Point of Interest Mid-Infrared Spectroscopy for Inline Pharmaceutical Packaging Quality Control

Good manufacturing practice for medicinal products is laid down in several guidelines and Directives of the European Commission. Those regulations imply, among other aspects, that medicinal product manufacturers have to ensure that the final products are fit for their intended use and do not place patients at risk due to the inadequate safety, quality, or efficacy. For the case of manufacturing of pharmaceutical blisters, the attainment of this quality objective often leads to the resourcing of qualified personnel for final visual verification of the blister pack content. The need for inline content verification of pharmaceutical blisters asks therefore for sensors that provide fast, noncontact, and accurate chemical information of each individual blister content. Here, we report on a quantum cascade laser (QCL)-based blister-verification sensor. The verification principle is substance chemical identification by means of backscattering mid-infrared (IR) spectroscopy. The light source is a palm-size wavelength-tunable mid-IR QCL with $\sim $ 1-kHz tuning speed. The blister content verification uses machine vision to obtain the required position information for each individual content and fast spatial scanning facilitated by a two-axis galvanometer scanner. Diffuse reflectance mid-IR spectra are acquired at each location, and their classification is conducted instantaneously. Different classifier approaches are evaluated and discussed including machine learning and standard cross correlation to Fourier-transform-IR (FTIR) data. Altogether, this sensor is capable of scanning a standard 12-pill blister pack in $\sim $ 0.3 s, whereas this scanning time is essentially related to the desired classification accuracy, but not to the spectral resolution, which is fixed. Using machine learning classification, 100% identification accuracy is demonstrated for 13 different medication types (i.e., with different chemical nature), whereas only 97.4% identification accuracy is achieved by standard cross correlation to FTIR data. The used pills have all similar size, shape, and color, so that classification by visual inspection is barely possible.


I. INTRODUCTION
T HE European regulatory system for medicines under administration of the European Medicines Agency [1] monitors the safety of all medicines that are available on the European market throughout their life span. For the case of medicines delivered within pharmaceutical blisters, regulated pharmaceutical quality requirements imply high standards on the blistering technology as well as full reliability on the blister's content [2], [3]. Specific blister packaging is a convenient and practical way to pack tablets and capsules for easy patient use. The package integrity of blister packs ensures the product quality and confers long shelf life by protecting the drug from moisture and oxygen [4]. It offers lower costs by reducing wastage and provides an increased drug therapy safety in comparison to individual drug dispensing by the patient or care staff due to automatized processes and additional quality control. Visual inspection verifies number, color, and shape of each individual blister package. Although not explicitly requested, current regulations imply in practice the need of having an experienced pharmacist for final verification. However, for solid doses with similar physical appearance i.e., color and shape, those visual quality measures fail, as no chemical verification is performed by default due to the limitations on time and qualified personnel. Spectroscopic methods could provide the required additional chemical information to further increase the security. Midinfrared (IR) spectroscopy in the 4-12 µm (850-2500 cm −1 ) range is particularly promising for this purpose as it directly addresses the strongest and most characteristic fundamental absorption features exhibited by compounds; thereby providing unique fingerprints for each substance, even in This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ a noncontact operation, such as backscattering or diffuse reflectance spectroscopy [5], [6], [7].
While the science underpinning this detection modality is well proven, practical deployment has been hampered by low measurement speeds, which are typically orders of magnitude too low for production-type, inline operation. Recent advances in mid-IR lasers promise to unlock the potential of this technology as the high spectral and spatial brightness offered by a laser source yields excellent spectral selectivity and allows measurements to be undertaken at considerable stand-off distance. In particular, external cavity quantum cascade lasers (EC-QCLs) hold promise as they are broadly tunable emitting devices and can address, by their design, the entire spectral fingerprint region in fractions of a second [8], [9], [10], [11], [12], [13], [14], [15], [16], [17], [18], [19], [20], [21], [22], [23]. In these devices, light is generated by a broadband-emitting quantum cascade laser (QCL) and wavelength tuning occurs by using the angular dependence of the optical feedback after reflection on an external diffraction grating. EC-QCL diffuse-reflectance spectroscopy studies for the analysis of pharmaceutical formulations and active pharmaceutical ingredients (APIs) have been reported by Galán-Feyle et al. [15] and Villanueva-López et al. [21].
In resonant-micro-opto-electromechanical system (MOE MS) EC-QCLs, agile wavelength tuning is achieved by combining the QCL semiconductor laser chip with a MOEMS grating scanner (i.e., the diffraction grating is etched on the rotating MEMS itself [11]). The MOEMS acts here as a wavelength-selecting element and is operated at a mechanical resonance frequency, increasing the spectral scanning speed up to the kilohertz range [16], [17]. Müller et al. [20] report on a subsecond acquisition time spectrometer that uses a MOEMS EC-QCL as a light source to acquire diffuse reflectance spectra on a variety of solid-state substances, including pharmaceutical pills, powders, and foils.
In this study, we describe how mid-IR diffuse reflectance spectroscopy can be used to support blistering machines in order to achieve a better inline content verification. Our approach consists on combining standard diffuse reflectance MOEMS-EC-QCL spectroscopy with artificial intelligence (machine vision and machine learning), enabling 100% inline content verification capabilities; 100% verification means here that each point of interest (POI) is scanned. Our system (see Fig. 1) works uninterruptedly and operator free, as it automatically detects and classifies a number of pills (currently 13 pill types or medications) every time that a blister with pills passes the sensor's field of view [also called region of interest (ROI)]. For this, the user deposits a blister in a random position and orientation on the conveyor belt.
The deposited blister contains 12 pills from the 13 pill types allowed. The contained 12 pills are randomly picked up from the allowed 13 pill types and are also randomly distributed within the blister. The conveyor belt moves the blister to the field of view of the sensor. As soon as this position is reached, the conveyor belt stops, and the system identifies the individual pill positions using machine vision and performs spectral scans, while the QCL interrogation laser beam is rapidly and sequentially vectored to each preidentified pill site. This increases the overall throughput of the sensor, as the spectroscopy measurement is limited only to the POIs.
The acquired data are translated into a diffuse reflectance spectrum (one spectrum for each POI, i.e., for each pill), which is then instantaneously classified. After this, the system moves the blister away from the sensor's ROI waiting for the next blister package. The system can be configured, so that after reversing the belt's direction, the blister moves back again into the ROI, reconducting the measurement in a slightly different blister position. This continuous operating mode, alternating the belt direction after each ROI pass, also allows for collecting measurement data for classifier training. More details on this point can be found further down.

II. EXPERIMENTAL SECTION A. Sample Selection and Preparation
The 13 medications addressed in this work are all of similar shape (circular, with diameters ranging within 12-13 mm) and color (whitish). They have furthermore no marks or engravings on at least one side, so that they are barely distinguishable by the eye. These characteristics make them ideal for our study, as human visual inspection (e.g., for the purposes of sorting or packaging-error exclusion) would certainly fail. Fig. 2 shows diffuse reflectance spectra for the 13 substances as acquired with a commercial Fourier-transform IR (FTIR) microscope. Characteristic diffuse reflectance spectra result from the different chemical nature of the substances contained in each pill (active ingredients, filler agents, etc.) as well as their spatial distribution within the pill. The spectra are stacked for illustration purposes.
For illustration purposes, the calculated molecular transition intensities for water vapor (H 2 O) as well as for carbon dioxide (CO 2 ) are shown in Fig. 2. These molecules are particularly relevant for our measurement concept as our reflectivity measurement is conducted in an unpurged environment and those molecules are mainly responsible for light absorption by air within the mid-IR. Other air components (N 2 , O 2 , O 3 , H 2 , CH 4 , NH 3 , NO, CO, etc.) have no relevant transition strengths within this region or have too small concentrations or both. For wavenumbers within ∼1320-1920 cm −1 , water vapor presents several transitions due to the activation of vibrational-rotational modes. Similarly, CO 2 transitions are present within the ∼620-730 cm −1 range. These transition strengths are sharp, as typical for gas vibrational-rotational transitions. When conducting a stand-off measurement in air, those transitions reduce the signal-to-noise ratio in those wavenumber regions considerably. In our case, although our system is unpurged, such sharp dips are absent in our data, because our working wavenumber region (gray shadowed area in Fig. 2) lies intentionally outside those air-absorption regions.
There are no preparation steps for the samples. The pills are taken from the original packages and directly used for measurements. Fig. 3 shows schematically the components of the QCL-based blister-verification sensor. These are summarized in Table I.

B. Measurement Setup
As soon as a blister appears in the ROI, the conveyor belt stops, and an image of the ROI is taken. The coordinates (in pixels) of the POIs (i.e., the pills in the blister) within the ROI are determined from this image and then transformed into galvo coordinates (in volts) using a previously generated calibration matrix. A spectral scan is conducted by deflecting the QCL beam over each POI. Spectral data are acquired continuously during the entire blister scan and spectral averaging is conducted by acquiring data for several MOEMS periods at each POI.
A MOEMS EC-QCL spectral scan consists of the acquisition of the backscattered light intensity versus time. This works as follows: the used QCL is a broadband emitting device designed and fabricated at Fraunhofer IAF, covering the spectral emission range ∼1050-1310 cm −1 . This spectral emission range has been chosen in order to meet a high number of spectral features for the selected samples. With increasing number of samples of different chemical nature, it would become more difficult to define a common spectral ROI, as the spectral bandwidth of MOEMS EC-QCLs is typically ∼300 cm −1 . As a solution hereto, one could combine several MOEMS EC-QCL modules into a single ultrabroadband source without the need of operating the individual modules in a sequential way, i.e., keeping the total scan time the same as for a single MOEMS EC-QCL [22], [23].
The QCL is driven in pulse mode, with current pulse trains of 500-kHz repetition rate and 100-nm pulsewidth. The average output power is 20 mW. The resonant MOEMS oscillates continuously with frequency ∼1 kHz. Each time the MOEMS crosses its equilibrium position, a new current pulse train starts, so that each current pulse leads to a light pulse with a different wavenumber yet with a fixed time delay with respect to the train's starting time. By using the relationship between MOEMS deflection versus time as well as the emission wavenumber versus MOEMS deflection, the acquired data (light intensity pulses versus time) are translated into the wavenumber domain. This results in the measured backscattered spectrum. More details about the relationship between time and wavenumber space when operating a MOEMS-EC QCL can be found elsewhere, for example, in [17].

C. Calibration
The system is first referenced using a spectrally flat diffuse reflective plate and calibrated using the transmission of a polystyrene film with a known spectrum. Because of system constraints independent of the laser source as, e.g., the detector sensitivity and the optical collection efficiency, the useful wavenumber region for a backscattering reflectivity measurement is ∼1070-1300 cm −1 . Fig. 4 shows the typical acquired data for a blister verification measurement. The top-left picture is the light intensity versus time as acquired during a blister scan. The used blister contains 12 pills, just as the one shown in Fig. 1. For this particular measurement, 20 spectral averages per POI are used. The portions of the full dataset that correspond to each POI are colored and indicated in Fig. 4 (top-left). These are "raw" light intensity versus time data, i.e., as recorded by the data-acquisition system. Depending on the specific characteristics of each pill (chemical and morphological), different light intensity levels are collected at each POI. The main contributors to this heterogeneity are the different reflectance characteristics of each pill in the used spectral range. An additional (minor) contribution is slight POI-position-dependent differences in the collection efficiency of the backscattered mid-IR light.

D. POI Spectroscopic Measurement
Zoomed views for two particular POIs are shown in Fig. 4  (bottom). The colored (blue/orange) portions of the data are used for averaging and subsequent data analysis. The unused data points (e.g., the values <450 and >470 ms for Fig. 4 bottom right) correspond to the QCL beam traveling between POIs.
Diffuse reflectance spectra for the first and 11th POI are shown in Fig. 4 (top-right). These reflectivity spectra result from averaging the "raw" data as in Fig. 4 (bottom), transforming the averaged data into the wavenumber space, and dividing the resulting spectrum by the previously taken reference spectrum. Measured FTIR reflectivity spectra for those spills are shown in Fig. 4 (top-right) as well. Those FTIR spectra were acquired using a commercial FTIR microscope. A high degree of similarity between the reflectivity spectra acquired by our sensor with those acquired with the commercial FTIR system is found. Our measurement, though, takes only ∼20 ms per POI, whereas the FTIR measurement requires several minutes measurement time in order to achieve a similar quality and resolution (∼1.5 cm −1 point spacing). Furthermore, as we will see later in more detail, using FTIR-reflectivity spectra for the pill's verification leads to larger errors in the pill's classification.

III. MACHINE LEARNING CONSIDERATIONS A. General Remarks
In our study, we conduct assisted learning to train a neural network with one hidden layer to correctly classify the 13 pill classes (each pill type, i.e., each medication type, defines one class). Typically, for each type of medication, more than 30 different pills originating from different packages have been used.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply. The input data for training and validation are large amounts of reflectivity spectra that have been collected over long time periods, sometimes at different days. This collection consisted in letting the system scan a blister carrying a unique pill type and changing the blister orientation every few minutes, while the conveyor belt would transport the blister back and forth, crossing the ROI each time. Because the POIs position within the ROI will be slightly different every time the belt stops, some variations within the reflectivity spectra for the same pill type are registered (see Supplementary Information for some examples). Those slight fluctuations in the input data that correspond to the same class are exactly what we are looking for in order to achieve a high-quality classifier training and validation. For implementing the neural network training and validation, the deep learning API Keras [24] was used, running on top of the machine learning platform TensorFlow [25].

B. Spectral Resampling
As explained in [17], because of the resonant operation of the MOEMS and the fixed QCL repetition rate, the spectral point spacing for the data acquired with our sensor is not equidistant. Furthermore, a wavelength-scale-calibration routine using calibration standard materials is necessary during operation in order to ensure a correct mapping of the data from the time domain into the wavenumber space. Because of this, diffuse reflectance spectra acquired at different moments (e.g., at different days) will not have exactly the same wavelength scale. In order to still be able to use these data to train our neural network, a resampling procedure was applied. This is essentially a 1-D projection of the measured spectrum into a predefined wavenumber scale (see Fig. 5). This scale consists of equidistantly spaced points between 1070 and 1300 cm −1 . This brings up the question: which spectral point spacing, i.e., which resampling resolution, should we apply? This is not a trivial question and the answer depends on the average spectral width of the sample's reflectivity features, the QCL repetition rate, the number of QCL pulses, and the system's noise level. A too small resampling resolution would take unwanted "sharp" features such as noise into account, reducing the training data quality. On the other hand, a too large resampling resolution would ignore relevant spectral features in the reflectivity data, reducing the training data quality as well. We decided to address this question empirically and used the input data to train a neural network and measure the validation accuracy as a function of sampling resolution.
For this test, a network with fewer nodes for the hidden layer as well as fewer epochs as in our final version was used. We did so in order to appreciate better the different performances of the network for the different resampling sizes. The resulting validation accuracy versus sampling resolution is shown in Fig. 5 (inset).
As expected, the validation accuracy remains low for too small and too large resampling resolutions. A maximum validation accuracy is observed around 0.6 cm −1 resampling resolution. For the rest of this study, this number was used for the resampling of the reflectivity spectra. After resampling, each reflectivity spectrum consists of 383 points. Each one of these points constitutes an input variable for training the neural network.

C. Machine Learning Training and Validation
For the machine learning classifier training, 70 nodes for the input layer, 383 nodes for the hidden layer, and a number of epochs of 100 were used. Adam optimization was used, together with sparse categorical cross entropy loss computation. If the validation accuracy reached 100% before finalizing the epochs, the training was stopped. Otherwise, it was continued. Both for training and validation, 20 spectral averages per POI were used.
We also investigated the effect of applying principal component analysis (PCA) on the classifier's performance. For this purpose, a dimension reduction on the resampled data was conducted, reducing the number of input variables from 383 to 50. In this case, also 50 nodes for the hidden layer were used. Fig. 6 (top) shows the confusion matrix that results for the classification of the 13 pill types using the PCA-preprocessed machine learning classifier. Excellent (100%) prediction capability is achieved, as predicted and expected labels coincide in all cases. (Each label corresponds to a pill type, just as enumerated in Fig. 2.) Fig. 6 (bottom) shows the confusion matrix that results when classifying the pills using standard cross correlation to FTIR data, reaching an average validation accuracy of 97.4%. The FTIR-supported classification performs poorer when comparing spectra from two different pill types but with a high degree of spectral similarity. This is the case for the labels pair 0 and 6 (paracetamol versus aspirin/paracetamol/caffeine), 1 and 12 (aspirin versus metamizole/natrium), 4 and 9 (magnesium/potassium versus magnesium), and 5 and 10 (ambroxol versus doxylamine) (see also Supplementary Material for a more detailed view of the averaged diffuse reflectance spectra).

IV. RESULTS AND DISCUSSION
In some cases, this similarity is due to the fact that the active ingredients in those pills are similar (e.g., paracetamol versus aspirin/paracetamol/caffeine or magnesium/potassium versus magnesium). In other cases, similar pharmaceutical excipients-this is the case for the ambroxol versus doxylamine pills, as both contain lactose, SiO 2 , magnesium stearate, and cornstarch and are fabricated by the same manufacturer-their distribution within the pill's surface, as well as the pill's surface shape might play a significant role for the similarity in the acquired spectra. The exact reason why those spectra are similar is beyond the scope of this article.
In essence, the machine-learning classifier performs much better than the standard FTIR-supported classifier because the machine-learning classifier is trained on the subtle variations in the measured spectrum due to the different possible incidence angles for the QCL on the pill (as-during training, the pill can be located anywhere within the ROI). The FTIR classifier, on the other hand, has no training phase, and it identifies the substance by finding the maximum overlap (cross correlation) between the measured spectrum and FTIR-measured database spectra, completely ignoring the inhomogeneities when acquiring diffuse reflectance spectra at slightly different positions within the pill.
100% validation accuracy is also achieved by the machine learning classifier without PCA-dimensionality reduction, as well as by a standard statistical analysis classifier (logistic regression, Python scikit-learn) with PCA-preprocessed input data. To better visualize the superiority of PCA-preprocessed machine learning classifier, we performed what the authors call "resilience test": having trained on 20 spectral averages, the classifiers' performance in making predictions on data taken with different spectral averages (1, 3, 5, 10, 50, and 100) is evaluated. Results are shown in Fig. 7. As expected, not much difference is observed for larger spectral averages (avg > 20), as reflectivity spectra tend to look better in this case due to the improved signal-to-noise ratio. For lower spectral averages, an overall reduction in the prediction accuracy is observed, as expected from the reduced signal-to-noise ratio. We observe, though, that the reduction in the prediction accuracy is somewhat slower for the PCA-preprocessed machine learning classifier, reaching 96.2% prediction accuracy for avg = 1 (no spectral averaging). Machine learning without PCA-dimensionality reduction as well as the linear regression classifier reaches only about 94.6% for avg = 1.
Finally, we note the somewhat similar prediction performances of machine learning and linear regression classifiers when using PCA-preprocessed data (see for example avg = 3 in Fig. 7). Further tests (e.g., large sample size, different spectral variability) are needed in order to better understand the strength of machine learning in prognostic modeling compared to traditional regression techniques.

V. CONCLUSION
We described an innovative approach for fast, noninvasive, contactless, and highly selective classification of pharmaceutical pills. Our sensor concept combines artificial intelligence (machine vision and machine learning) with laser-based diffuse reflectance spectroscopy. Because of the accuracy (which can be tuned defining threshold values for the classification confidence), the measurement time (∼20 ms/pill), and the selectivity (mid-IR fingerprint region), this approach is fit-topurpose, as it can be integrated to a pharmaceutical packaging line for the purposes of quality control. Different classification approaches were discussed and PCA-feeded machine learning proved to be the most accurate and resilient classifier approach. Standard FTIR-database-supported classification was found to deliver a much poorer prediction capability, especially when testing the classifier's performance in unfavorable measurement conditions (lower spectral averages).