PCB-Vision: A Multiscene RGB-Hyperspectral Benchmark Dataset of Printed Circuit Boards

Addressing the critical theme of recycling electronic waste (E-waste), this contribution is dedicated to developing advanced automated data processing pipelines as a basis for decision-making and process control. Aligning with the broader goals of the circular economy and the United Nations (UN) sustainable development goals (SDG), our work leverages noninvasive analysis methods utilizing RGB and hyperspectral (HS) imaging data to provide both quantitative and qualitative insights into the E-waste stream composition for optimizing recycling efficiency. In this article, we introduce “PCB-Vision,” a pioneering RGB-HS printed circuit board (PCB) benchmark dataset, comprising 53 RGB images of high spatial resolution paired with their corresponding high spectral resolution HS data cubes in the visible and near-infrared (VNIR) range. Grounded in open science principles, our dataset provides a comprehensive resource for researchers through high-quality ground truths, focusing on three primary PCB components: integrated circuits (ICs), capacitors, and connectors. We provide extensive statistical investigations on the proposed dataset together with the performance of several state-of-the-art (SOTA) models, including U-Net, Attention U-Net, Residual U-Net, LinkNet, and DeepLabv3+. By openly sharing this multiscene benchmark dataset along with the baseline codes, we hope to foster transparent, traceable, and comparable developments of advanced data processing across various scientific communities, including, but not limited to, computer vision and remote sensing. Emphasizing our commitment to supporting a collaborative and inclusive scientific community, all materials, including code, data, ground truth, and masks, will be accessible at https://github.com/hifexplo/PCBVision.


I. INTRODUCTION
The electronics market has witnessed remarkable growth and development over the last few years, driven by the high demand for new generations of electronic devices.However, this accelerated technological advancement has resulted in a considerably reduced lifespan for electronic products, leading to a surge in electronic waste (E-waste) [1].Research studies and journal reports indicate that the global E-waste generation has been escalating at an unprecedented rate, posing significant environmental challenges and sustainability concerns [2].It is estimated that each year around 30 to 50 million tons of waste from electrical and electronic equipment (WEEE) are disposed, with an annual growth rate estimation of 3 to 5% [3].In 2019, a staggering 53.6 million metric tons of Ewaste was generated globally, with only 17.4% of it officially documented as recycled [4].This resulting accumulation of E-waste, along with its unrecovered critical and toxic raw materials demands effective and efficient recycling strategies to mitigate its environmental impact and harness the economic value hidden within E-waste components.
The concept of a circular economy provides a framework for addressing E-waste challenges by promoting the recovery, and reuse of materials [5].E-waste recycling specifically plays a crucial role in this transition by transforming discarded electronic devices into valuable resources.E-waste recycling also contributes to the achievement of several sustainable development goals (SDGs) outlined by the United Nations (UN) [6].E-waste recycling reduces primary resource consumption and waste generation at the same time corresponding to SDG 12 "Responsible Consumption and Production: Ensure Sustainable Consumption and Production Patterns".Moreover, it helps reduce greenhouse gas emissions associated with virgin material extraction and processing supporting SDG 13 "Climate Action".Additionally, promoting the reuse of existing materials to reduce the environmental impact of virgin resource extraction aligns with SDG 15 "Life on land".
Among the myriad electronic components contributing to E-waste, printed circuit boards (PCBs) hold particular significance due to their widespread use in various electronic devices.The recycling of waste PCBs can unlock significant value, given their considerable residual content.Approximately 28% of a PCB's weight is constituted by high-grade precious metals such as Au, Ag, Cu, Pd, and Ta [7].The extraction of those metals can help reduce the need for raw material extraction.Nevertheless, waste PCBs present environmental and health hazards.The methods employed for extracting precious metals, particularly in open-air settings like PCB acid baths [8], can inadvertently release toxic substances, including lead and mercury into the environment [7].
To achieve efficient and sustainable recycling of PCB components, i.e., an optimized E-waste recycling, there arises a pressing need for automated advanced analytical techniques and informative systems that can provide both qualitative and quantitative information in a short time about the PCB composition.Smart systems that adopt optical sensors, including but not limited to RGB cameras, have demonstrated remarkable potential in this domain [9].By leveraging the rich data and information obtained from these sensors, machine learning (ML) and deep learning (DL) methods are to be deployed to streamline the recycling process and improve its accuracy and efficiency [10].Motivated by these technological and computational advancements, our project Ramses for Circular Economy (Ramses-4-CE) spearheads the development of optical spectroscopy-based multi-sensor systems tailored for the E-waste recycling industry.At the core of our initiative lies the pursuit of advanced multi-source data fusion e.g., RGB and hyperspectral imaging sensors.The usage of ML techniques for data fusion enables rapid data integration and automates information extraction from E-waste, thereby optimizing Ewaste recycling practices.RGB cameras offer several advantages, such as costeffectiveness, high spatial resolution, low integration time, ease of use, and real-time data acquisition, making them popular choices for various PCB applications, such as defect detection and quality control.These cameras excel in capturing visual information at a sub-millimeter pixel scale and are widely utilized in image-based inspection systems within the electronics industry [ [11], [12]].
Work examples where RGB cameras contributed to PCBsoriented informative systems are Herchenbach et al. [13] who used the RGB and the depth information of the Microsoft Kinect sensor to segment and classify the through-hole components (THC) mounted on printed circuit board assembly (PCBA).Li et al. [14] proposed an information retrieval automated PCB recycling system by detecting and segmenting surface-mounted devices (SMDs) of two kinds, small devices like resistors and capacitors, and integrated circuits (IC).In another work, Li et al. [15] also enhanced the text information retrieval of optical character recognition (OCR) from RGB images by developing a novel thresholding method that utilizes an adaptive window size along with background estimation.Li et al. work proves that text recognition quality can be improved by enhancing the binarization of text contents.For PCB inspection and fault detection systems, Kim et al. [16] developed a skip-connected convolutional autoencoder that took RGB images as input to detect defects in PCB for automating PCB surface inspection.Ding et al. [17] propose a tiny defect detection network (TDD-Net) utilizing deep convolutional neural networks (CNN) to enhance PCB quality control systems using RGB images.In [18] Adibhatla et al. used a version of you-only-look-once (YOLO) [19] named tinyYOLOv2 DL algorithm for defect detection in PCBs using 11,000 RGB images for PCB quality inspection.
In contrast to RGB, hyperspectral imaging (HSI) cameras bring an abundance of benefits for PCB component detection and inspection systems, e.g.offering comprehensive spectral coverage that enables the identification and analysis of materials based on their unique spectral signatures [26].Several studies employed spectroscopy-based methods to analyze and map PCB compositions, e.g., Englert et al. in [27] proposed a monitoring method to detect and quantify organic contaminations on technical surfaces using hyperspectral imaging and X-ray photoelectron spectroscopy (XPS).Carvalho et al. [28] examined and analyzed 1,200 DSLR + Linescan Spectrometer emission points on a 30mmx40mm section of a 2011 mobile phone PCB sample within the wavelength range of 186 to 1040 nm using laser-induced breakdown spectroscopy (LIBS) and scanning electron microscopy with energy-dispersive Xray spectroscopy (SEM-EDS).Their investigation focused on determining the metal compositions, and a graphical map of the metal distribution was obtained.In a study done on end-of-life mobile phone wastes, Palmieri et al. [29] proved that the characterization which combines traditional methods like scanning electron microscopy and Raman spectroscopy alongside innovative hyperspectral imaging in the short-wave infrared range significantly enhances recycling strategies and product recovery outcomes.Rapolti et al. [30] developed a two-stage sorting stand for E-waste, employing optical sensors in both stages.The first stage utilizes a Specim near-infrared FX10e hyperspectral camera and a robotic arm.Unrecognized items proceed to the second stage, which employs an IFM Electronics contour vision sensor and actuators for further sorting.Sudharshan et al. [31] proposed an enhanced version of the FasterRCNN object detection method called GOLbased Faster-RCNN that utilizes RGB and FX17 HSI camera information for enhancing the object detection performance of PCB components for aiding the recycling and recovery systems of PCB.Polat et al. [32] fused 3D point clouds and HSI for classifying different types of objects and materials of electronic waste like shredded PCBs and plastics, this usage allowed the combination of geometrical and physical information to help components distinguish routines of PCB components.
The aforementioned systems enrich the research towards building efficient non-invasive PCB informative systems that act as a cornerstone for efficient PCB recycling streams.However, a foundational element in the development of those advanced systems is having comprehensive datasets that act as the bedrock upon which algorithms and models are designed and tested.To our knowledge, only a few PCB datasets are publicly available e.g., the publicly available 'FICS-PCB' dataset developed by Mehta et al. [20] stands out as a valuable resource for the development of robust PCB automated visual inspection (AVI) systems.Containing 9,912 RGB images with 77,347 annotated components from 31 distinct PCBs, the FICS-PCB dataset provides a comprehensive platform for evaluating and improving the performance of PCB AVI algorithms.The authors demonstrated the effectiveness of two deep learning (DL) architectures, AlexNet [33], and Inception-v3 [34], on the FICS-PCB dataset, achieving promising results in PCB components classification.Mehta et al. [21] also proposed another dataset 'FICS-PCB X-ray' that supports the PCB-AVI which is the first annotated X-ray PCB dataset for automated PCB inter-layer inspection including 5 PCBs each containing 3D volumetric data that can be extracted into 2D slices for the analysis.In the field of PCB automated optical inspection (AOI), Huang et al. [22] proposed a PCB dataset referred to as 'PCBA-Defect' for defect detection and classification containing 1,386 RGB images with six kinds of synthetic defects (missing hole, mouse bite, open circuit, short, spur and spurious copper) photoshopped on cropped images of scanned PCB traces using an industrial camera equipped with a CMOS sensor and a zoomable industrial lens.In [23] Tang et al. proposed 'DeepPCB', a dataset containing 1,500 binary image pairs of defect-free and defected traces with annotations of six common types of defects (open, short, mouse bite, spur, pinhole, and spurious copper) along their positions, moreover a DL model utilizing a novel group pyramid pooling was proposed to efficiently detect the PCB trace defects.To facilitate research on computer-visionbased PCB analysis, Pramerdorfer et al. [24] proposed a PCB dataset referred to as 'PCB-DSLR' containing 748 images of 165 different PCBs using a digital single-lens reflex (DSLR) camera scanned in an industrial-like environment on top of a conveyor belt.'PCB-DSLR' is also provided with accurate PCB segmentation information, in addition to 9,313 bounding boxes for detecting integrated circuits (ICs), along with textual information of the labels of 1,740 IC samples.A similar to [24] but more abundant PCB dataset is Mahalingam et al. [25] who introduced 'PCB-Metal' (PCB Metal is not available to this publication date); a 984 high-resolution RGB images PCB dataset of 123 different PCBs with object detection ground truths of 4 main components of PCBs (IC, capacitors, resistors, and inductors) useful for image-based PCB analysis, e.g., PCB classification, component detection, etc. Table I provides a summary of the publicly available PCB datasets with their sensor and data types information according to our knowledge.
It can be noted that, although reflectance-based HSI provides further features (spectral representation of a physical absorption process) that allow more advanced vital analysis for the PCB recycling-oriented system, the image processing progression of HSI lags behind its RGB counterpart due to the scarcity of available data.The advancement of models and methodologies in HSI requires large datasets, facilitating enhanced development of non-invasive PCB inspection systems.
Existing HSI datasets remain sparse, many are confined to a single scene, and highly non-industrialized applications are rather more focused on food [35], medical fields [36], and earth observation e.g., Indian Pines, Houston, Salinas scene, Kennedy Space Center, and Pavia University.Those datasets formed a baseline to develop many of the HSI classification modalities e.g., 'HybridSN' [37] a model by Roy et al. that combines 3D CNN with 2D CNN for rich extraction spatialspectral feature representations.The utilization of 3D CNN on the full HSI scene on a graphical processing unit (GPU) demands a lot of unavailable memory, therefore patching from the scene was fed to the model.For unshaped objects e.g., earth observation scenes, this is not a problem, but it might be for geometrically shaped objects like PCBs as we discuss later on.Another example is Uchaev et al. [38] who presented RPNet-RF, a model that combines recursive filtering and random patches network (RPNet) to extract informative features that will be combined with the HSI spectral features to classify the HSI using a support vector machine (SVM) classifier.Nonetheless, the lack of multiscene hyperspectral (HS) data hinders the development of HSI image processing models with high generalization ability to unseen scenes, therefore, multiscene hyperspectral datasets are needed.The multiscene HSI datasets allow models to handle more illumination variations, noise (ambient and instrument), and variable proportions of content (objects/ classes), which challenge existing models trained on single-scene datasets.Recently, more sophisticated multiscene HSI datasets emerged, for instance, 'landslide4sense' presented by Ghorbanzadeh et.al [39] where 3,799 HSI patches of size 128 by 128 acquired by fusing optical layers from the Sentinel-2 sensor collected at four different times and geographical locations: Iburi (2018), Kodagu (2018), Gorkha (2015), and Taiwan (2009).Landslide4sense provides binary classification ground truth of landslide or non-landslide labels to facilitate accurate detection of landslide extents.Further remote sensing urban environment multiscene dataset is 'C2Seg' provided by Hong et al. [40].C2Seg is a multimodal multiscene benchmark dataset consisting of two cross-city scenes; Berlin-Augsburg (Germany) and Beijing-Wuhan (China) for the sake of crosscity semantic segmentation studies.C2Seg contains multiple data types including hyperspectral, multispectral, and synthetic aperture radar (SAR).Such datasets support the development of high-in-generalization HS data classification models since they allow classifiers to be trained on data from multiple scenes, i.e., considering further variance in the targets which improves the prediction performance on unseen data.
Nevertheless, all the above-mentioned datasets and investigations do not provide a simultaneously comprehensive presentation of PCBs across both RGB and HSI domains.Addressing this gap, we introduce 'PCB-Vision', a multiscene The integration of RGB and HS data within a single dataset holds significant promises, such datasets foster cross-modal exploration, empowering researchers and practitioners to forge novel pathways for material stream digitalization, monitoring and intelligent, data-driven decision-making in PCB recycling without the need to build expensive setups.PCB-Vision is a multi-scene RGB-HSI dataset, centered on PCBs, it unlocks a multitude of prospects for advancing computer vision techniques.Its potential extends beyond segmentation and object detection to applications like pansharpening and superpixel restoration in remote sensing.Our endeavor with this dataset is fortified by statistical analyses and crafted supervised learning ground truths for both RGB and hyperspectral data modalities.Through comprehensive numerical evaluations of segmentation models on these data types, we lay the foundation for innovative research endeavors and pioneering solutions to address the urgent challenges posed by E-waste recycling and support pathways to future concepts of more robot-aided and object-oriented E-waste component separation.PCB-Vision can help identify the most valuable components of PCBs as we later present in the classes of interest, which can maximize the recovery of valuable materials and improve the efficiency of recycling processes [41].
PCB-Vision provides a platform for evaluating and comparing different PCB analysis methods.This benchmarking fosters innovation and collaboration in E-waste recycling by encouraging collaboration and innovation among researchers and industry.PCB-Vision pushes the E-waste recycling methods a step forward toward more efficient and environmentally friendly technologies by accelerating the development of noninvasive real-time analytical systems that meet the industrial demands for in-line data processing pipelines that assist subsequent tasks.This supports the Green Deal and the circular economy.
The subsequent sections of this paper are structured as follows: Section two provides a comprehensive overview of the dataset, addressing acquisition requirements, outlining the provided annotations, and presenting their associated statistics.
Section three delves into the methodologies and models employed on the dataset, moreover, we detail the experimental procedures involved in training the models, along with the subsequent numerical evaluations.Section four discusses the forthcoming challenges and outlines the trajectory for future research directions.Finally, section five encapsulates our conclusions and summarizes the work's contributions.

II. DATASET DESCRIPTION
This section provides a comprehensive overview of the dataset, detailing the sensors employed for acquisition and presenting statistics related to the annotations.Figure 1 gives a glance at the PCB-Vision workflow.The conveyor belt industrial setup of the data acquisition shows the RGB and HSI cameras mounted on top of the belt scanning the PCBs and giving high spatial resolution images along with high spectral resolution HSI.The data is further preprocessed and normalized to fit the processing pipeline of DL models which after the training phase perform predictions to identify valuable PCB components.

A. Data Acquisition
The dataset encompasses RGB images captured by a Teledyne Dalsa Genie Nano-C4020 camera and HS data cubes obtained using a Specim FX10 spectrometer.The two sensors are positioned atop the conveyor belt in our Helios laboratory at the Helmholtz Institute Freiberg for Resource Technology (HIF) as shown in figure 2. The C4020 camera is characterized with a full-frame capture capability, while the FX10 spectrometer operates as a line scan device with high spatial resolution across the visible and near-infrared (VNIR) range.Illumination of the PCB stream was achieved using eight broad-band quartz-tungsten halogen lights.Table II provides detailed acquisition parameters for the two cameras.Hyperspectral data acquisition with the FX10 was performed Fig. 2: Acquisition setup at Helios Lab [31] using the Specim Lumo Recorder software (Spectral Imaging Ltd., Oulu, Finland).Before capturing the PCBs cubes, a dark reference frame was acquired with the shutter closed.Subsequently, a white reference panel with 99% reflectance was captured.Using the Hylite toolbox [42], we converted the raw hyperspectral datacubes to reflectance by applying the reference levels from the white and dark reference measurements, such that each pixel within the resulting cubes constitutes a vector representing the corrected reflectance spectra.Figure 3 presents PCB 1 true color representation of the HSI along the spectra of the classes of interest ('IC', 'Capacitor', 'Connectors'), in addition to the conveyor belt surface.The spectra were taken from a random point on the surface of the four objects.From 3b we notice the spectra of class 'IC' are similar to the spectra of the conveyor belt.This is expected since the IC surface material and the conveyor belt material are similar black polymers, therefore they will reflect similar spectra in this range.We wanted to highlight this situation since this will confuse models that utilize spectral information only to do pixel-wise classification (segmentation) of those two classes.Further analysis of this problem is introduced in the experiments section.
For the annotation of hyperspectral cubes and the creation of pixel-wise classification ground truths, the Envi Classic (ENVI™ 5.1, Exelis Visual Information Solutions, Boulder, Colorado) software was employed.Simultaneously, RGB images were annotated using the Anylabeling [43] annotation tool.These steps facilitated the generation of accurate and informative segmentation ground truths of three main components commonly found in every PCB, for subsequent computer vision tasks.

B. Annotations
Our research focuses on three fundamental objects commonly encountered in PCBs: Integrated Circuits (ICs), electrolytic capacitors, and connectors.The selection of ICs and connectors is driven by their significant economic value, while the inclusion of electrolytic capacitors is motivated by their potentially hazardous components.The accurate localization and detection of these components is a main task of our project's technical objectives.However, the tasks of our project 'Ramses' involve unique cases that demand a specialized type of segmentation ground truth different than the general segmentation ground truth that covers all the object's surface.For this purpose, an annotation ground truth named 'Monoseg' was created.Nonetheless, To enhance the dataset's versatility and contribution to the broader field of computer vision research, we also provide classic segmentation ground truths called 'General', making the dataset suitable for general-purpose tasks beyond the Ramses project.The main difference between the two segmentation types lies in the included object's surface area which is explained thoroughly further.While it's conceivable to expand the number of object classes within the dataset, this process is subject to scalability considerations, ensuring a balanced and comprehensive representation of various PCB elements.
1) General Segmentation Annotations: In this study, we define the 'General' segmentation ground truth as ground truth annotations encompassing all surfaces of the PCB components, regardless of the number of materials present on each surface.The resulting segmentation ground truth map for the PCB1 RGB image is depicted in subfigure 4a.It can be seen how all the surface of each class of interest component is taken as ground truth.This is the main difference between the two annotation styles 'General' and 'Monoseg'.2) Monoseg Annotations: In the context of the Ramses project, the segmentation ground truths adhere to a specific approach where only the primary material of an object's surface is labeled.This means excluding any additional markings, labels, or inscriptions present on PCB components' surfaces like labels that are printed on ICs surfaces.The rationale behind the 'Monoseg' ground truth style aligns with the use of point measurement sensors, such as XRF, and Raman spectroscopy that provides valuable chemical composition information about the reading point.In this approach, a point measurement is conducted within a given segment, and this measure must be taken where meaningful results are anticipated not where other labeling materials exist.Such obstructions of reading the wrong material point can lead to incorrect readings, potentially compromising the accuracy of the composition analysis.To mitigate this potential source of inaccuracies, the 'Monoseg' segmentation ground truth deliberately excludes all non-object-based materials.
Subfigure 4b shows this annotation strategy on the PCB1 RGB image where 'Monoseg' ground truth considers only a specific part of the component surface, e.g., the surface of the bottom left IC is partially selected since the printing label on the surface is ignored.For a better comparison of the two annotation styles, figure 5 demonstrates the ground truth maps of the PCB 1 RGB image for each annotation style.
The difference between the two annotation styles can be noticed in the bottom left IC and all the capacitors (green) where only the main material of the classes of interest are considered as targets.

C. Statistics
This section provides relevant statistics about the dataset and the ground truth containing the three classes of interest 'IC', 'Capacitor', and 'Connectors'.The two ground truth sets 'Monoseg' and 'General' have different sizes, the 'General' ground truth contains more segmented pixels for the 'IC' and 'Capacitor' classes.In this section, we present the statistics of the 'General' ground truth.However, although we provide two independent ground truths for both RGB and HSI, we highlight the ground truth of the hyperspectral ones, given its critical nature which plays an important role in selecting the training data for the HSI processing pipeline.
The dataset comprises 53 PCBs scanned using both hyperspectral and RGB cameras mounted on top of a black conveyor belt.The nature of this setup introduces the challenge of unwanted background (the black conveyor belt) surrounding the PCBs, significantly impacting various HSI preprocessing steps, such as data normalization, and dimensionality reduction, and processing steps like training State-Of-The-Art (SOTA) segmentation models.Moreover, the spectra of the conveyor belt closely resemble the spectra of the IC's dark plastic coating.To address this issue, we create PCB masks using the method proposed in [44], to help effectively eliminate the undesired background vectors and preserve only the PCB, our object of interest.
As mentioned previously, our dataset of 53 PCBs encompasses three classes: 'IC', 'Capacitor', and 'Connectors'.However, these classes are unbalanced, object-wise and pixel-wise across the dataset.Figure 6 presents a pie chart illustrating the percentage of the mentioned classes pixels.Figure 6 reveals a highly imbalanced class case, where the 'IC' class (red) dominates the dataset, constituting 82.0% of the ground truth, accounting for more than three-quarters of the entire dataset.Following this, the 'Connectors' class (blue) comprises 9.2% of the dataset, while the 'Capacitor' class (green) represents 8.7% of the dataset.
To gain a comprehensive understanding of the dataset and facilitate the subsequent HS cubes train-test split process, we categorized the PCBs based on the classes they contain.Figure  As shown in figure 7, the most common PCB type consists of two classes only, namely "IC-Connectors" (purple), accounting for 37.7% of the dataset.Following this, PCBs with all three classes, "IC-Capacitor-Connectors" (olive), constitute 28.3% of the dataset size.PCBs containing another two classes, "IC-Capacitor" (dark yellow), represent 24.5% of the dataset size.Finally, PCBs containing 'IC' class only (red) form 9.4% of the dataset size.
To comprehensively analyze the class imbalance in our dataset corresponding to each PCB category as presented in figure 7, we provide histograms in figure 8.Each histogram illustrates the quantitative distribution of class pixels within every PCB of each category.
The four histograms presented in figure 8

III. EXPERIMENTS
This section presents the segmentation experiments for the two ground truths 'General' and 'Monoseg', for both the HSI and the RGB data types using the raw version of the data without deploying any background (conveyor belt) masking technique to provide a benchmark point of PCB-Vision raw data.Five of the most famous segmentation models were used to benchmark the segmentation performance on the dataset.The following methodologies subsection elaborates more on those models.

A. Methodologies
In this study, we conducted comprehensive experiments to evaluate the performance of various semantic segmentation SOTA models including U-Net [45], DeepLabv3+ [46], and Attention U-Net [47], among others.These models were chosen for their well-established capabilities in addressing semantic segmentation challenges in computer vision.The used models are: 1) Unet: Unet is a widely used convolutional neural network (CNN) architecture for semantic segmentation tasks in computer vision.The network's unique design features a symmetric U-shaped encoder-decoder structure [45].The Unet architecture efficiently captures multi-scale contextual information through skip connections, allowing precise segmentation of objects in images [45].Its effectiveness and versatility have made Unet a popular choice for various segmentation applications.
2) ResUnet: ResUnet is a novel CNN architecture that combines the power of ResNet and Unet for semantic segmentation tasks in computer vision.The ResUnet model leverages residual connections from ResNet to facilitate efficient feature propagation during the encoder-decoder process [48].This fusion enables ResUnet to capture both local and global contextual information, enhancing segmentation accuracy and robustness.
3) Attention Unet: Attention Unet is an innovative CNN architecture designed for precise semantic segmentation tasks in computer vision.The model integrates attention mechanisms within the standard Unet framework.By selectively attending to informative regions, Attention Unet achieves improved segmentation accuracy, particularly in cases where the target objects exhibit diverse appearances or complex structures [47].The network's attention mechanisms enable it to focus on relevant image regions, enhancing its ability to accurately delineate objects of interest, such as organs in medical imaging or objects in natural scenes.
4) DeepLabv3+: DeepLabv3+ is an advanced CNN architecture tailored for accurate semantic segmentation tasks in computer vision.The model employs an encoder-decoder structure with atrous separable convolutions [46].This design enables DeepLabv3+ to effectively capture multi-scale contextual information while preserving fine spatial details.
5) LinkNet: LinkNet is a CNN architecture specifically designed for efficient semantic segmentation.It introduces a novel encoding path, featuring residual-like connections called "link" connections, to preserve spatial information effectively [49].The link connections enable seamless integration of highlevel features from deeper layers with low-level features from shallower layers, facilitating accurate segmentation.
Through meticulous evaluation, we investigated the numerical results of each model's inference on the test set, enabling a comprehensive comparison of their performance in accurately segmenting our classes of interest.

B. Results: RGB
This subsection outlines the preprocessing steps applied to the RGB data and provides a numerical evaluation of the aforementioned DL models:  [50].By using the existing samples to create new synthetic data variants, data augmentation addresses the challenges of limited data availability and improves the generalization ability of DL models to unseen data [50].Albumentations [51], a powerful image augmentation Python library, was employed to apply various transformations to both RGB images and their corresponding masks.These transformations fall into two categories: • Spatial-level transformations (applied to both image and mask simultaneously): -Vertical flip -Horizontal flip -40 degrees clockwise rotation -40 degrees anti-clockwise rotation -RGB channel color shifting with a 25-shift limit for the red, green, and blue channels -Transpose -Shift scale rotate The following part contains two types of segmentation experiments, the first is done on the 'General' ground truth set, and the second is done with the 'Monoseg' type of ground truth.The five above-mentioned benchmark segmentation models are used for the semantic segmentation of the RGB PCBs.
While alternative hyperparameters were explored, the selected set presented in table III empirically demonstrated robust performance across all models, avoiding overfitting issues.A noteworthy observation is that attempting a higher spatial image resolution (1280x1280x3) yielded improved results, despite practical constraints, particularly memory limitations, i.e., lack-of-memory errors that prevented universal application across all five models.Additionally, reducing the batch size below 8 negatively impacted performance, even with higher spatial resolutions.
Note: different hyperparameters were tested, resulting in varied outcomes for different models, mostly biased and overfitted performance.The chosen parameters strike a balance, showcasing good performance without encountering overfitting concerns for any of the models.a) 'General' segmentation evaluation:  cluded that the DeepLabv3+ and Attention UNet models stand out as the effective models among those five models exhibiting strong performance across precision, recall, and F1 Score for all classes, indicating a robust overall segmentation capability.Figure 9 presents the prediction performance of the DeepLabv3+ on some of the test images.
The quality of the segmentation prediction can be assessed by comparing the predicted masks to the ground truth masks.A good segmentation prediction should have a high overlap with the ground truth mask, and simultaneously be free of noise and artifacts.
From figure 9 it can be seen that the DeepLabv3+ model performs well on the three classes, with a high overlap between the predicted and ground truth masks.However, there are some cases where the model fails.Here is more expansion on the overall performance of DeepLabv3+: • The model performs very well on the 'IC' class, with a high overlap between the predicted and ground truth masks.• The model also performs well on the 'Capacitor' class, but there are some cases where the model misses some of the 'Capacitor' pixels, e.g., the last PCB (bottom right).
• The model performs worst on the 'Connectors' class, missing some of the Connector pixels, e.g., the third PCB from the left.• The model tends to make more mistakes in images with complex backgrounds.
Overall, the DeepLabv3+ model is a promising model for PCB segmentation.However, it is important to note that no model is perfect, and there will always be some cases where the model makes mistakes, especially in imbalanced class scenarios our PCB-Vision contains.
b) 'Monoseg' segmentation evaluation: Table V demonstrates the performance of the five segmentation models on the 'Monoseg' ground truth annotations.
Based on table V, DeepLabv3+ exhibits strong overall performance, excelling in both precision and recall.This

C. Results: HSI
In the context of hyperspectral data, several preprocessing steps are undertaken to enhance the quality and relevance of the dataset.These steps are crucial for ensuring that the hyperspectral information is effectively leveraged for subsequent model training.
1) Data normalization: The spectra in the hyperspectral data are normalized using information from the dark acquisition and white reference panel.This normalization process ensures that the spectral values fall within the standardized range of zero to one.2) Data limiting: Post-normalization, some values might still fall outside the expected range due to factors like noise or the presence of materials with higher reflectance than the white reference panel, e.g., shiny metals (heat sink) on the PCB surface.To address this, values exceeding the range are limited to 1.0, while negative values are set to 0.0.3) Train, validation, test split: Given the imbalanced nature of classes in hyperspectral data, a manual split is performed to ensure a balanced representation in the training set.This is particularly important to mitigate the impact of amplified class imbalance in HSI on model training.As stated in the statistics, the dataset is divided into training (56%), validation (5%), and test (39%) sets.Results are split into two main categories depending on the data type.We experimented with two types of data, the first is by reducing the dimensionality of the hyperspectral data cubes using Principal Component Analysis (PCA) utilizing the first three components.The second type of data was patches of the raw hyperspectral cubes.We ran two segmentation experiments on each of these two main types of data, one for the 'General' ground truths and one for the 'Monoseg' ground truths.
1) HSI -PCA: Our motivation behind using PCA came from the two features our hyperspectral cubes have, the spectral and the spatial features, therefore we extracted the spectral features using PCA and the spatial features using the benchmark segmentation models in an attempt to combine the merits of the two methods.The first three principal components were chosen since they cover up to 99% of the variance in the data as can be seen in figure 13.
To augment the training set, we applied eight spatial-level augmentation techniques, enhancing the diversity of deep learning segmentation models input.These augmentations included clockwise and counterclockwise rotation, vertical and horizontal transition (positive and negative), and vertical and horizontal flipping.This resulted in a total of 126 PCA data samples for training.The hyperparameters that yielded the better results which were used for training the segmentation models on the PCA data are outlined in table VI. a) PCA 'General' segmentation: Table VII shows the numerical evaluation of the five benchmark models on the  PCA data with the 'General' segmentation ground truth.From table VII we conclude that as in the RGB cases, Deeplabv3+ exhibits strong performance across precision, recall, and F1 score for all classes, indicating a robust overall segmentation      2) HSI -Patches : In this part, patches of raw data were fed to the DL models, in an attempt to construct an end-to-end model capable of discerning both spectral and spatial features promising to the segmentation of the four classes.Ideally, the utilization of the entire hyperspectral (HS) cube would have been preferred; however, due to the large size of the HS cube, memory limitations were encountered.To address this challenge, mitigating measures involved the reduction of spatial dimensions while preserving spectral characteristics.Consequently, the largest feasible patch size, specifically 128 by 128, was adopted, incorporating 214 spectral bands of the HS cube where the first ten bands were discarded due to the highly contained noise.It is noteworthy that while larger patch sizes could be accommodated by diminishing the batch size, such adjustments yielded suboptimal performance in our experiments.Specifically, experiments conducted with a batch size of 4 or below resulted in diminished model performance.The hyperparameters employed for this patch-based approach are outlined in table IX.
A comparative analysis of the hyperparameters utilized in the training scenarios for RGB and HSI PCA, as presented in tables III and VI, along with table IX reveals a notable distinction in the learning rate.Specifically, in the HSI patch training scenario, the learning rate is set at a lower value of 1e-5.This adjustment is not only empirically substantiated by observed results but is also theoretically justified due to the substantial increase in the number of training samples.The dataset size has significantly expanded from around a hundred Figure 14 shows the segmentation prediction performance of U-Net on some test HSI patches of the three classes.
Unet performs well on the three classes, with a high overlap between the predicted and ground truth masks.However, there are some cases in which the model makes mistakes.Furthermore: • The fact that the model can achieve good segmentation performance on hyperspectral data which are much more information-rich and complex than RGB images is significant.This suggests that Unet has the potential to be used for a wide range of PCB HSI segmentation tasks.

IV. DISCUSSION
In this section, we highlight the challenges that come along with performing segmentation on RGB and multiscene HS data cube.

A. RGB Segmentation
The segmentation performance across the RGB data reveals opportunities for further improvement.To enhance performance, increasing the dataset size beyond the current 53 (400 after augmentation) scenes is recommended as it is always the case in improving model generalization by providing more data, nevertheless, PCB-vision will be regularly updated with more scenes.Additionally, exploring more sophisticated deep learning models, leveraging pre-trained models, large vision models (LVMs) could contribute to improved segmentation results.The current benchmark models serve as a baseline, and further experimentation with larger datasets and advanced models is warranted.

B. HSI Segmentation
Segmenting multiscene hyperspectral data poses unique challenges due to the abundant information in hyperspectral cubes and lower spatial resolution compared to RGB images.Achieving high-performance segmentation requires models capable of extracting both spectral and spatial features efficiently.Several insights emerge from the HSI segmentation experiments: • Incompatibility of RGB models on HSI data: Directly applying RGB segmentation benchmark models to HSI data yields suboptimal results.The substantial increase in the number of channels (214 in HS compared to 3 in RGB) introduces more parameters in the input layers, implying more parameters to be optimized with the same PCB scenes, leading to harder challenges in achieving higher generalizing performance.an alternative, but the method of applying PCA worths consideration.
• PCA implementation challenges: Solving the patching problem that solves the lack of memory problem can be done by reducing the dimension of the HS data using a dimensionality reduction technique like PCA.The principal components that dimensionality reduction techniques produce can capture more than 90% of the variance in the data in only the first few components, making them suitable for the task.Usually, methods like PCA are implemented on the data cube first, then the result is given to train the classification model.This sequence utilizes the idea of extracting the spectral features using a dimensionality model first, then extracting the spatial features using a segmentation model second, combining the best of both models towards better segmentation performance.However, note that in the provided data (our application) we have multiple data cubes in the training set, validation set, and testing set.Therefore, two questions arise: 1) Should PCA be implemented from scratch on each HS cube in all training and testing sets, and the principal components will be then the segmentation model input?
2) Should all the training cubes be used to train one PCA, and then use that trained PCA to transform the dataset?Implementing PCA on each hyperspectral cube independently introduces higher variance in results due to diverse compositions in different cubes.Thus posing a harder generalization challenge on the segmentation models.Alternatively, training a single PCA on all cubes increases homogeneity but demands a PCA method compatible with large multiscene hyperspectral dataset.
• Impact of undesired background: Another point we would like again to highlight is the negative effect undesired background brings to the processing pipeline.Having an undesired background in the HSI skews the calculation across the processing pipeline: -Dimensionality reduction impact: The presence of undesired pixels in hyperspectral data significantly influences principal component calculations, particularly when their quantity surpasses that of the desired pixels (i.e., classes of interest).This circumstance results in suboptimal principal components, leading to an imperfect representation of the target classes [44].-Segmentation challenges: Undesired background pixels introduce complexities during segmentation.The model encounters increased intricacy, particularly when undesired pixels exhibit noise or share similar spectra with one of the classes of interest (e.g., conveyor belt pixels having spectra akin to 'IC' pixels due to both being black polymers), the model experiences confusion in capsulizing the 'IC' class, potentially leading to suboptimal convergence [44].To address these challenges, a comprehensive solution involving the application of background masking throughout the entire processing pipeline is yet to be implemented.By effectively masking out the undesired background pixels, the adverse effects on dimensionality reduction techniques and segmentation DL models can be mitigated, promoting more accurate and robust results.

V. FUTURE CHALLENGES
The above discussion highlighted empirical challenges that demand attention for achieving higher generalized performance in multiscene hyperspectral data analysis.Starting with data masking throughout the dataset, using the provided PCB masks in order to mitigate the effects of the undesired backgrounds.Furthermore, subsequent efforts involve the development of a multi-data type processing pipeline that seamlessly integrates RGB images with their hyperspectral pairs to overcome challenges unique to each data type.Additionally, we plan to continuously expand our PCB-Vision dataset by incorporating additional PCB scenes from diverse PCB sources and scanning sensors.This will help to better capture the variability in PCBs and improve the generalization ability of the models to unseen data.
The primary focus revolves around the refinement of segmentation models, with an emphasis on enhancing generalization performance within the processing pipeline to ensure optimal fast performance on unseen HS data.In this pursuit, additional research initiatives aim to provide pre-trained backbones tailored for segmentation and super-resolution models, expanding the scope of this work towards comprehensive advancements in hyperspectral data analysis.

VI. CONCLUSION
In this work, we present a significant leap forward in PCB analysis along the generalized hyperspectral data processing concerning optimized E-waste recycling aligned towards a circular economy.We provide 'PCB-Vision', a pioneering RGB-HSI benchmark dataset of 53 different printed circuit boards (PCB), offering essential insights into E-waste composition, and setting the data basis for solution developments in sustainable product design and their recycling strategies.Alongside the acquisition setup clarification and data description, we conducted an intensive statistical analysis of PCB-Vision.Moreover, we performed comprehensive experimentation with five SOTA segmentation models on both data types in PCB-Vision and highlighted the challenges and complexities inherent in classifying unseen hyperspectral data which is one of the main motivations for presenting the PCB-Vision benchmark dataset.The reported results not only contribute valuable insights into the performance of benchmark models but also emphasize the need for models with robust generalization capabilities.By addressing the detection of PCB elements towards object-based recycling, PCB-Vision supports the United Nations (UN) "Climate Action" SDG 13 and promotes innovation in electronic waste management, contributing to sustainable industrial practices outlined in SDG 9 "Industry, Innovation And Infrastructure".As we look ahead, the outlined future challenges and proposed approaches aim to encourage further advancements in hyperspectral imaging and PCB analysis, fostering continued progress in this critical domain with the aim of optimized recycling.To democratize scientific knowledge and promote inclusive innovation, we will openly release the benchmark dataset, ground truths, and accompanying baseline codes at https://github.com/hifexplo/PCBVision, enabling researchers from diverse fields to explore and contribute to the development of advanced E-waste recycling methodologies.This fosters resource collaboration and an open-access environment that aligns with the principles of the Green Deal.This paper, therefore, represents a crucial step for achieving interconnected SDGs, fostering responsible consumption, reducing environmental impact, and advancing industrial sustainability.

Fig. 1 :
Fig. 1: PCB-Vision setup to results: (a) RGB images and HS data cubes are acquired, (b) data normalization and preprocessing, (c) data preparation for ML model pipeline, (d) segmentation results multimodal (HSI and RGB) dataset consisting of 53 high spectral resolution hyperspectral data cubes along with their 53 high spatial resolution RGB images of 53 different PCBs scanned in industrial-like scenarios, coupled with two different segmentation annotations.In summary, PCB-Vision consists of: 1) 53 hyperspectral cubes of 53 different PCBs scanned in the visible and near-infrared range with 224 bands.2) 53 RGB images of the 53 PCBs.3) 'General' pixel-wise classification ground truth of three classes of interest ('IC', 'Capacitor', 'Connectors') for both RGB and HS data cubes.4) 'Monoseg' pixel-wise classification ground truth of three classes of interest ('IC', 'Capacitor', 'Connectors') for both RGB and HS data cubes.5) 53 background/foreground masks for separating the PCB from the image background.

Fig. 3 :
Fig. 3: Four spectra from PCB 1 HSI were captured from randomly selected points on the surface of our classes of interest, along with the spectra from the conveyor belt background.'Conveyor belt' (orange), 'IC' (red), 'Capacitor' (green), and 'Connectors' (blue).

Fig. 7 :
Fig. 7: PCBs type based on the contained classes

Fig. 8 :
Fig. 8: Class distribution for each PCB in each PCB class category.

Figures 10 ,Fig. 9 :
Fig. 9: Visual comparison of DeepLabv3+ predictions and ground truths on eight PCBs from the test set.

Fig. 14 :
Fig. 14: Visual comparison of Unet predictions and ground truths on several HSI test patches.

TABLE I :
A systematic review of publicly available PCB datasets

TABLE II :
Acquisition cameras parameters

TABLE III :
RGB images training hyperparameters Table IV demonstrates the performance of the five models on the 'General' ground truths annotations.Based on table IV it can be con-

TABLE IV :
Models' evaluation metrics on the RGB test set 'General' ground truth.

TABLE V :
Models' evaluation metrics on the RGB test set 'Monoseg' ground truth.

TABLE VI :
PCA data training hyperparameters

TABLE VII :
Models' evaluation metrics on the HSI PCA test set 'General' ground truth.
Table VIII demonstrates that Attention U-Net and DeepLabv3+ exhibited comparable performances across multiple classes, except for the 'Connectors' class, where DeepLabv3+ achieved superior performance.

TABLE VIII :
Models' evaluation metrics on the HSI PCA test set 'Monoseg' ground truth.

TABLE IX :
Patches training hyperparameters

TABLE X :
Models' evaluation metrics on the HSI patches test set 'General' ground truth.HSI Patches General Segmentation: Table X demonstrates the results of training three DL models on the 'General' ground truth.Three benchmark models were used instead of five due to incompatibilities between the hyperspectral data patches and DeepLabv3+ and LinkNet models' architecture.From table X it can be concluded that among the models, ResUnet and Unet demonstrate similar performance across multiple classes, with ResUnet having an edge in capturing only 'IC' whereas Unet was outperforming regarding the 'Capacitor' and 'Connctors' classes.

TABLE XI :
Models' evaluation metrics on the HSI patches test set 'Monoseg' ground truth.