Review and Alignment of Domain-Level Ontologies for Materials Science

The growing complexity and interdisciplinary nature of Materials Science research demand efficient data management and exchange through structured knowledge representation. Domain-Level Ontologies (DLOs) for Materials Science have emerged as a valuable tool for describing materials properties, processes, and structures, enabling effective data integration, interoperability, and knowledge discovery. However, the harmonization of DLOs, and, more generally, the establishment of fully interoperable multi-level ecosystems, remains a challenge due to various factors, including the diverse landscape of existing ontologies. This work provides, for the first time in literature, a comprehensive overview of the state-of-the-art of DLOs for Materials Science, reviewing more than 40 DLOs and highlighting their main features and purposes. Furthermore, an alignment methodology including both manual and automated steps, making use of Top-Level Ontologies’ (TLO) capability of promoting interoperability, and revolving around the engineering of FAIR standalone entities acting as minimal data pipelines (“bridge concepts”), is presented. A proof of concept is also provided. The primary aspiration of this undertaking is to make a meaningful contribution towards the establishment of a unified ontology framework for Materials Science, facilitating more effective data integration and fostering interoperability across Materials Science subdomains.


I. INTRODUCTION
Materials Science is an interdisciplinary domain that encompasses the study of the properties, processing, and applications of various materials, including metals, ceramics, polymers, and composites.The rapid advancement of Materials Science research, coupled with the escalating requirement for streamlined data management and exchange, has resulted The associate editor coordinating the review of this manuscript and approving it for publication was Ali Kashif Bashir . in an increasing demand for structured knowledge representation in the form of ontologies.
In the context of information science and computer science, an ontology is a formal, explicit representation of shared knowledge, taking the form of a set of classes, relationships and axiomatic constraints.It serves as a tool for structuring data, enabling efficient information retrieval, data integration, interoperability, and knowledge discovery. 1 In general, it is common practice in the literature to classify ontologies hierarchically depending on the generality of the concept they include, i.e. their domain of application.The ''levels'' so individuated have vague boundaries, yet the presence of borderline cases does not undermine the practical utility of the criterion. 2y definition [8], a Domain-Level Ontology (DLO henceforth), or simply a domain ontology, is an ontology that focuses on concepts, properties, and relationships relevant to a particular area of knowledge or field of study.A DLO can be either a specialized module of an upper-level ontology, or a standalone ontology targeting a specific domain (e.g.additive manufacturing, composite materials).While they are intended to retain a measure of neutrality with respect to specific use cases, these ontologies are developed to cater to the informational needs of the domain they revolve around; as such, they are hinged on more fine-grained concepts, and they are not useful beyond their respective domains.
Ontologies at this level are often lightweight, inasmuch as their classes are not thoroughly conceptualized formally, and often lack proper informal characterizations (i.e.documentation) as well.This is chiefly due to the fact that these ontologies are engineered to fit the practical needs of a specific community that takes a certain jargon for granted -hence, interoperability is made to rest on common ground rather than semantic transparency.In fact, lower-level ontologies might not necessarily take the form of fully-fledged axiomatic theories; instead, they tend to offer only a basic taxonomical organization of the relevant domain (focusing on hierarchical relations), and minimal terminological services.
Conversely, a Top-Level Ontology (TLO henceforth) provides very general concepts that can be used across various domains.TLOs usually offer systematic and inference-supporting conceptual schemata resting on few principles; thus, they can act as a framework to ensure semantic consistency and are frequently employed to harmonize, or ground, Domain-Level Ontologies (see [8] again).
DLOs for Materials Science provide a structured and standardized vocabulary for describing materials properties, processes, and structures, enabling effective data integration, interoperability, and knowledge discovery.However, DLOs frequently grapple with the absence of generally accepted definitions/elucidations for terms, and often support diverse interpretations, also due to their lack of generality.Furthermore, even ontologies whose scope covers the same, or overlapping, domains, can be widely different: pluralism is a direct result of stakeholders' heterogeneous desiderata, resulting in a diverse landscape.This poses well-known challenges for their harmonization, compromising data exchanges among ontologies as well as their assimilation within the frameworks offered by Top Level Ontologies.
In the context of ontologies, harmonization refers to the process of resolving inconsistencies to allow for efficient collaboration and interoperability.At the semantic and semiotic level [9], this involves ensuring the comparability of the concepts and relations employed by different ontologies as well as the removal of potential sources of ambiguity and friction, up to the achievement of uniformity with respect to formal and informal characterizations -thereby fostering coherence and unity in the knowledge representation, as well as satisfying a precondition for the consistent usage of ontologies in practical scenarios.Ontology harmonization is often understood in terms of alignments, i.e. formal connections among entities (like classes, relations, and instances) in different ontologies, up to correspondence/equivalence.These links can be used to transfer knowledge from one ontology to another or to merge ontologies together.Ontology alignment can be a core step in ontology integration [10], which involves merging multiple ontologies into a single, coherent framework, possibly repairing inconsistencies, resolving semantic conflicts, and filling in eventual gaps.The ''new'' ontology resulting from the integration processes contains the knowledge of all the original ontologies, and can be used in their place, making it easier to manage and navigate the associated information.However, notably, the ontologies resulting from integration processes might not be strictly compatible with the ones they are based on -which might be the ones de facto employed in the field.Consequently, alignment can often be more effective from a practical point of view than stricter forms of harmonization, involving ontology integration.
The purpose of this work is to review DLOs for Materials Science and propose (and showcase) a methodology for their alignment with one another and with salient TLOs, focusing on semantic and semiotic aspects.The first contribution of this paper consists of an overview of the existing DLOs for Materials Science, including details concerning their main features and purposes, such as the language in which their latest version is published, the state of their update, their actual inference capacity of reasoning, their use in Materials Science, whether they are based on/aligned with TLOs etc.This should increase said ontologies' findability and reusability, among other things, going ways towards the establishment of a defined ontology library for Materials Science.
The focus then turns to mediated alignments, taking inspiration from the fact that TLOs are often successfully employed to facilitate the establishment of connections among DLOs, and taking into account the results of the coverage analysis.A harmonization approach specifically tailored to the establishment of ecosystems of interoperable ontologies is thus presented: a key aspect in this approach concerns the individuation and informal definition of concepts capable of supporting salient, informative, formal connections across the network: ''bridge concepts''.The analysis of Materials Science's ontologies, and of the TLOs they are based on, can be considered a prerequisite for both the individuation of bridge concept candidates and bridge concepts engineering.
Expounding the philosophical principles underlying bridge concepts is beyond the scope of this paper; nonetheless, given the aims of this work, a brief overview will be offered, focusing on practical aspects.Bridge concepts can be understood as degenerate content ontology design patterns [11], standalone entities engineered independently of any particular ontological framework and then employed as data pipelines.Formal connections are established between bridge concepts and ontology entities belonging to the knowledge representation artifacts to be harmonized; thus, the seamless exchange of information across varied ontological structures is mediated by the proposed tools, which operate as hubs in hub-and-spoke structures.Bridge concepts' inherent modularity and ontology-neutrality have been deemed a precondition for a sustainable approach to the harmonization of a plurality of diverse ontologies, possibly operating at different levels of abstraction and expressing different worldviews.It is also worth pointing out that bridge concepts are FAIR by design: they are jointly engineered by teams of domain experts and ontologists to ensure accessibility; their informal characterization in natural language (elucidation) is tailored for ontology usage, and the rationale underlying each engineering and mapping choice is properly documented.
In this regard, this work proposes a template for defining bridge concepts, and proposes a method to extract prima facie effective candidates via the statistical analysis of the terms appearing in a set of relevant DLOs, which can support a subsequent step of semantic analysis to engineer bridge concepts with comprehensible elucidations both linked to standards and supporting informative links with concepts from said DLOs.Salient bridge concepts resulting from the aforementioned process are also presented as a proof of concept.
By providing a comprehensive analysis of the state-of-theart of DLOs in Materials Science and offering a methodology for their harmonization based on bridge concepts, this work aims to contribute to the establishment of a unified yet pluralistic ontology framework for Materials Science; such a framework is expected to facilitate more effective data integration and interoperability across Materials Science subdomains, promoting collaborative research, innovation, and the accelerated discovery of new materials and their applications.
This work is carried out within the context of the Onto-Commons H2020 project [12], a European initiative aiming to establish a common foundation for the development, harmonization, and application of ontologies in materials and manufacturing research.
This paper is structured as follows.In Section II, an overview of the state of the art concerning Materials Science ontologies is provided.A review of the relevant DLOs is offered; given TLOs' role in facilitating interoperability, the ontologies are grouped according to the TLOs they are aligned with, if any.The section also covers the most salient TLOs and ontology hubs related to Materials, as well as a discussion of relevant tools, approaches and initiatives found in the literature.Section III presents the methodology for analyzing existing ontologies' coverage for Materials Science.Additionally, it provides a brief outline of a harmonization approach based on bridge concepts, which is implemented in the following two sections, respectively.Section IV offers a partition of Materials Science into subdomains, and implements the coverage analysis of Materials Science DLOs, situating the latter in the landscape based on a statistical method resting on terminological grounds.Conclusions in the form of limitations and opportunities for DLO harmonization are presented.Section V implements the approach for harmonizing DLOs based on bridge concepts, offering more details in the process.The section includes an automated statistical approach for the individuation of effective candidate bridge concepts, a template for defining bridge concepts, and the description of a process of semantic analysis carried out by experts, which can be used to come up with salient bridge concepts from a set of relevant DLOs and standards.The described methodology is applied to the reviewed DLOs, and a sample bridge concept is presented as the output (Section V-C); in light of that, the methodology is reassessed, focusing on matters pertaining to its role in ontology harmonization, practices to ensure efficient reuse of the results and potential future improvements.Finally, in Section VI conclusions are drawn.

II. STATE OF THE ART: DOMAIN OVERVIEW, RELATED TLOS AND HARMONIZATION ATTEMPTS
The sole library/repository targeting specifically Materials Science ontologies is MatPortal https://matportal.org/, which covers a total of 28 ontologies.IndustryPortal http://industryportal.enit.fr/,also part of the OntoPortal https://ontoportal.org/initiative, includes Materials Science ontologies as well; however, it does not revolve specifically around that domain.Likewise, to the best of the authors' knowledge, the domain hasn't been previously subjected to a systematic coverage analysis.Various TLOs and ontology hubs have been put forward as frameworks to facilitate the harmonization of ontologies pertaining to Materials Science and to support the creation of new ontologies already harmonized with a given ecosystem.
However, perhaps surprisingly, there seem to have been no systematic attempts at harmonizing the various subdomains; in fact, there is also a significant scarcity of alignments between overlapping ontologies (even pairwise), despite the potential benefits.This might be in part due to the lack of a systematic analysis improving ontologies' findability and reusability.
120374 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

A. EXISTING TLOs AND HUBS RELATED TO MATERIALS SCIENCE
TLOs are often employed as tools to create new consistent and well-organized ontologies.DLOs based on (or otherwise aligned to) TLOs are more easily harmonized, and already minimally interoperable; as such, TLOs are often employed as the core of hubs of ontologies in the attempt to create harmonized networks of ontologies.
Harmonization strategies resting on semantic alignments with a specific TLO can be particularly effective when all the involved ontologies are based on the same TLO; however, the usual challenges immediately arise if the alignments have to be established.That said, it is possible to establish connections among TLOs to facilitate alignments among ontologies based on different ontologies, a strategy which was also employed in the context of the OntoCommons project [12].Nevertheless, harmonization strategies based solely on TLOs have limitations when it comes to establishing alignments among DLOs: while they can entail minimal mediated connections, TLOs' role is chiefly that of facilitators and safety checks; in order to establish connections supporting the exchange of meaningful data at the domain level, the standard issues have to be addressed.That said, TLOs' potential should not be underestimated, especially given the possibility of employing mixed harmonization strategies.
A number of TLOs have been deemed especially salient for Materials Science, as a result of a bottom-up coverage analysis, namely: BFO [13], EMMO [14], SIO [15] and SUMO [16].A detailed description of each of them can be found in the following paragraphs.

1) BASIC FORMAL ONTOLOGY (BFO)
BFO [13] is a top-level ontology that provides a foundational structure consisting of thirty-six classes, including no terms particular to Materials Science.BFO has been employed as a reference by a number of lower-level ontologies -especially in the biomedical domain, but also including relevant hubs e.g. the Industry Ontology Foundry (IOF).
BFO rests on a distinction between universals and particulars; in line with that, it adopts three basic relation types: universal-universal, universal-particular, and particular-particular (where the first relates subtypes to parent types and cannot be time indexed; and the only universal-particular relation is that of instantiation).
BFO's taxonomy is hinged on criteria borrowed from analytic philosophy, concerning whether entities have temporal parts (or, alternatively, whether entities have their properties temporally, or atemporally) and relations of ontological dependence, among other things.BFO endorses a (i) realist, (ii) fallibilist, and (iii) adequatist stance: (i) it represents the world rather than language and concepts; (ii) it concedes that our understanding of the world (and specifically of universals) can change in light of new discoveries -and, consequently, it is committed to tracking said changes over time; (ii) and it assumes a form of ontological and semantic anti-reductionism on the basis of the parity of (scientific) disciplines.

2) ELEMENTARY MULTIPERSPECTIVE MATERIAL ONTOLOGY (EMMO)
EMMO [14] is an interdisciplinary ontological framework that serves as a platform for applied sciences, specifically designed to cater to Materials Science.Its ultimate purpose is to standardize the representation of materials modeling information, improving interoperability among diverse materials models, data, and software tools.
EMMO combines a sciences-friendly core, comprising of formal distinction supporting the qualitative representation of space-time and offering well-defined criteria of identity, with a plurality of ''perspectives'' -mutually compatible conceptual schemata augmenting the ontology's expressiveness.
The following are some of EMMO's main perspectives.
• Holistic perspective: this perspective focuses on the dependence relations between an entity, seen as an integrated system (a whole), and its parts, playing functional roles within the system.
• Semiotics perspective: this perspective takes inspiration from Charles S. Peirce's semiotic theory [17]; it is employed, among other things, to ground a nominalistic, metrology-friendly, approach to the representation of properties, by conveying meaning in the establishment of relationships among entities playing the role of sign and semiotic object for an interpreter.
• Physicalistic perspective: this perspective classifies entities by referring to concepts taken from Physics, and, specifically, from the Standard Model of Particle Physics and Materials Science.
• Reductionistic perspective: this perspective supports the analysis of systems at different levels of granularity, facilitating mechanistic approaches and fine-grained descriptions of systems' evolution in time.

3) SEMANTISCIENCE INTEGRATED ONTOLOGY (SIO)
SIO [15] is a comprehensive and versatile ontology framework that supports collaborative, integrative, and translational research in diverse scientific disciplines.SIO is intended to provide a semantic foundation for the representation of complex scientific knowledge, bridging the gap between the heterogeneity of data representation and the need for integrated knowledge in various research areas.SIO shares many similarities with the already cited BFO; comparatively, it adopts a more commonsense-friendly approach when it comes to architecture and labeling, and it is more permissive with respect to the representation of fictional and virtual entities.

4) SUGGESTED UPPER MERGED ONTOLOGY (SUMO)
SUMO [16] is an IEEE-sanctioned, Top-Level Ontology created by merging a number of publicly available ontologies in a single cohesive structure.Taking into account its domain ontologies, it is one of the largest, free formal ontological resources.It is public, and aligned with WordNet.Like other TLOs, SUMO provides a general framework for the representation of knowledge across various domains, facilitating cross-domain integration and interoperability, and the ontology has been used to support the development and integration of domain-specific ontologies in Materials Science.

B. EXISTING MATERIALS SCIENCE DLOs
This subsection offers an overview of the DLOs developed for Materials Science, as well as those used by its community of stakeholders and practitioners.
The authors surveyed the landscape of DLOs related to Materials Science, enlisting them in the study, taking the available resources listed above as a starting point.Domain experts and community practitioners individually made the initial selections; these decisions were then jointly validated to reduce the risks of false positives and false negatives, as well as to ensure that consistent criteria for inclusion were applied.In the process, information regarding the ontologies was also collected for subsequent analysis.
Table 1 includes the name of the ontologies, their acronym (when present), their URLs, as well as the TLO they are aligned with (if any).The alignment of the ontologies with a TLO was inferred from the presence of subsumption relationships involving classes of the DLOs and classes of a TLO, along with bibliographic research.The latest public version of the ontologies was taken as a reference.
In what follows, each ontology is briefly described, paying particular attention to characteristics pertaining to the number of classes and relations, as well as their intended domain of application.When available, data related to the annotations of ontology entities has been included, along with a FAIR score coefficient endorsed by the OntoPortal initiative.The FAIR score depends, among other things, on the quantity and quality of annotations and documentation pertaining to an ontology.Relevant projects associated with the development of the ontologies are also addressed, with the aim of providing contextual information.[18], [19], [20] The Additive Manufacturing (AM) Ontology is a modular ontology that employs Basic Formal Ontology as its top level.AMO serves as an advanced framework to represent knowledge about additive manufacturing processes, particularly crucial for metal-based AM, where intricate interconnections among process parameters are yet to be fully understood.It comprises three main constituents: AMProcessOntology, ModelOntology, and AMOntology.AMProcessOntology encompasses a collection of entities capturing the specifics of additive manufacturing processes, aiming to untangle the web of interconnected parameters and establish a foundation for reliable process control models.ModelOntology, on the other hand, constitutes modeling concepts encapsulating potentially multi-physics, multi-scale processes.Its role is to standardize terminologies and modeling protocols, thus providing a more intuitive graphical view of these complex relationships.In essence, AMOntology weaves together the entities from AMProcessOntology and ModelOntology, creating a robust knowledge base delineating the characteristics of computational models for AM processes.By unifying these diverse aspects, AMOntology presents an integrated framework for understanding and modeling the AM processes.This ontology primarily consists of undefined, horizontally organized independent classes.It comprises 85 classes, of which 83 lack annotations, and includes 5 properties.The ontology features a relatively flat taxonomic structure with 55 child classes that do not further branch out.Only some of these classes are connected by restriction properties.With such a structure, the ontology readily responds to logical inference and deduction processes.

1) DLOs IN MATERIALS SCIENCE NOT ALIGNED WITH A TLO a: AMONTOLOGY: ADDITIVE MANUFACTURING ONTOLOGY
b: BUILDMAT: BUILDING MATERIAL ONTOLOGIES [21] The Building Material Ontology defines the main concepts of materials, the elements as layers that identify materials design, and general properties.Most of the classes lack elucidations, and there is no nested taxonomy; only first-degree subclass relationships are present.It consists of 26 classes, 23 of which do not have an elucidation, and 63 properties.This ontology's top-level classes are not connected to any TLO, making it an independent low-level ontology.It provides a semantic framework for the representation and sharing of building material data, catering to various phases of engineering and construction projects.This ontology addresses the central role of building material information in design decisions and different simulation processes such as energy, acoustics, and lighting.Despite the availability of different metadata schemas like the Industry Foundation Classes (IFC) [22], efficient data sharing among stakeholders is limited due to inherent constraints.To address this issue, the authors propose an ontology-based approach that utilizes semantic web concepts to enhance interoperability and Building Information Modeling (BIM) [23] data sharing in collaborative workflows.The outcome is the BUILDMAT ontology, which offers improved management of building material information in BIM-oriented collaboration process.c: DEB: DEVICES, EXPERIMENTAL SCAFFOLDS AND BIOMATERIALS ONTOLOGY [24] The Devices, Experimental scaffolds, and Biomaterials Ontology (DEB) is an open-source tool designed to structure information related to biomaterials, covering their design, manufacturing, and biological evaluation.It comprises 601 classes, of which 597 lack elucidations, and includes 121 properties.It presents a nested structure, where classes are organized in a hierarchy with different levels of subsumption relationships.It was crafted using text analysis of a carefully curated biomaterials corpus to accurately capture the sector's terminology.Subject matter experts from the biomaterials research field validated the coverage of topics within the ontology.Stored in .owlformat, DEB is versatile, with utilities that extend to term searching, machine learning annotation, standardizing metadata indexing, and facilitating cross-disciplinary data use, thereby promoting improved interoperability and data exploitation.d: MATONTO [25] MatOnto is an ontology created by the eResearch Lab at the University of Queensland, aimed at facilitating materials science.It seeks to integrate materials databases, model provenance data, and support knowledge extraction.Initially using DOLCE as its upper ontology, it weaves in pre-existing ontologies to describe diverse elements like units, time, and scientific experiments.Key concepts include Material, Property, Family, Process, Structure, and Measurement.It is composed of 848 classes, 351 of which do not have an elucidation, and 96 properties.It presents a nested structure, where classes are organized in a hierarchy with different levels of subsumption relationships.MatOnto also provides a unique way to describe relational database relationships, allowing dynamic SQL statement creation.Its primary application is in the fuel cell domain, and it's employed to discover new compounds for fuel cell electrolytes.MatOnto also serves as the backbone for MatSeek, a web search tool.Currently, MatOnto is being updated to version 2.0, planning to switch its upper ontology to BFO and enhance its logical consistency and commonality.
e: MDO-FULL: MATERIAL DESIGN ONTOLOGY [26] The Materials Design Ontology (MDO) aims to address the challenges associated with data accessibility and interoperability across disparate materials databases, particularly prevalent in the materials design domain.It is composed of 37 classes, 8 of which do not have an elucidation, and 64 properties.The majority of the classes are superclasses without children.It sports a FAIRness score of 242.Given that materials databases typically have different data models, users often struggle to locate and integrate data from various sources.The MDO leverages ontologies and ontology-based techniques to improve data availability and interoperability by formalizing domain knowledge representation.Crafted using domain knowledge from materials science, particularly solid-state physics, and guided by data sourced from several materials design databases, MDO is a comprehensive ontology encapsulating knowledge within the materials design field.The application of MDO has been demonstrated on data extracted from established materials databases, showcasing its practical utility in enhancing data interoperability.[27], [28] This ontology offers a lightweight solution for detailing the structure of tabular data (series data) stored in hdf5 containers.It has been effectively applied to describe timeforce-displacement data from tensile tests.It consists of an ongoing initiative to model an ontology prototype for representing Materials Science experiments, as a common standard formalization for materials knowledge that has not been realized yet.A use case was presented where such an ontology was used to facilitate the curation and comprehension of experiments, outlining the expectations from the ontology.It is composed of 8 classes, with only one without an elucidation, and 21 properties.

f: MOCO: MAT-O-LAB CONTAINER ONTOLOGY
g: MOL_BRINELL: BRINELL TEST ONTOLOGY [29] This modeling is based on ISO standards.The Mol Brinell ontology is a knowledge representation system that focuses on the domain of material characterization, measurement processes, and associated quantities of interest.This ontology has been developed by distributing a significant number of individuals across a limited number of classes.It is composed of 37 classes, all of which do not have an elucidation, and 21 object properties.It hosts, indeed, more than 2,000 instances.For this reason, it can be regarded as an ontology used primarily at the application level, with a large proportion of its classes being connected to the classes of the MSEO Ontology.
h: MPO: MATERIAL PROPERTIES ONTOLOGY [30] The Material Properties Ontology is made up of about 150 classes and about 13 object properties, in order to provide the vocabulary to describe the building components, materials, and their corresponding properties, relevant within the construction industry.More specifically, the building elements and properties covered in this ontology support applications focused on the design of building renovation projects.The Building Ontology at the same time reused the SAREF4 Building Ontology to construct this classification.SAREF4 Building Ontology focuses on the concept of device, which is defined as a tangible object designed to accomplish a particular task in households, common public buildings, or offices.
i: PMDco: PMD CORE ONTOLOGY [31] The Platform Material Digital (PMD) project aims to store FAIR data in alignment with a standard-compliant ontological representation (application ontology) of a tensile test of metals at room temperature (ISO 6892-1:2019-11).The process includes developing an ontology per the relevant standard, converting standard test data into the interoperable RDF format, and establishing a connection between the ontology and data.This semantic association promotes interoperability and boosts querying capabilities.To enhance data and knowledge reusability, the PMD core ontology (PMDco), a DLO in Materials Science, has been developed.The interconnection of the tensile test application ontology with the PMDco is highlighted by the ontology's authors.Additionally, a tool designed to aid domain experts in visual ontology development and mapping for FAIR data sharing in materials science and engineering has been reported as developed (Ontopanel).[32] The Chemical Analysis Ontology (CAO) is structured around the Basic Formal Ontology (BFO) framework.It primarily comprises entries under various categories such as concepts, material entities, information content entities (data items), roles, and processes.At this stage, the ontology mostly resembles a vocabulary, as the development of predicates (ontology properties) linking subjects to objects has not been a central focus.This ontology is associated with ChEBI, CHEMINF and CHMO.

2) DLOs IN MATERIALS SCIENCE ALIGNED WITH BFO a: CAO: CHEMICAL ANALYSIS ONTOLOGY
b: CHEBI: CHEMICAL ENTITIES OF BIOLOGICAL INTEREST [32] The Chemical Entities of Biological Interest (ChEBI) ontology serves as a comprehensive dictionary for small molecular entities, providing a freely accessible resource for researchers across various disciplines.These molecular entities include any atom, molecule, ion, ion pair, radical, radical ion, complex, conformer, and more, which can be distinguished as a separate entity.The entities covered by ChEBI are broad: in fact, it is composed of about 183,000 classes, 130,000 of which without any elucidation, and 10 object properties.Ranging from products found naturally in biological organisms to synthetic products that have been designed to interact with living systems.This could encompass substances that are intentionally introduced into an organism, such as pharmaceutical drugs, or unintentional environmental chemicals that may have biological effects.ChEBI, however, specifically excludes entities directly encoded by the genome, which means that nucleic acids, proteins, and peptides that are derived from proteins by cleavage are generally not incorporated within this ontology.In addition to these individual entities, ChEBI also includes classes of molecular entities and part-molecular entities.These are forms of substituent groups or atoms, providing a more granular level of detail in the ontology's coverage.c: CHEMINF: CHEMICAL INFORMATION ONTOLOGY [33] The Chemical Information Ontology (CHEMINF) is an ontology developed to represent and describe chemical information.It is composed of 850 classes, 509 of which not elucidated, and 118 object properties.Cheminformatics, which involves the application of informatics techniques to solve chemical problems in silico, necessitates accurate data exchange, increasingly being accomplished through the use of ontologies.CHEMINF is particularly focused on data-driven research and the integration of calculated properties (descriptors) of chemical entities within a semantic web context.It distinguishes algorithmic, or procedural information, from declarative, or factual information, making the annotation of provenance to calculated data particularly important.This ontology primarily aims to standardize the representation of chemical information.Its primary goals include producing an ontology to represent chemical structure and richly describe chemical properties, whether intrinsic or computed.d: CHMO: CHEMICAL METHODS ONTOLOGY [32] The Chemical Methods Ontology (CHMO) consists of over 3,000 classes, 1,000 of which without any elucidation, and less than 30 object properties.It outlines methods employed in chemical experiments.It encompasses methods for data collection (like mass spectrometry and electron microscopy), methods for material preparation and separation (like sample ionization, chromatography, and electrophoresis), and methods for material synthesis (like epitaxy and continuous vapor deposition).Additionally, it describes the instruments employed in these experiments, such as mass spectrometers and chromatography columns, and their outputs.The ontology is represented in both OBO and OWL formats, and both can be edited in Protégé, an ontology editor as well as a knowledge management system.e: ENM: eNanoMapper [34] The eNanoMapper project provides a comprehensive computational infrastructure for the management of toxicological data associated with engineered nanomaterials (ENMs).It is made up of 26,000 classes, 3,000 of which not elucidated, and sports 55 object properties.Utilizing ontologies, open standards, and interoperable designs, eNanoMapper promotes a unified approach to European nanotechnology research.Its database supports diverse data and provides an infrastructure for data sharing, analysis, and modeling of ENMs, accessible through an API.A key feature is a configurable spreadsheet parser that simplifies data preparation and upload.Additionally, a web application allows users to retrieve and analyze experimental data using machine learning algorithms.Importantly, the database supports the import and online publication of ENM data from various sources, facilitating the development of reproducible quantitative structure-activity relationships for nanomaterials (NanoQSAR).
f: LPBFO: LASER POWDER BED FUSION ONTOLOGY [35] LPBFO, the Linked Product Basic Formal Ontology, was developed and effectively employed within the AluTrace project to establish digital connections between disparate data and knowledge silos that often emerge throughout the course of industrial product development and manufacturing cycles.The primary objective of the AluTrace initiative was to create seamless linkages between these silos, enabling a unified flow of linked data and knowledge.By successfully achieving this integration, the linked data was harnessed to address specific use cases, including empowering design engineers to optimize components with a focus on lightweight design for additive manufacturing.This streamlined approach facilitated the realization of enhanced efficiency and precision in the product development process.It has 509 classes, almost all with an elucidation, and 40 object properties.g: MSEO: MATERIALS SCIENCE AND ENGINEERING ONTOLOGY [36] MSEO utilizes the IOF Ontology stack giving materials scientists and engineers the ability to represent their experiments and resulting data.The goal is to create machine and human-readable semantic data that can be easily digested by other science domains.It is a product of the joint venture Materials Open Lab Project between the Bundesanstalt für Materialforschung und prüfung (BAM) and the Fraunhofer Group MATERIALS and uses the BWMD ontology created by Fraunhofer IWM as a starting point.It is considered in [37] an ''Upper domain level Ontology''.As such, it classifies equipment in a particular domain.It constitutes a base to which ontologies on specific technologies are added.It has 231 classes, 139 of which without any elucidation, and 118 properties.[38] This is an ontology for describing the tensile test process, made in the Materials Open Lab Project.It is a domainspecific Ontology [37] and can be considered aligned to BFO Ontology.Most of its classes are aligned with the MSEO Ontology.It has 327 classes, 50 of these without any elucidation, and 97 properties.It is a detailed digital representation of the classical mechanical tensile test.This widely used and well-standardized test method, with which comparable material parameters are generated.

h: MOL_TENSILE: MAT-O-LAB TENSILE TEST ONTOLOGY
i: NPO: NANOPARTICLE ONTOLOGY [39] The NanoParticle Ontology (NPO) serves as a comprehensive repository of essential knowledge pertaining to the physical, chemical, and functional characteristics of nanotechnology in the context of cancer diagnosis and therapy.It is an ontology composed of about 2,000 classes with elucidations and about 80 object properties.Developed within the Basic Formal Ontology (BFO) framework and implemented using the Ontology Web Language (OWL) with well-defined design principles, NPO focuses on representing crucial information related to the preparation, chemical composition, and characterization of nanomaterials used in cancer research.To ensure accessibility and widespread usage, the NPO is made available to the public through the BioPortal website, which is maintained by the National Center for Biomedical Ontology.This facilitates seamless access to the ontology for interested parties.Furthermore, mechanisms for editorial and governance processes are under development to support the ongoing maintenance, review, and expansion of the NPO, ensuring its relevance and accuracy as the field of nanotechnology and cancer research progresses.j: RXNO: REACTION ONTOLOGY [40], [41] The RXNO Ontology, or more formally the Reaction Ontology, is a resource designed to provide a standardized classification and description of chemical reactions.This ontology is designed to annotate and integrate reaction data across various databases and information sources, thereby promoting interoperability.It has about 1,000 classes and 40 object properties.The core of RXNO focuses on reaction types, providing a high-level classification based on reaction mechanisms, reactant or product classes, and roles.These descriptions include a broad array of reactions, from simple transformations like oxidation, reduction or cyclizations, to more complex processes like the Diels-Alder reaction.Among the presented ontologies, RXNO stands out as the most comprehensive and thorough.It boasts a vast array of over 500 name reactions, thoughtfully organized into a multi-layered ontology.Initially, reactions are categorized by their general types, such as oxidations or cyclizations.This primary layer serves as a broad grouping.Subsequently, a secondary layer further subdivides the reactions based on specific characteristics, such as the dedicated reactants involved.For example, oxidations are subcategorized into reactions that synthesize alcohols or alkenes.This hierarchical tree structure empowers users to select reactions explicitly tailored to their desired outcomes.However, despite the wealth of reactions and their organization within the system, the ontology lacks additional information beyond the reaction names and their parent relationships.Moreover, it fails to provide information about the chemicals required for each reaction and vice versa.Additionally, there is no mention of any licensing details associated with the ontology.
3) DLOs IN MATERIALS SCIENCE ALIGNED WITH EMMO a: ATOMISTIC [42] An EMMO-based domain ontology for atomistic and electronic modeling.It is a domain ontology aligned to the EMMO multiperspective alpha 2 version.It includes all the concepts related to the fundamental atomistic theory.Most of its concepts are under EMMO's Reductionistic perspective, and use elucidations from IUPAC Goldbook, IEC standards, and Wikipedia.The ontology, including its imports (the TLO it is based on), encompasses 533 classes and 46 object properties.
b: BattINFO: BATTERY INTERFACE ONTOLOGY [43] The Battery Interface Ontology (BattINFO) is a free, opensource domain ontology designed to address the current challenges in the battery data landscape characterized by heterogeneous sources and inconsistent metadata.Bat-tINFO aims to enhance the interoperability, reusability, and machine-readability of battery data.Developed within the scope of the EU H2020 project BIG-MAP, it fosters data interoperability across over 30 battery research institutes and companies in Europe.BattINFO employs an ontology -a data model that encapsulates domain knowledge as a network of concepts and their relationships.This model permits data to be mapped to a common vocabulary, enabling the connection of two data pieces mapped to the same term.Expert knowledge can be expressed as machine-readable graphs 120380 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
that can be queried to identify data linkages and facilitate knowledge sharing.The ontology, including its imports (i.e. the version of EMMO it is based on), encompasses 3284 classes and 116 object properties.c: CDO: CRYSTALLOGRAPHY DOMAIN ONTOLOGY [42] A crystallography domain ontology that imports EMMO and the CIF core dictionary.It is implemented in a formal language with the objective of archiving and distributing crystallographic information.It is an extremely extensive ontology, in which the fundamental categories of crystallographic models and CIF files are integrated.Both imported ontologies contain the fundamental concepts, while CIF Core, the more detailed one, takes a large percentage of the space.The ontology, including its imports, encompasses 1806 classes and 47 object properties.
d: CHAMEO: CHARACTERIZATION METHODOLOGIES ONTOLOGY [44], [45] The CHAMEO ontology, a domain ontology, seeks to bridge the divergence in terminologies and data management approaches that exist in the wide-ranging field of materials characterization.Stemming from a recent CEN Workshop Agreement (CWA 17815), CHAMEO is grounded in a standardized terminology and the characterization Data (CHADA) documentation scheme.The overarching goal of CHAMEO is to serve as a unifying framework, harmonizing method-specific ontologies by reusing and specializing its generic constructs.CHAMEO is part of a broader European Materials Modelling Council (EMMC) initiative to develop interconnected materials modeling ontologies based on the Elementary Multiperspective Material Ontology (EMMO) root.Created within the scope of the NanoMECommons European project, CHAMEO's objective is to harmonize characterization protocols.Furthermore, it aligns with several newly developed EMMO-based domain ontologies for classifying materials, models, manufacturing processes, and software products relevant to materials modeling.The ontology's axiomatization can be accessed in a GitHub repository and is published online.It is composed of 479 classes and sports a FAIR score of 195.
e: CIFO: CIF ONTOLOGY [46] The CIF Ontology is an ontologization of the CIF vocabulary.The ontology, not including its imports, is made up of 68 classes and 5 object properties.

f: EMMO DISCIPLINES [47]
EMMO Disciplines ontologies are positioned between top-level and domain-specific ontologies, representing a unique and versatile category.Some prominent examples include Conformity Assessment, Chemistry, Manufacturing, and Life Cycle Assessment ontologies.These ontologies span across various domains, effectively capturing concepts within a multidisciplinary context.Due to their broad scope and comprehensive nature, the concepts within these ontologies can be applied in diverse and unrelated applications.By bridging the gap between middle-level and domain-specific ontologies, EMMO Disciplines ontologies provide a flexible framework for knowledge representation and exchange.Their rich conceptual structure allows for seamless integration of data from different domains, promoting interoperability and facilitating knowledge sharing across various fields.For this reason, these are ontologies frequently imported into domain-specific ontologies.
g: METAL-ALLOY [48] This ontology hosts many main concepts from materials science and metallurgy, that characterize a metal alloy structurally.There are many references to structural types of cells and properties.The ontology, not including its imports, is comprised of 11 classes and 18 object properties.
h: MICROSTRUCTURE DOMAIN ONTOLOGY [49] The Microstructure Domain Ontology is a specialized ontology designed to encapsulate all vital aspects of metallic microstructures within the field of physical metallurgy.This includes components such as composition, particles (both stable and metastable), grains and subgrains, grain boundaries, particle-free zones, texture, dislocations, and alloy systems.The ontology's goal is to support both microstructure modeling and characterization.In terms of modeling, the ontology differentiates between evolution models, which alter the microstructure via a sequence of states, and property models, which associate a microstructure state with a specific property.Furthermore, it supports the description of the same concept at varying spatial resolutions, such as mean-field, 1D, 2D, and 3D.The ontology can encapsulate mean field descriptions and offers the ability to detail quantities at different statistical levels like mean size, normalized size distribution, and full-size distribution.It also incorporates a common method to link a state to external conditions such as temperature, volume/shape, and pressure, which are vital for describing a process.The ontology, including its imports (i.e. the TLO it is based on), encompasses 1305 classes and 107 object properties.

i: MTO: MECHANICAL TESTING ONTOLOGY [50]
This ontology comprises an extensive collection of mechanical properties and testing procedures, which are essential for determining and characterizing the mechanical behavior of materials.It also includes terminology of Metallurgy treatments referring to a series of processes and techniques applied to metals and alloys to modify their physical, chemical, and mechanical properties.It is composed of 819 classes and features a FAIR score of 194.

j: OPEN INNOVATION ENVIRONMENT (OIE)
ONTOLOGIES [51] Five EMMO-compliant, domain-level ontologies developed for the purpose of the Open Innovation Environment platform (developed within the context of the EU-funded OYSTER [52] and NanoMECommons [53] projects) tackling the areas of characterization methods, manufacturing processes, materials, models, and software products, respectively.The ontologies include 42 classes and no object property (Characterization Methods), 221 classes and 3 object properties (Manufacturing), 117 classes and no object property (Materials), 107 classes and 1 object property (Models), and 161 classes and no object properties (Software).A number of EMMO-based ontologies are aligned with them, including CHAMEO [44], [45] and MAEO [54], the latter an application ontology modeling experts, expertise and knowledge providers within EMMO's ecosystem.
k: PDO: PHOTOVOLTAICS DOMAIN ONTOLOGY [55] This project aims to establish a comprehensive taxonomy of terminologies related to photovoltaic plants, starting with the fundamental concepts and gradually expanding to cover the detailed description of photovoltaic module composition.By systematically organizing and categorizing the terminology, this initiative seeks to enhance understanding, communication, and research within the field of photovoltaic technology.It imports another module called Photovoltaics that contains the classes related to the composition of the Photovoltaic cells layers and inherent measurements.l: PRECIPITATION MODEL [56] This ontology includes many physical, statistical and simulation models for the calculation for studying how dislocations move and their position.It is focused on linear defects in material structures.As well as the Mechanical Testing Ontology earlier described, it composes the metal alloy module in EMMO, covering both the experimental and numerical observations.m: VIRTUAL MATERIAL MARKETPLACE (VIMMP) ONTOLOGY [57] The Virtual Materials Marketplace project is an initiative that focuses on creating an open platform for offering and utilizing services related to materials modeling.A significant part of the project deals with the development of ontologies and advancing data technology.VIMMP has established a series of marketplace-level ontologies designed to describe services, models, and user interactions, all within the framework of the European Materials and Modelling Ontology, which serves as a top-level ontology.These ontologies play a vital role in data annotation, facilitating the storage of this information within the ZONTAL Space component of VIMMP.Additionally, these ontologies are instrumental in managing data and metadata intake and retrieval at the VIMMP marketplace's front-end.

4) DLOs IN MATERIALS SCIENCE ALIGNED WITH SIO a: MM: MATERIALSMINE [58]
A materials ontology to support data publication involving nanomaterials and metamaterials.The MaterialsMine Team brings together expertise across five research institutions in the fields of mechanics, materials, design, manufacturing, data science, and computer science to build and develop an open-source, user-friendly materials data resource guided by FAIR principles, with current modules geared toward research communities in the domains of polymer nanocomposites (NanoMine) and mechanical metamaterials (MetaMine).MaterialsMine is an extension of NanoMine and incorporates curated data from research articles in the field.It is composed of 2052 classes, 1711 of which do not have an elucidation, and 325 properties.The majority of the classes have at least a subclass.The platform utilizes a knowledge graph structure, which is built upon linked data conforming to semantic web ontologies and vocabularies.The knowledge graph in MaterialsMine consists of curated data including information about materials, processing techniques, characterization methods, and bibliographic details.
b: NANOMINE [59] NanoMine is based on an XML-based data schema designed for nanocomposite materials data representation and distribution.The schema aligns with a higher-level polymer data core, ensuring its consistency with other centralized materials data efforts.Alongside the schema, an ontology and a knowledge graph framework are implemented to provide a more comprehensive representation of nanopolymer systems.The schema, ontology, and knowledge graph ensure ease of accessibility and compatibility with concurrent material standards, thus establishing a robust platform for data storage and search, tailored visualization, and machine learning tool integration for material discovery and design.This integration supports a more systematic approach toward material data handling and application within the materials science domain.The ontology, not including its imports, is comprised of 172 classes and 1 object property.

5) DLOs IN MATERIALS SCIENCE ALIGNED WITH SUMO a: TRIBAIN ONTOLOGY [60]
The TribAIn ontology was developed to organize, standardize, and enhance accessibility to experimental data in the field of tribology, which focuses on understanding and controlling friction, lubrication, and wear in interacting surfaces.Due to inconsistencies in test procedures and terminology, as well as the practice of publishing results in natural language, accessing and reusing knowledge from tribological experiments can be challenging and time-consuming.Comparisons between different tribological systems or test conditions can be difficult to make.The TribAIn ontology seeks to address these issues by providing a formal, explicit specification of knowledge in the field of tribology.This allows for semantic annotation and searching of experimental setups and results, facilitating the selection of potential tribological pairings based on specific requirements and enabling comparative evaluations.To ensure generalizability, TribAIn is linked to the intermediate-level ontology EXPO, which focuses on the ontology of scientific experiments.
Additional subject-specific concepts are incorporated to address the unique needs of the tribology domain.The ontology's formalization is expressed using the OWL DL standard from the World Wide Web Consortium (W3C).In a use case demonstration, TribAIn effectively covers tribological knowledge from experiments, incorporating data from a range of sources, including natural language texts and tabular data.This illustrates the ontology's potential to improve the management, accessibility, and utility of experimental data in tribology.The ontology, including its imports (i.e. the version of SUMO it is based on), encompasses 238 classes and 59 object properties.

C. ONTOLOGY ALIGNMENT AND INTEGRATION APPROACHES
An analysis of the existing literature revealed no relevant deviations or innovations with respect to standard alignment and integration approaches or techniques [61] for the Materials Science area.Notably, automatic techniques used in many contexts (e.g.[62], [63], [64]) are strongly privileged, possibly due to the large number of concepts appearing in the relevant ontologies -one of the peculiarities of the domain.
On the other hand, in Materials Science the focus has been on Ontology-based data access, extraction and integration [65].To cite two examples, the authors of [66] proposed a hybrid Natural Language Understanding-driven technique to extract data from journal articles, whereas the authors of [67] created a database (the Crystallography Open Database) and employed it as a basis to unify and harmonize a plurality of resources.
Notably, the authors of [66] take inspiration from the GENE ontology [68], i.e. a hub for biomedical ontologies, and their approach could, in principle, be used to ground alignments between ontologies covering Materials Synthesis.Likewise, an ontology version of the Crystallography Open Database could be used as a core to integrate ontologies, besides databases.

III. METHODOLOGY FOR ANALYZING AND HARMONIZING DOMAIN-LEVEL ONTOLOGIES IN MATERIALS SCIENCE
This section describes a methodology for analyzing and harmonizing DLOs in Materials Science, and evaluating the required steps towards the establishment of a unified ontology framework for Materials Science.This process consists of two distinct phases, which will be detailed in the following subsections.

A. COVERAGE ANALYSIS
The first phase consists of a coverage analysis apt to evaluate the existing Materials Science DLOs, and the effective coverage of the domain.This phase includes the following steps: 1) Sub-domains in Materials Science are defined; a glossary of characterizing terms for each sub-domain is compiled, taking inspiration from golden standards.
Domain experts and community practitioners are called to identify the standards and validate the assignments; the validation is carried out jointly by a group of experts in order to reduce the risks and to ensure that coherent criteria are employed.

2) An automatic check is executed on the Materials
Science DLOs with respect to the characterizing terms, in order to assess each ontology's coverage of a certain sub-domain in Materials Science and situate them within the landscape.
3) The overall coverage of the domain is evaluated; the ontologies are analyzed singularly and comparatively to draw a better picture of the state of affairs.

B. HARMONIZATION OF DLOs VIA BRIDGE CONCEPTS
The second phase covers an approach to harmonize the ontologies individuated in the first phase, as well as the ontologies they are based on.The inclusion of upper-level ontologies, and specifically TLOs, serves both to act as a check for the consistency of the proposed alignments, as well as a facilitator.The choice is supported by standard strategies in the literature, and also goes ways towards the establishment of a more comprehensive and reasoning-friendly framework, possibly extending beyond Materials Science per se to cover other aspects of manufacturing value chains etc.This phase includes the following steps: 1) a statistical analysis of relevant terms' frequency is carried out on the individuated ontologies for the domain, in order to identify potential candidates for bridge concepts capable of supporting connections among a large number of ontologies.2) A template for bridge concepts is defined, both as a guide for bridge concepts engineering and to guarantee a FAIR documentation of entities and alignments.3) Bridge concepts are engineered by working groups including both domain experts and ontologists, following the results of the statistical analysis.
More details will be provided infra.

IV. MATERIALS SCIENCE: ANALYSIS OF THE DOMAIN AND OF THE DLOs
The following three subsections implement steps 1 to 3 of the coverage analysis in the methodology described above, respectively.

A. CLASSIFICATION OF SUB-DOMAINS IN MATERIALS SCIENCE
Materials Science is a very wide field encompassing a number of disciplines and sub-domains.In order to attain a clearer picture of DLOs' domain coverage, it was decided to assess the ontologies depending on the terms appearing in their documentation.Though indicative and susceptible to errors, this approach was deemed appropriate (and conductive to reasonably accurate results) given the sheer volume of data to be analyzed.
The well-respected, de facto standard textbook for Materials Science, i.e.Callister's Materials Science and Engineering (now at its 10 th edition [69]) with its glossary of some 700 terms, was chosen as a point of reference.To ensure comprehensiveness, supplementary terms were sourced from the materials modeling CWA [70] (based on the Review on Materials Modeling (RoMM) [71]) and the materials characterization CWA.Key chemistry terms (atom, molecule, etc.) were also included in the list.
This collection of terms was then manually allocated by experts in Materials Science from the OntoCommons project in one of six subdomains partitioning Materials Science's logical space (listed below, alongside the number of relevant terms), depending on their usage.The operation was carried out by groups of experts subsequently subjected to crossvalidation, to ensure that uniform criteria were adopted, and that borderline cases were subjected to thorough discussion.
1) Materials classes -86 terms 2) Materials structure -143 terms 3) Materials properties -99 terms 4) Materials behavior -60 terms 5) Materials technologies -258 terms 6) Materials theories -91 terms The complete list of terms for each sub-domain is reported in Appendix A. It's worth noting that a term could be classified in more than one sub-domain.However, a principle of salience was adopted to ensure more insightful results.Additionally, it was decided to adopt a permissive approach with respect to polysemy and semantic indetermination.To provide a concrete example, a single term might be used both for the (functional description of a group of) materials and their characteristic property, e.g.extrinsic semiconductors; similarly, a term can stand both for a device and a material: e.g.optical fiber.In such cases, both uses of the terms were entered.Furthermore, alleged synonyms were identified (e.g.environment and medium).Naturally, within a domain ontology that describes a particular materials technology, such as characterization, it is expected that specific tools will be connected to the materials amenable to investigation by the latter.Likewise, the structurally measured attributes of the materials are expected to be presented alongside the recorded properties.Nevertheless, in the present analysis, interconnections among the six subject domains are intentionally omitted.This deliberate choice was made to provide a clear-cut assessment of each DLO's specific focus.The terms found in an ontology's documentation are strongly indicative of its contents and domain of application: for instance, it becomes evident whether an ontology describes particular materials, and also which materials structures it documents.
The identified list of terms is by any means not exhaustive (relevant glossaries from textbooks on physics and chemistry could be added to the list), and it does not aim to be so: however, if a domain ontology claims to concern Materials Science, it is arguably expected to include some of the identified terms.Most of the materials among the types described in Callister's textbook actually appear as materials classes in this analysis.

B. CLASSIFICATION OF THE DLOs AND (SUB-)DOMAIN COVERAGE
The glossary and the assignment of terms to sub-domains formed the basis for situating each ontology in the landscape and analyzing the overall coverage of the Materials Science domain.While domain experts initially reviewed DLOs against the glossary and the terms identified for each subdomain, this curated process was also supported by a machine-processed computational analysis, given the large number of ontologies involved.
The computational analysis aimed to determine whether these terms were present within each considered ontology, in order to evaluate each ontology's relevance with respect to identified sub-domains.From a technical standpoint, the terms were organized within an Excel file by subdomain, and served as input for a procedure designed to read, parse, and scan each ontology for their presence.It's essential to note that the search was conducted broadly, including verbatim terms and slight syntactical variations (e.g.their corresponding adjectival form), against all the elements of each ontology including classes, properties and comments/annotations.The results of this search are presented in Table 2, where the numerical values provided for each ontology and domain correspond to the percentage of terms associated with a given sub-domain that have been detected within a given ontology.In line with what has been said, these figures provide insights into the extent to which an ontology covers a particular sub-domain.
The main findings of this analysis are as follows.In general, the coverage of terms from the list is relatively low.The highest percentage coverage was observed in the Materials Properties sub-domain, with a maximum coverage of approximately 35% of terms, with 9 ontologies covering more than 10%.The lowest coverage was found in the Materials Classes sub-domain, with a maximum value close to 19%.This result might appear prima facie surprising, given the importance of materials classes, though it may be attributed to the extensive diversity of materials types: Materials Science DLOs typically focus on just some types of materials (e.g.metals and alloys vs polymers), and research and application areas are typically separated by materials types.
These results show that the overall coverage of each sub-domain by each ontology is limited, with the highest percentages being no higher than about 36%, and with average values being much lower.Moreover, they stress the need for harmonization, as no single ontology can comprehensively cover any sub-domain within Materials Science, not to talk about the domain itself.

C. DLOs ANALYSIS: LIMITATIONS AND OPPORTUNITIES FOR HARMONIZATION
As an uptake of the previous analysis, several limitations and opportunities for harmonization can be discerned: Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

TABLE 2.
Results of the automatic analysis of DLOs.The numerical values provided for each ontology and domain correspond to the percentage of terms associated with a given domain that have been detected within a given ontology.
• Lack of a comprehensive and unified ontology framework.Existing Materials Science ontologies often focus on specific subdomains, resulting in fragmented knowledge representation.The establishment of a unified ontology framework could facilitate more efficient data integration and interoperability across subdomains.
• Limited coverage of Materials Science subdomains.Some subdomains, such as biomaterials and energy materials, are not well-represented in existing ontologies.Harmonization approaches including a built-in way to individuate and fill conceptual gaps, might offer potential solutions to this issue, or contribute to that end.
• Inconsistencies in terminology and semantics.Different ontologies may use distinct terminologies and semantics for similar concepts; alternatively, they might use the same labels for different things: both phenomena lead to challenges for data integration and interoperability.
Harmonization efforts should concentrate on aligning and standardizing terminologies and semantics across ontologies.
• Opaque ontological commitments & semantics.It is often unclear what a certain ontology entity stands for, given alternative equally reasonable interpretations.This is partly due to the fact that DLOs' architecture is often organized pragmatically.Opacity poses challenges for alignment, which require ways to check for consistency (e.g.links with TLOs).
• Incompatible (and often unclear) ontological commitments.As in other domains, Materials Science DLOs are partly task-oriented, and, thus, endorse the worldview which they find more suitable given specific aims.This poses another challenge when it comes to alignment, which can be partly addressed by making the commitments more transparent by establishing links with TLOs.
• Scalability and maintainability.With the continuous growth of Materials Science, it is crucial to develop a scalable, open, and maintainable ontology framework capable of adapting to the evolving demands of the entire domain.

V. HARMONIZATION OF DLOs THROUGH BRIDGE CONCEPTS
The points above can be seen as guidelines and desiderata for a harmonization methodology.Among other things, it appears pivotal to focus on specific ontology entities that can support informative data exchanges across the network, improve the FAIRness and transparency of the ontologies to be included in the ecosystem and their ontological entities, and provide a tool that also supports links with TLOs, to ensure consistency, validate the proposed alignments and deal with some of the issues outlined above.
In this section, a methodology based on bridge concepts is briefly outlined, following the approach sketched in Section III.Specifically, the following three subsections implement steps 1 to 3 of the harmonization of DLOs via bridge concepts, respectively, while a fourth subsection serves to discuss potential improvements and limitations of the outlined approach.
In short, a statistical analysis of the terms employed in the involved DLOs serves to individuate core nodes in the network of ontologies.Standalone entities are engineered taking the individuated terms as a starting point; they are informally characterized referring to golden standards in a FAIR and extensive fashion; semantic connections are established with ontology entities belonging to Materials Science's DLOs and relevant TLOs.Bridge concepts operate as minimal data pipelines, enabling data exchange at core junctures.

A. IDENTIFICATION OF THE MOST FREQUENT TERMS IN DLOs
In the previous section, the full list of 700 terms has been used to check which of those terms appear in which ontology; however, it can also be used with the purpose of determining the most frequent terms overall.While focusing on terms (labels) is fairly problematic, once again it should provide some hints on which concepts are most relevant within the context of Materials Science DLOs.The results of this automatic statistical analysis are shown in Table 3: the numerical values indicate the number of ontologies (and the percentage) in which each term appears.
Considering the terms' frequency, Table 3 shows the terms with high occurrence, down to the term ''molecule'' at 47%.

B. BRIDGE CONCEPT TEMPLATE
The following template is proposed for documenting bridge concepts.The template, acting as a human interface, can be divided into three main parts: 1) General Information.It contains five subparts: • Concept Name.The label, preferred label, or IRI 3 title to be used for identifying the bridge concept.
• IRI.IRI proposed for the bridge concept.
• OWL Type.A value between Class | ObjectProperty | Individual.In this work, the focus will be on the former for the sake of simplicity.
• Concept Elucidation.Natural language informal definition of the concept (elucidation), meant to be easily understandable by domain experts.Elucidations ought to be in line with common sense and knowledge domain resources; they must not refer to other ontology entities, and, ideally, they should be ontologically neutral, i.e. they should avoid taking a stance on matters which do not pertain to a specific domain the concepts belongs to.Diverse examples of usage are included whenever possible, and relevant ambiguities that could potentially compromise ontology usage are explicitly addressed.A final desideratum for elucidations is brevity.
• Labels.Labels used to address the concept, ordered as: (i) preferred (one) -the label primarily 3 Internationalized Resource Identifier: a unique string that identifies an ontology or an ontology entity in RDF spaces.
120386 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.used to refer to the concept in is meant to be both intuitive and informative; (ii) alternative (multiple) -labels that are commonly used to address the concept in practice, even if they are used with narrower or wider sense; (iii) deprecated (multiple) -labels that misleading with respect to the concept, in that they are ambiguous or encourage misuse).Hidden labels can be included to support queries.2) Knowledge Domain Resources.It contains two subparts: • Related Domain Resources.Existing domain resources (e.g.standards, books, articles, dictionaries) that were taken into account for the engineering of the bridge concepts.The template includes static references to the resources and quotes of the relevant informational content.More than one resource can be reported; widely shared resources that are likely to have influenced the users are also covered.Not only do these resources act as a guiding light in the engineering phase, but they also help domain experts situate the bridge concepts, acting as points of reference.As such, a second hub-and-spoke structure pertaining to informal characterizations is established, improving conceptual clarity overall.
• Comments.They explain the motivations behind the concept definition with reference to the domain resources, underlying similarities and differences.3) Alignment to Existing Ontologies.It contains six subparts: • Target Ontology.IRI of the ontologies that encompass ontology entities supporting the establishment of semantic connections with the bridge concept.
• Related Ontology Entities.List of the identifiers of the specific ontology entities to which the bridge concept is semantically connected, included for the sake of FAIR-ness.
• Mapping Elucidation.Natural language discussion of the mapping choice and the underlying rationale, including possible alternative mappings considered and evidence gathered, facilitating the evaluation and validation of the proposed connection by third parties and contributing to the clarification of the bridge concept itself.
• Semantic Relationship Level.implementation in mind and it is FAIR by design.It is worth noting that, as a result, bridge concepts can contribute to addressing issues related to ontologies' lack of documentation, once mappings are established.Finally, template is organized into three parts: the first part contains essential information relevant to users, while the second and third cater to domain experts and ontologists, respectively: thus its structure is arguably user-friendly.

C. BRIDGE CONCEPT ENGINEERING
Based on all of the results from the automatic and manual analysis, as well as from further expert discussion, it would be very impactful for domain-level interoperability in chemistry and materials if bridge concepts were available for all of the terms in Table 3, in relation to their use in Materials Science.In addition, experts analysis proposed that the concept of ''Materials Process'' should also be included.While it does not occur with the highest frequency, the ontologies include a wide range of materials processes in a variety of ways, thus a general superclass would be helpful for bridging.
In the following discussion, some general observations regarding the representation of the concepts 'material' and 'materials property' in Materials Science's DLOs are provided.These highlight some of the issues encountered throughout.
(a) Material: The concept of material is of course crucial to all of Materials Science studies.However, while the more generic 'material entity' (meaning anything of a material nature rather than abstract nature) is widely used, and also the concept of a chemical substance is reasonably well established (see e.g.https://schema.org/ChemicalSubstance),material in the sense of its use in Materials Science tends to be represented in different and sometimes contradictory ways.
In particular, the questions are how the concept 'material' relates to 'matter', and also how it relates to the field of chemistry, where the term 'chemical substance' has been defined in the IUPAC Goldbook [72] as a ''Matter of constant composition best characterized by the entities (molecules, formula units, atoms) it is composed of''.Furthermore, the question is whether individual atoms and molecules (known as 'molecular entities' in the IUPAC Goldbook) are subclasses of materials or more broadly 'matter'.In the bridge concepts developed in OntoCommons, the elucidation of a material states that it is ''an amount of matter at the super-molecular level'', hence differentiating it from molecular entities, in line with IUPAC Goldbeck.It includes chemical substances or mixtures of substances in different states of matter or phases ('continuum matter'), as well as nanomaterials such as nanoparticles ('mesoscale matter').Material is hence broader than or superclass of a chemical substance, since the latter requires a defined composition, whereas the composition of a material may not even be known.This is in contrast to the usage of materials classes in a number of ontologies.
(b) Materials Property is a particular characteristic associated with a material.It is often referred to as a quality, which is inherent in a material, but can also be regarded as the outcome of an observation using a certain method and involving an interpretation of that observation as the property.Materials Property is not necessarily quantitative and not necessarily related to measurement.Nevertheless, some ontologies do not clearly differentiate between materials properties and physical quantities, which require a standardized definition capable of supporting quantification.
So far, the bridge concepts for Material, Materials Property, Material Component and Experiment have been elaborated.As an example, the Material Component bridge concept is provided in Table 4,5,6,7,8,9.Materials Component plays an important role in one of the largest ontology efforts in Materials Science, namely in NanoMine/MaterialsMine, the latter addressing composite materials.The study of (Material) Component also may enable further elaboration related to other connected terms, such as 'Constituent' and 'Part'.Furthermore, by taking a closer look at the ontologies that have a high degree of domain coverage in Materials Science, the following priority ontologies (or groups of ontologies) were identified: • eNanoMapper, which already has some alignment with the Nanoparticle ontology • CHEBI • EMMO-based ontologies • MaterialsMine and NanoMine Also, MSEO as a BFO-based DLO in Materials Science was deemed important, but it currently does not cover the term 'Component'.While CHEBI also does not include 'Component', CHEBI is the most widely used chemistry ontology, and hence it was important to discuss its mapping regarding such a relevant concept.Further bridge concepts are available and will be further developed on the following GitHub repository: https://github.com/OntoCommons/OntologyFramework.

D. POTENTIAL IMPROVEMENTS AND LIMITATIONS
The harmonization approach outlined in this section has several limitations.For instance, the focus on terms (labels) in the automatic analysis might lead to the selection of sub-optimal candidates, even more so considering the issues related to polysemy, semantic indeterminateness, and intra-domain variance considered in the course of the discussion.While the semantic connections are re-evaluated case by case, the selection of a sub-optimal candidate can lead to a waste of resources and to less informative connections.
In general, approaches revolving around pinpointed links will always be less informative (i.e.support the transmission of less information/data) than full mappings.
Another problem concerns the difficulty of establishing connections between the bridge concepts and ontology entities belonging to the ontologies part of the ecosystem: while having a properly characterized point of reference can help, this process is not per se any simpler than the establishment of a standard semantic connection between two ontology entities whatsoever.While the connections with TLOs might offer a consistency check, thus defining affordances and constraints and partially validating the links, extreme caution should be exercised for each and every alignment.
Lastly, inconsistencies might emerge due to mistakes in the ontologies considered, or due to incompatible ontology commitments.Given the focus on harmonization rather than integration, only a cooperative effort with the ontology developers can help resolve similar issues.
Overall, the methodology might appear as a fallback solution given the intrinsic difficulties involved in the establishment of a harmonized networked ecosystem; however, there could be ways to improve the approach without sacrificing scalability.For instance, automatic alignment tools could be employed to improve the selection of candidate terms, moving beyond terms (labels); the statistical analysis could also be improved by weighting the candidates making use of graph theory and network sciences; finally, it might be possible to partially automatize the engineering process, going ways towards a fully bottom-up approach starting from a given set of ontologies and selected documentation.These points might be explored in future works.

VI. CONCLUDING REMARKS
This work has provided a comprehensive examination of the current state-of-the-art of DLOs for Materials Science, highlighting their main characteristics and their connections with existing TLOs The harmonization of DLOs with TLOs has been identified as a crucial challenge in developing a unified ontology framework for Materials Science.The role of TLOs in supporting the harmonization of DLOs has been examined, mentioning the main TLOs currently available and/or under development for the area, which may be exploited for facilitating consistency and cross-domain integration, as well as to ensure interoperability.
Harmonization via TLOs is arguably not enough to establish strong DLO-DLO pipelines, therefore, this work has proposed an approach for the establishment of multi-level, harmonized ontology networks based on the identification of ''bridge concepts''.A template for bridge concepts has been presented, and a methodology made up of both a manual and automatic semantic analysis has been proposed and applied to the identified DLOs in order to bring about a number of potential bridge concept candidates.Eventually, an example of a bridge concept has been proposed as a result of this process, showcasing actual semantic alignments mediated by the latter.
Ideally, this work is a step toward the creation of a more coherent and unified ontology framework for Materials Science, being the latter one of the core objectives of the OntoCommons project from which it stems.Such a framework is expected to facilitate more effective data integration and interoperability across Materials Science subdomains, ultimately promoting collaborative research, innovation, and the accelerated discovery of new materials and their applications.

APPENDIX A SUB-DOMAINS IN MATERIALS SCIENCE WITH THEIR CORRESPONDING LIST OF TERMS A. MATERIALS CLASSES
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

TABLE 3 .
Most frequent terms derived from the automatic analysis of DLOs.

TABLE 4 .
Part 1 of the Bridge Concept template filled for ''Material Component'': General information.

TABLE 5 .
Part 2 of the Bridge Concept template filled for ''Component'': Knowledge domain resources.

TABLE 6 .
3.1 of the Concept template filled for ''Material Component'': Alignment to existing ontologies -ENM.Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

TABLE 7 .
Part 3.2 of the Bridge Concept template filled for ''Component'': Alignment to existing ontologies -MM.

TABLE 8 .
Part 3.3 of the Bridge Concept template filled for ''Component'': Alignment to existing ontologies -EMMO.

TABLE 9 .
Part 3.4 of the Bridge Concept template filled for ''Component'': Alignment to existing ontologies -CHEBI.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.