Converting OCL and CGMES Rules to SHACL in Smart Grids

Models are first-class elements in Model-Driven Engineering (MDE). In this paradigm, the most widespread approaches adopted by the development community are Object-Oriented and ontological, formalized using Unified Modeling Language (UML) and Resource Description Framework (RDF), respectively. However, Object Management Group (OMG) does not provide a specific standard language for validating UML models against Object Constraints Language (OCL) constraints; meanwhile, World Wide Web Consortium (W3C) has defined Shapes Constraint Language (SHACL) as a standard validation language. Although the transformation between UML and RDF can be performed at the structural level, no effort has been made to transform OCL to SHACL. This paper addresses the transformation of OCL and text-based constraints to SHACL shapes in the context of Common Grid Model Exchange Standard (CGMES), a UML-based standard for electric utilities in Europe. This paper presents several contributions to the software engineering community. First, solving the validation problem in a standardized way. Second, facilitating European Network of Transmission System Operators for Electricity (ENTSO-E) the construction of an ontology associated with the CGMES standard. Third, allowing developers to integrate the two complementary approaches. Finally, Promoting the adoption and integration of the ontological approach in the software community.


IEC
International MDE is a paradigm where models are first-class elements throughout the software development life cycle. Object-Oriented is an MDE approach where the key elements to model the world are the class and the object. UML [1] is a visual modeling language that addresses modeling process with an Object-Oriented approach. OCL [2] defines added constraints on the objects in the UML model. RDF is a semantic web technology. In this approach everything is a resource with a unique identifier. Models are built from the perspective of resources, and predicates. Thus, the atomic knowledge representation is a triple in the form of (subject predicate object). Models are graphs. SHACL is a W3C standard validation language for validating conditions against RDF data graphs, in addition it can be used as a modeling language.
In MDE, models can be addressed from different approaches; this work focuses on the Object-Oriented (UML/OCL), and ontological (RDF/SHACL) approaches, as shown in Fig. 1. Regardless of the modeling paradigm that is adopted, a model consists of a metamodel, rules, and instances. The transformation from a model in the source paradigm to another model in the target paradigm involves transforming lexical, syntactic, and semantic aspects from the source paradigm to the target paradigm, as indicated in Fig. 2.  In the case that concerns us, the transformation is performed between the Object-Oriented paradigm (UML) and the ontological paradigm (RDF), as shown in Fig. 3.
Common Information Model (CIM) 1 is a UML-based standard adopted by electrical utilities for exchanging network models between companies. It consists of a set of International Electrotechnical Commission (IEC) standards; Table 1 shows a subset adopted in Europe by European Network of Transmission System Operators for Electricity (ENTSO-E).
In the context of ENTSO-E, Common Grid Model Exchange Standard (CGMES) is a standard aimed at exchanging power system models between Transmission System Operators (TSO).
CGMES is a standard for interoperability based on the UML/OCL standard and can be considered an MDE approach. However, although OMG provides UML together with OCL as languages for modeling, it lacks a standard 1 The CIM acronym can be found in two domains: Electricity domain officially adopted by IEC, and IT environment domain officially adopted by Distributed Management Task Force (DMTF). Our case is the first one.  validation language to validate UML models against OCL constraints. Consequently, validation is performed using proprietary, non-standard methods, see Fig. 4, thereof resulting in a negative impact on essential software factors such as quality and sustainability, among others.
The solution proposed in this work is to adopt SHACL as a language to validate models against the rules in the CGMES context, see Fig. 5. The effort consists of (i) the elaboration of a methodology to transform the OCL and textbased rules to SHACL (ii) the conversion of all CGMES rules to SHACL (iii) the development of an application that allows validating any model of a TSO, member of ENTSO-E, against the CGMES rules.
This work presents a series of contributions: (i) A methodology to convert OCL to SHACL, which has never been addressed before. (ii) Provide ENTSO-E stakeholders with an RDF/SHACL ontological model to validate their models and guide them in developing their own rules in SHACL, either by converting existing ones or creating new ones. (iii) A web application to validate model compliance with the CGMES rules. (iv) Since a big effort is being made from different communities to promote the adoption of ontological engineering through semantic web technologies, our contribution by addressing the conversion from OCL to SHACL bridges the gap between Object-Oriented and ontological paradigms.
The target audience of this paper is domain experts in utilities, and software developers in the context of model validation.
This paper is split into nine parts. In part 2, related work is presented. In part 3, CGMES standard documents are described. Part 4 explores the reasons to adopt SHACL as a validation language. Part 5 shows a basic modeling rule in SHACL. Part 6 describes the methodology to transform OCL and CGMES rules into SHACL shapes. Part 7 presents a web application to validate CGMES models. Finally, part 8 contains conclusions and future works.

II. RELATED WORK
In the last decades, the two most promising approaches for modeling are Object-Oriented and the ontological. The software community has adopted (UML/OCL) for the Object-Oriented modeling approach. On the other hand, the ontology modeling approach is performed using semantic web technologies (RDF/OWL/SHACL) [3]. This section is mainly divided into two parts: (i) a presentation of works related to Object-Oriented modeling, (ii) and another one that includes works related to modeling with ontologies.
A set of publications regarding UML/OCL approaches has been found. In [4], a group of automation strategies for OCL rules is presented. The meta-model evolution has a significant impact on the OCL rules. This work presents a semiautomatic approach throughout a strategy and a technique for OCL rules evolution concerning class diagram changes. In [5], a web-based tool has been developed using a user-friendly language at a higher level of abstraction. This language allows for rules definition. Therefore, knowledge of OCL syntax is not required. In this work [6], authors have used predicates logic to transform UML/OCL to validate UML models. This methodology reduces the abstraction level towards First-Order Logic (FOL) for validating purposes.
In [7], it is created a logical model around UML/OCL considering preconditions, postconditions, states, and transitions between them. This approach adds new constraints (modifies only statements) to validate and verify UML models, taking into account all possible states and transitions between them, thus, reasoning about behavioral validation and verification. [8] demonstrates a conversion from UML/OCL model to the constraint logic programming (CLP) paradigm to allow for validation, reasoning, and verification. Only OCL invariants are considered. A new framework has been developed for automating transformation between class diagram to logic specification (Formula). With a reasoner, satisfiability is determined within certain bounds, i.e., if an instance model can be created.
In [9], a framework is developed with a repository of models. A common data model is created to abstract and integrates heterogeneous sources such as Extensible Markup Language (XML), ontologies, and so on via transformation to query and homogeneously visualize them. It is an independent modeling tool and can be integrated with other modeling tools. In state-of-the-art, it is possible to validate rules in the ontology paradigm. A set of publications regarding ontologies approaches has been found.
In [10], instead of validating RDF data against a schema, a model is validated against RDF data. The purpose is validating model credibility or giving the capability of ontology learning. Validation is performed using axiom scoring heuristics based on possibility theory.
In [11], Semantic Web Rule Language (SWRL) is used for data exchange applied to the water domain using Water Data Transfer Format (WDTF) for the definition of integrity constraints and using Web Ontology Language (OWL) for modeling this domain. Despite OWL and SWRL consider Open-World Assumption (OWA), authors have used OWL/SWRL for model validation. Validation is performed by transforming axioms to queries on the logic domain and transforming these queries into SPARQL Protocol and RDF Query Language (SPARQL), which are machine-readable. However, the authors recognize that SHACL is suitable in the validation contexts.
As can be seen in the state-of-the-art, there are several nonstandardized validation approaches [10], [11]. All of them are in a web semantic context. In 2017 SHACL language has been released as an RDF validation standard in the ontology modeling domain. With our present state of knowledge, no works have been found related to the transformation of text-based and OCL rules to SHACL.
In [12], several validating RDF approaches are described. One of them is SHACL, the first standard validation language for RDF graphs. In [13], an online validation tool for SHACL is available. In [14], an open-source standard Experience API (XAPI) is a starting point for the e-learning domain in collecting learning data. Authors have transformed the XAPI standard model to an ontology, and they have used SHACL for validating the constraints defined in the XAPI standard. JSON is used to serialize data. In [15], authors have transformed Clinical Information Modeling Initiative (CIMI) to ontologies, using SHACL to model and define rules with validation purposes. In [16], a collaborative ontology is created from information streams coming from different sources, both public and private, in the domain of environmental sensors. Semantic mapping, along with data enrichment, is performed. Data are validated using SHACL defining collaborative rules. For that purpose, authors have developed a framework named LSane. In [17], the quality evaluation of a couple of Knowledge Bases (KB) is performed based on completeness and consistency by using a semantic web framework. This evaluation has been performed using SHACL and SPARQL queries. Since these KBs are RDF-based and public, a statistical analysis has been realized. The main contribution of this work is the quality assessment of KBs subject to constant evolution. In [18], the first attempt to use SHACL for modeling and validating CGMES models has been realized but limited to RDF schema validation without addressing OCL rules and text-based CGMES rules.
The domain of this work is data exchange for utilities. As it is shown in previous paragraphs, there is a need from different domains to transform the modeling process and models into an ontological approach with RDF/OWL and SHACL. This work is focused on the most tedious task for this transformation between different approaches: transforming OCL rules to SHACL shapes. For transforming UML to RDF or OWL, OMG specifies this transformation in Ontology Definition Metamodel (ODM) and in MOF to RDF Mapping, which is a new beta specification. However, no work has been found so far concerning the transformation between OCL rules and SHACL. W3C has specified SHACL as standard for RDF validation. In this work a methodology is presented for: (i) the conversion of OCL constraints to SHACL dealing with key lexical, syntactic, and semantic elements in both grammars, (ii) the transformation of text-based rules to SHACL has been addressed both manually and semi-automatically given the ambiguity of natural language. Moreover, with this methodology, the authors have modeled/implemented all the CGMES rules, namely OCL and text-based ones. Additionally, they have developed a microservices-based application for model validation.

III. CGMES STANDARD DOCUMENTS
''Common Grid Model Exchange Specification (CGMES) is an IEC Technical Specification (TS) based on the IEC CIM family of standards. It was developed to meet necessary requirements for TSO data exchanges in the areas of system development and system operation'' [19].
ENTSO-E CGMES provides a set of documents for the standard specification. The Key documents that are of interest for this work are shown in Table 2.
• QoDCRules.xsd is an xsd file for defining rules model, see Fig. 6.

IV. WHY SHACL
In the context of modeling, especially in validating models for quality in interoperability when exchanging models, there is a need for a standard validation language. ''SHACL is a language for validating RDF graphs against a set of conditions'' [22]. W3C provides SHACL to allows for the model's validation against the rules. However, no standard validation language is provided for validating models defined in UML (Object-Oriented). A key aspect of MDE is the model's transformation. Thus, transforming UML/OCL model to RDF/SHACL and validating with semantic web technologies.
A particular case is the CIM standard, whose foundations lie on Object-Oriented techniques (UML) [23]. Therefore, models are class and object diagrams, and with the purpose of model serialization, CIM standard has adopted the W3C RDF language.
It is necessary to note a characteristic of high relevance: the ''executability'' of the model. It can be defined as follows. A model M formalized in a language L under a paradigm P is executable if there exists a processor that allows reasoning tasks on the model for a specific objective without any model transformation. (e.g. P = Object-Oriented, P = ontology, L = RDF; L = UML).
Since UML is a visual language, models formalized with it, are not directly executable. Therefore, it is necessary a transformation to another language that allows the executability feature. However, models defined in RDF are directly executable, in the sense that they support reasoning, see Fig. 7. ENTSO-E has created CIM profiles to specify the CGMES standard. These profiles are mainly class diagrams, together with additional rules written in OCL or text-based. With this specification, it becomes clear that the model is not directly executable. A crucial task such as validation in this context can be performed through: (i) Transforming the model to another paradigm that allows model executability. (ii) Using a general-purpose language, but the text-based and OCL rules require a considerable effort to be implemented, which negatively affects the software quality. (iii) Using Eclipse Modeling Framework (EMF) with OCL, but the returned validation report lacks elementary information since it contains only a reference to the object and the invariant; in essence, OCL is not a validation language. VOLUME 8, 2020  W3C, in response to the community's need for a standard language for validation, has provided SHACL for this purpose.
The arguments for adopting SHACL are: (i) Incorporating SHACL as a standard validation language for the CGMES standard to solve the validation problem. (ii) The creation of an ontology as part of CGMES, representing the structural part in RDFS and the rules in SHACL. Therefore, providing a reusable and shared knowledge (reasoning, processing, querying) and being both machine-readable and human-readable, through semantic web technologies. (iii) RDF/SHACL models are directly executable, which translates into a considerable saving of time and effort in coding into a general-purpose programming language, and into an improvement of process and product quality, as this approach is more aligned with MDE [24]- [26] than UML/OCL.
Additionally, the authors have performed two comparisons concerning: (i) The presence of Object-Oriented and ontologies approaches in key modeling aspects in the CGMES standard, see Table 3. (ii) Features offered by the OCL and SHACL languages, see Table 4.

V. BASIC MODELING OF A RULE IN SHACL. BASIC CONSTRUCTS
This section aims to illustrate the essential elements of SHACL from an architectural, syntactic, and semantic point of view. In the RDF world, everything is a resource. The atomic model in RDF is a triple (subject predicate object) that corresponds to an underlying graph, see Fig. 8. 2 XML metadata interchange 3 Universal resource identifier (URI)  SHACL is a language to validate data models against restrictions. It should be noted that both the data and the language are represented in RDF language. Therefore, they are graphs. SHACL processor has two inputs, RDF data and   a set of rules. It returns a validation report in RDF, which contains the nodes of the data graph where the rules have been violated, see Fig. 9.
The SHACL processor traverses the graph, visiting the focus nodes to evaluate the rule. In case of non-compliance, information about the coordinates where the rules are violated is added to the validation report with additional information such as a feedback message indicated in the rule, see Fig. 10.
W3C has divided the SHACL specification into SHACL-CORE, which contains the basic constructs, and SHACL-SPARQL, which provides constructs for a more advanced expressivity, based on SPARQL.
The following example illustrates a simple rule, its implementation with SHACL-CORE, and the output of the validation report. The RDF data is described in Listing 1.
Where :Peter object of type :Person, is declared (:Peter a :Person;) with two identification codes ''123456789A''   and ''123456789B''. The graph corresponding to the previous RDF is shown in Fig. 11.
In this rule sample, all instances of type '':Person'' should have a unique value for '':idCode''. SHACL code that defines the rule is depicted in Listing 2.
Listing 3 shows the validation report obtained once SHACL processor is executed with input data corresponding to Listing 1 (RDF data) and Listing 2 (SHACL shapes).
In the RDF validation report depicted in Listing 3, the RDF data violates the rule because there are two ''idCodes'' for the same person. The report shows violation coordinates in terms of the node and the rule where non-compliance has occurred. Additional information, such as severity and a feedback message, is presented.

VI. METHODOLOGY TO TRANSFORM OCL AND CGMES RULES TO SHACL SHAPES
Object-Oriented and ontological paradigms are formalized by UML/OCL and RDF/SHACL, respectively. The transformation from UML/OCL to RDF/SHACL is shown in Fig. 12, and involves the following steps: 1. Class diagrams are transformed into RDF/OWL schemas by mapping UML concepts to RDF concepts. 2. OCL constraints are transformed into their SHACL shapes counterparts. 3. Text-based rules are defined in SHACL constraints 4. UML Objects are converted to RDF data. The focus of this work is on transforming OCL and text-based rules to SHACL.

A. OCL TO SHACL
This section aims to illustrate how to perform the transformation from OCL to SHACL by presenting the key ideas that facilitate UML transformation with semantic web technology applied to the domain of electrical utilities (CGMES).
Some requirements must be taken into consideration during the activity of the transformation from OCL to SHACL as follows: a. How the SHACL processor runs through the data graph with the constraint graph. b. The main task is to produce triples from the components of the OCL expressions. An invariant in OCL is defined as follows: ''Context ClassName inv: invariantName oclExpression'' defines a constraint identified by invariantName, and where ClassName represents the name of the class of the objects on which the boolean condition indicated in oclExpression is applied. An OCL invariant and a SHACL shape can be represented by the UML classes in Fig. 13.
The mapping between Invariant and ShaclRule is presented in Fig. 14.
The process of transforming OCL to SHACL consists of the following stages: 1. Extract the main concepts from the OCL invariant and build with them the initial structure of the   SHACL rule, as illustrated in the Ocl2Shacl() algorithm shown in Listing 4. As OCL is used to define constraints on UML class diagrams, and SHACL is a language to validate RDF data against RDF shapes graphs, the Ocl2Shacl() algorithm needs as a parameter CIM_UML_RDF_TABLE, a table that contains the mapping of the domain concepts between UML and RDFS. The mapping, which is an inherent part of the CGMES standard, is shown in Table 5. 2. The mapping between OCL and SHACL concepts lies in the observation that the primary or main element of an OCL expression is navigability, and it starts from the object whose type is indicated by default in the context statement of the OCL constraint, or explicitly, such as Class.allInstances().count() where Class is any class in the class diagram that represents the model. The main idea or task is the creation of triples (subject predicate object) from OCL expressions of the form ''a.x'' representing the navigability starting from an object on which the rule is applied to the feature x. The oclEx-pression2Sparql() algorithm illustrates a high-level way to realize this task (see Listing 5). Table 5 shows the semantic and syntactic mapping between OCL and SHACL. Listing 6 depicts an excerpt of OCL grammar, defined by OMG, and Fig. 15 shows its corresponding abstract syntax where the primary element has the following form: exp :: = a opNav b (params)? (where opNav is '.' or '− >') LISTING 5. Transformation from OCL expression to SPARQL query. LISTING 6. An excerpt of OCL grammar specified by OMG.

B. FROM TEXT-BASED RULES TO SHACL
The algorithm's parameters are a text-based rule and a schema domain; Text-rule is specified in natural language in the context of the domain, in this case, electrical utilities; Text-based to SHACL manual algorithm.
The schema domain is the metamodel domain defined in RDF. Concepts in the text-based rule are translated to CIM concepts and are connected using the schema domain to build the pattern that corresponds to the original rule. The conversion from the concept of the domain in natural language to its corresponding one in CIM/XML can be realized with two approaches: (i) manual, with the collaboration of the domain expert or (ii) semi-automatic using Natural Language Processing NLP technologies in two steps. First, similarity-based processing is performed to determine the RDF schema concepts involved, and the domain expert intervenes to select the most appropriate match. Second, the pattern corresponding to the rule is built by searching for the subgraphs that interconnect the concepts found in the previous step and involving the domain expert to select the most appropriate pattern.

VII. TOOL DEVELOPMENT FOR VALIDATING CGMES RULES
Authors have developed a Model View Controller (MVC) web application as a set of microservices to validate models from utilities. Hypertext Markup Language 5 (HTML5) is used to implement the view part; the model layer is represented through RDF/RDFS/SPARQL/SHACL; the controller part consists of a set of Java classes that manage Hypertext Transfer Protocol (HTTP) requests and perform validation tasks invoking SHACL processor. Fig. 16 shows the architectural layer of the developed application. A more detailed architecture, representing the application components and the communication among them is shown in Fig. 17. For validation purposes, the system uses RDF/XML and schema as input data, representing domain models from electrical utilities. Validation levels have been grouped into three categories: (i) A microservice copes with validating RDF files at levels 1, 2, and 3. (ii) A second microservice deals with level 4 validation (OCL). (iii) Finally, a third microservice deals with levels 5 to 8. The reason for this separation is that levels 1, 2, and 3 deal with file naming, XML wellformedness and XML structure, namespaces, headers, and file packaging. Level 4 deals with OCL rules defined by the standard. Levels 5 to 8 deal with cross-profile consistency, business rules related to load flow, cross Individual Grid Model (IGM) consistency, and Common Grid Model (CGM) plausibility.
A microservice is provided for the assembling process, which consists of producing a graph from the four files that represent an IGM (EQ, TP, SV, and SSH together with the Boundary Set) and the corresponding RDFS.
The user interface of the application is shown in Fig. 18. Another objective of this application is to evaluate (i) The contribution of adopting RDF/SHACL within the MDE approach. (ii) The impact of using models that are directly executable in software development. (iii) The improvement in scalability, maintainability, and agility, among others, in the software process.
For developing an application in electrical utilities context with this approach, it is necessary: (i) Background in CIM, UML, RDF/S, SPARQL, and SHACL. (ii) The mapping of concepts from natural language to CIM concepts. (iii) The model and semantics of a rule. (iv) The navigability within the metamodel (RDFS) and the models (RDF) of the domain.
(v) Graph pattern building, corresponding to the rule that will be translated into a SPARQL query which will be embedded into SHACL.
For developing the application, Spring microservices, SHACL validation library (developed with JENA library), HTML5, and a Javascript library are required.

VIII. CONCLUSION AND FUTURE WORKS
In this work, a standardized solution for the validation problem has been presented for UML/OCL-based models, adopting SHACL as the validation language. To this purpose, a methodology for converting OCL and text-based rules to SHACL has been specified. Since CGMES is based on UML/OCL, a rule transformation to RDF/SHACL is required.
The arguments for adopting SHACL as a validation language have been presented through a comparison between the two approaches (UML/OCL vs. RDF/SHACL) considering essentials criteria for the model validation task such as violation reporting, reasoning, among others.
The basic concepts and constructs of SHACL have been presented, as a first approach to the semantic and syntactic foundations of this validation language within the modeling domain.
The methodology for the transformation from OCL and text-based rules to SHACL has been presented as algorithms. For the conversion from OCL to SHACL, key elements have been considered in lexical, syntactic, and semantic terms within the scope of the two grammars. The text-based transformation to SHACL has been addressed from the linguistic perspective towards the syntax and semantics of SHACL.
Authors have implemented all the rules, defined by CGMES, in SHACL. They have also developed a web application with Spring microservices to validate the eight validation levels of CGMES specifications. They will make them available to the user community, both electrical and software engineering.
The contributions presented in this work are: (i) Transforming OCL and text-based rules to SHACL, a task never addressed before, through a methodology that allows a systematic implementation of the rules. (ii) Providing ENTSO-E stakeholders with an ontology that enables them to validate their models. Additionally, it can help as a guide for their rule implementation. (iii) Making available a web application, developed by the authors, to the community of electrical utilities, to validate CGMES models against the eight levels of validation specified in the standard. (iv) Bridging the gap between Object-Oriented and ontological paradigms by addressing the conversion of OCL to SHACL, since a considerable effort is being realized from different communities to foster the adoption of ontological engineering through semantic web technologies.
This work has focused on the CGMES standard release that is currently adopted by ENTSO-E members. A minor limitation of this work is the optimization of rules expressed in SHACL, in terms of performance for SHACL processor, and implementing new CGMES standard releases in the short term.
To further our research, we plan to develop a Domain-Specific Language (DSL) in the electrical utilities domain, enabling experts to write text rules to be transformed into SHACL automatically.