GreenMicro: Identifying Microservices from Use Cases in Greenfield Development

Microservices architecture is a new paradigm for developing a software system as a collection of independent services that communicate through lightweight protocols. In greenfield development, identifying the microservices is not a trivial task, as there is no legacy code lying around and no old development to start with. Thus, identification of microservices from requirements becomes an important decision during the analysis and design phase. Use cases play a vital role in the requirements analysis modeling phases in a model-driven software engineering process. It is a technique of capturing high-level user functions and scope of the system. In this paper, we propose GreenMicro, an automatic microservice identification technique that utilizes the use cases model and the database entities. Both features are the artifacts of analysis and design phase that depict complete functionality of an overall system. In essence, a collection of related use cases indicates a bounded context of the system that can be grouped in a suitable way as microservices. Therefore, our approach GreenMicro clusters close-knit use cases to recover meaningful microservices. We investigate and validate our approach on an in-house proprietary web application and three sample benchmark applications. We have mapped our approach to state-of-the-art software quality assessment attributes and have presented the results. Preliminary results are motivating and the proposed methodology works as anticipated in producing functionally cohesive and loosely coupled microservice candidate recommendations. Our approach enables the system architects to identify microservice candidates at an early analysis and design phase of development.


I. INTRODUCTION
Monolithic architecture is one of the most extensively used architectures for web application. It bundles the user interface, business logic, and the data store into a single executable file that is eventually deployed as a single package. The web server accepts incoming HTTP requests, executes the request and produces a response. But, as the monolithic applications get bigger in size, codebase becomes more complex. Eventually, maintainability and scalability of monolithic application becomes a tough and expensive. A smallest change in the application necessitates testing and redeployment of the entire application. Furthermore, regarding scaling of the monolithic application, an increased traffic involves deploying the entire codebase even though barely a small subset of its component is overloaded.
by breaking them into tiny services, where each service satisfies its own bounded-context. Another important aspect of MSA is traceability between the functional requirements and microservice system structure. Therefore, only one service has to be scaled, updated and redeployed in case of a change in domain. This initiative enables faster time-tomarket and reduces turnaround time for each release for microservice-based architecture. These advantages come at the price of complex deployments of individual services, management overhead, and monitoring challenges. Microservice application development can be categorized as Greenfield or Brownfield [4]. Greenfield development approach refers to the development plan starting from a clean slate i.e. no legacy code around. Brownfield development means development of a new software system from an existing application. Here, we focus on greenfield development. It means development for a totally new environment from scratch with no restrictions or dependencies. System analysis and design artifacts of SDLC (software development life cycle) are available for decomposition in such scenarios. Major SDLC artifacts that can be used are use cases, Functional Requirements, Non-Functional Requirements, DFD, BPMN, API specifications, Class Diagrams, UML diagrams, Domain Driven Design and application and data design. Since there is no clear direction about microservices development, the degree of risk is comparatively higher for greenfield approaches which makes these developments more challenging. A use case is a methodology used in system analysis to identify, elucidate, and organize system requirements. Use cases are usually written by software analysts and can be used extensively during several stages of SDLC. Use case diagrams abstract high-level view of functionality. Use Case modeling is accepted and widely used in industry. A use case model includes use cases, actors and relationships between use cases [5]. Use Cases may be related to other use cases by the following relationships: Generalization, Include, and Extend [6] as shown in Figure 1. In Generalization, a parent use case may be specialized to one or more child use cases that indicate more specific forms of the parent. In such scenarios, the child inherits all structure, behavior, and relationships of the parent. Children of the same parent are all specializations of the parent. Generalization is simply the inheritance relationship between two use cases by which one use case inherits all the properties and relationships of another use case. Generalization between use cases is represented as a solid directed line with a large arrowhead toward the parent use case. Include relationship denotes the inclusion of a use case as a sub-process of another base use case. Here, the base use case is dependent on the included use case and without them the base use case is incomplete as the included use case represents a sub-sequence of interactions that may always happen. In other words, if a certain use case must function at the end of another use case then there will be an Include relationship between the two use cases. Base use cases require the completion of included use cases in order to be completed. Dotted Arrow is directed from base use case to included use case.

FIGURE 1. Modeling options in Use Case Relationships
The Extend use case is dependent on the base use case. It exactly extends the behavior described by the base use case. Base use case should be a fully-functional use case in its own way. Arrow is directed from the extended use case towards the base use case. These three relationships can be used to group use cases and to discover appropriate partitions of the system. We use the use cases in the requirements phase for the greenfield development. Typically, microservices are designed intuitively, utilizing the expertise of the software architects and designers. However, developing wrong service boundaries can prove to be very expensive [7]. It will lead to a higher inter-service communication, and highly coupled components, and consequently it might be worse than just having a single monolithic system 1 . Unfortunately, in this scenario, a microservice application will work like a distributed monolith i.e., an application deployed like a microservice but built like a monolith. Since a microservice is fine-grained, its functionality characteristically comprises several related use cases. As an example, Flight Booking service has a functionality that can encompass multiple use cases such as Book Flight, View Booking, and Cancel Booking. Theoretically, these use cases should be bundled together in a microservice. Often, grouping of the use cases exceeds the functionality provided by a single class. We have devised a five-step approach namely GreenMicro to identify microservices for greenfield developments based on clustering of use cases. GreenMicro uses various criteria involving functional requirement specifications-i) use cases, ii) dependency relationships among use cases, and iii) database entities. A similarity matrix is computed using these criteria on which the clustering algorithm is applied to find candidate services. We have evaluated our approach on an in-house proprietary application and three sample benchmark applications. Key contributions of this paper are as follows: C1: A five-step microservice extraction approach GreenMicro based on grouping of use cases that are highly cohesive and loosely coupled at the same time. C2: Applying the proposed approach on 'Teachers Feedback Web Application (TFWA)' as a Proof of Concept (PoC). C3: Applying our approach on three benchmark applications -i) Cargo Tracking System, ii) AcmeAir, and iii) JPetStore. C4: Validating our results quantitatively. Microservice implementation of some of the projects is available on Github (JPetStore 2 , TFWA 3 ). The remainder of the paper is organized as follows: Section 2 presents the related work going in the Greenfield decomposition domain. Section 3 describes the proposed methodological plan for microservice identification. Section 4 walks through an in-house Java application taken as an example project to describe our approach and illustrate the implementation on three sample benchmark web applications. Section 5 shows quantitative analysis of our methodology along with the results. Section 6 states the conclusion.

II. RELATED WORK
There exist several techniques for identifying microservices in greenfield development. The techniques use the requirement documents, requirement models or design documents to achieve candidate services. Here, we discuss the existing approaches. Both [8], [9] used dataflow-driven decomposition techniques for identifying microservices from the given detailed business requirements. In [8] a top-down decomposition approach is used where authors have converted traditional dataflow diagram (DFD) to get purified DFD. A two-phase automation algorithm for decomposition is proposed: (1) generating a decomposable DFD from purified DFD; (2) identifying candidate microservice from decomposable DFD. Li et al. [9] suggested a semi-automatic dataflow-driven microservice decomposition approach to generate a process-data store version of DFD (DFDPS). DFDPS shows the relation between processes and related data stores. Next, a condensed DFD called decomposable DFD is taken out from DFDPS by extracting the sentence sets in which a process reads or writes data to a data store. Last step of their proposed approach is to group modules of fine-grained processes and the related data stores to identify microservices. Amiri et al. [10] presents a microservice identification method based on a set of business processes. Authors identified fine-grained services from business processes. They used the notions of structural and object dependency between business activities represented as business process model notation (BPMN). Ahmadvand [11] proposed a methodology that reconciles security and scalability requirements to be included in the requirements engineering phase. Their approach maps functional and non-functional requirements to identify more optimal system decomposition. Baresi et al. [12] proposed semantic similarity of available functionality evaluated through OpenAPI specifications. Their approach identifies potential candidate microservices by matching the key terms in the specifications against a reference vocabulary and suggests possible decompositions. The success of their approach is dependent on well-defined Application Programming Interfaces that give meaningful names. Another service identification technique [13] exist which is based on functional decomposition of use case requirement. They first create a model of the system that contains a finite set of system operations and of the system's state space. Later, authors created an operational/ relational dependency graph using some automated tools to derive possible decomposition. Fan et al. [14] analyze the system architecture using Domain-Driven Design (DDD) and extract the candidate microservices. Later, they analyze the database schema to verify if it is consistent with the candidate microservices. Finally they filter out inappropriate service candidates. In [15], an automatic identification approach has been proposed from a set of business processes. Their multimodel approach combines different independent models that represent a business process like control, data, and semantic dependencies [16]. Rivera et al. [17] propose another intelligent and optimal method that works on user stories to decompose the functionalities or requirements of the application into microservices. Most of these existing approaches have two important disadvantages: 1) Lack of validation of the results and 2) Over dependency of their approach on expert opinion. Until now, we see that researchers have used various techniques in greenfield developments. We build our approach on use cases as use cases are the most common and straightforward way to model the functional requirements of a system that are defined during the early phase of software development. Use cases delineate a high level view of the system without delving into the system intricacies. A complete set of use cases indicates all the functionality and behavior of the system. Thus, in our work, we harness the ability of use cases model, and finally group these business use cases into a set of candidate microservices. To achieve this, we describe a systematic approach GreenMicro to reengineer applications in the early analysis and design phase, based on the system's functional requirements illustrated as use cases.

III. METHODOLOGY
Use case modeling is extensively used in contemporary software development engineering as an approach for requirements elicitation [43]. Further, relationships between use cases model the dependency and connection between individual use cases elements in the system. These relationships add semantics to use case models by defining the structure and behavior between the model elements. Also, use cases accessing the same data are more related than other use cases. Therefore, we present a systematic methodology to find similarity of use cases by making use of use case relationships and database entities which lead to finding candidate microservices.

A. BASIC DEFINITION
In this section, we will mathematically represent the problem of determining a set of microservices using use cases. Consider a set of use cases as UC A such that UC A = {UC 1 ,UC 2 ,…,UC k } where UC 1 represents an individual use case. Using this, we define a set of microservices as μ A = {μ1,μ2 ,…, μn} defined on UC A such that • ⋃ =1 = UC A . It means all use cases are designated to some microservice. • μ i ≠ Ø, ∀i = 1, … , n. It means there is no empty microservice. • μ i ∩ μ j = Ø, ∀ , = 1, . . . , . It means each microservice is unique.
The main idea for our approach is to identify microservices by assessing the functional dependencies/ relationship between use cases. This can be implemented by clustering closely associated use cases into a service. Due to the informal nature of functional description of use cases, the degree of dependency between them may not be calculated directly. Therefore, we propose the approach GreenMicro to effectively determine the functional dependency between two use cases, UC i and UC j .

B. PROPOSED APPROACH
GreenMicro is a systematic approach for microservice identification comprising a sequence of steps. Sequence of steps is logical and easy to apply in practice. The proposed approach consists of five steps as shown in Figure 2. We assume that use case models are fundamental artifacts of any object-oriented system and are readily available to system designers. Use case set, UC A , can be gathered manually from the requirement documents or can be automated by parsing the given software requirements using NLP based language models as discussed in [44]. An input dataset is a <component -attribute> data matrix.
Components are the entities (use cases) that we want to combine on the basis on their similarities. Attributes are the properties of the components. Two datasets utilized in our work for identifying microservices are as -i) Use Case-Use Case Relationship Matrix, and ii) Use Case-Database Entities Relationship Matrix. These two datasets are the artifacts of requirements analysis and design phase in application development. Both these relationship matrices are aggregated to get a combined similarity matrix. In subsequent steps of the approach, clustering is performed on the combined similarity matrix to get the desired results. Below are steps we utilize for microservice candidate identification: 1. Use Case-Database Entities Relationship Matrix (UC-D) -Use cases accessing the same data are more related than other use cases. In other words, use cases manipulating the same database entities have some degree of relationship. Thus, by examining the functionality of use cases, one can determine the database entities manipulated by each use case. So, database entities are regarded as features and CRUD operations applicable on them by specific use cases are recorded in Use Case-Database Entities Relationship Matrix (also known as CRUD matrix). This matrix indicates use cases UC A as rows, database entities as columns, and semantic relationship tags as Create-C, Update-U, Delete-D, and Read-R as cells of the matrix [18].
• "C" means the use case CREATES the database entity. • "U" means the use case UPDATES the database entity. • "D" means the use case DELETES the database entity. • "R" means the use case READS the database entity. Now, this tag-based matrix is transformed into numeric values. Therefore, each tag is replaced with the corresponding value (weight) in the matrix according to the priority as C > U > D > R. In our work, to simplify computations, we adopt these substitutions as: C: =1, U: = 0.75, D: = 0.5, R: = 0.25, as suggested in [18]. 2. Use Case-Use Case Relationship Matrix (UC-UC) -This matrix represents the degree of interdependence among all the use cases. Three typed of relationships between use cases exist -Include, Extend and Generalization as discussed in the introduction section. Here, use cases, UC A , are represented in both rows and columns and their relationships are marked as cells of the matrix. In this paper, we adopt substitutions as: Generalization: =1, Include: =0.66, Extend: =0.33, as suggested in [19]. 3. Constructing a Resemblance Matrix for UC-D: Here, we find the resemblance coefficient between each use case entry in UC-D Matrix. Each row of the UC-D matrix can be seen as a vector whose similarity needs to be evaluated. To find degree of similarity or dissimilarity between these two use cases, we use the Cosine Similarity between pair wise component vectors to represent the weight of the relationship between use cases. Cosine Similarity is basically a measurement that quantifies the similarity between two or more vectors. Cosine Similarity is a common approach and has been used in many other related studies for microservice identification [20], [21]. This score indicates how much two use cases are related in terms of accessing the database entities. The higher the value, the stronger the relationship between use cases.
Cos(x, y) = x . y / ||x|| * ||y|| where, x . y = dot product of the vectors 'x' and 'y' and ||x|| * ||y|| = cross product of the two vectors 'x' and 'y'. 4. Generate combined Similarity Matrix -In this step, we create a combined Similarity Matrix of use cases which is a N×N symmetric matrix. Here (i, j)-th element represents the similarity measure for the UC i and UC j where i,j = 1,…,N. For this, Resemblance Matrices for UC-D and UC-UC are aggregated together to generate a combined Similarity Matrix.

5.
Clustering of use cases: The combined similarity matrix generated in the previous step is the input for clustering technique. We perform clustering to obtain a cohesive set of use cases that may be bundled together as microservices. Given a set of use cases, UC A , it involves organizing each use case into a specific group called cluster. In our work, we consider each use case UC i as a distinct object. Use cases of the same cluster are likely to be as homogeneous as possible to make sure the cohesion property of a cluster. In contrast, use cases belonging to different groups are likely to be as distinct as possible to make sure the loose coupling of a cluster. For our work, we have applied classical Hierarchical Agglomerative Clustering (HAC) [22] for two reasons.
Firstly, it has been utilized in numerous earlier works on software re-modularization [23] and microservice candidate identification [24], [25], [20]. Secondly, it has less time complexity in comparison to the hill-climbing technique [26] and genetic algorithms [27], [10]. Despite the fact that hierarchical clustering provides a graphical representation of a fully connected hierarchical tree as dendrograms, we can find the optimal number of clusters to be extracted. For this, we executed the Silhouette method that suggested the optimal number of clusters. Each group of use cases can be considered a potential microservice candidate. By the end of this step, we find possible microservices μ A from the given set of use cases UC A .

FIGURE 2. Complete Outline of the GreenMicro Approach
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and content may change prior to final publication.

IV. EMPIRICAL EVALUATIONS
In this section, we present subject applications on which empirical evaluations are being performed. We also describe baseline techniques with which results of GreenMicro are compared. We also elaborate quality assessment metrics that are utilized in this evaluation work.

A. SUBJECT APPLICATIONS
We   TFWA: Teachers Feedback Web Application (TFWA) automates the process of the feedback system for teachers. Students use TFWA to submit their feedback for all subjects and the respective teachers. Students can give their feedback online. Visually impaired student is a specialized actor that can give feedback through a chatbot. Teacher In-Charge (TIC) of every department and Principal can check the status of the feedback procedure and can view and analyze the feedback data from different analytics perspectives (department-wise, teacher-wise, class-wise and subject-wise). All use cases are depicted in Figure 3. There are mainly four actors that interact with the system -Student, TIC, Principal, and Admin. We extracted twenty two use cases and seven database tables. Thus the dimension of the UC-D matrix is 22x7. UC-UC matrix is a 22x22 matrix depicting relationships between use cases. Combined Similarity Matrix is also a 22x22 matrix on which finally the clustering algorithm is applied. Figure 4, Figure 5 and Figure 6 show the excerpts of Use Case-Database Entities Relationship Matrix (UC-D), Use Case-Use Case Relationship Matrix (UC-UC) and Combined Similarity Matrix respectively for TFWA.

FIGURE 5. Excerpts of UC-UC Matrix for TFWA
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and content may change prior to final publication.  6 : An e-commerce shopping application that allows users to browse of catalog, products and items, adding items to cart, removing items from cart, update cart items and purchase of items within several categories of pets.

B. RELATED TECHNIQUES FOR QUANTITATIVE EVALUATION
The objective of our evaluation is to assess whether GreenMicro can produce effective microservice candidates. For JPetStore, and AcmeAir application, we compare our approach with four well-known baselines for microservice identification that have presented their results for selected applications: FoSCI [27], CoGCN [36], Mono2Micro [28] and MEM [2].
FoSCI 7 collects and processes the execution traces of the monolithic application, and identifies services candidates using a search-based functional atom grouping algorithm using a hierarchical clustering approach. Later FoSCI assigns these functional atoms to microservice candidates by merging them using a genetic algorithm. CoGCN proposes a multi-objective Graph Convolution Network approach to partition monolith applications using graph based clustering. The technique minimizes the effect of structural and attributes outlier classes that could be in the embeddings of other classes. Mono2Micro 8 employs a spatial-temporal decomposition technique that leverages running of selected business use cases of the monolithic application and dynamically collects runtime call traces to find functionally cohesive clusters of application classes. Business use cases comprise the space dimension and the control flow of the dynamic runtime traces convey the time dimension. MEM 9 makes use of Kruskal's algorithm to find the minimum spanning tree (MST) for the monolithic application. This technique has two transformation stages.
In the construction step, the monolith is transformed into the graph representation using three coupling criterialogical, contributor and semantic coupling. In the clustering step, graph representation is decomposed to generate partitions behaving as microservice candidates. The above listed baseline approaches do not perform their analysis on the Cargo Tracking System. So, for this application, we compare our approach with other four wellknown baselines for microservice identification: Service Cutter [39], API Interface Analysis [12], DFD Analysis [9] and Business Processes Analysis [16]. Service Cutter 10 is a state-of-the-art approach for microservice identification. The inputs to the tool are a set of requirement artifacts, and weighted coupling criteria. Tool outputs a graph where nodes indicate candidate services, and weight along the arcs represents how cohesive/ coupled two candidate services are. Lastly, a clustering algorithm offers the most appropriate microservice cuts. API Interface Analysis 11 involves semantic similarity of available functionality evaluated through OpenAPI specifications. This approach identifies potential candidate microservices by matching the key terms in the specifications against a reference vocabulary and suggests possible decompositions. DFD Analysis suggested a semi-automatic dataflow-driven microservice decomposition approach to generate a processdata store version of DFD (DFDPS). DFDPS shows the relation between processes and related data-stores. This DFDPS is condensed to get a decomposable DFD, in which the sentences between processes and data-stores are joined. Business Processes Analysis utilizes a multi-model approach that combines different independent models that represent business process activities like control, data, and semantic dependencies and capture dependencies for them. A clustering algorithm further applied to combine all extracted dependencies for identifying microservices.

C. METRICS UTILIZED
For the selected benchmark applications, we could not find quality assessment metrics that are common for all three applications. Researchers have validated their work on varied sets of quality attributes. Therefore, we test these benchmark applications for those quality assessment metrics which are used in other baseline studies. This eventually gives us an opportunity to assess and validate our approach on a diverse spectrum of quality attributes. We apply below mentioned five quality metrics for JPetStore and AcmeAir applications to measure the effectiveness of partitions recommended using GreenMicro.
Structural Modularity (SM), as defined in [28], [36], [27], measures modularity quality of a microservice from a structural view. Higher the SM, better modularized the microservice is. Inter-Partition Call percentage (ICP) [28], [35] amounts to the percentage of calls between two microservices i and j.
measures the number of call between microservices and partition . The lesser the value of ICP, the better is the microservice identification.
Business Context Purity (BCP) [28], [41] indicates the mean entropy of business use cases per partition. A microservice is considered functionally cohesive if it implements a lesser number of use cases. Mathematically, where N is the number of services and indicates the number of business use cases in microservice . Since BCP is primarily based on entropy, lesser values are better.
Interface Number (IFN) [27], [28], [36] indicates the average number of published interfaces exposed by a microservice to other services. Smaller the value of IFN, the better it is as the service follows the Single Responsibility Principle. A service publishing a large number of interfaces may provide numerous functionalities, thus violating SRP. Mathematically, it can be represented as Non-Extreme Distribution (NED) [28], [36] measures how evenly distributed the sizes of the recommended microservice is. In general, it is preferred to have a microservice that has too many or too few classes.
where ni is the number of classes in service i and V is the set of classes. i is not extreme if its size is within bounds of {5:20}. The Less the value of NED, the better it is. For quantitative evaluation of the Cargo Tracking System, we couldn't find any research paper where above listed metrics are utilized. So, we makes use of another four object-oriented design metrics namely i) Number of Incoming Dependencies, ii) Number of Outgoing Dependencies, iii) Instability, and iv) Relational Cohesion 12 as used in other baseline techniques [9], [12], [16], [39]. In general, all the metrics are based on coupling, cohesion and number of interactions between microservices. A concise description of these metrics is given below: Number of Incoming Dependencies -Measures the number of classes outside this microservice depend upon classes within this microservice. It is also called afferent coupling (Ca). Number of Outgoing Dependencies -Measures the number of classes inside this microservice depends on classes outside this microservice. It is also called efferent coupling (Ce). Instability Index (I) -Indicates service's resilience to change. It has a range from 0 to 1(both inclusive). I = 0 (maximally stable service), means no method in this service has a dependency to any other method or class in another service. If there are no outgoing dependencies, then Instability will be 0 and the measured service is stable. If there are no incoming dependencies, then Instability will be 1 and the measured element is unstable. Stable means that the element is not so easy to change. It can be calculated as the ratio Ce / (Ca + Ce). Relation Cohesion (RC) -It is a measure of the number of internal relations that represent class inheritance, method invocations, access to class attributes etc. Higher values of relational cohesion suggest more cohesion.

V. RESULTS
We evaluated GreenMicro's performance for three benchmark applications and one proprietary application.
For each application, we applied our approach and grouped a set of cohesive use cases to obtain microservices. We compared the Cargo Tracking System against four baselines on four evaluation metrics. Further, we compared AcmeAir and JPetStore against four baselines on five evaluation metrics. Table II, III, IV and V present the comparison of our results for all four applications across two sets of metrics. For all the metrics, we have assigned two labels as "(-)" or "(+)". Label "(-)" indicates lower values are better, and a label "(+)" indicates higher values are better. For the AcmeAir application, GreenMicro performed better than other approaches for SM, ICP, BCP and IFN as shown in Table II  For JPetStore, GreenMicro yields better results for ICP, BCP, IFN and NED. Lower ICP indicated lesser call percentage between services. Winning in terms of NED indicates that the majority of the services contain 5 to 20 classes as shown in Table III. It may be noted that the value of SM metric is slightly lower than MEM (highest).  For TFWA, we achieve three microservices namely Authentication, Feedback, and Analytics. Table V (a) and (b) represents quality assessment metrics for both sets of metrics discussed above. GreenMicro improves the longterm health, quality, maintainability of the application and yields better software restructuring. (b) For TFWA, we also compare and present the results of legacy monolithic application and microservices application, developed according to GreenMicro. Table VI aggregates the quality assessment parameters provided by SonarGraph Architect as described below: • System Maintainability Level -Evaluates maintainability (in %) by assessing dependency structure between components in source files.
Cyclic dependencies and incoming dependencies negatively influence the metric.
• Cyclic Java Packages -Number of Java packages involved in a cycle.
• Component Dependencies to Remove -Number of component dependencies to remove to break up all Java package cycle groups.
• Structural Debt Index -An estimation of the work needed to clean a software project from structural drift and erosion which happened due to unwanted dependencies that violates architectural rules and cyclic dependencies between packages.
• Physical Cohesion -Number of dependencies 'to' and 'from' other components in the same module.
• Physical Coupling -Number of dependencies 'to' and 'from' other components in other modules. Our results show that microservice application yields reduced cyclic package dependencies, structural erosion, structural debt index and improved system maintainability level. It will consequently improve long-term health, quality, and maintainability of the application. To conclude, microservice identification performed by our approach has greater cohesion, smaller coupling, lesser number of operations offered by a microservice and lesser average calls from one microservice to another. Thus, these results display satisfactory PoC.

VI. CONCLUSION
Microservices are one of the most popular concepts in web application development. A microservice is a small, independent, loosely coupled and high cohesive service that is based on bounded-context. This new paradigm brings a lightweight, independent, reuse-oriented, and fast service deployment approach that minimizes infrastructural risks. However, microservices identification remains an important hurdle for system architects and designers. This task becomes even more challenging in greenfield deployments. Finding microservices before the code exists, as done in our approach, enables system architects to design software that is of higher design quality. We proposed a microservice identification approach GreenMicro that makes use of business use cases, their inter-dependence and associated data dependencies as the primary sources of input. We applied a clustering algorithm on the combined similarity matrix to identify microservices. For evaluation, GreenMicro is applied to four enterprise Java applications to recommend candidate microservices. The initial results are promising demonstrating better structural modularity, higher cohesion and lower inter-service calling. In our future work, we will perform further evaluations of GreenMicro on real-world big scale enterprise applications.