Operationalizing Human Values in Software Engineering: A Survey

Human values (e.g., pleasure, privacy, and social justice) are what a person or a society considers important. The inability to address them in software-intensive systems can result in numerous undesired consequences (e.g., financial losses) for individuals and communities. Various solutions (e.g., methodologies, techniques) are developed to help"operationalize values in software". The ultimate goal is to ensure building software (better) reflects and respects human values. In this survey,"operationalizing values"is referred to as the process of identifying human values and translating them to accessible and concrete concepts so that they can be implemented, validated, verified, and measured in software. This paper provides a deep understanding of the research landscape on operationalizing values in software engineering, covering 51 primary studies. It also presents an analysis and taxonomy of 51 solutions for operationalizing values in software engineering. Our survey reveals that most solutions attempt to help operationalize values in the early phases (requirements and design) of the software development life cycle. However, the later phases (implementation and testing) and other aspects of software development (e.g.,"team organization") still need adequate consideration. We outline implications for research and practice and identify open issues and future research directions to advance this area.


I. INTRODUCTION
Software systems (e.g., mobile apps, banking systems, video games) are now an integrated part of our society. Software systems are expected to address, respect, and be aligned with the individual human values of their diverse end-users who might have different characteristics (e.g., aging people, visually challenged people) [1]- [3]. On the macro level, software systems should not harm or jeopardize social justice and human rights (such as privacy) [4]. Human values such as inclusion, diversity, autonomy, and wealth are defined as "what is important for an individual or a society" [5]. Human values are also referred to as the principles that guide human actions and behavior in daily life [6]. Failing to address human values in software-intensive systems may bring problems and irreversible damages for all stakeholders who are directly or indirectly influenced by software-intensive systems [7], [8].
These difficulties and damages are enormous and can range from user dissatisfaction to reputational disaster, financial losses, or loss of life.
Some examples of human values ignored or violated by software systems and their creators have had such devastating and widespread damages that they have been widely covered in the media and led to public condemnation of the software industry [8], [9]. For example, Facebook and Cambridge Analytica were accused of violating privacy and abusing power by harvesting and using almost 87 million Facebook users' personal data without seeking their consent to help influence voters' choices in the US presidential election [10]. Facebook faced large fines (i.e., US$ 5 billion) and lost US$ 119 billion stock value in one day [11]. Amazon's "Prime same-day delivery service", which is designed to provide an egalitarian shopping experience for all US citizens, appears VOLUME 0, 2022 1 arXiv:2108.05624v3 [cs.SE] 26 Jul 2022 to be unfair and biased against black neighborhoods as they are systematically deprived of receiving this service [12].
Operationalizing human values in software is expected to prevent or minimize such undesired effects and issues and bring benefits for software organizations, end-users, and practitioners such as an excellent reputation for software organizations, end-users put trust in the software, and the increased acceptability of the software [1], [13]. Inspired by the literature [14]- [16], we define "operationalizing human values in software" as the process of identifying human values and translating them to accessible and concrete concepts so that they can be implemented, validated, verified, and measured in software.
Due to the increasing importance of operationalizing human values in software, a growing body of literature has attempted to provide solutions (e.g., frameworks, tools, roles, design patterns, etc.) to help operationalize human values in software. However, such solutions and the knowledge (e.g., their motivations and limitations) around them are scattered in the literature that appears in diverse venues. Consequently, there is no clear and holistic view on how human values can be operationalized in software. We expect that having a comprehensive understanding of operationalizing values in software helps identify the areas such as tooling support and methodological aspects which need more support and investments. Further to this, such a comprehensive understanding can help software development organizations become more aware of possible solutions and associated tools for operationalizing values and adopt appropriate ones that match their needs and industrial settings.
A few secondary studies have reviewed the literature on human values [17], [18] and human values in software [19]- [21]. Perera et al. [19] investigated to what extent papers in the leading software engineering venues are relevant to values. Friedman et al. [17] studied 14 methods that aim to consider and integrate values in the design process of technologies. Salleh et al. [20] and Khurum et al. [21] carried out systematic mapping studies on Value-based Software Engineering (VBSE). VBSE mainly sees the concept of 'value' from the economic lens [22]. While Salleh et al. [20] focused on characterizing the research around VBSE (e.g., principles, research methods, etc.), Khurum et al. [21] developed the Software Value Map (SVM) to provide a unified and consolidated view of value. Our survey differs from these review studies in terms of objectives, the level of in-depth analysis, and research questions (See Section III). The scope of our survey is identifying, analyzing, and classifying solutions for operationalizing human values applied to software-intensive systems development. None of the previous works focused on this aspect. Our survey does not focus on operationalizing human values in technology or product development (e.g., car design).
To gain a comprehensive understanding of the state-of-theart solutions for operationalizing values in software, we conducted a survey on 51 primary studies. We found these 51 primary studies by following Webster and Watson's guidelines [23], suggesting identifying a pool of initial papers, followed by applying the backward snowballing technique. (1) Our survey identifies 51 solutions to operationalize values, which can contribute to five areas in software engineering: requirements, design, implementation, testing, and team organization. (2) The 51 solutions are further classified into 10 "not mutually exclusive" groups, in which the majority of them (31) attempt to "capture values from different resources (e.g., stakeholders)". (3) The majority of solutions (32,62.7%) are able to operationalize any values, while the rest (19) target one or two exclusive values. (4) Only a few solutions (14 out 51, 27.4%) are supported by (semi-) automated tools.
The key contributions of this survey are • The first comprehensive survey on the current research on operationalizing values in software; • A taxonomy of solutions published to 2020 for operationalizing values in software; • A list of promising research directions for future work and investments; In Section II, we provide some definitions of human values and introduce some well-known values models. Section III summarizes existing review studies on human values. Section IV outlines our research method. We report our findings in Sections V, VI, and VII. Section VIII reflects on the key findings and proposes some promising research areas. In Section IX, we discuss possible limitations and threats to the validity of our survey. Finally, we conclude our paper in Section X.

A. HUMAN VALUE DEFINITIONS
The concept of "human values" has long been of interest among the researchers of sociology, psychology, anthropology [5] as well as science and engineering [19], [24]. As defined by Schwartz, "human values are desirable, transsituational goals, varying in importance, that serve as guiding principles in people's lives" [25]. Therefore, human values are something that individuals deem important in life [6]. Values can be fundamental and primary needs (e.g., food) or general needs (e.g., self-esteem) [26]. Many researchers from social science defined values as abstract goals, individual attitudes, behaviors, and beliefs [27]. Schwartz and Bilsky define values as "(a) concepts or beliefs, (b) about desirable end states or behaviors, (c) that transcend specific situations, (d) guide selection or evaluation of behavior and events, and (e) are ordered by relative importance" [28]. Values are also defined as principles that guide social life and are modes of conduct that a person likes or chooses among different situations [6], [27]. Above all, values are "ways to live" [29] which can be defined as a micro-macro concept where the micro level is individual behavior and macro level is cultural practices [27].
Related to values are two important concepts from human psychology, namely human motivation and emotions, that are worth mentioning. Motivation serves as a guiding force for 2 VOLUME 0, 2022 all human behavior and actions. Humans are goal-directed creatures, and their motivation energizes, directs, and sustains their goal-directed activities [30]. The sought-out goals can be as concrete as obtaining food or clothing or abstract like developing a sense of meaning or purpose. On the other hand, emotions are kinds of desires, action tendencies, or feelings that correspond to physiological changes brought on by pleasure/displeasure or behavioral responses, e.g., heart racing when looking at an object perceived as dangerous.
Emotions are often intertwined with personality and motivation; at times aligned with motivational goals and rationality but often challenging any practical reasons [31]. According to modern theories of motivation, values and emotions underlie human motivation. The inter-relationship of these concepts helps us understand why individuals behave in the manner they do, e.g., approve or disapprove of something and engage or disengage in various activities [32]. While some values do have a moral import, not all values are derived from ethics or moral philosophy; the societal or religious perceptions of what is morally acceptable or unacceptable behavior [33].

B. VALUES MODELS
There is no universal agreement on either the number of human values or the way human values can be modeled. Nevertheless, several human values models introduced in social sciences are recognized as the most comprehensive values models to date. Table 1 provides an overview of six wellknown human values models. In 1973, Rokeach identified 36 universal human values using a survey-based approach. Half of the values introduced in Rokeach's model are 'life goals' such as Inner Harmony and Social Recognition, while the rest are linked to modes of behavior such as Cheerfulness and Politeness [6]. Taking forward the survey-based approach, Schwartz (1992) proposed the theory of basic human values [34]. The theory includes ten main value categories, which are measured with 58 individual value items. Importantly, as illustrated in Figure 1, Schwartz organized values in a circular structure to depict their relationships. Values (e.g., Self-Direction and Stimulation) that appear close to each other are complementary, and those (e.g., Self-Direction vs. Tradition) that are further apart are in conflict. This theory has been developed using data from 82 countries with different socioethnic backgrounds [34].
Hofstede used value measurement analysis mainly on cultural aspects and suggested four cultural dimensions: "Power Distance", "Uncertainty Avoidance", "Individualism versus Collectivism", and "Masculinity versus Femininity" [35]. Moreover, there are noteworthy contributions to human values models proposed by various researchers. For example, Parashar et al. (2004) introduced the micro and macro concept of values which are individual behavior and cultural practices, respectively [27]. Gouveia et al. introduced the three-by-two framework with six basic value categories, and three specific values under each category [26]. Cheng and Fleischmann reviewed 12 different values models from dif-ferent disciplines to produce a meta-inventory of values [37]. They categorized all human values from these 12 models into 16 main values categories. Their categorization allows us to understand the similar values concepts (or synonyms) discussed using different wordings in different values models. For example, to describe life's accomplishments, Schwartz uses the word Achievement while Rockeach uses A Sense of Accomplishment. The same idea is expressed by Kahle et al. [36] as Self-fulfillment.
The primary studies investigated in our survey use different values models as a reference point to motivate, describe, and evaluate their proposed solutions. Consequently, they may use various terminology to refer to the same human value. We carefully considered such similarities in our study and selected a common terminology.

III. EXISTING REVIEWS ON HUMAN VALUES
There are a few review papers on human values in software engineering. Perera [34] (adopted from [2], [38]). The boxes indicate the ten universal values, and each of them is subdivided into some finer-grained values.
their study, we developed our own taxonomy of solutions to incorporate human values and investigated if the proposed solutions address specific human values, such as fairness or authority.
Khurum et al. [21] carried out a systematic mapping to discover value-based aspects relevant to decision-making in software-intensive product development. Consequently, they introduced "Software Value Map" (SVM) as a consolidated view of the concept of value aggregated from the finance, customer, business process, and innovation/learning perspectives. Apart from building a unified view of value, they made two other notable contributions relevant to our discussion of value here: (1) categorizing value constructs as "value aspects", "value sub-aspects", and "value components", to facilitate the development of shared understanding among decision-makers and (2) mapping the interrelationship amongst various value constructs explicitly and collecting various methods to measure a specific value component. On their part, this was an attempt to facilitate practitioners' understanding of value for one or more perspectives and enhance their ability to make an informed decision about value creation in the (software) product they produce. Although customer perspective and their intrinsic value were explored, almost none of the human values such as benevolence, universalism, self-direction mentioned in the values models introduced in Section II-B were addressed. Furthermore, no solutions were identified to explicitly address or measure them.
Value Sensitive Design (VSD) is a method to incorporate values during the design phase of a product. VSD is de-4 VOLUME 0, 2022 fined as "a theoretically grounded approach to the design of technology that accounts for human values in a principled and systematic manner throughout the design process" [17]. Friedman et al. [17] identified and reviewed the existing VSD methods proposed in the literature by applying the following inclusion criteria: (1) the method has been invented or undergone substantial development for the investigation of values in technology, (2) the method is self-contained, and (3) the method covers a broad range of values and application areas. The use of these selection criteria resulted in a collection of 14 VSD methods, such as value source analysis and value scenario. With respect to their study, we argue that the definition of VSD is limited to only the design process, but on the other hand, also broader by covering the development of any technology. For example, one VSD method called value sketch focused on 'understandings, views, and values about a technology' [17], not specific about software. Our study aims to complement Friedman et al.'s study by covering solutions proposed in the literature to operationalize values in all software engineering aspects.
Another concept named 'Behavioral Software Engineering (BSE)' is, to some extent, relevant to human values [18]. Lenberg et al. defined BSE as "the study of cognitive, behavioral, and social aspects of software engineering performed by individuals, groups, or organizations" [18]. Lenberg et al. reviewed 250 papers published between 1997 and 2013 that discussed BSE concepts in the software engineering discipline [18]. They found that the software engineering research did not (sufficiently) investigate the majority of BSE concepts (e.g., life satisfaction, conformity) and only focused on a few BSE concepts (e.g., communication, personality, and group composition). It was also found that the vast majority of the 250 publications on BSE fall in only two software engineering areas: software engineering professional practice and software engineering management. While BSE concepts are related to human values, they include a wider range of human aspects (e.g., cognitive, behavioral, and social) in software engineering. Lenberg et al.'s work also mainly focused on providing a definition for BSE and identifying BSE concepts.
All previous review studies have been vital to showing the importance of values in software by focusing on an important goal (e.g., the relevance of the software engineering literature to values) or targeting a specific aspect of software development (e.g., VBSE or VSD techniques). Our survey attempts to build on this body of knowledge and provides a comprehensive view of operationalizing human values in software engineering by identifying and classifying solutions that can be applied in any aspect of software development. We also provide a taxonomy of the solutions and discuss what values are operationalized.

IV. RESEARCH METHOD
In the following sections, we define our research question and outline the scope of our survey and the execution of the survey.

A. RESEARCH QUESTION
Researchers, organizations, and practitioners have been trying to incorporate human values in technology development (e.g., medical devices) [40]. Our main goal in this survey is to identify and classify solutions that help operationalize human values in software. By solutions, we mean any techniques, approaches, practices, frameworks, tools, etc., that can be used in or support the process of operationalizing human values in software engineering. Hence, we formulate the following research question (RQ): RQ. What solutions do exist to enable or support operationalizing human values in software engineering?

B. SURVEY SCOPE
We apply the following inclusion/exclusion criteria to collect and select primary studies for our survey.
• The paper should propose a solution that can be applied to one or more software engineering phases/aspects. For example, the papers that offer solutions to address values in non-software systems (e.g., car design) are excluded. • The papers that test a set of hypotheses on human values (e.g., investigating the relationship between human values and e-learning adoption [41]) without proposing any solutions to integrate values are excluded. • If we find papers that present the same solution but appear in diverse venues (e.g., a conference paper and its extension in a journal), we include the most mature paper. • As described in Section II, security and privacy are two human values that appear in many human values models (e.g., the Schwartz theory of basic values). We exclude the papers that provide technical solutions to improve security or privacy in the software for two reasons. First, such technical solutions are out of the scope of our study. Second, security and privacy have been extensively studied in the past decade in the software and systems engineering communities, and a substantial number of reviews exist on these two concepts. • Fairness can be considered a human value. There are many papers in the AI and machine learning community that attempt to detect and address fairness issues in machine learning algorithms and models [42], [43]. We exclude such papers if they do not investigate fairness from the software engineering perspective, as this is not within our survey's scope. • As we described in Section II-A, human values and the concepts such as motivation, ethics, and emotion are intertwined. Therefore, if a paper proposes a solution to operationalize these concepts in software engineering with human values-related examples and discussions, we will include such papers.

C. PAPER COLLECTION
There are two different techniques to identify the primary sources for literature review studies [44]. In the first tech-VOLUME 0, 2022 nique, which is common in the software engineering community, search strings are developed and then executed on different digital libraries (e.g., ACM Digital Library) [45]. The second one is more common in the information systems community and starts with identifying a pool of initial papers, followed by the backward snowballing technique [23]. Jalali and Wohlin [44] applied both techniques on Agile practices in Global Software Engineering (GSD) and realized that although these techniques led to the identification of different sets of studies, no significant differences were observed in the findings.
Human values have been researched in many domains across different research areas. In Section II, we discussed that there is no consensus on what human values are, and there are many values models that cover a different number of human values with various terminologies. Further to this, there is no established theory on human values within the software engineering community [19]. Due to these limitations, it was not possible for us to build a search string that covers all human values and execute it on different digital libraries. Hence, we decided to follow the approach proposed by Webster and Watson in the information systems community, which includes the following two steps [23]. Figure 2 shows our paper collection process.

1) Initial Set of Papers
This step included four phases. In the first phase, we searched on Google Scholar search engine with two general terms: "human values" and "software". We looked at only the title of the first 1000 out of 59,751 papers returned by Google Scholar and excluded 685 papers as they did not meet some of our inclusion or exclusion criteria. Next, the first author read the abstract of all these 315 papers and chose 131 papers that had the potential to be included in this survey. As shown in Figure 2, the inclusion and exclusion criteria outlined in Section IV-B were applied on the abstract of the 315 papers to select these 131. We maintained an Excel spreadsheet file to record which papers were included or excluded in these two first steps and the rationale behind our decisions. We shared the file with four of the authors to receive their feedback. Next, the 131 papers were distributed among five of the authors. They were asked to read the full text of the papers and determine which of their assigned papers proposed solutions for operationalizing values in software. They had to record the reason for including or excluding a paper in the Excel spreadsheet file for further discussions. Finally, 46 papers met all the inclusion and exclusion criteria and were included in our survey.

2) Backward Snowballing
We used the backward snowballing technique [46] to minimize the risk of missing pertinent studies. Following the guidelines proposed by Wohlin [46], the first author conducted the backward snowballing in several iterations until no new papers were found. The first author checked the references of all 46 papers and employed the inclusion and exclusion criteria discussed in Section IV-B. Similar to the "Initial Set of Papers" step, the rationale behind excluding a paper was recorded in the Excel spreadsheet file and shared with four of the authors for seeking their comments. This step added 5 papers to the pool of primary studies. Table 3 shows all 51 primary studies that are studied in this survey. It is worth noting that we did not conduct the forward snowballing. It is because the significant time required to conduct a systematic literature study often forces the majority of researchers to limit their search procedures (See Section IX) [44].

D. DATA EXTRACTION
The first author created another Excel spreadsheet file shared with five of the authors to extract the detailed contents from the 51 primary studies. The 51 primary studies were distributed between the first six authors to extract the relevant information: the first author (27 primary studies), the second one (9 primary studies), the third one (5 primary studies), the fourth one (5 primary studies), the fifth one (4 primary studies), and the sixth one (1 primary study). This allocation was based on the time availability of the authors. We collected the following data items from each primary study. Table 2 shows these data items.
• D1-D9: These data items were collected to provide comprehensive demographic information on the primary studies. We obtained the citations (D3) of each primary study from Google Scholar on 24 March 2021. • D10: We extracted how a solution proposed in a given primary study for operationalizing values is evaluated (D10). • D11: Our study recorded if a solution operationalizes a specific human value or is designed to operationalize any human values. • D12-D13: Some researchers prefer to give a name to their proposed solutions. In such a case, we recorded the name of the solution (D12). Some of the solutions are supported by tools. Hence, we recorded if a solution is supported by a tool and collected the tool's name, provided that it was mentioned (D13). • D14-D17: We wrote a critical summary of how a solution supports or enables operationalizing values (D14). We also recorded the problems (D15) solved by the solution and the benefits (D16) of the solution. Finally, we collected the possible limitations (D17) of the solution.

E. DATA ANALYSIS
Data items D1 to D9 and D11 to D13 were analyzed using descriptive statistics. We used the taxonomy proposed by Wieringa et al. [47] to classify the research type used to evaluate a solution (e.g., a technique, tool, etc.). Wieringa et al. suggest that the evaluation of a solution (in the requirements engineering community) can be done through the following six research types: "validation research", "evaluation research", "solution proposal", "philosophical paper", "opinion 6 VOLUME 0, 2022 paper", and "experience report" [47]. Note that Wieringa et al.'s taxonomy has been widely used in many review papers in the software engineering community to classify research types. For example, Engström and Runeson [48] used it to classify papers on software product line testing, and Jalali and Wohlin [44] used it in a review on global software engineering. The data collected from data items D14 to D17 were analyzed using the open coding procedure [49] to build a taxonomy of solutions for operationalizing values in software. The first author performed open coding iteratively in parallel with data extraction and labeled the data. First, the first author read all the extracted data to become familiar with the extracted data. He constantly contacted the authors involved in data extraction if any ambiguity or missing information (e,g., a description of a solution was not understandable) was found in the data. Next, he coded each study and shared the codes with the author responsible for reading and extracting the data from the given study to seek their feedback on the identified codes. The first author updated the codes based on feedback and comments from the corresponding authors. In the next step, codes identified in one study were compared with those that emerged from other studies. The next step iteratively classified these emergent labels to build the taxonomy. In the last step, the taxonomy was shared with other authors to seek their feedback. Any disagreements between the authors were solved by organizing several face-to-face and Zoom meetings. The final version of the taxonomy was agreed upon by all the authors. Table 3 shows the 51 primary studies selected for analysis in this survey. We summarize the demographics of these primary studies in the followings sections. Figure 3 shows that conferences are the dominant venues to publish research on operationalizing values in software. 25 primary studies (49%) were published in conferences, 17 (33%) primary studies in journals, and the rest were workshop papers (7 primary studies, 14%) and book chapters (2 primary studies, 4%). Table 4 reveals that the 51 primary studies come from 42 distinct venues, in which "Requirements Engineering Journal" with 5 primary studies is the most popular one, followed by "ACM Conference on Human Factors in Computing Systems" (4 primary studies). There are two venues with two primary studies each and 38 venues with only one paper each. This indicates that research on human values does not have exclusive venues and attracts a wide range of researchers from computer sciences, software engineering, and human-computer interaction. VOLUME 0, 2022

B. AFFILIATIONS AND COUNTRIES
The authors of the 51 primary studies come from 22 countries. We found that researchers from the USA (13 primary studies, 863 citations), UK (9 primary studies, 514 citations), and Australia (7 primary studies, 193 citations) have contributed more to this research area than others (See Table 5). In total, 78 institutes published in this area. The Delft University of Technology had 4 primary studies, and researchers from Washington University and Lancaster University published three primary studies each (See 5). The vast majority of them (71 institutes) had only one primary study. Note that we did not find any authors with more than two papers on operationalizing human values in software.  Figure 4 shows the number of primary studies published from 2005 to 2020. The average number of primary studies in this research area is 3 studies per year with 3.9 papers in the last ten years, where 2019 peaked (8 primary studies), followed by 2015 (6 primary studies). This indicates that the interest in the research on human values in software engineering seems to remain more or less constant since 2011.

D. CITATIONS
Citations and venues of a paper can partially show the quality of the research paper [98]. We obtained the citation counts of the 51 primary studies from Google Scholar on 24 March 2021. Table 3 indicates that the number of citations ranges from 0 to 1368, with a high average of 68.6. Figure 5.(a) shows that 50% of the primary studies have more than 20 citations. We also used the Google Scholar H5-index 1 of the primary studies' venues as an indicator to judge the quality of the primary studies. A higher H5-index implies that more quality papers appear in the venue. As shown in Figure  5.(b), there are 12 primary studies published in the venues whose H5-index were not available (e.g., new conferences, workshops) and were labeled as "None". The average H5index for the remaining venues is 35. Figure 5.(b) shows that most of the primary studies (29 out of 51, 56.8%) appeared in venues with an H5-index of more than 20. These results can (partially) show that research on human values is being published in quality venues.

E. RESEARCH TYPES
As we described in Section IV-E, we used the taxonomy provided by Wieringa et al. [47] to evaluate the solutions proposed in the primary studies. As illustrated in Table 6, most of the primary studies (29 out of 51 primary studies, 56.8%) are classified as "solution proposals". This group of primary studies introduces a solution and discusses its effectiveness, but usually without a solid and sufficient validation. Such primary studies examined the usefulness and actionability of their proposed solution by a small example application, sound argument, case studies, or experiments. We found 13 primary studies (P1, P2, P3, P5, P9, P14, P36, P37, P38, P39, P42, P46, P50) studying attributes of a solution that was not implemented in the industry (i.e. "validation paper"). We classified 6 primary studies (P13, P16, P17, P22, P35, P43) as "evaluation research". These primary studies investigated the (positive and negative) consequences of a solution that was implemented in practice. Only one primary study is assigned to each of the "philosophical paper" (P15), "opinion paper" (P34), and "experience report" (P30) categories. P15 is classified as a philosophical paper because it proposes new notations and concepts for Tropos to support values in the software development process. In P34, the authors report their personal opinions about how the responsibilities of Scrum Master, Product Owner, and Development Team should be adapted to ensure accessibility in Agile projects. Based on an experience report, the authors in P30 suggest how to adjust Scrum (e.g., adding new tasks to Scrum) to meet accessibility requirements in a software project.
Given the low portion of the papers with industrial evidence, a call for conducting more validation and evaluation research is demanded. Such research improves the practical applicability of solutions proposed to operationalize human values and encourages software development organizations to adopt those solutions in practice.

VI. SOLUTIONS FOR OPERATIONALIZING HUMAN VALUES IN SOFTWARE
We identified 51 solutions from the 51 primary studies (each of the primary studies proposes a solution to operationalize human values in software). As we described in Section IV-E, we applied the open coding procedure (on data items D14 to D17) to analyze these 51 solutions, leading to a taxonomy. Figure 6 shows the taxonomy of the 51 solutions. The identified solutions can be applied at the highest level in the following five areas: requirements, design, implementation, testing, and team organization. In the next level, we further classified the solutions into 10 categories (O1 to O10 in Figure 6). The 10 categories are not mutually exclusive, as there might be solutions that fall in more than one category. Table 14 in the Appendix provides a comprehensive overview of all these solutions. For brevity, a small subset of the solutions in each category as examples is elaborated in the following sections.

1) Capture Values from Different Resources
According to the definition provided in the Introduction section, the first step of operationalizing values in software is to identify values. However, understanding and eliciting human values are not a non-trivial task. As highlighted in Table 7, this difficulty stems from three main factors: • Values' characteristics. Values are tacit knowledge and subjective concepts, and cannot be (easily) understood outside of their social and cultural context.    In Table 8, we further classify these primary studies based on the sources that they investigate to elicit values. As shown in Table 8, 14 primary studies attempt to identify values from any relevant stakeholders. Stakeholders are defined as anyone who is directly or indirectly influenced by a software-intensive system [99]. Hence, stakeholders can be any combination of end-users, development teams, customers, and organizations. Six primary studies only use end-users, and one primary study only uses development team members for this purpose. Other sources for values elicitation are software development artifacts (e.g., requirements documents), existing systems, literature, and prototypes. Table 9 indicates that interviews and surveys are the main instruments to elicit values, followed by co-design workshops. Three primary studies (P1, P10, P11) use observations to extract values from stakeholders. Below are some good examples to illustrate how values can be elicited from different resources using the instruments presented in Table 9.
Thew et al. (P1) propose Value-based Requirements Engineering (VBRE) to make values explicit in the requirements engineering process. The VBRE process provides a set of step-by-step guidelines that advise and assist the analyst in obtaining values from the requirements engineering artifacts, documents, and interviews. In this process, the analyst is provided with a paper-based taxonomy of values, which is supported by a website. The proposed taxonomy may not be 10 VOLUME 0, 2022

Requirements
Capture Values from Different Resources (24 Solutions)  [P1, P3, P4, P5, P6, P10, P11, P12, P13, P15, P16, P18, P20,  P22, P23, P36, P37, P38, P42, P44, P45,     easily understandable by the analyst. Hence, the supporting website provides detailed information about each item (e.g., value) in the taxonomy. This includes a list of representative interview questions and scenarios to support the analyst, particularly the novice analyst, identify and capture values. Emotion-oriented Requirements Engineering technique (P36) uses a set of notations to model emotional goals ("how it feels"), functional goals ("what it does"), and quality (nonfunctional) goals ("how it does"). The technique first extracts "emotional threats" (users' pain points) from the relevant stakeholders using an emotion-oriented interview and survey. Then each extracted threat will be translated into emotional, functional, and quality goals. For example, the "insecure" threat can be converted into "safe" as an emotional goal, "responsiveness" as a quality goal, and "anomaly detection" as a functional goal. Then, all functional goals should be further decomposed into sub-functional goals. Any emotional goal must be linked to one or more functional goals. to connect developers with security incidents and prompt them to speak about the consequences of such incidents. By adopting a positive, value-oriented approach toward security, the workshop first asks its attendees (developers) to read the report of a compromise (i.e., an overview of a security incident). A group of cards (e.g., value cards) are distributed among the attendees to prompt discussions about various security incidents and their impacts on stakeholders. In the next step, the attendees are encouraged to talk about the security incident from a developer's viewpoint being directly affected by the reported compromise. All this can reveal the possible impacts of the compromise on the values of the affected developer.

2) Evaluate and Validate Values-critical Requirements
As discussed in Section VI-A1, many solutions have been developed to collect values from different resources. However, it is equally essential to ensure the gathered values-critical requirements (i.e., requirements that include human values) are complete and match stakeholders' needs. We found 12 primary studies (P5, P14, P16, P18, P22, P24, P26, P30, P31, P40, P42, P44) that propose solutions for this purpose.
Lee et al. (P26) argue that there is a lack of approaches to validate the elicited requirements, particularly validating the elicited requirements against customers' inner needs (i.e., values, beliefs, and motivations that someone has but are not easily visible) and behavioral data. Customer Requirements Validation (CuRV) technique (P26) leveraging the mental model technique can be considered a new step in the requirements engineering process after Elicitation, Analysis, and Specification. The CuRV technique first collects user behavior data, for example, from the earlier version of the software. It then creates a mental-requirements model to identify which of the elicited requirements correspond with users' mental states. Based on the mental-requirements model, it can suggest (a) there may be a need for re-eliciting requirements as some customers' behaviors or inner needs are not captured in elicited requirements. (b) It can provide insights for the requirements reviewers that there may be a need to re-analyze elicited requirements as some do not match customers' behaviors or inner needs.
The goal of Appraisal and Measurement of User Satisfaction (AMUSE) (P42) is to help the development team to select the best features that improve user satisfaction for the next releases. To this end, AMUSE first needs to measure 12 VOLUME 0, 2022 the end-users' satisfaction with the current version of the product using a questionnaire. The responses collected from the questionnaire can reveal the extent to which the current users are satisfied with the product from five perspectives: effectiveness, productivity, hedonism, trust, and overall satisfaction.
The Ethics-Aware Software Engineering framework (P44) is to attain ethical harmony in software artifacts and development process and includes five phases: articulation, specification, implementation, and verification and validation. In the verification and validation phase, the software is continuously monitored to ensure the software is aligned with its ethical values specifications (e.g., diversity, transparency) and detect and reveal any deviations.
In another study (P14), Colomo-Palacios et al. propose affect grid to measure stakeholders' emotions in terms of pleasure and arousal on the collected requirements.

3) Detect and/or Resolve Values-based Conflicts/Violations
As values are subjective concepts, conflicts among values are inevitable. Moreover, values can be easily violated, but it would be hard to monitor values violations. We found 8 studies (P1, P5, P7, P9, P13, P15, P22, P41) that develop solutions for detecting and/or resolving conflicts among and/or violations in values. Table 10 gives an overview of these solutions.
Value-based Requirements Engineering (VBRE) (P1) can reveal the conflicts between values using an iterative refinement process of acquiring and learning stakeholders' values, emotions, and motivation. However, VBRE has no recommendation and solution for solving the conflicts. The "softgoal" concept in Tropos methodology can, to some extent, be used to describe values (P15). However, many important characteristics of values still are missed. For example, it may not be possible to address potential conflicts between values if presented with the "soft-goal" concept. Detweiler et al. (P15) argue that it is needed to have a distinct notation to present values properly, as values are different from goals. Having a notation for values makes links between values explicit, thereby detecting potential conflicts between values.
The studies (P7, P9, P41) attempt to detect possible values violations. UMLtrust (P41) can model and monitor the relationship between participating parties and detect any violation of trust occurring between them at the very beginning of a software development process using a set of specialized UML rules and notions (e.g., trust-use-case diagram). Anthonysamy et al. (P9) focus on detecting privacy violation in online social networks. They propose a technique to measure the level of traceability between "privacy policies" and "privacy controls" in such networks and classify this relation as complete, partial, or broken.
Only two primary studies (P9, P22) go further and provide mechanisms or clues to resolve such conflicts and violations. UML Diagram. Uddin and Zulkernine (P41) extend UML diagrams to document trust scenarios in the software development process. For example, use case diagram is extended and called trust-use-cases diagram, including two types of users (i.e., <<trustor>> actor and a <<trustee>> actor) and four forms of trust relationships (e.g., <<trust -service>>).
User Story. Two primary studies (P30, P37) suggest that user stories can be modified to elicit values. Romero-Chacón et al. (P30) argue the tasks related to web accessibility in user stories should be explicitly labeled to ensure the inclusion of accessibility. However, an extra phase (task) in the Planning stage called "identification of accessibility tasks" needs to be added to Scrum to identify accessibility tasks. Harbers   • Stakeholders participating in co-design workshops are asked to provide reflections on the design and speciation of features/functions of the system under development to help reveal possible values violations (P7). • The proposed solution leverages common reference points between policy statements and the corresponding privacy controls to recognize privacy violations (deficiencies) (P9). • The proposed solution utilizes trust scenarios to generate trust rules in order to detect trust violations (P41).

Resolve
• It is suggested to clarify explicitly (the weight of) values (i.e., determining which values outweigh others) (P22).
• A set of guidelines (corrective measures) is propped to address privacy violations between privacy policies and privacy controls (P9).
value are illustrated. Then, any concrete situation is mapped to a stakeholder requirement. The knowledge gained from the previous steps makes it possible to write user stories as: "As a <stakeholder>, I want <stakeholder need> in order to support <value>". Other Artifacts. The study (P18) argues that the Persona template can be modified to document values.  Table", "Culturally Aware Requirements Framework", and eValue. Both "Value Identification Frame" and "Value Comparison Table" artifacts utilize a tabular visualization to document and manage values at the early stages of software-intensive systems development.

B. DESIGN
In total, we found 15 primary studies that develop solutions to operationalize values during the design phase. Table 12 provides an overview of these solutions. The 'Solution' column shows the name of solutions (if any), followed by the 'Phases/Activities' column indicating the phases/activities of solutions. We also show who can be involved in the design process when a given solution is applied (See the 'Involved' column). Some solutions may use or be supported by other materials/techniques to accomplish their goal. We show such materials/techniques in the 'Supporting Materials/Techniques' column. Finally, the possible artifacts produced or modified by a given solution are shown in the 'Produced Artifacts' column. These 15 primary studies can be generally classified into the following categories.

1) Integrate Values into the Design Process
We found six primary studies (P3, P5, P13, P23, P38, P41) that develop a phased value-centered design process that complements existing development or design processes to make values an integrated part of the design process. (1) using values as the key drivers for decisions; (2) working closely with all relevant stakeholders; (3) leveraging design thinking methods for visioning and problem solving; (4) using Agile methods to develop software; (5) using "action research" methodology to seek reflection from stakeholders; and (6) embracing uncertainties and risks by rapidly building and evaluating prototypes.
Uddin et al. (P41) develop a trust-aware software development framework to manage trust concerns throughout the software development process. The framework includes four stages: trust scenarios identification, trust scenarios modeling, trust rule implementation, and trust rules deployment. Four UML diagrams, including use case, class, state machine, and package diagrams, are extended, and some trust rules are developed to model, implement, and monitor the identified trust scenarios.
Pereira and Baranauskas (P5) argue that both values and culture need to be considered when designing interactive systems. It is because values and culture cannot be separated. 14 VOLUME 0, 2022 Cockton (P23) introduces Value-Centred Design Framework to develop value-centered systems. The framework consists of four processes: opportunity identification, design, evaluation, iteration. In each process, the development team (e.g., developers, designers) needs to perform a set of activities, which result in constructing artifacts. For example, in the design process, the designer performs the "value delivery scenario authoring" activity to transform the "values statements" created in the opportunity identification process to the "value delivery scenarios" artifact. The "value delivery scenarios" present how the proposed design satisfies the values captured in "values statements". In the evaluation process, the designer appraises the extent to which the values delivered in the design may expose difficulties for the user, leading to the "value impact assessment" artifact. The activities conducted in and artifacts produced in the iteration process suggest how the proposed design should be modified to remove undesirable user difficulties.
Values-led Participatory Design (P13) as a three-phase design process aims to bring end-users', stakeholders', and designers' values into the design process. In each phase, the process provides several scaffolds. The first phase leverages different techniques such as workshops and presentations to foster the emergence of values with participants. In the second phase, a dialogical process is employed to encourage participants to think and discuss how the emergent values can be implemented in a new way in the design of the new software products (i.e., re-conceptualizing values). The last phase (grounding of values) ensures that the reconceptualized values are embedded in the final design.

2) Increase the Awareness of the Designer about Values
The decisions made by the designer to shape a software system may have implications on the values of individuals, organizations, and societies [55]. However, it is a challenge for the designer to recognize and think about values when designing software because values are high-level abstract concepts. We found 10 primary studies (P6, P7, P8, P12, P15, P18, P19, P39, P41, P45) that develop design materials, scaffolds, or design principles to increase awareness about values and equip designers when dealing with values during the design process.
Both Value-Centred Design Framework (P23) and Values-VOLUME 0, 2022  Value Sensitive Action-Reflection Model (P12) leverages the ideas of co-design spaces and reflection-on-action to bring values to the technology-centric co-design process. First, the model prompts the participants of a co-design process to generate initial ideas (designs) for a design problem. Second, it uses two types of prompts to encourage the participants to elaborate on, evaluate, and reflect on their initial designs and apply the required changes (if any) to their initial designs. The first prompt, stakeholder prompt, employs values scenarios to draw the participants' attention to the special socio-technical context of yet-to-be-built tools use (generating new features that are aligned with human values). The second one, designer prompt, uses Envisioning Cards to draw attention to the more general social and contextual concerns that are readily neglected. The prompts increase (1) the number of design ideas/solutions and (2) lead to divergent thinking. The new ideas created/influenced by two types of prompts are expected to consider users' values of yet-to-be-built tools' users.
Effectively addressing human values is a good indicator of the acceptance of socio-technical systems. This type of system is necessary to achieve societal sustainability. The current software engineering methodologies do not have any mechanisms to integrate and consider values in such systems. Barn et al. (P6) propose the co-design workshop technique to extract value-sensitive concerns and requirements and then incorporate the extracted value-sensitive concerns into the design. The co-design workshop technique emphasizes that all key actors, such as designers and (indirect and direct) stakeholders, should engage in building or evolving design features of the system under development. The design of a feature may lead to discussions that reveal value concerns (e.g., breaching privacy) or address values concerns. The codesign workshop technique captures such discussions using the concept of value-based prompt. The co-design workshop technique utilizes contextual design method to solve a valuebased prompt.

1) Generate Values-conscious Codes
Our analysis shows that five studies (P27, P29, P32, P49, P51) develop approaches that enable developers to create code or software components aligned with some human values (such code is referred to as "values-conscious code").
Two studies (P27, P32) propose a domain-specific language, followed by a runtime-checking technique, with which developers can define and verify some values-related rules and specifications when they are coding. The goal is to guarantee that the codes or user interfaces developed by developers do not violate or neglect human values. Albarghouthi et al. (P32) develop a specification language to define customized fairness specifications in the code for sensitive decision-making procedures (functions). A runtimechecking technique, similar to assertions in traditional testing, is applied to check if the decisions made by a fairnesssensitive procedure violate defined fairness specifications in the code. The Rule-Based Generation of Mobile User Interface (RUMO) framework (P27) includes a domain-specific language that enables software engineers to define and create a set of rules and constraints to generate similar user interfaces for different platforms (i.e., usability). A rule engine checks if the user interfaces produced for different platforms meet the defined rules and constraints (e.g., usability-related rules and constraints).
Compared to Albarghouthi et al. (P32), who introduce the fairness-aware programming technique to consider fairness as a first-class concern in the code, Mougouei (P51) does not focus on any specific types of values and develops the AIR framework to support the concept of 'value programming'. The AIR framework consists of four components: "value annotation of APIs", "value annotation of code", "value inspection", and "value recommendation". The first two components identify and annotate which APIs and parts of the code are relevant to human values. The "value inspection" component aims to detect values breaches and violations in the code. Finally, the "value recommendation" component provides recommendations to alleviate or fix the detected values breaches and violations. Although the AIR framework has not been evaluated yet, it is expected to help developers VOLUME 0, 2022 learn about user values and write codes aligned with user values.
Similar to the study (P27), Rathnayake et al. (P29) try to embed human values into user interfaces. They specifically target adaptivity and usability as two human values. They introduce a development framework to generate an adaptive user interface automatically. This is achieved by a deep analysis of user behavior patterns and customizing web user interfaces, which are supported by machine learning solutions. Capturing user behavior is done by considering the user's mouse point waiting time and click count in each component on a webpage. The development framework detects which components/subcomponents of a web page are rarely used and dynamically switches them off based on the collected user behavior data. This improves the usability of the website as non-technical users with minimum configuration effort can perform that.

D. TESTING 1) Detect and Test Values-critical Features
This category includes solutions aiming to inspect if the values elicited from stakeholders or other resources (e.g., requirements documents) are being reflected in a designed prototype or implemented system (P5, P17, P18, P22, P42, P46, P49, P51). The VCIA model (P5) introduces the eValue artifact and practical guidelines on how to use this artifact, which help practitioners (e.g., designers) evaluate and reason about a software solution and its features from the perspective of values supported or ignored. The eValue artifact helps practitioners explicitly assess the impact of the neglected value(s) and provides suggestions to identify and implement new features to include the neglected values in the next release.
Galhotra et al. (P46) develop a fairness testing approach called Themis to automatically measure two types of discriminations in a software system that makes decisions (e.g., loan software): group discrimination and causal discrimination. Group discrimination focuses on detecting and scoring the difference between two or more input groups resulting in a similar output. Causal discrimination checks if the software remains fair when some characteristics (e.g., the race of individuals in the loan system) of inputs are changed.
Tramer et al. (P49) propose the FairTest tool to aid software practitioners to identify and fix "fairness bugs". The FairTest tool takes several user attributes, including protected attributes (e.g., race), and the outputs produced by a given software system for users as input and generates an association bug report. A typical association bug report includes statistically significant associations between protected attributes and outputs, and developers can easily understand and interpret it to determine real bugs that require fixing from reported associations.
As discussed in Section VI-A2, AMUSE (P42) can reveal the level of user satisfaction with the features of the current version of a product. AMUSE further specifies if the users perceive the product weak from the following qual-ity aspects: effectiveness, productivity, hedonism, and trust. AMUSE evaluates the new features that are supposed to be included in the next release (i.e., Feature Appraisal). The evaluation is done by measuring how well each candidate feature improves the effectiveness, productivity, hedonism, trust, and overall satisfaction of the new product. AMUSE helps practitioners decide which features to be included in the next releases for improving user satisfaction (called Feature Prioritization).
The Continual Value(s) Assessment (CVA) framework (P18) is composed of four components: "identifying values of stakeholders", "developing initial feature model", "valuebrainstorming iterations", and "level-wise evaluation". The CVA helps practitioners build the rationale from highly abstract concepts of human values to design choices of a system that the practitioners develop. CVA uses a combination of Goal Modeling and Feature Modeling techniques to continuously evaluate values at different development levels such as conceptual, behavioral, and implementation levels. This helps decision-makers pick the values they want to implement and ensure that the values are being implemented across all development levels.

E. TEAM ORGANIZATION 1) Measure Team Members Values
Any organization may have internal and unique values expected to be accepted, respected, and followed by its staff and job applicants. However, software organizations usually do not seek to what extent their personnel and job applicants understand, agree with, and respect universal human values (e.g., inclusiveness, diversity). Hussain et al. [102] refer to "hiring staff with values in their minds" (i.e., values-mined staff) as a cultural approach to address values in software. Understanding how developers think about and appreciate values can reveal any conflicts between a project's values and the project's assigned staff's values. This also may show what areas of improvement need to be sought for an organization's staff. We found five primary studies (P2, P25, P28, P33, P43) that develop solutions to understand and measure team members' perceptions of human values.
As for developers or developers do not trust each others' profiles. Developer Reputation Estimator (DRE) (P30) addresses this issue by building a trustable profile for developers. DRE collects and considers both quantity and quality measures for this purpose: (1) It calculates the number of commits of a developer; (2) It assesses the impact of the works done by a developer (e.g., it calculates how many times other developers reuse code written by developers); and (3) It measures the importance and impact of a developer's collaborators.
Ying et al. (P33) propose a reviewer recommendation technique (EARec) to help core developers decide who should review an incoming Pull Request (PR). EARec simultaneously considers developer (reviewer) expertise and authority when making the decision. The level of expertise of a reviewer for each incoming PR is assessed by determining the similarity between the title and description of an incoming PR and the title and description of the PRs commented on by the reviewer. EARec measures authority by calculating the number of PRs that reviewers have commented on together. In the last step, Random Walk with Restart (RWR) algorithm is applied to balance expertise and authority. A reviewer with many relationships with others is considered as an experienced reviewer and with more authority.
Values are not usually taken into account when organizing a software development team. A compatible team can better pursue and realize the team's goals (e.g., developing software systems aligned with today's diverse society). Surian et al. (P25) propose Developer-Project-Property (DPP) to find a list of compatible developers based on historical data. To this end, the DPP approach first collects data regarding developers and the projects that they worked on. For each project, the approach collects two properties: (1) the programming languages used in the project and (2) the category of the project. Then a graph will be built with three nodes: Developer, Project, and Project Properties. In the last step, the RWR algorithm is applied to compute the similarity between developers.

2) Adjust existing Roles and Responsibilities
Four studies (P21, P30, P34, P35) focus on roles and responsibilities to embed values in software. As Table 13 shows, software engineers/developers, Product Owners, Scrum Masters, validation team, and co-pilots need to adjust their responsibilities to address values. Miller and Larson (P23) recommend new responsibilities and skills for software engineers to develop software services or products that are more aligned with the values of a wide range of diverse stakeholders. They emphasize that software engineers should be familiar with and perform ethical analysis techniques (e.g., utilitarian analysis and deontological analysis) to put human values at a central point in the decision-making and predict the ethical consequence of each of their decisions. Pellegrini et al. (P34) argue that many accessibility issues in software projects are due to (1) postponing the implementation of accessibility features by teams that adopt Agile methods (for example, because they adopt the Minimum Viable Product approach), and (2) a lack of knowledge on the implementation of accessibility. Pellegrini et al. (P34) define a set of new responsibilities for roles involved in software development to address this issue. For example, Product Owner should prioritize accessibility from the beginning of the project and produce user stories that take into account disabled people and their needs. Scrum Master should guarantee that the DONE definition covers accessibility.
In the same line, Romero-Chacón et al. (P30) suggest that Scrum should be adjusted to integrate accessibility and usability criteria in the beginning steps of developing a software project. The customized Scrum has three extra phases (tasks) in Sprint: "accessibility test", "accessibility fixes", and "accessibility review". During the "accessibility test" phase, developers check if all users can access and use finished functionalities. Any accessibility issues need to be resolved in the "accessibility fixes" phase by developers. Once accessibility corrections are applied, the validation team, with the help of customers, determines which web pages and to what extent should be evaluated from the accessibility perspective. Next, the validation team uses the Web Content Accessibility Guidelines (WCAG) as a criteria-based framework to audit the entire website (or a representative sample of the website).

VII. A FURTHER ANALYSIS OF CURRENT SOLUTIONS
In Section VI, we grouped the 51 solutions into ten categories based on five areas in software engineering: requirements, design, implementation, testing, and team organization. This section further classifies and articulates the 51 identified solutions from two perspectives: the type of human values they aim to operationalize and tool support.

A. OPERATIONALIZED HUMAN VALUES
Our analysis shows that the 51 solutions can be arranged into two groups according to the type of human values operationalized by them: Holistic View and Exclusive View.

1) Holistic View
We found that the majority of the primary studies (32 out of 51, 62.7%) do not focus on specific values and provide solutions that aim to operationalize human values as a concept that refers to a wide range of human-centric issues (e.g., emotional value and economic value) without explicitly distinguishing them. In Table 14 and Figure 7, we label such solutions as a 'Holistic View' (See 'Values' column in Table  14). For example, while P12 focuses on safety, youth, and addressing homelessness to show the effectiveness of the Value Sensitive Action-Reflection model, the model is not restricted to any specific values. It can capture and reveal any human values during the design process and recognize (new) features that are more aligned with human values. Similarly, the RUMBO (Rule-Based Generation of Mobile User Interface) framework (P27) can be used to define any human values and detect any value violations, but P27 mostly targets accessibility and usability to evaluate the RUMBO framework. VOLUME 0, 2022 Some studies (e.g., P2, P4, P16) use the well-known values models (e.g., the Schwartz theory of basic values) as a starting point to show and assess the functionalities of their proposed solutions. We also categorized such solutions (e.g., Values Q-sort (P2)) as 'Holistic View'.

B. TOOL SUPPORT
We also investigated if any tools are developed to support the proposed solutions. As shown in  Figure 7 shows that 10 of 14 of the developed tools address one or two exclusive values, and the rest works for any human values. We observe from Figure 7 that the majority of the solutions (28 out of 32) targeting any value ('Holistic View') are not supported by any tools.

VIII. IMPLICATIONS FOR RESEARCH AND PRACTICE
In Sections V, VI, and VII, we provided an in-depth analysis of the research around operationalizing values in software and presented a taxonomy of solutions for this purpose. In the coming sections, we present implications for research and practice. We also highlight some promising research areas and open issues that need more consideration from researchers and practitioners.

A. DEVELOP VALUES-FRIENDLY SOLUTIONS BEYOND REQUIREMENTS ENGINEERING AND DESIGN
Our findings highlight that requirements engineering and design have received a significantly higher number of contributions than software implementation and testing. Out of the identified 51 solutions, 33 relate to the requirements engineering phase, making requirements engineering the most common area to seek guidance from for operationalizing values (See Table 14 and Figure 6). We argue that this fertility results from a combination of (1) recognition and labeling of human values as goals in social sciences and psychology dating back to the 1970's [6], [28], and (2) the development of goal-oriented RE techniques aimed at satisfying user goals and expectations from the system more than three decades ago [103], [104].
The concentration of solutions focused on requirements engineering and design highlighted in our results helps explain the findings by [102] who report that the traceability link between values design and values implementation is often broken. Even the techniques from HCI (Human-Computer Interaction) that may support values consideration during the implementation phase often do not cross over to software engineering. Our taxonomy of solutions in Figure 7 clearly highlights the need for more solutions for the implementation phase of software engineering. The current tally sits as 5 solutions for the implementation phase compared to 16 in the design phase.
Furthermore, the proposed solutions to integrate values into the software system during requirements engineering are better supported by industrial validations than those introduced in software development or testing.  to develop artifacts informed by human values. The requirements engineering artifacts, such as value stories, are of critical importance since they act as inputs to downstream activities, hence enabling the operationalization of values in software. While the availability of a healthy pool of techniques and guidelines during the RE phase is encouraging, further contributions are still needed for the development and testing phases to ensure the successful implementation of values into the software (as discussed in Section VIII-D).

B. COLLABORATE WITH PRACTITIONERS TO CO-DEVELOP AND CO-VALIDATE VALUES-BASED TOOLS
Practitioners mainly rely on tools to build technology. When it comes to sensitive issues like values, the developers need tools supporting the development and implementation of the target values. Still, more importantly, their effectiveness needs to be pre-validated, ideally in a commercial project or setting. Based on our examination of the literature, we highlight this as an open area, requiring more attention and effort. Potential growth in the availability of tools can profoundly increase the incorporation of the specified values, as we have seen in the case of privacy, security, and accessibility that enjoy mature research and implementation tools support [105], [106]. Our examination of the trend in tools development suggests that most of the proposed techniques in our sample of studies are either not supported by a tool or the accompanied tools are not validated in real-life projects. This creates a multi-faceted problem. a) It takes years before such innovations are commercially feasible and reach the industry (this is a general trend in software engineering research). b) It would be challenging to prove their efficacy and efficiency in real-life projects. c) Since these tools are perceived as 'cooked-up' in research labs rather than co-developed with the practitioners, tools uptake requires a disruption rather than diffusion of the innovation [107]. Collaborative approaches such as co-design suggested by the research on human values in software engineering can be leveraged to improve the development and acceptance of tools designed to integrate values in software (discussed further in Section VIII-D).

C. ALIGN DEVELOPER VALUES WITH THE PURPOSE OF TECHNOLOGY DEVELOPMENT
Technologies actively help shape our societies, rather than being neutral means to help realize human ends [108]. Studies in software engineering [102], machine learning [109], computer game design [110] and social sciences [111] have adequately demonstrated that technologies can (by design or accident) embody values, especially of those who create them. Therefore, the role and power of the creators of these technologies at the individual level and team level cannot be overemphasized.
Our findings, however, highlight that only a handful of solutions exist to identify or measure developer team values. Close examination of these studies also reveals that no attempt has been made to link or align the identification of team and developer values with the technology's development purpose or intentions. Similar to other studies in software engineering [102], [112], our findings suggest that building values awareness in the team members, creating values-specific roles for team members, and aligning devel-VOLUME 0, 2022 oper motivations with values expectations from technology is likely to result in achieving the desired software outcomes. We believe that the 'friction' caused by the misalignment of developer values with system intentions can severely hamper building technologies with the desired purpose and impact. Given the importance of the topic and paucity of research to identify, measure and align developer values with the purpose of technology calls for a significant investment of research efforts in this area.

D. DEVELOP METRICS FOR VALUES VALIDATION AND MEASUREMENT
Metrics are a vital part of the verification and testing of software design and code artifacts [113], [114]. Although it is argued that values operationalization requires the development of value-specific metrics to support values-based testing and verification [16], our analysis reveals that there has been little progress on this front. While our survey identified a few mechanisms that measure team member values, the metrics that measure individual values are virtually non-existent. A lack of consensus on the definitions of values could be offered as a possible explanation for this phenomenon. Even if this justification is entertained, the implementation of values-critical requirements or their quality is particularly difficult to verify without the availability of metrics. Given the subjective nature of values, a general lack of availability of quantitative metrics is hardly surprising; however, not finding qualitative measures presents an opportunity for contribution and clearly suggests that the software engineering research community needs more focused efforts to address this problem.

IX. LIMITATIONS AND THREATS TO VALIDITY
The paper collection and data analysis steps may have introduced some limitations and threats to our survey. Following the recommendations proposed in [23], [44], we first created an initial pool of papers by executing a very simple but broad keyword-based search string on Google Scholar. We then performed the backward snowballing technique to minimize missing the relevant papers. As we discussed in Section IV-C, we did not conduct the forward snowballing technique. Hence, we acknowledge that our final set of primary studies may not be the most comprehensive one, and we might have missed some important primary studies. Some steps, such as reducing 315 papers to 131 papers and the identification of papers for the snowballing iteration, in the paper collection process (See Section IV-C) were performed by the first author. Such activities may have introduced a subjectivity threat. Our strategy to minimize such a threat was to maintain a shared Excel spreadsheet file and record all the reasons for including and excluding papers in each step of the paper collection process. This made the paper collection process for all authors visible and enabled them to review the chosen and excluded papers and provide their feedback.
We have classified the 51 primary studies for different purposes (Sections V, VI, and VII). To minimize the misclas-sification of the primary studies in Sections V-E and VII-A, we tried to reduce our personal judgments and interpretations when analyzing the collected data. Concerning the taxonomy of solutions (Section VI), the first author manually analyzed the collected qualitative data (data items D14 to D17) for this purpose. In order to reduce the possible subjective bias in building the taxonomy, the taxonomy was built in several iterations and constantly shared with other authors to seek their feedback.

X. CONCLUSION AND FUTURE WORK
This survey has provided a detailed analysis of the state-ofthe-art solutions for operationalizing values in software. Our findings will allow software engineering practitioners and researchers to understand the research around operationalizing values in software and provide some essential open research areas for research and investments. The main results of our survey are: In this study, we focused on solutions for operationalizing human values in software engineering. There are several future directions for this work. It is worth identifying and analyzing techniques proposed in the AI community that attempt to operationalize human values in machine learning algorithms and models. Given the increasing importance of values consideration in software among software practitioners, multi-vocal literature reviews [115]   The 'Category' column shows each solution belongs to which of the 10 categories (O1 to O10 in Figure 6). JOHN GRUNDY is an Australian Laureate Fellow and Professor of Software Engineering at Monash University. He leads the HumaniSE research lab. Currently, he researches new approaches to engineering software systems that fully take into account the "human" aspects of endusers and team members.
JON WHITTLE is director of CSIRO's Data61, the digital technologies and data science arm of Australia's national science agency. He is also an adjunct (full) professor with the Faculty of Information Technology, Monash University, Melbourne. His research interests include the intersection of software engineering and human-computer interaction. He is best known for his work in model-driven development, aspect-oriented modelling, digital technologies for social good, and values in software. VOLUME 0, 2022