Ethical Considerations and Checklist for Affective Research With Wearables

As the popularity of wearables increases, so does their utility for studying emotions. Using new technologies points to several ethical challenges to be considered to improve research designs. There are several ethical recommendations for utilizing wearables to study human emotions, but they focus on emotion recognition systems applications rather than research design and implementation. To address this gap, we have developed a perspective on wearables, especially in daily life, adapting the ReCODE Health - Digital Health Framework and companion checklist. Therefore, our framework consists of four domains: (1) participation experience, (2) privacy, (3) data management, and (4) access and usability. We identified 33 primary risks of using wearables to study emotions, including research-related negative emotions, collecting, processing, storing, sharing personal and biological information, commercial technology validity and reliability, and exclusivity issues. We also proposed possible strategies for minimizing risks. We consulted the new ethical guidelines with members of ethics committees and relevant researchers. The judges (N = 26) positively rated our solutions and provided useful feedback that helped us refine our guidance. Finally, we summarized our proposals with a checklist for researchers’ convenience. Our guidelines contribute to future research by providing improved protection of participants’ and scientists’ interests.


INTRODUCTION
M OST researchers hypothesize that emotions can be rec- ognized using self-report data along with objective behavioral and physiological indicators [1].Until recently, however, the collection of rich multimodal emotion data was restricted to laboratory settings [2], [3], which limited the ability to recognize emotions in everyday life.Now, we can collect data related to emotions experienced in the field through a combination of the Experience Sampling Methods [4] and wearables [5], [6], [7].The rapid development of wearable technologies and artificial intelligence (AI) opens new possibilities in affective science that overcome labbased limitations [8], [9], [10], [11], [12].
For instance, a recent review found that amusement elicitation does not cause significant respiratory, cardiovascular, or electrodermal changes [13].This contradicts the common experience of being amused when individuals have trouble catching their breath while laughing, and the accompanying muscle action is sometimes so strong that individuals may feel abdominal soreness the next day.The lack of support for physiological changes may result from the lab methods usually used in psychophysiological studies.Amusement is usually elicited with funny film clips, while participants are attached to medicalgrade apparatus that restricts their movement [13].With wearables, researchers should be able to collect data on the amusement experienced in everyday life that will hopefully include the strong physiological reactions mentioned above.
Using wearables also opens the possibility to account for the role of context when studying emotions.While collecting physiological and behavioral data with wearables, it is also possible to collect additional information about the context, including participants' location, the presence of other people, and sound or lighting conditions.Considering the role of context may help to overcome the limitations observed in other affective domains in which neglecting the importance of context led to large controversies (e.g., in the facial expression analysis domain [14]).
Wearables refer to devices that can be worn embedded with sensors that monitor individuals' behavioral and physiological activity, such as smartwatches, wristbands, or chest straps.The usage of wearable technologies for research has nearly doubled in the last few years [15].Due to their unobtrusiveness and convenience, wearables are increasingly being utilized by individuals to improve their well-being, sleep, and fitness [10], [16].For instance, recently, wearables have allowed researchers to effectively detect seizures [17], [18] and help with the precision management of diabetes [19].We believe that exploiting behavioral and physiological signals acquired from wearables has similar potential for scientific discoveries in affective science.
Although using wearables to study emotions holds promise, at the same time, it poses potential ethical risks [20].Given the incredible potential (current and future), it is critical to reflect on how to plan and conduct ethical and responsible research with wearables and human involvement.However, the digital research community lacks ethical guidance making it difficult for scientists to determine how best to inform prospective participants and to manage, gather, and share data by means of wearables [21].
Furthermore, the interdisciplinary nature of affective computing research using wearables presents challenges not only to researchers but also to the relevant ethics committees [21], [22], [23].These committees are guided by regulations and ethical principles, which, unfortunately, have not kept up with the pace of technological development [24].For instance, committees have struggled to evaluate studies that passively collect data from participants' surroundings in real-time [25].Moreover, the regulations -if they exist in the country -were created when most researchers came primarily from academic institutions, which are bound to apply federal or national regulations due to public funding.Hightech companies now possess sufficient resources to launch large-scale psychological and biomedical research.Since federal funding is not tied to these studies, these entities are not bound to regulations designed to protect research participants.This raises a risk that some investigations might be profit-rather than ethics-driven.Hence, the changes in scientific interests should be followed and even anticipated by the evolution of ethical standards, guidelines, and codes for research.While members of the ethics committees should keep up with the pace of technological development, numerous challenges prevent appropriate knowledge updates.The (dis)approval for the study is influenced by the boundaries of the scientific knowledge of its members, including awareness of the volume and granularity of data produced while using wearables.The knowledge gap can impact the risk assessment in unexplored fields and research topics [26], and in turn, this results in unclear oversight mandates and inconsistent ethical evaluations [27].
Here, we aim to address ethical issues specific to studying emotions with wearables in field research.Building upon available frameworks in psychology and computer science [28], [29], [30], [31], [32], [33], [34], we identify ethical risks and group them into four domains inspired by the Digital Health Framework [21].Following the general recommendation that ethics should provide examples of what is right -rather than prescribing what should be avoided [31] -we have also developed some strategies to minimize the risks.Finally, we consulted our proposals with affective scientists and the ethics committee members.It resulted in the final list of potential risks and recommendations for minimizing them.We strongly believe that our recommendations may serve as guidelines for affective scientists working with wearables.Our work will help researchers address ethical concerns, not only in planning a study but also in the process of obtaining approval from an ethics committee.Furthermore, we argue that the guidelines may serve ethics committees evaluating the risks in the projects related to examining emotions with wearables in field studies.
The main contributions of this paper are: 1) We identified 33 risks specifically related to carrying out affective research with physiological signals provided by wearables, especially in everyday life.2) We developed appropriate recommendations for each identified risk.3) We consulted, validated, and revised both risks and recommendations with external experts worldwide.4) Based on the risks and related recommendations, we developed an appropriate checklist to support researchers in preparing and conducting their studies.

EXISTING ETHICAL GUIDANCE
Ethics is the study of proper action [35].New technologies raise new ethical challenges that need consideration to improve appropriate action in research.When working with new technologies, scientists usually start evaluating more general ethical recommendations to tailor them down to specific research questions.Thus, researchers follow general principles such as respect for persons, autonomy, beneficence, justice, and non-maleficence.These principles have been included in many national and international human research ethical guidelines, including the Charter of Fundamental Rights of the European Union [36], the Declaration of Helsinki [37], the Belmont Report [38], and the Menlo Report [39].Furthermore, researchers follow their professional ethics or ethics related to their scientific field.For affective computing, which is an interdisciplinary field, scientists may rely on the guidelines that emerged from computer science (e.g., IEEE Code of Conduct [40], IEEE Code of Ethics [41], IEEE Ethically Aligned Design [42]) and from psychology (e.g., APA Ethical principles of psychologists and code of conduct [33], BPS Code of Human Research Ethics [34]).Although the general guidelines provide useful recommendations for high-order issues (e.g., the necessity of informed consent), they do not address specific risks related to the narrower scientific area, such as using wearables to recognize emotions.
There are few ethical guidelines in affective computing [28], [29], [30], [31], [32].However, rather than addressing specific issues related to studying human emotions with wearables, the other ethical perspectives provide a very general ethical framework for affective computing [32] and for ethical consequences of affectively-aware artificial intelligence [31], or focus on the ethical impact on members of scientific teams rather than research participants [28], on applications of emotion recognition systems [29], and on recognizing emotions from text [30].
As a result, in searching the existing ethical frameworks, we explored other scientific fields that collect data with wearables on human participants, including medicine and public health.In recent years, one promising and complementary ethics perspective for digital health research was created, namely, ReCODE Health -Digital Health Framework and companion checklist -Digital Health Checklist for Researchers (DHC-R) [20].The DHC-R was initiated using a framework grounded in ethical principles spelled out in the Belmont Report and Menlo Report: beneficence, justice, respect for persons, and respect for Law and Public.Beneficence relates to appropriately balancing possible harms and benefits resulting from the research [39].Justice relates to fairness in selecting research participants and fair distributions of cost and benefits of research according to individual needs and effort [39].Respect for persons relates to the participants' autonomy, with specific treatment to individuals with diminished autonomy (minors) [38].Respect for Law and Public relates to compliance with relevant laws, contracts, terms of service, and transparency-based accountability [39].Applying these ethical principles to each domain is critical for ethical decision-making [20].The DHC-R is structured around four domains 1) risks and benefits, 2) privacy, 3) data management, and 4) access and usability.Risks and Benefits focus on weighing the potential harms and disadvantages against the potential benefits in terms of knowledge to be gained from the study.Privacy focuses on the type of personal information collected about participants, their ownership, and who has access to the data.Data Management focuses on collecting, storing, sharing, and protecting data.Access and Usability focus on issues related to access and efficient usage of proposed devices and technology [20].We renamed the Risks and Benefits domain with the Participation Experience domain in this article.As all domains are related to some study's risks and benefits, we believe the name -Participation Experience domain -fits our risks and recommendations better.

IDENTIFIED RISKS
First, we identified the primary ethical risks for affective research using wearables.To ensure the risks list is comprehensive, we developed it using a combination of approaches that include (1) a state-of-the-art literature review; (2) our experiences in using wearables in research; (3) research participants' feedback; (4) suggestions from ethics committees members; (5) suggestions from psychological and AI societies' members.Furthermore, we brainstormed with an extended team of 12 researchers.We then sorted out our ideas by linking similar proposals and defining and clarifying risks.By risk, we mean the potential physical or psychological harm or discomfort to participants that may arise from the investigations.We identified risks that apply to a broad range of research contexts, including laboratory and field studies.Here, we evaluated specific risks related to affective research using wearables (e.g., distress by repetitive testing) rather than general risks in scientific research (e.g., involuntary participation).The general risks are listed at the end of the section.Although most of the identified risks apply to studies passively collecting data with wearables, we also detected some specific risks of using AI solutions in affective studies (e.g., Risk 11,28,29).
Next, we recommended risk minimization strategies by proposing actions that can be performed during the planning or implementation stage of the study.Our recommendations are addressed to researchers, so we present them in second-person grammatical form, i.e., you/your.Finally, based on Digital Health Checklist for Researchers, we grouped our suggestions using four domains, namely ; (1) participation experience, (2) privacy, (3) data management, and (4) access and usability.
To clarify the research context, we added the icons next to the risks' names, which mark whether a given risk and recommendation apply to wearable research conducted in the lab ( ), in the field ( ), or in both scenarios ( ).

Participation Experience Domain
Risk 1: Studying a sensitive topic If a study involves recalling past situations, participants may experience emotions associated with those situations.If the emotions are unpleasant, participants may feel psychological harm [43].
Recommendation: You should help participants consider any unpleasantness they may experience during the study.Strategies to help participants process or recover from unpleasant feelings include positive psychology interventions, such as expressing gratitude and kindness to others.Additionally, participants may be compensated for any negative emotions experienced during the study.These steps may balance the unpleasantness and pleasantness associated with participation in the study.You may consider referring subjects to professional help at no cost to them.
Risk 2: Study-related guilt If participants forget study procedures, they may experience feelings of guilt.Examples include forgetting to wear or charge the device or to answer survey questions on time.Furthermore, participants may feel guilty as their enthusiasm for the study reduces over time, and they stop following the study procedures.
Recommendation: You can inform the participants that it is acceptable to skip some aspects of the study to protect themselves from unpleasant sensations.We also encourage you to create procedures to monitor participants' well-being and intervene if necessary.Participants should also be encouraged to withdraw from the study or take a temporary break if they experience unpleasant sensations as part of the research.Above all, participation in research is voluntary.You may also consider examining whether the data is biased according to the stages of the study, e.g., beginning, middle, end.These steps can help normalize forgetting study procedures and prevent feeling study-related guilt.
Risk 3: Study-related frustration If the technology associated with the study does not work properly or as expected by participants, then the participants may experience feelings of frustration and even anger associated with the research.
Recommendation: You should pilot test the technology and the study procedures within the research team (alpha testing) and on real users (beta testing) before the research.Proper testing should minimize the possibility of errors and bugs during the actual study.We encourage you to clearly explain how participants are to use the technology during the study and provide additional instructions as needed.Furthermore, you should minimize the participants' burden in the case of a device failure.You should replace and fix the device as soon as possible and in a way that does not involve participants' additional effort.These steps may help to reduce the risk of frustrating situations.
Risk 4: Study-related fear If the technology feels fragile or expensive, participants may feel overly cautious when using the technology and concerned that it may be stolen or damaged.
Recommendation: We recommend providing the participants with information about (1) the actual value of the technology, (2) what to do if it is damaged or stolen, and (3) the technology's unique ID number that can be traced back, e.g., if somebody steals it and tries to sell it online.Additionally, if the device is particularly valuable, you may consider providing instructions on how to conceal the device properly.You may also consider purchasing an insurance policy for the technology.Participants should also be reassured that no retaliation will be followed for accidental damage.These steps can help to reduce participant concerns about accidental damage or theft.
Risk 5: Fatigue If the study procedures involve repetitive processes, such as responding to daily survey questions and remembering to wear and charge a study technology, over time, some participants may develop feelings of study fatigue.
Recommendation: You should ask participants to communicate if/when they are experiencing fatigue during the study.Encourage participants who are feeling study fatigue to take a break from the study procedures.You should inform the participants that it is more important to provide reliable data than more data.Suppose participants are tired and do not want to report their emotions.It is better to skip the notification than to answer it recklessly in such a case.Strategies to reduce study fatigue include adding incentive mechanisms to your study procedures, such as gamification and rewards for completed surveys, but it can bring some bias.
Risk 6: Wearing discomfort If the study procedures involve wearing technology on a regular basis, some participants may experience physical discomfort associated with the technology due to its size, weight, fit, or other design factors.
Recommendation: During the consent process, you should inform the participant that collecting data may require wearing sensors in unusual places (e.g., on the chest), which might be uncomfortable.Consider providing participants with options for how to wear the technology and ways of adjusting the technology so that the fit is comfortable.
Risk 7: Skin damage If the study procedures involve wearing a technology tightly against the skin, over time, wearing the device may result in skin irritation, abrasion, or other harm.Additionally, some participants may be allergic to the materials used to manufacture the technology (e.g., substances on the strap).
Recommendation: You should inform the participants that collecting reliable data may require wearing sensors that fit tightly and/or stick to the skin.However, you should strive to develop a technology that is not uncomfortable or harmful, e.g., causing skin damage or pain.To reduce these risks, you may provide options for adjusting the technology (e.g., replacing a metal smartwatch strap with a leather one).You can also provide participants with information about what to watch for (e.g., discomfort or rash) and what to do if this happens (e.g., remove the device, report the accident to researchers, and consult a primary physician if the skin rash persists after a certain time, e.g., three days).
Risk 8: Financial responsibility If the technology relies on energy, Internet access, or other resources from the participant, then some participants may feel concerned about the financial costs associated with providing these resources as part of the study.
Recommendation: As part of the study planning, estimate the potential costs of maintaining the technology while it is in the participants' possession (e.g., energy costs, Internet access fees).Plan to reimburse or provide participants with these additional resources as part of the study procedures.Explain how the study accounts for these additional costs during the consent process to reduce participants' feelings of financial responsibility.
Risk 9: Social stigma If the technology is visible, some participants may feel concerned about how other people perceive them when wearing the technology.Examples include technologies that record situational information, such as voice, images, and location.
Recommendation: You should clearly describe the technology (e.g., its look, wear, functions), all the types of data that the technology collects, and how the data will be managed during the study.You should also provide the participant with sample responses to standard questions from other people about the technology.Additionally, encourage the participant to remove the device if it makes other people uncomfortable.When automatically collecting data, ask participants to obtain verbal permission from family members, cohabitants, workplace managers, or supervisors before the study begins.However, sometimes automatically recording data (e.g., voice) may not be permitted by law if the study uses only a two-party consent state.To the extent possible, potential bystanders should be informed about how data collection and management procedures may relate to them personally by contacting the research team directly or by asking the participant to do so.These steps can prevent negative social perceptions and reduce instances where data has been collected without consent from third parties [44].
Risk 10: Unknown harm As there have been rapid advancements in wearable technologies, participants may feel concerned about the potential for currently unknown harms associated with using the technology.
Recommendation: You should inform the participants that to the best of your ability, the research team will strive to recognize potential risks as they emerge during the research and will promptly communicate those to all participants.Additionally, you should consider pilot testing all possible scenarios to identify and reduce as many unknown factors as possible.
Risk 11: Automation bias If the technology involves artificial intelligence, some participants may feel overconfident in the recommendations provided by the technology [43].For instance, if the technology uses artificial intelligence to make inferences about a participant's emotions, some participants may become reliant on the recommendations as an emotional guide in decision-making (e.g., buying a specific t-shirt because the smartwatch vibrated when looking at it).
Recommendation: You should inform the participants about the limitations of artificial intelligence systems, presenting the opportunities, risks, and limitations clearly.A clear explanation of the tested systems' capabilities and limitations can help participants feel cautious about the technology, results and recommendations returned to them through the research.

Privacy Domain
Risk 12: Data anonymization Some participants may expect to participate anonymously; however, it may not be feasible for them to do so concerning the study procedures.In this case, participants may feel deceived when they learn that their data is not anonymized to meet their expectations.In addition, there is a risk that with the development of technology, physiological signals (e.g., ECG) will be used to identify individuals, just like fingerprints [45].
Recommendation: You should make every effort to anonymize data [43], [46], [47].In an ideal world, even the data collector does not know which data belong to whom.However, for data collection and possible technical problems, a participant ID-participant data map should be retained for the duration of data collection.Once the data collection process is complete, researchers should irreversibly delete the link allowing them to identify which data belongs to whom.You should also inform participants about situations when their data is only partially anonymized and that you cannot guarantee that participants' data will not be reidentified in the future.New, more advanced deanonymization techniques emerge, and someday multiple kinds of anonymized data when combined, will enable the identification of someone.These steps may help the participants to feel comfortable with the data collection and management procedures.
Risk 13: Individual-level access Some participants may expect to have access to their individual data; however, it may not be feasible for researchers to provide this access.In such cases, participants may feel they are not benefiting from the study insights based on their personal contributions to the research.On the other hand, providing unsupervised access to a data subject may unintentionally result in different psychological harm or discomfort.For instance, a person may become distressed by being confronted with such data, or it may lead them to develop inaccurate interpretations unconsciously.
Recommendation: If possible, you should develop ways of returning study data to participants at individual or aggregated (group) levels.This may not be feasible once the data is anonymized, i.e., the link between participant ID and their data is deleted.You should inform the participants when and how they can obtain their data.Sometimes, what and how information is returned needs to be determined based on the type of data and whether it will be of value to the participant.Thus, the access should be planned to be of value to the subjects while minimizing any harm or distress that may arise from the subject's observation and exploration of the data (e.g., observing a heart rate above 170 bpm).Sometimes data may need to be interpreted by a clinician or other expert.
Risk 14: The 3rd party access and data ownership If the study involves wearable technologies that are commercially available, then the device manufacturer or other third parties may have access to data collected during the study without the researchers' and participants' knowledge.It may create confusion about who the data owner is.When participants recognize this consideration, they may lose trust in the research and/or be concerned about how their data might be used (or used against them).
Recommendation: You should clearly inform participants who the collected data owner is.When using commercial devices and software, you should inform the participants that some data collected for study purposes will be transferred to commercial apps and will be processed according to their privacy policies.You should read the Terms of Service and Privacy Policy and provide access to them for study participants.If vendor practices might violate participant expectations, do not use the product or be explicit about what specific information the company will have access to and what they might do with it.Alternatively, you can register the product so that the participant's identity is not linked.Furthermore, we encourage you to use wearables that do not have such risks or clearly state the consent form policies.Furthermore, we encourage you to collect minimum data, keep it locally, develop safety data migration procedures, and store data only for the minimum required time.These steps may prevent unwanted data sharing.Some of the procedures and privacy policies might be regulated by federal regulations such as the GDPR in European Union Countries [48], [49].Furthermore, as a research data owner, you should also be prepared for a situation in which some researchers (or even the whole team) would not be able to continue their work.The outgoing researcher should choose a provision person to take over the responsibility for the collected data or destroy it.These steps may provide the continuity of research data access.
Risk 15: Researcher access If researchers have access to non-anonymized qualitative data that includes personally sensitive information (e.g., an affair, sexual orientation, opinions about other people), then participants may feel concerned about how their data may be shared and with whom.This might be an especially sensitive issue when some participants know the researchers or other people who may gain access to the data.
Recommendation: You should clearly state who will have access to which data and for how long.Participants should be fully aware of the safety of shared information.
Risk 16: Temporary break If participants want to stop data collection during specific time periods or events (e.g., stop receiving notifications during intimate or professional situations), they might not know how to do it and whether it is acceptable based on the study procedure.This can lead to feelings of confusion and a lack of agency among participants.
Recommendation: You should clearly explain to the participant that it is fine to stop data collection when needed and that data quality matters more than data quantity.Participants should be instructed on how to stop data collection by switching off the device or choosing the specific option in the app where they may choose which measures are collected at the given moment.In this way, participants should be able to stop data collection when necessary.
Risk 17: Informed data collection If participants do not know what is registered by the device (e.g., sound, location, type of physical activity, presence of other wearable devices, or smartphone keyboard input) and for what purpose it will be used, they may reveal some unwanted information during the study (e.g., logins and passwords).When participants recognize this consideration, they may feel concerned about how their data might be used (or used against them).
Recommendation: You should inform the participants about the type of data collected by the devices and how the data might be used.You may also want to occasionally remind participants about the nature and granularity of data collected since the pre-study informed consent may not be completely understood.Ongoing reminders may be helpful and result in a more meaningful consent process.

Risk 18: Data insecurity
If the collected data is not properly secured (lack of encryption during data transfers from devices to servers), then data can be leaked (e.g., due to a cyber-attack).If the participants' data is leaked, it may lead to lost trust in the research and/or concern about how their data might be used (or used against them).
Recommendation: Original data should be stored in offline encrypted storage, locked in a secure place.You should maintain a backup.All research staff members should be informed about the consequences of data sharing.Data sharing should be controlled.Data storage and access protocol should be established and maintained, preferably consulted with external experts.Furthermore, you should describe where the data is stored in the consent form and how it is transferred from the wearable to the other storage.For instance, the data collected by the wearable is transmitted via Bluetooth to a smartphone and then uploaded to the secured cloud via mobile data.These steps may ensure the participants' data safety and establish a secure data flow.We recommend following local data protection guidelines (e.g., in European Union, the EU's GDPR), which are designed to ensure that the utmost care is taken to protect personal data.When data is sensitive (e.g., not possible to pseudonymize), we encourage additional risk and impact assessments with additional protection.

Risk 19: Low validity and reliability of commercial technology
If researchers use commercial devices (rather than scientific devices), which might lack reliability and validity, then their scientific conclusions might lack quality.This can lead to biased conclusions from the study and, in severe cases, result in flawed law or policy decisions.
Recommendation: You should use the validated/verified devices or should validate the devices yourself.We encourage you to collect the raw data provided by the wearables.Thus, you may test the differences between processing solutions provided by device producers and other state-of-theart available solutions.Sometimes you would have to choose the wearables based on the required data type (e.g., raw photoplethysmography signal versus preprocessed heart rate).Furthermore, we recommend checking the completeness of the documentation of the device itself and the device software.You should also establish data quality monitoring procedures (e.g., calculating signal-to-noise ratio) to detect artifacts and signal noise.You should be aware that poor signal quality will lead to questionable model inferences.These steps may ensure the quality of collected data.
Risk 20: Poor wearable fit If the study aims to use physiological data, then not properly worn wearables may cause low-quality data and, in turn, incorrect inferencing.
Recommendation: The wearables' accessories, like straps, should properly fit the participant's body.Sometimes the original accessories may not be enough, as they may lack sufficient size regulation options.We recommend equipping used devices with dedicated accessories that overcome these issues, e.g., a magnetic strap for smartwatches that enables perfect adjustment.Wearing the devices properly is the first and necessary step in the following stages of the study.
Risk 21: Reporting or editing data If participants collected invalid data (e.g., accidentally completing a survey while the phone was in the pocket), then researchers might not be aware of the incident and treat the data as valid reports.The moment participants realize that the report was filled out incorrectly, they may feel discomfort.
Recommendation: You should provide the option to the participants where they can flag the data they might think was corrupted.It would help the scientist to make informed decisions on including/excluding the reported data.
Risk 22: Technical problems If the efficiency of the study-related technology (e.g., devices, applications, or AI models) is dependent on the operating system version, then it may sometimes malfunction due to unexpected errors or some anticipated operating system changes and updates.This can lead to participants' wasted time due to non-functioning technology and project delays.
Recommendation: We recommend planning comprehensive and continuous testing procedures.For instance, we recommend monitoring announced system changes (e.g., the new Android OS version) and making the application compatible in advance.Furthermore, it might be helpful to implement near-real-time technical monitoring (e.g., each day, you may check the completeness and correctness of acquired data).Thus, you may intervene in a relatively short time manner rather than after completing the study.These steps may ensure the proper efficiency of the studyrelated technology.
Risk 23: Unexpected contact loss If the research team loses contact with the participants (e.g., in extreme cases due to participants' death), then the team may lose the devices and some research data.This may lead to increased project costs and allegations of researcher mismanagement.
Recommendation: During the signing process, you may want to ask for a contact to the participants' close ones so that you can determine the possible reason for the contact loss.In terms of collected data, you should plan in advance procedures for using or removing data from the participant that prematurely terminated the study.You may sign a device lease contract with the participants to form a civil law relationship between you and the participant.In this way, you might search for the missing participants asking the authorities to help you.You can also take out an insurance policy that will cover your losses.In some specific scenarios, you can consider collecting the equivalent of the rental equipment on the pledge, however, it may discourage participation in the study.Furthermore, you should be aware that unexpected contact loss is possible and consider purchasing extra devices and appropriate budget planning.

Risk 24: General exclusivity
If the researchers recruit individuals from the WEIRD (Western, Educated, Industrialized, Rich, and Democratic) populations, then it may lead to growing biased datasets with sex, race, and age discrimination.
Recommendation: You should recruit participants based on the scientific goal of the study.You should consider whether participants were provided fair access to the study by recruiting people of different ages, sex, and race.However, sometimes the research questions might focus on studying a specific group (e.g., elderly populations), or the study might be run in a country with a homogeneous population, so full inclusivity is not possible.If this is the case, you should avoid overgeneralizing your findings and applications [30].Furthermore, if studying a specific group, it is important from an access and usability perspective that the device and AI models have been tested with the target population in advance and are deemed usable [31].Moreover, when studying unique populations, we encourage you to start the study on the easily accessible group (e.g., students) and then progress to groups that may benefit from the technology the most (e.g., elderly).We believe that testing the procedures and practical solutions on popular groups and then tailoring them to other populations might be optimal.
Risk 25: Excluding participants with specific physical conditions If researchers collect physiological data with wearables, then researchers may exclude people with specific physical conditions that interfere with sensors (e.g., tattoos, obesity) [50].
Recommendation: You should be aware of the technology limitations.Once you know the conditions under which the devices do not collect reliable data, you might consider: (1) using only the conditions that ensure collecting reliable and valid data or (2) you might work on improving sensor quality and data preprocessing procedures.You should consider whether you can address wearable sensors' limitations.If not, you might inform participants about the reason behind the exclusion criteria for the study.
Risk 26: Technological unfairness If researchers collect data with wearables, then they may exclude people who do not own specific technology.For instance, researchers may want to use individuals' smartphones to collect the data, with some software requirements and access to the Internet.This may exclude individuals with old or low-quality devices that may not want to or be able to afford the newest models of smartphones required for the study.
Recommendation: You should provide participants with all the equipment needed to participate in the study.Participants may use their devices if they find them more comfortable.In that case, you should inform the participants what device specification is needed.

Risk 27: Digital illiteracy
If researchers collect data with wearables, then they may exclude people who are not technology enthusiasts or people less familiar with using wearables.
Recommendation: We encourage you to use diverse recruiting strategies --going beyond social media advertisements with recruitment -to reach interested people of all ages and levels of digital fluency.It may be necessary to educate the targeted population about the benefits of the technology to recruit them.Furthermore, the language of study instructions should be as simple as possible and adjusted to the targeted population.
Risk 28: Biased inferencing If the study uses AI models trained on a non-representative dataset -for age, sex, race, health status, social status, and digital illiteracy -then researchers' inferences might be biased.In turn, the technology or solutions produced in research might not be useful for discriminated groups.For instance, the technology that works based on cardiovascular data may not work well for people with some cardiovascular dysfunctions (e.g., cardiac arrhythmia or the use of drugs or medications).
Recommendation: We recommend using datasets containing samples from diverse subjects for training AI models.You should inform participants about the original population that the technology was validated on, and that the system may not work correctly on data from underrepresented groups.Moreover, the AI models should be tested on the target population to ensure that they work correctly.
Risk 29: Overgeneralization of individuals If the study use AI models trained on the general population dataset (e.g., due to the lack of personalized data -cold start problem [51]), then some individual differences (e.g., in emotional responses and evaluations) might reduce the models' usability and leads to models' incorrect predictions.Even for a given individual, their variability of physiology and perception may depend on the time and context.
Recommendation: We recommend utilizing personalization and contextualized methods while creating an AI model.We encourage you to retrain the general models on data from specific participants to fit the model more accurately.You can inform the participants about the personalization process, which requires collecting the individual's data to create a better-performing model.
Risk 30: Medical inferences If participants collect health-related data but not medicalgrade data, participants may mistakenly want to use research data for health evaluations and transfer the data to health records.
Recommendation: You should carefully consider whether the data collected in the study have health implications.If this is the case, you should be aware of additional data processing and storage regulations.Furthermore, you should clearly state whether the data collected in the study might be used to evaluate participants' medical conditions (e.g., cardiovascular health) and whether it is possible to transfer the data to the participant so it may be consulted with a physician.For instance, respiratory and cardiac data recorded with chest straps might be useful for identifying sleep apnea [52].These steps may clarify whether research data can be used for proper medical inferences.
Risk 31: Device reduced functionality If participants expect that taking part in the study will allow them to take full advantage of the device they will receive, then they might be disappointed that due to the research requirements, some device functionality might be reduced (e.g., the necessity of charging wearables at night limits the possibility of measuring sleep).
Recommendation: You should inform the participants about the benefits of using the wearables (e.g., reading messages, answering the call on the smartwatches) by clearly addressing limited device functionality related to its usage for research (e.g., short battery life).
Risk 32: Duplicated devices If participation in the study requires using a specific smartphone or smartwatch, then the participant may end up with two smartphones/smartwatches (one private, the second for research) being used in parallel, increasing the burden of study participation.Furthermore, if the participant treats the research device as secondary, it can lead to loss of data.
Recommendation: We recommend presenting participants with the pros and cons of switching to research devices for the duration of the study, e.g., additional applications on own devices may cause awkward battery drain; research devices were extensively tested before the study to avoid unexpected problems; both research and user's applications were not tested together, so they may not work properly; research devices may be more recent and advanced making participants more familiar with the technological development.Informing about these facts may convince participants to use only the research device for the duration of the study, providing more complete data and limiting the study participation burden.If the participant cannot use the provided device for any reason, you should consider the consequences of excluding such a person or losing some data.
Risk 33: Reusability of the developed technology If the research team produces some technological advancements, they may want to restrict its access.Then, external researchers cannot reproduce, exploit or validate the developed solutions, which in some cases may lead to duplicating mistakes and wasting resources.It is especially crucial in new and fast-growing -including wearablestechnological domains.
Recommendation: We recommend you share the code in the spirit of open science practices.You should take care to improve the findability, accessibility, interoperability, and reuse of your digital assets.For instance, be transparent about what data was used in different stages of the system construction.Other researchers might use the publicly available code to develop new solutions or use it in their studies.We believe that only transparent and accessible knowledge will lead to scientific advancement.

General Risks
We also noted more general issues of concern in conducting research when identifying risks.Among them, we highlighted (1) provision of informed consent; (2) inability to withdraw (but also to rejoin if practicable) the research; (3) language and study instructions not appropriate to the intellectual and technological proficiency of the participants; (4) anticipating missing data; (5) overall data anonymization and security; (6) balancing the burden on study participants with the benefit to researchers, e.g., asking too many questions or too often; (7) technical limitations of devices, e.g., sampling rate, low battery; (8) choosing the inappropriate emotion model (e.g., outdated or not suitable for the later needs of creating a machine learning models [12]) ; (9) inference model use; (10) amount and method of compensation; (11) data quality; or (12) overgeneralization of context while experiencing emotions.Although the general risks might be as important as those determined by us, we focused on examining ethical risks specific for affective studies using wearables.

CONSULTATIONS
To validate identified risks and recommendations, we created a survey and distributed it among the affective researchers and members of ethics committees.

Identifying Related Researchers
We created a list of ethics committees related to affective computing based on a Google search, WHO List of National Ethics Committees, the European Network of Research Ethics Committees website, and articles about recognizing emotions using machine learning and physiological signals that provided ethics committees' details.Our list included 317 committees from 119 countries on six continents.Additionally, we identified 278 researchers studying affective computing by extracting contact emails from ACII 2021 conference proceedings.We also contacted members of the Society of Affective Science, the International Society for Research on Emotion, the Association for the Advancement of Affective Computing, and the authors of the ethical frameworks mentioned in the Introduction section.

Creating Survey
We created the survey in the Google Forms tool.We asked researchers to evaluate to which extent they agree with proposed risks and recommendations using a single-item scale ranging from 1 (strongly disagree) to 5 (strongly agree).If judges (respondents) were uncertain about the risk (or recommendation), they were asked to mark 3 (Neither agree or disagree).We also provided the scientist the open question box to explain their risk rating and propose any updates to our recommendation in a brief comment.At the end of the survey, we provided the scientist with an open question box to propose novel risks and strategies for minimizing the risks.Researchers were also asked to report their age, sex, location of their scientific institution, dominant scientific field, academic position, experience in research ethics, years of experience in scientific research, and membership in the ethics committee.

Distributing Survey
We sent the invitation for evaluating identified risks and recommendations to researchers from the lists in mid-May 2022.A follow-up reminder was sent two weeks after the initial email.The response rate was 4.38%.
We explained how the risk and recommendations were identified along with the invitation.We kept the survey brief to encourage participation, with only two questions for every risk.Furthermore, to encourage researchers to participate in the study, we provided an option to evaluate only some of the risks and recommendations -after each block of eight/ nine items, participants could end the questionnaire.

Results
Participants.In total, 26 researchers from 13 countries answered our call.Researchers represented different scientific fields, including psychology, computer science, ethics, clinical medicine, clinical trials, public health, engineering, and robotics.Their level of experience in research ethics ranged from 1 (novice) to 5 (expert) (M = 3.60, SD = 1.13), and 10 of them (38%) were members of ethics committees.Among respondents, three classified themselves as students (graduate or undergraduate), six as post-docs, one as a researcher, nine as professors, one as a medical doctor, two as Ethics managers, and three as ethical committee members, with no academic positions.Researchers mean research experience ranged from 3.5 to 43 years (M = 17.22,SD = 11.49) and age ranged from 21 to 77 (M = 43.04,SD = 14.15).Most respondents were females (N = 14, 54%).
Agreement.Overall, the judges positively rated the proposed risks (M = 3.82, SD = 0.27) and recommendations (M = 4.14, SD = Table 1 contains the detailed results -the mean score of agreement, standard deviation, and the number of responses for a given risk or recommendation.The judges disagreed with only a few risks.We considered the risk or recommendation as questionable if it received at least two "strongly disagree" or "disagree" ratings.We discussed the questionable 16 risks and seven recommendations.Furthermore, we evaluated the comments provided by the judges and developed final versions of the risks and recommendations. We clarified some of our risks and recommendations based on the judges' comments.We added to the recommendation to Risk 2 -Study-related guilt, that researchers might consider examining whether the data is biased according to the stages of the study.In the recommendation to Risk 4 -Study-related fear, we suggested that participants should be reassured that no retaliation will be followed for accidental damage.We also noted in the recommendation to Risk 5 -Fatigue, that adding incentive mechanisms to study procedures, can bring some bias.We added to Risk 13 -Individual-level access, that providing unsupervised access to a data subject may unintentionally result in psychological harm or discomfort.For instance, a person may become distressed by being confronted with such data, or it may lead them to develop inaccurate interpretations unconsciously.In the recommendation to Risk 16 -Temporary break, we clarified that researchers should explain to the participant that it is fine to stop data collection when needed and that data quality matters more than data quantity.In the recommendation to Risk 18 -Data insecurity, we suggested following the local data protection guidelines and developing a procedure for handling sensitive data.In the recommendation to Risk 23 -Unexpected contact loss, we noted that awareness of unexpected contact loss should lead to appropriate budget planning.We also changed Risk 25 -Excluding participants with a specific physical condition.We initially presented it as "excluding unhealthy participants" and we gave the example that researchers may exclude people with some cardiovascular dysfunctions (e.g., cardiac arrhythmia or use of drugs or medications) when collecting cardiovascular data.We believe that the current version fits better with the provided recommendation.In the recommendation to Risk 27 -Digital illiteracy, we noted that sometimes researchers may need to educate the targeted population about the benefits of the technology while recruiting.We added the example of malfunctioning technology due to participants' health conditions in Risk 28 -Biased inferencing.We also clarified the Risk 33 -Reusability of the developed technology, which states that external researchers may not be able to reproduce, exploit or validate the developed solutions when the original researchers restrict access.We also added an example of open science practice, namely, presenting what data was used in different stages of the system construction.
We also added to the general risk category: (1) the inability to withdraw from the study (but also to re-enter if feasible), ( 2) language and study instructions not appropriate to the intellectual and technological proficiency of the participants, and (3) overgeneralization of context while experiencing emotions.
One judge also identified an additional risk and recommendation, in which the researcher addressed sound and voice recording with wearables.The researcher noted that voice recordings of third parties that are not participating in the research might not be permitted under state law in the US if it is a two-party consent state.We incorporated this suggestion into Risk 9 -Social stigma.
After thorough discussions, we have not included some of the judges' comments and suggestions.For instance, one of the judges did not agree that studyrelated technology might elicit frustration or anger, so it does not need to be classified as an ethical risk.We disagreed with this comment, as we observed in our studies that malfunctioning technology causes frustration, anger, and some discomfort in participants [53].We also disagreed with a comment concerning rewards.The judge suggested participants should not be offered an incentive to participate in research.We believe that participants should be compensated for the time devoted to the study.Paying participants with specific compensation structures corresponding to the level of involvement in the study is a well-known strategy in research using Experience Sampling Methods [54], [55].

CHECKLIST
Based on our risks list validated with external experts, we have developed a checklist to help researchers prepare and carry out their studies, Table 2. Our checklist is divided into five sections corresponding to the research stages: (1) developing procedures before the study, e.g., testing or privacyprotection procedures; (2) participants recruitment; (3) informing participants about the study, used devices, data processing, etc.; (4) actions to be undertaken during the study, e.g., monitoring the study, providing the equipment and technical support; (5) validating the research, e.g., related to AI model biases or overgeneralization of findings.

DISCUSSION
The usage of wearable technologies in affective research is growing rapidly.Researchers use wearables to track participants' cardiovascular, physical, and sleep patterns.With wearables, researchers should be able to overcome the limitations of traditional psychophysiological laboratory studies, e.g., accounting for the role of context when studying emotions.As wearable devices become more common, the risks of misuse and harm are growing as well.Therefore, our work reviews possible ethical risks associated with using wearables in affective research.
We developed a list of potential risks using a combination of approaches -e.g., state-of-the-art literature review, own experiences in using wearables in research, research participants' feedback, suggestions from ethics committees, and affective researchers.To systematize our proposals, we grouped our risks into sections (1) participation experience, (2) privacy, (3) data management, and (4) access and usability, similarly to the four domains of the Digital Health Checklist for Researchers [20].Other researchers have positively rated our solutions.
Furthermore, to help address the risks, we recommended risk minimization strategies by proposing actions that can be performed at the planning or implementation stage of the study.Our recommendations have been positively rated by other researchers as well.In our survey, researchers stated several times that they did not consider some of our proposals an actual ethical risk.Furthermore, some of our ideas may sound like methodological -rather than ethicalrecommendations.We believe that this supports the validity of our work.It is worth pointing out the possible risks when a situation may be considered a risk to some people and a typical case for others.We also believe that wasting participants' time by doing bad science is unethical and may decrease public trust in science.Our work provides some recommendations that can assist researchers when preparing and running affective research, as well as ethics committees in the effective evaluation of submissions.
Although we have given our best, our work has some limitations.For instance, our recommendations mainly focus on specific issues related to the use of wearables in affective research, and the list is not exhaustive.For instance, our list does not carefully evaluate the specific applications of knowledge gained with wearables [29] or more general ethical Researchers were asked to evaluate to which extent they agree with proposed risks and recommendations using a single-item scale ranging from 1 (strongly disagree) to 5 (strongly agree).

TABLE 2 A Checklist for Ethical Considerations on Using Wearables in Affective Research
It is divided into sections corresponding to the research stages.
consequences of affectively-aware artificial intelligence [31].Thus, we recommend using our list along with traditional ethics committees frameworks and/or other guidance to help comprehensively identify sources of vulnerability in specific research domains [28], [29], [30], [31], [32], [33], [34].Moreover, not all recommendations may be applicable in case, and it is crucial that researchers carefully consider the potential risk-benefit balance for end-users.Further, a 5% response rate to a survey could be considered low in some cases.However, in this study, 5% response rate resulted in 26 individual reviews, out of which many suggested how to improve the proposed risks and recommendations.Lastly, we did not collect data on judges' experience or expertise in using wearables, which may introduce some bias to the results.Nonetheless, we believe the judges' valuable feedback improved the accuracy and overall quality of the risks and recommendations.
We hope our work will contribute to reliable communication across all parties involved in scientific research to promote awareness about using new technologies in affective science.Given the incredible potential (current and future) of wearable technologies and artificial intelligence, we may open new possibilities by applying them to the researchers' toolbox.

CONCLUSION
Wearables have become a very attractive and popular tool in scientific research.This creates an unquestionable opportunity where people wearing their personal devices also collect rich data that can be exploited in affective research.To offer future ethical innovations, we evaluated potential risks and provided recommendations as well as a suitable checklist to help researchers detect and minimize risks in planning and conducting their studies.We hope to offer simple yet effective dedicated guidance to prevent or mitigate possible harms in affective research using wearables.

TABLE 1
Agreement With the Initial Risks and Recommendations