Personality Perceptions of Conversational Agents: A Task-Based Analysis Using Thai as the Conversational Language

Recently there has been a tremendous growth in the popularity of artificial intelligence (AI) based conversational agents (CA). Their support for anthropomorphism and human-likeness makes them popular. However, being anthropomorphic raises a question – do these agents have a personality? Moreover, what effect may personality have on the different tasks these agents perform? Through this research, we aim to answer these two questions by focusing on Thai as the communication modality between the users and the CAs. We use a multi-model approach involving human, brand, and website personality frameworks for proposing our CA personality model. We use a series of steps right from creating the initial pool of personality traits to the final set of personality traits through a systematic approach. Our proposed personality model has 7 dimensions across the two-dimensional continuum (calm – neuroticism, maturity – juvenility, intelligence – ineptness, openness – reserved, sociability – seclusion, self-control – instability, and aesthetics – unaesthetics). For examining the effect of personality type on the nature of tasks, we identified two primary task categories (social and functional) and used a multi-criteria decision-making approach to examine the corresponding impacts. Social tasks are impacted most from the (maturity – juvenility) dimension, whereas functional tasks are mostly impacted from the (intelligence – ineptness) dimension. Based on the results we provide suitable recommendations for future research.


I. INTRODUCTION
Conversational agents (CAs) or intelligent personal assistants are software systems that can interact with the users by using natural language, similar to human-human communication [1]. This has been made possible due to the advancements made in artificial intelligence (AI), machine learning (ML), and natural language processing (NLP) techniques. Consequently, the CAs are being used extensively in a variety of The associate editor coordinating the review of this manuscript and approving it for publication was Liang-Bi Chen . scenarios ranging from healthcare [2], education [3], [4], e-commerce [5], to entertainment [6]. Since CAs are able to establish human-like communication with the users, therefore these are typically considered to have anthropomorphic capabilities that result in the creation of a lovable manmachine bond [7], [8]. Normally, in case of interpersonal relationships, researchers have established the importance of personality in establishing perceptions of friendliness or affection for each other [9]. So, logically as an extension to this in the human-AI world the question arises that whether people perceive the personalities of the artificial agents in the same way. For e.g., during a human-CA interaction if the later interrupts the user when he/she is speaking, then the CA might be considered ''rude''. Likewise, if the CA mispronounces a common word, it might be considered ''dumb'', or even if it forgets something that the user just said, it might be considered as ''forgetful''. Therefore, it only becomes natural to ask that do CAs exhibit any personality? Is it beneficial for the user experience? What type of personality traits it may have? Do these traits differ in terms of when the CAs perform different activities? Since, previous research has outlined the presence of three types of CAs based on their embodiment and communication mode [10]: text based CAs (e.g., chatbots), embodied CAs (e.g., avatars or 3D animated figures), and voice based CAs (e.g., Alexa, Siri, etc.), in this work we scope them to cover only the voice-based ones.
Personality reflects human character and is one of the fundamental aspects of social relationships. It can be defined as ''a set of features that is stable across multiple situations and time and acts as a guiding influence on conversational agent behavior and interactions'' [11]. In order to explore the personality aspect of the CAs, we asked three popular commercially available CAs the following question: ''Hey XXXX, what is your personality?'', where XXXX represents the wake-up words of the CAs. Siri responded with, ''Hmm. . . I don't have an answer for that. Is there something else I can help with?''. Google Assistant replied with two answers, ''I'd say I am curious, helpful, and always looking for reasons to use an emoji'' and ''Helpful meets silly meets curious meets positivity -that's me in a nutshell!''. Alexa responded with, ''I sure do, I even completed the Myers-Briggs personality assessment, it turns out I am a ESFJ which means I enjoy being around and helping people''. As evident from the above responses, the designers of Siri do not want to give it with some specific personality, whereas Google stresses on the helpful and curious nature of its agent, and Amazon positions its agent as a caregiver that is outgoing, loyal, and organized. It becomes evident that there is no consensus as to what an ideal personality should be, and different designers' approach this in a different manner. However, integrating personality into these social agents is motivated by creating more human-like interactions with the users [12]. The challenge is currently there is no known systematic way of developing such agent personality, and the CA's still fall short with regards to the generation of sufficient human-like gossip or humour in general that needs the presence of specific personality types [13].
Previous research on CA personality is not only scattered across different disciplines like psychology, sociology, and computer science, but it is also mostly conceptual in nature [10]. Majority of these works as in [14], [15], [16], and [17] have followed an information systems (IS) based approach, where authors have attempted to identify various intelligent and anthropomorphic features of CAs that result in their adoption in the society. Whereas these studies are very important in establishing the importance of giving personalities to the CAs, however, they do not give any direction towards how to build the personality model itself. The few works that have attempted to fill up the current void with regards to the CA personality models are heavily based on the human personality framework of the Big Five Model. For e.g., [18], [19], and [20] have tried developing personality models based on the Big five taxonomy. Although this might be a good starting point since CAs typically act as human proxies, however still they are machines and therefore it is doubtful that how much the ''artificialness'' of such machines can be captured by the human personality frameworks. Additionally, the Big five model was made through a psycholexical approach, which is typically carried out by collecting adjectives about humans [21], so how well such adjectives would directly map to the CA context needs to be researched.
Another issue that exists with respect to CA personality is related to the language and task capabilities of these devices. Language variety may have an impact on the personality assessment of the CAs. For e.g., whereas assessing personality of humans, researchers in [22] identified some problematic areas where personality was found to vary based upon the language spoken. Likewise, in the CA context [23] proposed a taxonomy of social cues from a multidisciplinary perspective, and they concluded that verbal aspect of the CAs is an important determinant of the social cues. Similarly, researchers in [24] provided clear evidence that synthesized language variety (German and Austrian language) influence human perception about a conversational agent's extroversion. The problem is that the very few works, that have attempted to develop CA personalities, have done so considering English as the medium of human-CA interaction. However, considering language variety to have an impact on personality means that the results cannot be generalized across different languages, which is in direct contrast to the efforts by the CA manufacturers to give multiple language support to their devices. Alexa for e.g., provides their skill set to create custom skills in 9 different languages as of 2023. Siri supports more than 20 languages globally, whereas Google aims to support more than 30 languages. Therefore, developing personality traits in non-English language becomes important for the CAs to deliver a decent user experience.
Motivated by the above discussion, through this research we aim to answer the following research questions: RQ 1 : How can we systematically develop a personality model for voice based CAs in a non-English (Thai language) conversational scenario? RQ 2 : What is the relative importance of the developed personality dimensions on the different tasks that are performed by the voice-based agents?
For answering the above research questions, we use the psycholexical approach from the psychology domain, and modify it to suite our needs. We focus on Thai as the conversational language between users and the CAs. In order to reduce the reliance on current human personality frameworks like the Big five model (FFM), or Myers Briggs 94546 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply. type indicator (MBTI) while developing our CA personality model, we employ a multi-model approach for building the initial set of adjectives for describing the CAs. In addition to the human personality framework, we develop the adjective traits for the CAs based on Aaker's Brand Personality Model [25] and the Website Personality Framework [26]. After consolidating the different personality traits from all the three frameworks, an English to Thai translation is done on this entire initial set of personality adjectives using the services of professional translators. This initial personality bank is then subjected to several rounds of screening by experts, pilot study, and main survey in multiple stages to obtain the final personality model. At the same time, based on current literatures of CA usage, we create a taxonomy of the different scenarios in which these agents are used. The final CA personality model is then distributed among real users of these devices to prioritize the relative impacts of the different identified personality dimensions across the multiple scenarios/activities. This work makes several contributions towards developing a personality model for voice based CAs. First, based on our multi-model approach we are able to identify 29 unique personality traits for the CAs. These 29 personality traits are grouped into 7 dimensions that signify the differences in personality between humans and the artificial conversational agents, because these dimensions do not correspond to any human personality framework, either in terms of number or content. This provides evidence that although the objective behind creating CAs is to provide suitable human proxy, yet people perceive the CAs to be different from humans [27]. Consequently, human personality frameworks like FFM falls short for explaining the personality of these AI agents. Second, our personality model is based on using Thai as the conversational language between the users and CAs. Thai is a tonal language, whereas English is not. English uses tone for meaning, whereas Thai uses tone as a sound distinction (as well as meaning) [28]. This may have an impact on the user experience as the competency levels of the CAs in Thai and English might be different due to the different maturity level of the respective NLP models. To the best of our knowledge this is one of the first works that attempt to develop CA personality in a non-English context, i.e., Thai, which makes the model unique. Finally, we also provide suggestions to the designers of these agents in terms of which personality aspects should be prioritized under which scenario. The CAs are capable of performing various types of tasks, however not all of them may have the same needs in terms of competency or intelligence level. Therefore, by prioritizing the personality requirements and expectations from the CAs we are able to provide future research directions into this aspect.

A. THEORETICAL BACKGROUND: THE NATURE OF PERSONALITY
Personality is a collection of individual differences, dispositions, and temperaments that exhibits some form of consistency across different situations and time [11]. It can be seen as an interaction between the expectations of an observer and the behavior of the observee [29]. However, for cases where personalities are created intentionally, for e.g., in AI-based agents, the behavior of the observed (AI-agent) has been designed with the expectations of an observer in mind [29]. Human personality can have infinite variations due to which mapping the space of all possible types of personalities becomes almost impossible. Therefore, for simplicity and feasibility reasons human personality has been synonymous to the Big five (FFM) model or the MBTI indicator [30]. Specifically, with regards to social interactions; previous research in [10], [31], and [32] has shown that agreeableness, friendliness, extroversion, and dominance are important factors, each of which can have a two-way variation. In Fig. 1, we have shown the twodimensional interpersonal space for these factors, along with their possible variations.
In terms of human personality, the most widely used approach is to classify the personality types based on the Big five model. It has five dimensions: openness, conscientiousness, extraversion, agreeableness, and neuroticism and is also commonly referred to as the OCEAN model [33]. In majority of the HCI studies, extraversion has been found to be the most popular dimension because it has a high informative value and is easy to observe [34]. However, the Big 5 model has its own disadvantages too. It has often been criticized due to the presence of response bias due to social desirability issues [35], and its generalizability in case of cross-cultural studies is questionable [36]. Likewise, the other popular model of MBTI works by building combinations of four dichotomies, resulting in 16 combinations. But, it has also been criticized by researchers due to its poor viability and inconsistent results over time [37].
Apart from the human personality frameworks discussed above, the concept of brand and website personality are also highly relevant for the present scenario. Brand personality was proposed by Aaker in [25], and is defined as the set of human characteristics associated with a brand. Brand personality tends to serve a symbolic or self-expressive VOLUME 11, 2023 94547 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.
function. Although it may seem that human and brand personality traits share a similar conceptualization, they differ in terms of how they are formed. Human personality traits are inferred on the basis of an individual's psychological perceptions and attitudes, while brand personality traits are formed and influenced by any type of direct or indirect contact the customer has with the brand. This helps in the transfer of the personality traits of the people associated with the brand directly to the brand itself [25]. This framework has 42 personality traits grouped into 5 dimensions (sincerity, excitement, competence, sophistication, and ruggedness). Likewise, websites also have the potential to develop relationships with users that are characterized by dialogue and customized content [26]. It is also a brand carrier, and hence it is important that the channel exhibits the personality characteristics of the brand. However, it is different from both the human and brand personality models in terms of how it is formed that is based not only on the direct or indirect contact that the user has with the website, but also the interface and the system design of the site.
As evident from the above discussion all the three frameworks of human, brand, and website are highly relevant for generating the CA personality. Human personality framework provides the basic ground upon which the other two have been developed. The CAs engage with the humans in a natural language, similar to how a person would communicate with another person. Moreover, CAs have a certain degree of anthropomorphic nature that justifies the usage of human personality framework while building the CA personality model. These CAs belong to multiple brands, which are using their own proprietary algorithms while interacting with the users, thereby resulting in different user perceptions. Additionally, each brand positions its agent in a unique manner due to which brand image may play a vital role while creating the personality model of these agents. Lastly, CAs have several things in common with a website. It has a well-defined interface (voice only) and system design through which the users can interact, which in turn may result in personality traits that are typically associated with websites. Hence, in this work we used a multi-model approach by combining all these three different personality frameworks.
Since the taxonomy of the traits in the current human personality models is based on natural language (lexical resources - [12]), these traits have a high correlation with a wide range of linguistic variables. This has been discovered by the frequency with which certain words are used, together with the variations in the usage of words [38], [39]. Therefore, language constitutes and is a predictor of personality [10]. For determining personality, psychologists have stressed on the importance of considering both verbal and non-verbal cues, depending on the context [40]. For e.g., [41] and [42] stated that personality can be expressed via verbal language (word), type of language, paraverbal language (pitch, tone, volume, etc.) and body gestures. Therefore, language has a significant role in shaping the personality.

B. PERSONALITY OF AI-BASED AGENTS
Humans tend to assign personality to machines that exhibit anthropomorphic capabilities. How users perceive the machine personality shapes their user experience [43], [44], [45]. For e.g., [43] and [45] found that humans prefer a computer that has got a strong and consistent personality. Moreover, in many scenarios users prefer machines that have personality that matches their own personality [29]. However, contrasting situations also exist, e.g., in case of human-robot interaction scenario, wherein users may enjoy interacting with robots having an opposite personality [46]. Having a favourable personality helps to create deeper manmachine bonds that results in a trusted relationship between the two [27], [47]. Based upon the personality theories that we discussed above, researchers have used different gesture generation mechanisms and verbal communication techniques for giving personality to the AI-based agents [48], [49].
Although the benefits of assigning personality to the conversational agents are known, and there has been some research progress into this by using both verbal and gesture generation mechanisms that we outlined above, the problem is how such personalities of the CAs are perceived by the users? Such user perceptions are of extreme importance as it will help shed some light on the current state-of-art of the CAs with respect to their personality aspect. Although there are a few relevant works in this area, yet they have their own drawbacks. Table 1 summarizes the current works related to the development of agent personality. Many of the works are conceptual in nature ( [1], [10], [11], [50], wherein the authors have proposed some frameworks for the AI-based agents, however, with no validation of their frameworks. Majority of the developed personality models borrow concepts from human personality frameworks like the Big five model or the OCEAN model. In terms of the human-CA interaction language too, English has been used predominantly (Table 1). Moreover, none of these works consider the task-specific variations in personality, since as intelligent devices the CAs are capable of performing a variety of activities.

C. RESEARCH GAPS
From the outlined literature review, the current research gaps are clear. First, the importance of giving personality to the AI-based agents is well recognised among the research community, since it promotes anthropomorphism. Therefore, as evident from Table 1, there have been user studies that investigate this aspect of agent personality, however, with a strong reliance on human personality frameworks of the Big five model, MBTI, or the OCEAN model. It is natural to start off with the human personality models for exploring the personality of the AI-agents, more so since these are perceived to have anthropomorphic features, but the problem is there are no studies that give any type of conceptual or empirical evidence as to how well these human personality models are able to represent the CAs. Moreover, researchers have called for widening the scope of personality studies by defining new range of ''emic'' dimensions for artefacts such as trust in autonomy, mental model of robots, or anthropomorphism of technology [30]. The current CA personality models lack this aspect by focusing only on the human-centric traits.
Second, there has been very less focus on the language of communication being used during human-CA interaction and how it may affect the personality of the CAs, with the current studies heavily focusing on English as the communication medium. Although there are several challenges with respect to the usage of English itself, for e.g., accent of speaking, use of filler words, etc., [51], they are not under the scope of the present work. Although language has a significant impact on personality, yet not focusing on this aspect refers to a weak maturity of the current CA personality models. This becomes an even greater issue, since all the CA manufacturers are pushing for more language support in their products. Hence, in this work instead of English we focus on Thai as the medium of human-CA interaction. To the best of our knowledge this will be the first work to develop a CA personality model in Thai, although current research has shown that the usability and acceptance of these devices among the Thai population is determined by a number of factors like privacy issues, age of the users, and personality of the CAs [52]. Hence, we anticipate that by having a Thai focus as the language this work will be able to broaden our understanding of CA personality in general.
Third, CAs perform various types of activities in daily life, ranging from searching information over the internet, to playing music, and even providing companionship to users through engaging conversations. These different activities primarily represent two different scenarios: goal oriented or functional tasks, and social tasks. For goal oriented tasks like information enquiry or multimedia/IoT devices control accuracy and efficiency are important, whereas for social tasks like general conversation or recreation, emotional goals become prominent [53], [54]. In fact, researchers in [55] established that the differences in voice personalities of social robots not only vary based on the usage context, but also affect the user satisfaction. Strangely, the current works on CA personality have totally ignored this aspect of task type, and how personality may vary or impact the functional and social tasks performed by the CAs. Therefore, after systematically developing our CA personality model, we use a multi-criteria decision-making approach for evaluating the impact of each personality type on the different task categorizations.
Since we wanted to reduce the current dominance of human personality models whereas building the CA personality in order to account for their uniqueness, therefore in this work we take a multi-model approach by combining personality traits from the Neo personality inventory (human personality), brand personality framework, and the website personality model. The relationship and attachment that humans develop with other animate objects (humans) and inanimate objects (AI-based technologies) is different. For e.g., researchers in [8] and [58] showed that ''human love'' and ''love for AI'' are different in terms of the existence of the various love components of passion, intimacy, and commitment. Although CAs are intelligent devices and possess human-like features, yet fundamentally these are inanimate objects. Therefore, we decided to include the two additional personality aspects from a brand and website perspective, since it aligns well to the present research context.

III. METHODOLOGY
Keeping in mind the research objectives, we use a two-stage approach in this work. In the first stage we build the CA personality model, whereas in the second stage we explore the impact of the different identified personality dimensions on the multiple types of tasks performed by the CAs. The overall methodology of this work is outlined in Fig. 2.
For generating the initial pool of personality traits, we start off with three models: NEO personality inventory (specifically the latest version NEO PI -3), Aaker's brand personality framework, and the website personality model. Although NEO-PI-3 evolved from the Big five model, however there are certain drawbacks and criticisms of the later due to which we decided not to use it for our purpose. First, the Big five model was developed by gathering data from young college students from WEIRD (western, educated, industrialized, rich, and developed) countries, due to which this model is extremely sensitive to cultural variations [59]. Experiments have shown that individuals from non-WEIRD countries are less interpretable using this model, indicating the presence of acquiescence bias. Moreover, there are reduced and more recent improved versions of the Big 5 model with better psychometric properties [59]. The NEO series of personality models are an improvement over Big five, and the latest revision of it is the NEO-PI-3 model. The advantage of NEO-PI-3 model is it gives insights into the six facets that define each domain. Moreover this version has better reliability and validity measures over the NEO-PI-R version and features new normative data due to which it seems to be appropriate over a wider demographics [60]. Hence, we decided to include the NEO-PI-3 version as the representative of the human personality model.
In the initial stage a total of 30 personality descriptors are gathered from the NEO-PI-3 inventory, 42 descriptors from the brand personality framework, and 38 descriptors from the website personality model, yielding a total of 110 personality descriptors in the initial pass. In the next round of refinement, we eliminate all those traits that are either redundant or synonymous. Before any removal the exact meaning of all the traits is checked either from the original description of the authors, or via using the services of the online dictionary of Merriam-Webster (https://www.merriam-webster.com/). This process eliminated 8 synonymous traits, e.g., competence (NEO inventory)/competent (website personality model), orderliness (NEO inventory)/orderly (website personality model), etc. Therefore, a total of 102 traits are passed on to the experts for their opinion and evaluation. The initial pool of these 102 items together with their Thai translations are presented in the Appendix.

2) EXPERT INTERVIEWS (PHASE 1)
In the next step interviews are conducted with experts. This research has been approved by the Institutional Review Board (IRB) of the authors' university, and before any type of data collection informed consent is obtained. Five experts are contacted (two males and three females), among which three have more than 10 years proficiency in psychology domain, and two of them are AI-experts having more than 6 years' experience developing different aspects of conversational agents. For selecting the experts, a convenience sampling strategy is used from the authors' resources. The purpose of conducting the expert interview is two-fold -first, we wanted to have an open discussion with respect to the suitability and relevance of the selected 102 personality traits together with their respective Thai translations and second, to assess the content validity of each of the personality traits before going for the pilot testing phase. The objective of the interview was explained to the experts and an excel sheet was shared that contains all the 102 personality traits (in both English and Thai) together with their descriptions at least 1 week before the interview. The interview sessions are conducted online and in a semi-structured fashion. At the end of the interview the experts had to score all the personality traits either by giving a 0 (if the expert is unsure that whether the particular personality trait is relevant and well-represents the CA), 1 (if the expert is sure that the particular personality trait is relevant and well-represents the CA), or -1 (if the expert is sure that the particular personality trait is not relevant and does not represent the CA well).

3) PILOT TESTING (PHASE 1)
In addition to the expert interview, we decided to carry out the pilot testing on the personality traits that passed the expert interview phase to ensure good psychometric quality of the items that will be given as a part of the main survey. Thus, the main objective of this phase was to check for the internal consistency, i.e., the reliability of the items presented as a part of the survey. A total of 10 pilot users are selected having the following criterion: have at least 6 months experience of using a CA and uses the CA at least 1 time every month. The participants of this phase are recruited based upon announcements made on the publicity bulletin of the university along with other appropriate social-media channels. Responses are obtained from 7 males and 3 females in the age range of 24-39 years, with all of them having bachelor's degree as a part of their educational background. Only those items that pass the reliability testing are included in the next phase of main survey.

4) MAIN SURVEY (PHASE 1)
Based on the refined set of personality traits that we obtain from the pilot phase; the main survey is carried out online using Google Forms. The data was collected in January 2023. With regard to the eligibility of the participants who can take part in the survey, we fixed similar norms to that of pilot testing, i.e., the participants need to have a usage experience of at least 6 months with any CA in Thai language and use the CA at least 1 time every month. We did not focus on any specific CA belonging to a specific brand since we wanted to capture the general state-of-art of the current commercially available CAs in the market. However, by incorporating proper screening questions, we ensured that all the participants had sufficient experience with CAs, and that their communication medium was Thai. Additionally, we created a short video depicting the concept of CA personality that the participants had to watch before filling up the survey. Participants were recruited based on a combination of convenience and snowball sampling techniques. The final set of personality descriptors are presented in a random manner to the participants to which they have to agree or disagree on a 5-point Likert scale rating ranging from ''strongly disagree'' to ''strongly agree''. A total of 86 responses are obtained, however, 5 of them are incomplete since the participants could not complete the survey as they had less than 6 months of experience using a CA. Therefore, the final sample has 81 participants that is used for the purpose of building the personality model.

B. EXPLORING THE IMPACT OF PERSONALITY DIMENSIONS ON TASK TYPES 1) IDENTIFYING THE DIFFERENT TASK TYPES (PHASE 2)
After the completion of phase 1, the first objective of phase 2 is to identify the different types of tasks performed by the CAs, since it will help to analyse the task-wise variation of CA personality. Current literatures have majorly pointed out towards the existence of two different task types: functional and social tasks [54] that we elaborated previously in the literature review section. Based upon the current literatures together with a group discussion among all the authors of this article, the different possible scenarios of CA usage are finalized. Table 2 provides information about the different scenarios, together with some activities that can be performed with some examples. We would like to highlight that Table 2 is not comprehensive, i.e., it only presents certain sample representative scenarios.

2) MAIN SURVEY (PHASE 2)
For the phase 2 main survey, the same participants were contacted as those in the first round of the main survey.   [54], [61], [62]. This was done because of the following reasons. First, these participants were well aware with our overall research objectives. Second, since the CA personality model was build based on their inputs, we believe that it is justified to have their perceptions towards the relative importance of the different personality dimensions based upon the usage scenario. This may be beneficial from the reliability aspect of the developed personality model too, because although this is not a strict example of test re-test reliability, however, capturing the perceptions from the same participants over a period of time may improve the model consistency. Likewise, as we mentioned previously also, personality is subjected to cultural variations [59]. Therefore, we decided to have a homogeneous sample for both the main surveys.
Before beginning the survey, the participants were shown an infographics created by us based on the CA personality model developed (shown later in results section). We also created sample video clips for each of the 4 scenarios outlined in Table 2. The participants had to watch each of the video clips before giving their ratings in terms of the pairwise comparison of the identified personality dimensions. This comparison was done based on the standard AHP protocol as outlined by authors in [63]. The comparison rating scale ranges from 1 to 9, with each score having a particular semantic. Table 3 presents the semantics of the AHP scale.

A. THE FLOW OF THE PERSONALITY TRAITS
To begin with we present the overall flow of the personality traits in Fig. 3. Initially, 30 personality descriptors are gathered from the NEO-PI-3 inventory, 42 descriptors from the brand personality framework, and 38 descriptors from the website personality model, yielding a total of 110 descriptors. 8 personality traits are found that are either synonymous or redundant, thereby reducing the initial primary pool to 102 items that is passed on to the expert interview stage. During expert interview based on their suggestions 23 personality traits are deleted either due to their high relevance with other traits (e.g., down-to-earth, sincere, leader, corporate, etc.), or because they are not relevant for the CA context (e.g., original, secure, young, etc.) as per their opinion. Additionally, 19 traits (e.g., action-packed, flashy, western, upper-class, etc.) did not pass the content validity test. For these 19 traits we obtained an IOC (Item-Objective Congruence) score of less than 0.5, which disqualifies them from further processing [65]. Therefore, in the expert 94552 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.
interview stage a total of 42 personality descriptors are eliminated, leaving behind 60 traits for the phase one main survey. Running an Exploratory Factor Analysis (EFA) on the remaining 60 traits results in the omission of 31 additional traits (discussed in detail in next section), thereby resulting in 29 personality descriptors for the final model.

B. EXPLORATORY FACTOR ANALYSIS
The EFA is conducted using SPSS version 13.0. Before conducting the EFA certain pre-requisites are checked in terms of data suitability and sampling adequacy. First, the correlation matrix is examined to check if the conditions of factor analysis are satisfied. There are several correlations of 0.30 or higher, indicating that the data is suitable for factor analysis. Second, a Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy is checked that indicates the proportion of variance in the different items that might be caused by the underlying factors. Third, Bartlett's test of sphericity is carried out that tests the hypothesis that the correlation matrix is an identity matrix, which indicates that the items are unrelated, and hence suitable for structure determination. The relevant statistics are reported in Table 4. The KMO value is greater than the threshold of 0.5 and the Bartlett's test is significant indicating the suitability of carrying out factor analysis [66].
For conducting the EFA, we use the Principal Component Analysis (PCA) methodology that has been widely reported in current literatures. PCA is an unsupervised technique in which the inter-related items are transformed to a new set of items (called principal components) in such a way that they are uncorrelated and the first few of the components explain most of the variance of the entire dataset. For better aligning of the factor structure, we used the Equamax rotation that is a type of orthogonal rotation technique. An initial visual walkthrough of the factor loadings revealed several anomalies. First, some of the items did not load onto any factor, i.e., they have very less loadings. As per the standard procedure, we suppressed all items having loadings less than 0.4 in further iterations of factor analysis. Second, we observed some cross-loading issues, i.e., items loading onto multiple factors with values greater than 0.4 or the absolute difference in magnitude of the loadings across multiple factors being less than 0.2. All such items are deleted. Following this procedure 31 items are deleted in total in this round. The final factor structure indicated the presence of 7 components with an eigen value of greater than 1 with the seven factors together explaining 73.40% of the total variance. Fig. 4 presents the Scree plot that confirms the seven-factor solution for the current dataset. The detailed factor loadings are presented in Table 5, together with the relevant descriptive statistics of the personality traits.
In terms of the reliability analysis, first we checked the internal consistency of all the items by inspecting the Cronbach's alpha (α) values. For all the traits the alpha values  are greater than 0.90 (which is greater than the threshold value of 0.70), which indicates sufficient reliability of the items [67]. Next, we also checked for convergent validity, i.e., whether all the items within a single factor are highly correlated. With regards to convergent validity, researchers have proposed different cut-off values, with 0.4 [68] and 0.5 [69] being the most popular ones. Keeping in mind the exploratory nature of this research, we decided to take 0.4 as the cut-off value, more so since only 1 personality trait (organized) had a factor loading of 0.438. Therefore, convergent validity is also satisfied. Lastly, we checked for the discriminant validity. For discriminant validity we checked for the uni-dimensionality of the items (i.e., absence of any cross-loadings - Table 5) [70], and by examining the factor correlation matrix. None of the correlations between the factors are found to exceed the threshold of 0.70 [70], which indicates sufficient discriminant validity of our proposed model. After ensuring sufficient model reliability and validity measures, next we carried out the factor (dimension) naming based on the item loadings in Table 5. Whereas proposing the dimensions, we have attempted to classify each into a two-dimensional personality space in order to account for both the positive and negative personality aspects. The dimension conceptualization and description are provided in Table 6. All the traits corresponding to component 1 (except excitement-seeking) originally belong to the ''neuroticism'' dimension of the NEO-PI-3 inventory. Therefore, we name component 1 as ''neuroticism -calm''. In terms of human personality, people who experience mood-swings, anxiety, or irritability are classified as neurotic. Mostly, it refers to the negative traits of the CAs. For the other extreme, we propose the dimension ''calm'', that is characterized by emotional stability, matureness, and outgoingness. We name component 2 as ''juvenility -maturity''. The traits corresponding to component 2 reflect the unsystematic, non-coherent and confusing manner of communication by the CAs, due to which we name this dimension as ''juvenility'' -reflecting their immature status. For the opposite pole we propose the dimension ''maturity'', wherein the CAs will be able to establish formal, relevant, and systematic conversation 94554 VOLUME 11, 2023 with the users. The third component is named as ''ineptness -intelligence''. All the personality traits loading onto this dimension either reflect the incompetency of the CAs towards understanding the users and communicating with them, which might make the user experience not satisfying -''ineptness'', or they reflect the knowledgeable and smart aspects of striking conversations with the users -''intelligence''. We name the fourth component as ''seclusion -sociability''. This dimension reflects the scenario wherein either the CA is very inactive or dormant and less interactive with the users -''seclusion'' or is very energetic and interactive with the users -''sociability''. The fifth component is named as ''reserved -openness''. Traits belonging to this dimension either reflect the reluctance of the CAs to learn new things and being indifferent to the user during conversation -''reserved'' or being eager to learn new things and striking a proper conversation with the user depending on their emotional state and mood -''openness''. The sixth dimension is named as ''instability -self-control''. This dimension reflects the inability of the CAs towards self-control and their occasional careless and uncompromising nature -''instability'' or their capability to control emotions, being prudent and humble -''self-control''. The last component is named as ''unaesthetics -aesthetics''. It reflects the CAs having unattractive or monotonic voice -''unaesthetics'', vs. those having a pleasant, attractive, and natural-sounding voice -''aesthetics''.

C. TASK-WISE VARIATION OF PERSONALITY
As a second objective, next we examined the relative impact of the identified personality types of the CAs on the different types of tasks that are performed by them. Since the CA personality can vary across the two-dimensional plane as we presented above, for the purpose of the second research objective we consider either the positive or negative personality dimension depending on the nature of VOLUME 11, 2023 94555 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply. the traits (positive or negative) that load onto a particular dimension. For e.g., in case of the ''neuroticism -calm'' continuum, majority of the traits excluding ''excitement seeking'' are negative, due to which for this phase we consider the dimension name to be neuroticism. Likewise, for the ''juvenility -maturity'' continuum, all the traits are negative, and hence this dimension is considered to be juvenility. Similarly, intelligence, sociability, openness, selfcontrol, and aesthetics are the other selected dimensions. The different scenarios for which the video clips are created and shown to the users are already outlined in Table 2. Since, both the surveys (phase 1 and 2) had the same participants, we showed a summary of the developed CA personality model by creating an infographics before the start of phase 2 survey. Fig. 5 presents the visualization.
First, we checked for the consistency of the pairwise comparison matrix by computing the consistency ratio (CR). We obtained CR value of less than 0.1 that is suggested as per the recommendations [71]. The priority weights for each of the scenarios are presented in Fig. 6 in the form of radar charts. The results show that for casual activities, juvenility dimension has the greatest impact followed by sociability, openness, aesthetics, self-control, intelligence, and neuroticism respectively. However, for all the remaining scenarios, intelligence has the highest impact. Very strangely, juvenility that scored the most for casual activities, is found to have the least effect on multimedia control and follow the instruction activities. Its impact on the information inquiry scenario is also just marginally higher than the lowest one. The corresponding implications are discussed in the next section.

V. DISCUSSION
With the growing popularity of different AI-based applications and services, the notion of human-machine communication is gradually evolving. Traditional graphical user interface-based systems are being replaced with voice-based systems that have come to the forefront of research. These 94556 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply. voice-based systems are special and different from the legacy IT systems due to their humanness and ability to establish relationships with its users [8], [58]. This anthropomorphic and humanness aspect of the CAs further motivated us to explore the personality types of these devices and examine their impact on the day-to-day activities they perform. Below, we provide a discussion and implications of this study with respect to our research questions. To answer this question, we adopted a multi-model strategy by combining human, brand, and website personality models that resulted in 7 personality dimensions of the CAs. The entire model building process was done by considering Thai as the communication modality between the CAs and the users, which to the best of our knowledge is probably one of the first works to propose a personality model in this language. Doing so, we are able to extend the current literatures on CA personality and provide our own unique contributions to this field. First, previously authors in [10] had proposed a conceptual framework of personality cues for CAs based only on the Big five model, where they had hinted that some personality traits and cues have not been studied, although they have importance from an IS and HCI perspective. By focusing on a wider perspective of human, brand, and website personality models, in this work we are able to consolidate ''animate'' and ''non-animate'' personality types and provide empirical evidence towards the same.
Second, previous research in [50] had hinted towards the possibility of cultural variations in the context of CA personality, and suggested that not only the language, but the location is also important. E.g, the same words may have different meanings in different regions, such as 'pants' in UK and US English, due to which localization becomes important. In this study we paid special attention to this aspect of localization by including a homogeneous sample group only from Thailand. Moreover, our personality model indicates the presence of 7 dimensions, which in our opinion ensures comprehensiveness by following a systematic approach. Previous works as in [24] examined only specific personality aspects, such as introversion-extroversion in non-English languages (German and Austrian).
Third, with respect to the personality dimensions we obtain, it is interesting to note that majority of the dimensions represent either desirable or non-desirable characteristics. This indicates that whether the CAs are able to fulfil the user's expectations during a conversation or performing tasks is an important aspect of user perception. In this respect, we would like to highlight the two dimensions of self-control, and aesthetics. The dimension of self-control VOLUME 11, 2023 94557 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply. seems to describe the CAs similarity to the humans in terms of conversations. For e.g., the impressions of autonomy, independence, or giving opinions during interactions with the user on the topic of conversation seems to imitate human characteristics. Technically, how much self-conscious the current generation CAs are, may be a matter of debate, however it does have an impact on the CA personality. On one hand, whereas designers of future CA systems may want to make them more autonomous to improve their self-control aspect, there is another contradicting concept of uncanny-valley, wherein greater human-likeliness after a certain level has been shown to promote perceptions of discomfort and eeriness [72]. Similar arguments can be made for the aesthetics personality type, wherein although vocal quality and naturalness of voice are important aspects to consider, however, to what extent such naturalness is desirable that leads to a pleasurable experience must be identified.
Finally, our CA personality model contains 7 dimensions, some of which like ''aesthetics -unaesthetics'' or ''selfcontrol -instability'' are not present in the human personality frameworks. Therefore, our multi-model approach of proposing the personality traits is justified, since such a proposition enables to reveal the actual personality type of the CAs that goes beyond the current human personality models. However, we would like to emphasize that due to continuous improvements to AI and machine learning algorithms, the intelligence, and capabilities of the CAs keep on changing due to which we do not regard the proposed personality model of the CAs as the only available solution, rather this should serve as the starting point for future research. The main contributions towards answering RQ 1 are summarized in the box below. As the second objective we wanted to check the relative importance of the identified CA personality types on the different usage scenarios. For this we proposed two types of scenarios: social tasks and functional tasks. The results show a distinct difference in personality types affecting the two usage scenarios. For the social tasks, juvenility is found to have the highest impact, whereas, for functional tasks intelligence is the most important personality type. The social tasks are meant for establishing a social connection with the CAs and are primarily meant for hedonic purposes. E.g., cracking jokes, or playing games with the CAs are indicative of the enjoyment and pleasure that users derive out of such activities. In this regard, the users expect relevant answers, e.g., funny conversations that make them laugh in case of jokes, rather than some serious conversations. The whole purpose behind hedonism and enjoyment will be defeated if the CAs give out of the context responses. Hence, maturity of the CAs play an important role in this aspect that explains the high impact of this dimension.
Next, for the functional tasks the scenario is completely different. Intelligence of the CAs has the highest impact since it is related to the correctness and smoothness of the operations performed. Being intelligent makes the CAs do the tasks correctly and with competency. Likewise, when seeking information or requesting for navigation services, correctness of the knowledge becomes important so that the CAs are able to successfully guide the users towards their objectives. Hence, it becomes evident that for multiple activities different personalities have their own impact level depending on the activity type. Thus, the user's purpose of interacting with the CAs (social vs. functional use) may influence the perceptions of personality. Although previous research highlights artificial autonomy to be an important aspect for the successful adoption of CAs into the society [73], how such artificial autonomy can be created in the first place with CAs having a favourable personality towards the users is a fuzzy area [74]. The results obtained in this study will help in providing an initial research direction towards creating favourable personality types of these agents based on the nature of the activities they perform. The main contributions towards answering RQ 2 are summarized in the box below.
From the above discussion it becomes clear that the HCI designers must proceed with caution while designing suitable personality of the CAs based on the activities they perform. In fact, we suggest incorporating all the seven types of personality continuum (Table 6) in a dynamic fashion. Our results clearly indicate that people prefer the CAs to have specific type of personality when performing specific activities. Therefore, we suggest the designers to incorporate flexible personality to the CAs, which can be easily and instantly switched by turning on the most appropriate ''personality mode'' based on the instantaneous activi- ties being performed. Moreover, such ''personality mode'' changes must be done automatically by the agents depending on the nature of interaction with the user. Additionally, in this regard personalization is also an important aspect that can be leveraged by the advancements made in AI and ML techniques with the objective of these agents knowing their primary/owner in a never before better manner. Knowing the owner's mood, preferences, and likings will enable these agents to switch themselves to the most appropriate personality mode to maximise the user experience. The designers always have specific characteristics in mind when designing their CA agents. In this regard, our personality descriptors can be used as the communication tool for designing these specific characteristics explicitly, and this should help since the impact of each of these traits on a particular task type is also known. Moreover, the designers can work with the negative personality descriptors and try to leverage the benefits of natural language processing and machine learning techniques to overcome these drawbacks and improve the societal adoption of these devices.

VI. CONCLUSION AND FUTURE WORK
In this work we provided a systematic analysis of the different personality descriptors and dimensions based on the concepts of human, brand, and website personality models. We propose a total of 29 personality descriptors grouped into 7 personality dimensions. Moreover, the proposed personality model has been created based on using Thai as the conversational medium for human-CA interaction. We also explore the relative impact of the different personality dimensions on the different task types commonly performed by these agents.
On a broader perspective our personality model does not match with the human personality models like Big five, or the NEO-PI-3 inventory. The personality traits include items that are not related to the human personality models. This provides some evidence into the fact that people might perceive these agents in a different manner from humans. Moreover, as common in psychology our personality model should be re-validated by future studies.
This work is not without limitations that paves way for future research. First, we focused only on non-embodied conversational agents. However, it is unknown that whether the current results will hold true for embodied agents, chatbots and social robots. Second, currently only two commercially available CAs (Siri and Google Assistant) have the capability of conversing in Thai. Therefore, our personality model is limited specifically to these two agents and might need modifications when other manufacturers give support for this language in their products. Third, in this work we focused only on Thai as the conversational medium. While, this is important for the context of Thailand, VOLUME 11, 2023 however, it makes our personality model exclusive to Thai language. Other local languages might have different effects of personality that need to be tested by future research. Nevertheless, the methodology that we adopt in this work is totally generalized (language independent) and can be adopted by future researchers for building personality models in other local languages. Moreover, future studies can do a comparative analysis of personality models in both English and Thai from the same users. This will enable the research community to understand the exact nature of variations that might happen due to change of languages. Another aspect that we would like to focus upon is that the current work aimed to build the personality model of the commercially available CAs that are available in the mass consumer electronics market and support Thai. In doing so, we focused only on the default capabilities of these devices in term of what features and support the manufacturers gave. This might have resulted in the omission of certain language aspects like accent, dialect, or use of filler words for which customized CAs with specific language models must be trained. This can be a second line of research by which not only our proposed Thai personality model may be checked for robustness, but it can also help getting insights into language specific features. APPENDIX See Table 7.