Synthetic data were generated based on the semantic similarities and analysed (upper path). The same analyses were done for previously collected survey data. The results ...
Abstract:
Questionnaires are essential for measuring self-reported attitudes, beliefs, and behaviour in many research fields. Semantic similarity of the questions is recognized as ...Show MoreMetadata
Abstract:
Questionnaires are essential for measuring self-reported attitudes, beliefs, and behaviour in many research fields. Semantic similarity of the questions is recognized as a source of covariance in the human data, implying that response patterns partly arise from the questionnaire itself. A practical method to assess the influence of semantic similarity could significantly facilitate the design of questionnaires and the interpretation of their results. The current study presents a novel method for estimating the influence of semantic similarity for questionnaires with Likert-scale responses. The method transforms Likert-scale responses into natural language sentences and applies the Sentence-BERT algorithm to compute a semantic similarity matrix. Synthetic response data are generated using the semantic similarity matrix and a noise parameter as input. Synthetic data can then be analysed using the same tools as human survey data, making the comparison straightforward. The method was tested with a questionnaire measuring the acceptance of automated driving. Synthetic data explained 40% of the observed correlations in the human response data. This means that semantic similarity substantially influenced responses. Using synthetic data, it was possible to identify the same factor structure as in the human data and to identify relationships between factors that might have been inflated by semantic similarity. This demonstrates that semantically generated synthetic data could help in designing multi-factor questionnaires and correctly interpreting the found relationships between factors.
Synthetic data were generated based on the semantic similarities and analysed (upper path). The same analyses were done for previously collected survey data. The results ...
Published in: IEEE Access ( Volume: 13)
Funding Agency:
Keywords assist with retrieval of results and provide a means to discovering other relevant content. Learn more.
- IEEE Keywords
- Index Terms
- Survey Responses ,
- Semantic Similarity ,
- Similar Influence ,
- Synthetic Data Generation ,
- Factor Structure ,
- Response Patterns ,
- Human Data ,
- Natural Language ,
- Response Data ,
- Similarity Matrix ,
- Likert Scale Responses ,
- Empirical Data ,
- Structural Equation Modeling ,
- Confirmatory Factor Analysis ,
- Measurement Model ,
- Related Data ,
- Social Influence ,
- Behavioral Intention ,
- Path Coefficients ,
- Entailment ,
- Sentence Embedding ,
- Self-driving ,
- Hedonic Motivation ,
- Questionnaire Statements ,
- Effect Of Social Influence ,
- Negative Sentences ,
- Numerical Code ,
- Sufficient Reliability ,
- Reverse Coding ,
- Amount Of Noise
- Author Keywords
Keywords assist with retrieval of results and provide a means to discovering other relevant content. Learn more.
- IEEE Keywords
- Index Terms
- Survey Responses ,
- Semantic Similarity ,
- Similar Influence ,
- Synthetic Data Generation ,
- Factor Structure ,
- Response Patterns ,
- Human Data ,
- Natural Language ,
- Response Data ,
- Similarity Matrix ,
- Likert Scale Responses ,
- Empirical Data ,
- Structural Equation Modeling ,
- Confirmatory Factor Analysis ,
- Measurement Model ,
- Related Data ,
- Social Influence ,
- Behavioral Intention ,
- Path Coefficients ,
- Entailment ,
- Sentence Embedding ,
- Self-driving ,
- Hedonic Motivation ,
- Questionnaire Statements ,
- Effect Of Social Influence ,
- Negative Sentences ,
- Numerical Code ,
- Sufficient Reliability ,
- Reverse Coding ,
- Amount Of Noise
- Author Keywords