Abstract:
Developing value-aligned agents is a complex undertaking and an ongoing challenge in the field of AI. Indeed, designing Large Language Models (LLMs) that can balance mult...Show MoreMetadata
Abstract:
Developing value-aligned agents is a complex undertaking and an ongoing challenge in the field of AI. Indeed, designing Large Language Models (LLMs) that can balance multiple possibly conflicting moral values based on the context is a problem of paramount importance. In this paper, we propose a system that performs contextual value alignment based on contextual aggregation of possible responses. This aggregation is achieved by integrating a subset of possible LLM responses that are best suited to a user's input while taking into account features extracted about the user's moral preferences. The proposed system trained using the Moral Integrity Corpus displays better alignment to human values than state-of-the-art baselines.
Published in: ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Date of Conference: 06-11 April 2025
Date Added to IEEE Xplore: 07 March 2025
ISBN Information: