Abstract:
Large Language Models, even when specifically trained to process long input contexts, struggle to capture relevant information located in the middle of their input. This ...Show MoreMetadata
Abstract:
Large Language Models, even when specifically trained to process long input contexts, struggle to capture relevant information located in the middle of their input. This phenomenon is known as the "lost-in-the-middle" problem. In this study, We propose a new method Found In The Distribution (FITD) with Latent Dirichlet Allocation (LDA) to solve the problem of "lost-in-the-middle". Specifically, we first apply the LDA to capture the critical information within the text. Based on the captured results, we generate corresponding probabilities and use a masking method to compress the prompt, enabling LLMs to correctly focus on these important pieces of information, even if they are situated in the middle. Finally, we demonstrate that our approach performs better in identifying pertinent information within long contexts.
Published in: ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Date of Conference: 06-11 April 2025
Date Added to IEEE Xplore: 07 March 2025
ISBN Information: