Loading [MathJax]/extensions/MathZoom.js
Found In The Distribution: Utilizing Latent Dirichlet Allocation Improves Long Context Comprehension of Large Language Models | IEEE Conference Publication | IEEE Xplore

Found In The Distribution: Utilizing Latent Dirichlet Allocation Improves Long Context Comprehension of Large Language Models


Abstract:

Large Language Models, even when specifically trained to process long input contexts, struggle to capture relevant information located in the middle of their input. This ...Show More

Abstract:

Large Language Models, even when specifically trained to process long input contexts, struggle to capture relevant information located in the middle of their input. This phenomenon is known as the "lost-in-the-middle" problem. In this study, We propose a new method Found In The Distribution (FITD) with Latent Dirichlet Allocation (LDA) to solve the problem of "lost-in-the-middle". Specifically, we first apply the LDA to capture the critical information within the text. Based on the captured results, we generate corresponding probabilities and use a masking method to compress the prompt, enabling LLMs to correctly focus on these important pieces of information, even if they are situated in the middle. Finally, we demonstrate that our approach performs better in identifying pertinent information within long contexts.
Date of Conference: 06-11 April 2025
Date Added to IEEE Xplore: 07 March 2025
ISBN Information:

ISSN Information:

Conference Location: Hyderabad, India

References

References is not available for this document.