Abstract:
In recent work [1], we developed a distributed stochastic multi-arm contextual bandit algorithm to learn optimal actions when the contexts are unknown, and M agents work ...Show MoreMetadata
Abstract:
In recent work [1], we developed a distributed stochastic multi-arm contextual bandit algorithm to learn optimal actions when the contexts are unknown, and M agents work collaboratively under the coordination of a central server to minimize the total regret. In our model, the agents observe only the context distribution and the exact context is unknown to the agents. Such a situation arises, for instance, when the context itself is a noisy measurement or based on a prediction mechanism. By performing a feature vector transformation and by leveraging the UCB algorithm, we proposed a UCB algorithm for stochastic bandits with context distribution. In this paper, we test our algorithm on a real-world dataset and investigate the interactions between drugs and proteins. For this we perform a data pre-processing step to fit the model and we evaluated the performance of our algorithm for the drug-protein interaction study as compared to other benchmark algorithm. Furthermore, we present the results of biological experiments and draw inferences from our findings.
Published in: ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Date of Conference: 14-19 April 2024
Date Added to IEEE Xplore: 18 March 2024
ISBN Information: