Journals & Magazines >IEEE Access >Volume: 11

Multi-Agent Deep Reinforcement Learning With Progressive Negative Reward for Cryptocurrency Trading

Schematic diagram of our system overview: (a) data preparation, (b) multi-agent proximal policy optimization (MAPPO), and (c) simulated cryptocurrency market environment.

Abstract:

Recently, reinforcement learning has been applied to cryptocurrencies to make profitable trades. However, cryptocurrency trading is a very challenging task due to the vol...Show More

Metadata

Abstract:

Recently, reinforcement learning has been applied to cryptocurrencies to make profitable trades. However, cryptocurrency trading is a very challenging task due to the volatility of the market, especially during bearish periods. In addressing this problem, the existing literature employs single-agent techniques such as deep Q-network (DQN), advantage actor-critic (A2C), and proximal policy optimization (PPO), or their ensembles. Moreover, in the context of cryptocurrencies, the mechanisms for restricting losses during a bearish market are insufficiently robust. Consequently, the performance of reinforcement learning methods for cryptocurrency trading in the existing literature is constrained. To overcome this limitation, we propose a novel cryptocurrency trading method based on multi-agent proximal policy optimization (MAPPO) with a collaborative multi-agent scheme and a local-global reward function to optimize both the individual and collective performance of the agents. Both a multi-objective optimization technique and a multi-scale continuous loss (MSCL) reward are used to train agents using a progressive penalty to avoid consecutive losses of portfolio value. For evaluation, we compared our method to multiple baselines. As a result, better cumulative returns are achieved than when baseline methods are used. In addition, the superiority of our method is emphasized by the result of the bearish test set, where only our method can make a profit. Specifically, our method obtains a 2.36% cumulative return, whereas the baseline methods result in negative cumulative returns. In comparison to FinRL-Ensemble, a reinforcement learning-based method, our method achieves a 46.05% greater cumulative return in the bullish test set.

Schematic diagram of our system overview: (a) data preparation, (b) multi-agent proximal policy optimization (MAPPO), and (c) simulated cryptocurrency market environment.

Published in: IEEE Access ( Volume: 11)

Page(s): 66440 - 66455

Date of Publication: 27 June 2023

Electronic ISSN: 2169-3536

DOI: 10.1109/ACCESS.2023.3289844

Contents

References is not available for this document.

Multi-Agent Deep Reinforcement Learning With Progressive Negative Reward for Cryptocurrency Trading

Abstract:

Metadata

Abstract:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Multi-Agent Deep Reinforcement Learning With Progressive Negative Reward for Cryptocurrency Trading

Alerts

Abstract:

Metadata

Abstract:

Authors

Figures

References

Citations

Keywords

Metrics

References

IEEE Account

Purchase Details

Profile Information

Need Help?