Twitter Attribute Classification with Q-Learning on Bitcoin Price Prediction

Aspiring to achieve an accurate Bitcoin price prediction based on people's opinions on Twitter usually requires millions of tweets, using different text mining techniques (preprocessing, tokenization, stemming, stop word removal), and developing a machine learning model to perform the prediction. These attempts lead to the employment of a significant amount of computer power, central processing unit (CPU) utilization, random-access memory (RAM) usage, and time. To address this issue, in this paper, we consider a classification of tweet attributes that effects on price changes and computer resource usage levels while obtaining an accurate price prediction. To classify tweet attributes having a high effect on price movement, we collect all Bitcoin-related tweets posted in a certain period and divide them into four categories based on the following tweet attributes: $(i)$ the number of followers of the tweet poster, $(ii)$ the number of comments on the tweet, $(iii)$ the number of likes, and $(iv)$ the number of retweets. We separately train and test by using the Q-learning model with the above four categorized sets of tweets and find the best accurate prediction among them. Especially, we design several reward functions to improve the prediction accuracy of the Q-leaning. We compare our approach with a classic approach where all Bitcoin-related tweets are used as input data for the model, by analyzing the CPU workloads, RAM usage, memory, time, and prediction accuracy. The results show that tweets posted by users with the most followers have the most influence on a future price, and their utilization leads to spending 80\% less time, 88.8\% less CPU consumption, and 12.5\% more accurate predictions compared with the classic approach.


I. INTRODUCTION
Earlier stock market forecasting research relied on past stock values [1]- [3]. Most studies have discovered that analyzing previous prices is not sufficient to anticipate stock market changes because stock market prices are highly volatile. According to the efficient market hypothesis [4], financial market movements are influenced by news, current events, and product releases, all of which have a substantial impact on a company's stock value. As large stock market, Bitcoin has no central controlling authority and is regulated solely by the public. As a result, Bitcoin is viewed as a volatile cryptocurrency and its value is influencing by public ideas. According to the analysis of Kristoufek [5], several significant reductions have occured in the Bitcoin exchange rate and in its price during dramatic events in China. Another study conducted by the American Institute for Economic Research [6] shows that Bitcoin prices fluctuated substantially between 2016 and 2017 as a result of global news and emotions.
Owing to the rise of social media, information regarding popular sentiments has become more accessible. Social media is becoming an ideal medium for sharing public mood on any issue, and it has a significant effect on general public opinion. Twitter, a social networking service (SNS), has recently received significant academic attention. Twitter is a real-time micro-blogging service that allows users to follow and comment on others' thoughts and views [8]. Approximately 140 million tweets are sent to more than a million people daily. Each tweet is 140-characters long and expresses the public view of a particular issue. Information derived from tweets is valuable for forecasting [9]. Over a million Bitcoin-related tweets are available to researchers for processing and application in the field of predicting future Bitcoin prices. Processing a large amount of Bitcoinrelated tweets normally consumes a high level of computer resources (CPU, RAM, memory) and time [31]- [34]. Most of the previous works is focused on how to reduce the resource, so maximizing the prediction result at the same time is not considered. However, tweets written by an expert, public figure, or celebrity will become viral, with many replies, likes, and retweets. Tweets with few replies, likes, or retweets are unlikely to become viral because they are likely to circulate mainly among close friends. Consequently, viral tweets are expected to have a greater influence on price changes than other tweets. If we can separate tweets with the highest impact on future price changes from less important tweets, it gives the possibility to employ less computer resources usage while still obtaining accurate forecasts.
Hence, different from the previous approaches, in this study, we analyze how Bitcoin-related information on Twitter affects the actual Bitcoin price by considering four main attributes: (i ) the number of followers of the poster, (i i ) the number of comments on a tweet, (i i i ) the number of likes, and (i v) the number of retweets. For this, we gather all Bitcoin-related tweets within a particular period and divide them into four groups based on their attributes. Since we use the sentiment information of tweets as a resource for the prediction, yet there was no particular guidance to inform the model in what condition of sentiments the price will increase or decrease. Therefore, we need an optimal policy to achieve valuable prediction accuracy. The model can improve its initial non-optimal policy by receiving good/bad rewards based on the prediction results. Considering the above, we choose to use a well-known Qlearning method to obtain the most valuable attribute to predict Bitcoin's future price. To the best of our knowledge, our work is the first study on the predictable range of tweet attributes involving the term "Bitcoin" on the future returns and volatility of Bitcoin.
After classifying which attribute is helpful to separate highly effective tweets to make a prediction, we compare our approach to the classic approaches in which all Bitcoinrelated tweets are utilized without being attribute-filtered, by looking at the CPU workload, RAM utilization, memory usage, required time to complete the same task, and prediction accuracy.
We summarize our main contributions in more detail as follows: (a) First, we study the predictive power of four main tweet attributes: number of tweet poster's followers, number of comments, number of likes, and number of retweets. We create four datasets consisting of tweets sorted according to the above attributes. Next, we extract the sentiment of each tweet. By making four separate predictions based on the datasets and evaluating the prediction results, we detect the most useful attribute for the Bitcoin price prediction.
(b) Second, we develop the predictive model based on the Q-learning algorithm. For this, we first consider a Markov Decision Process (MDP) as follows: the current actual price of Bitcoin serves as a state, the prediction of Bitcoin price as an action, and the difference between the actual price and predicted price as a reward. In general, the state transition probability is often not provided which leads for us to adopt the model-free version of Reinforcement Learning (RL). Using this, we design several reward functions to improve the prediction accuracy of the Q-leaning.
(c) Finally, we improve the accuracy of prediction and minimize computer resources (CPU, RAM, and memory) utilization and researcher time. The Q-learning based model receives two different datasets as input data where the first dataset consists of all Bitcoinrelated tweets without being attribute-filtered (classic approach) and the second dataset is the most useful dataset which we determined earlier among four datasets (proposed approach). With two different datasets, the model gives two different prediction outputs. By comparing the predictions' parameters, we get a conclusion about which approach is better one.
The remainder of this paper is organized as follows. Section II provides an overview of related research. Detailed information about the data collection, data preprocessing, and sentiment analysis, are provided in Section III. Section IV describes the model learning algorithm and its employment in our research. The experimental results are detailed in Section V. Section VI summarizes the limitations of the study and points the direction for further research. Finally, Section VII concludes the paper.

II. RELATED WORK
In this section, we classify the related researches into the following three categories: (1) Bitcoin price prediction with public opinion, (2) Striving for accurate prediction, and (3) Resource usage minimization. Table I shows general information about related studies along with the key algorithms/methods they used.

A. Bitcoin price prediction with public opinion
Sentiment analysis is an important field for researchers, as people's thoughts and emotions have become popular and an acceptable technique for examining and analyzing public opinion. Twitter, Facebook, and Instagram are examples of social media platform used to collect sentiment data for research. The major goal of adopting these approaches is to identify and extract emotions in spoken or written language using natural language processing techniques. Among other social media platforms, Twitter has recently attracted interest from a wide range of academic disciplines, as it is considered useful for analyzing economic and social datasets. Employment of machine learning algorithms on the data extracted from Twitter has opened widely opportunities including identification of hatred speeches [10], analyzing personalities based on profile pictures [11], prediction on offensiveness in tweets [12], etc.
Over the past decade, there have been some studies within the field of finding the links between price movements and sentiments extracted from Twitter. Kaminski et. al. [13] found that the platform appears to have an impact on users and information dissemination. Ranasinghe et. al. [14] demonstrated that Twitter may be related to a shift in the public image of Bitcoin. According to this research, there is a strong link between the probability of Twitter users' influence and the probability of being influenced, but the majority of users maintain a balance in terms of their attitudes in both circumstances. Nagar et. al. [15] claimed that the sentiment of news obtained from the news corpus and stock price movements were highly correlated. Pagolu et. al. [16] focused on forecasting stock price movements using Twitter sentiment, and revealed a strong connection between sentiments on Twitter and stock market movements. Sul et. al. [17] developed a sentiment classifier and compared it with stock returns in 2.5 million tweets related to S&P 500 companies. The findings revealed that rapid sentiment was more likely to be reflected in a stock price on the same trading day, whereas slower-spreading sentiment was more likely to be reflected on upcoming trading days. In our previous research [18], we scrapped more than 9.2 thousand tweets that were posted in a two-month period, and found that when sentiment analysis was applied to
tweets regarding Bitcoin and financial data, the sentiment on Twitter had a predictive impact on the Bitcoin findings.

B. Striving for accurate prediction
It is known that tweet sentiments have positive relationships with price fluctuations. Based on this fact, several techniques have been proposed to accurately predict the future price by the employment of different machine learning algorithms. Mittal et. al. [19] gathered approximately 7.5 million tweets and obtained results on tweet sentiment after applying long short-term memory (LSTM), recurrent neural network (RNN), and Polynomial regression, whereas tweet volume and Google trends predicted accuracy of 77.01 percent and 66.66 percent for the Bitcoin direction, respectively. Pant et. al. [20] conducted an another RNN model which categorized Bitcoin tweets as good/positive or negative. They used the percentage of them coupled with historical price of Bitcoin. The results showed total 77.62 percent of prediction accuracy.
While many studies that investigated the token economics based on the Bitcoin network, several researches was focused to analyze the network sentiment on the overall price of Bitcoin. Serafini et. al. [21] compared two models used for Bitcoin time-series predictions: the Auto-Regressive Integrated Moving Average with eXogenous input (ARIMAX) and RNN. The flow of studies that adopted LSTM to make a price prediction has been continued by Ye et. al. [22]. As an ensemble method along with LSTM, they used gate recurrent unit (GRU). The results showed that their model performance achieved 88.74% value based on real data from September 2017 to January 2021.
Thanekar et. al. [23] demonstrated that artificial intellegence (AI) models using sentiment analysis of tweets containing the keywords "bitcoin" or "btc" predicted the volatility in Bitcoin values with higher accuracy than models that compared the values without sentiment analysis using machine learning through an autoregressive integrated moving average model and LSTM network. Gurrib et. al. [24] achieved 0.828 accuracy in forecasting the next-day price direction by using linear discriminant analysis (LDA) with sentiment analysis of Bitcoin-related tweets. Another study [25] compared AutoRegressive Integrated Moving Average (ARIMA) and LSTM model to make a real-time prediction of Bitcoin price using public sentiments in tweets and achieved more accurate results by using LSTM. Colianni et. al. [26] studied how tweet sentiments may be utilized to influence investment decisions, focusing on Bitcoin. The authors employed supervised machine learning algorithms to achieve an hour-by-hour and day-by-day accuracy of above 90%. Similar with above researchers, Jain et. al. [27] focused on current tweets by classifying positive, negative, and neutral sentiments and accumulating their numbers every two hour to predict the price of Bitcoin and Litecoin two hours in advance. Using multiple linear regression (MLR) model, they utilized more than 1.8 million Bitcoinrelated and Litecoin-related tweets to investigate whether social factors were capable of predicting the future price of cryptocurrencies. The study notes that MLR model predicts the price of the Bitcoin and Litecoin with the score of 44% and 59% respectively.
As Bitcoin has no central authority to control and its fluctuations are relevant to ongoing news and events, some researchers have studied how COVID-19 outbreak data (number of new cases, recovery, and deaths) can impact the future price of Bitcoin. Pano et. al. [28] provided a corpus of tweet text for Bitcoin-related tweets during the summer of the COVID-19 period. This dataset is publicly available and considers three months to perform unimpeded research. In order to make an accurate price prediction, Luo et. al. [29] tried to feed four different machine learning models with three different data: Bitcoin exchange data, COVID-19 data, and Twitter data from January 2020 to July 2020. One of the findings of this study is COVID-19 data does not help to improve the prediction.

C. Resource usage minimization
Many researchers have studied how to minimize PCresource employment while keeping the same working accuracy. One of such study, by Steinkraus et. al. [31] reported over three times faster training and testing processes when the model was implemented on a graphic processing unit (GPU) rather than a CPU. A greater comparison difference was reported by Catanzaro et. al. [32] where the classification time and speed were eight times faster when implementing support vector machine (SVM) on a GPU than when implementing an alternative SVM algorithm that ran on a CPU. In contrast to the above two studies, McNally et. al. [33] ran LSTM model on a CPU and GPU to ascertain the accuracy of the direction of the Bitcoin price in USD. They reported the GPU outperforming by a result of 67.7%. As the dataset for the model to learn increases, Sumarsih et. al. [34] compared GPU performance with the Apache Spark cluster, which is an in-memory data processing engine that uses RAM instead of an I/O disk. Their data processing simulation using linear regression (LR) to learn Bitcoin trading showed faster results when run on the Apache Spark cluster.
The common point of all the aforementioned researches is that they considered all types of tweets related to cryptocurrency, without considering the importance of the tweet attributes on price movements. To the best of our knowledge, our work is the first attempt to classify the tweet attributes involving the term "Bitcoin" and "BTC", that have effects on the future volatility of Bitcoin price.

III. DATA PREPARATION
In this section, we describe the data-preparation steps for Bitcoin price prediction. We consider the following four steps in data preparation: (i ) data collection, (i i ) preprocessing, (i i i ) attribute division, and (i v) sentiment analysis. In the data collection step, we collect data containing tweets relating to Bitcoin. Thereafter, we remove noise such as repeated tweets, URLs, user mentions, and extra repeated characters from the dataset in the preprocessing step. In the attribute division step, we build four datasets containing tweets sorted according to their attributes. We perform sentiment analysis on the gathered tweets in the final sentiment analysis step. The detailed explanation of each step is provided below.

A. Data Collection
Bitcoin Price Data. We use a total of 1690 days' data that is in the time period from April 1, 2014 to November 14, 2018, in the Bitcoin price market (see [35]) as real data to predict the Bitcoin price because it was observed that the Bitcoin price fluctuated substantially during this period. This motivates us to verify the effectiveness of the proposed method during this period.

Bitcoin Tweet Data. We use Tweepy and Twitter's streaming API [36] for the Bitcoin-related tweet data. Tweepy is a
Python-based open-source framework, makes it easier to gather tweets using Twitter API [37]. Tweepy allows data filtering based on hashtags or terms, which is an effective means of collecting relevant data. The filter keywords are selected using the most definitive Bitcoin context phrases; for example, "cryptocurrency" may contain attitudes towards other cryptocurrencies, and therefore, the scope must be narrowed even further to include only Bitcoin synonyms, such as "Bitcoin" and "BTC." Using this method, we gather 5,496,138 Bitcoin-related tweets generated within the real data period of the Bitcoin price. Table II lists the statistical  values for the dataset. Tweets obtained directly from Twitter typically create noisy datasets. This is due to the social nature of social media use. Certain noises in tweets, such as URLs, emoticons,  Tweet with keywords used once  3,462,567  Tweet with keywords used twice  1,154,192  Tweet with keywords used more than three time  879,379 and user references, must be eliminated appropriately. For this purpose raw Twitter data must be formatted to build a dataset that can be easily processed by multiple classifiers.
To this end, we consider several preprocessing steps to  Tasks  1 Change all letters in tweet to lower case 2 Check and switch 2 or more dots (.) with space 3 Switch 2 or more spaces with one single space 4 Remove user-mentioning symbol (@) 5 Change hashtags into typical words 6 Remove retweet symbol (RT) and URLs 7 Reduce characters repeated more than 3 times normalize the dataset, minimize its size, etc. Table III presents an example of our preprocessing tasks, in which the above order is not important. We use the data refined according to the corresponding processing.

B. Attribute Division
To determine the effects of tweet attributes, we divide the preprocessed data into the following four types: (1) number of followers of the poster, (2) number of comments on the tweet, (3) number of likes, and (4) number of retweets.
Sorting According to Attributes. We consider that the tweet data covered tweets posted within 1,688 days, and we already obtain a single dataset with over 5 million tweets during this period.
To create datasets of interest, tweets posted on a particular day were separately sorted into four datasets according to the above attributes in decreasing order. That is, we sort the dataset by attribute (1), save it separately and sort it again by attribute (2), save it separately, and repeat this process with attributes (3) and (4). However, this is the same dataset.

Avoiding Similar Data.
To prevent similar data from appearing in each dataset, only the first half of each dataset is used in the experiment. In simple terms, all tweets posted in one day were sorted in decreasing order of their number of comments, and only the first half of the tweets were used as the first dataset. Subsequently, the tweets are disordered by the number of followers (1) and only the first half is used as the second dataset. Similarly, they are sorted according to the number of comments (2) and the number of retweets to create attributes (3) and (4).

C. Sentiment Analysis
As a final step, we apply sentiment analysis to determine the subjective emotions or views expressed in the tweets  Table III.) on Bitcoin. We perform sentiment analysis by categorizing textual views into categories such as "positive," "negative," or "neutral." We use the Valence Aware Dictionary and Sentiment Reasoner (VADER) [38] to classify the content of each tweet. VADER is a sentiment analysis Python library that uses lexicons and rules to analyze sentiments posted on social media. VADER includes three valence scores for each sentiment, given text content: positive, negative, and neutral. The valence ratings of each word in the lexicon are added together, modified according to the rules, and then normalized into [−1, 1], where −1 is extremely negative, +1 is extremely positive, and 0 is neutral. These statistics are good because they provide a single unidimensional estimate of the emotion for each tweet. Based on this, we use the compound score to describe the sentiment of each tweet. Subsequently, we perform proper Q-learning for the price prediction with sentimentally analyzed tweet data, as described in the following section.

IV. LEARNING ALGORITHM
In this section, we introduce our approach to predicting Bitcoin prices based on Twitter data. For this, we adopt simple reinforcement learning, in which the environment was the Bitcoin market. First, we briefly explain RL and the proposed approach with RL in the following subsection.

A. RL and Q-Learning
Standard RL is formulated based on a Markov decision Process (MDP). An MDP is a tuple < S , A , r, P, γ >, where S and A are sets of states and actions, respectively, and γ ∈ [0, 1] denotes the discount factor. A transition probability function P : S × A → S maps the states and actions to a probability distribution over the next states, and r : S ×A → R denotes the reward. The goal of RL is to learn a policy π : S → A that solves the MDP by maximizing expected discounted returns R t = E[ ∞ k=0 γ k r t +k |π]. The policy induces a value function V π (s) = E π [R t |s t = s] and an action value function Q π (s, a) = E π [R t |s t = s, a t = a].
In general, the state transition probability is often not provided in the RL. In this case, the agent must learn the optimal policy using trial and error through exploration. In RL, determining a policy that maximizes the expected reward through this process is known as modelfree learning. Q-learning is one of the most famous modelfree algorithms. RL strategies (such as Q-learning) have recently been used in various sectors to improve prediction models in various areas of social network research [39]. Qlearning [40] is a simple RL algorithm that provides the current state and finds the best action to be taken in that state. This is an off-policy algorithm because it learns from random actions. It constructs a Q- table Q(s, a), where the value of the table is the reward when the agent selects action a ∈ A at state s ∈ S . The algorithm operates in three basic steps: (1) the agent starts in a state, takes an action, and receives a reward; (2) for the next action, the agent has two choices: either reference the Q-table and select an action with the highest value, or take a random action; and (3) the agent updates the Q-values (i. e., Q(s, a)) in the table. The main objective is to learn the Q-function. To describe this precisely, let s t and a t be the state and action at current time t . Before the iteration, Q is initialized to an arbitrary value. Subsequently, at each time t , the agent selects an action a t at s t and observes a reward r t , following which it enters a new state, s t +1 . Subsequently, the values of Q are updated. At the core of the algorithm is the Bellman equation as a simple value iteration update using the weighted average of the old value and new information: where θ (0 < θ ≤ 1) is the learning rate and γ is a discount factor with 0 ≤ γ ≤ 1. The value of Q * is the estimate of the optimal future value, which is expressed by This process continues until s t +1 reaches its final or terminal state. Due to the lack of model information (the transition probability of the Bitcoin price), we adopt Qlearning as an RL approach for our Bitcoin price prediction problem.

B. Bitcoin Price Learning
In this prediction problem, an agent interacts with the environment, which is the Bitcoin market, and learns how to predict future prices based on Q-learning. For this purpose, we define a tuple < S , A , r >, as follows: • State Space S . As a state s t := (AP t , T S t ) ∈ S of the agent at time t , where AP t is the actual Bitcoin price and T S t is the tweet sentiment score at time t , respectively. The Bitcoin price is usually expressed with two decimal places (e.g., 21,254.50 USD.) and we consider the tweet sentiment score as a discrete value after applying round to two decimal places. Hence, we note that the considered state space is also discrete.
• Action Space A . The action a t ∈ A of the agent at time t is defined as a prediction of the current Bitcoin price. However, to reduce the number of action states, the percentage of the current price increasing, decreasing, or not changing is selected. That is, the action space is the rate of the price change as a percentage, which is expressed by A := {−1000, −999, ..., 0, ..., 999, 1000} 1 . For example, if the agent selects 50, it means that the agent predicts the next price by increasing the current price by 50% of the current price; that is, a t = 1.5 × AP t −1 .
We also denote this action by the predicted price P P t at time t .
• Reward Function r . For the prediction of the actual Bitcoin price, we consider the following three reward functions: (1) simple difference reward (SDR), (2) relative difference reward (RDR), and (3) comparative difference reward (CDR). Detailed description for each function is as follows: (i ) SDR. This reward function is simply based on the difference between the actual price (AP t ) and the predicted price (P P t ). Considering that the model needs to receive a higher reward for a smaller difference, it receives only negative rewards with the highest possible reward r t = 0 in case that AP t and P P t are the same. Formally, the SDR is defined by: (i i ) RDR. It is based on the relative difference between AP t and P P t , which is formally defined by: where AP t > 0. Therefore, r t ∈ [−∞, 0] where r t = 0 means perfect fit of P P t to the AP t .
(i i i ) CDR. In the prediction of actual price of Bitcoin, it will be an important information on how much has increased or decreased compared to the previous step. In the third reward function, we consider the additional information about this rate of change. To formally describe this, we first introduce a concept of zero-reward value as follow.
, the rate of change of actual price. Let l = AP t − P P t −1 (1 + α) > 0. We call a point by Zerovalue reward (ZR) where the difference from AP t is l.
Actually, we have two such zero-reward values as shown in Figure 4 since one point is less than l from AP t and the other is larger than l from it. We denote the former by Z R 1 t and the latter by Z R 2 t , respectively. Then, Z R 1 t is computed by (See Figure 4): and the Z R 2 t is computed by Z R 2 t = P P t −1 + (P P t −1 * α + 2l).
From two zero-reward points, we compute the reward value based on whether P P t is higher or lower than the AP t . The formula of computing the reward value is different according to the value of P P t . The explanation of the possible P P t cases and computing formulas is as follows.
(a) The case where the P P t is smaller than the actual price (P P t < AP t ). As AP t value stands in the middle of positive rewards interval; the agent receives a negative reward if P P t < Z R 1 t or a positive reward if the P P t is between Z R 1 t and AP t (Z R 1 t < P P t AP t ). The value of the reward is calculated as follows: (b) The case where the prediction price is higher than the AP t (P P t > AP t ). As given in interval determination paragraph, when the P P t is higher than the AP t , the model computes Z R 2 t value to decide whether the reward is positive or negative. If the P P t is in between AP t and Z R 2 t (AP t P P t < Z R 2 t ) value range, then the reward will be positive. If the P P t is higher than Z R 2 t , the reward will be negative. This is computed by: In all the cases, when the predicted price comes closer to the actual price, the reward value becomes higher. The second reward function, namely the RDR, also varies according to the current Bitcoin price. That is, even if the difference between the predicted and actual prices is the same, the reward value is higher if the current price is large.
Based on the defined reward functions and preprocessed tweet data, the agent learns the actual Bitcoin price and attempts to make a prediction by repeating the following working steps: • Agent starts in a state (s 1 = (AP 1 , T S 1 ) -Actual price of Bitcoin, sentiment score), takes an action (a 1 -a number between -1000 to 1000, as it will be applied as a percentage of change to actual price), and receives a reward (r -computed based on one of SDR, RDR, and CDR reward functions). • The agent chooses an action by referring to the highest value in Q-table. • Update Q-values. As the Q-values are updated and the agent chooses the maximum value in the table to take the action, the agent performance also starts to improve. The model with the above parameters is tested using four different datasets to experimentally verify the predictability range of tweet attributes. A brief explanation of the experiment is presented in section V.

V. EXPERIMENT AND RESULTS
In this section, we present three different experimental results in order to determine the best reward function, tweet attribute that has the most influence on price, and computer resource working overloads during the performance of both classic and proposed approaches. For this, we use Python to create the experimental environment and the Pandas library for data preprocessing. Sentiment analysis is performed using the VADER analyzer tool, and TensorFlow and Keras are used for training and testing, respectively. For monitoring and analyzing of computer resources (CPU, RAM, and memory) usage we use one of Windows 10 standard tools called Performance Monitor [41]. It is useful with its options where anyone can customize what data to collect, when the collection begins, how long the analysis process needs to run, etc.
Training with Q-Learning. In the model training, we use a dataset of tweets posted between April 1, 2014, and June 30, 2017. The training process yielded promising results when the first part of the divided dataset was used to feed the model. We use γ = 0.95 as the discount factor because this value provided the best performance during the experiment.

A. Performance Measures
To evaluate the performance of our model with reward functions, we use six metrics among a wide range of evaluation metrics, as they are the most suitable for the prediction task and provide a valuable evaluation. We briefly describe them as follow.
(i ) Variance Accounted For (VAF). VAD [42] is used to verify the correctness of a model by comparing the real output with the predicted output. The values of VAF which is closed to 100% indicate highly accurate prediction. With the definition of the actual price -AP and the predicted price as -P P , the formula of VAF is given by: where v ar (x) is the variance of x, which is computed Here, x t is a value of x at time t andx is the average value of x t from 1 ≤ t ≤ n. In our experiment, we set n = 1690 for all performance metrics.
(i i ) Coefficient of Determination (R 2 ). R 2 is used to evaluate the forecast outputs and provides a measure of how well-observed outcomes are replicated by the model [43]. Formally, it is computed by: where RSS is the residual sum of squares which is given by RSS = n t =1 (AP t − P P t ) 2 and TSS is the total sum of squares that is T SS = n t =1 (AP t − AP ) 2 . Here, AP t and P P t denote the actual price and predicted price of Bitcoin at time t , and AP is the average value of AP t for time 1 ≤ t ≤ n, respectively. Hence, the range of R 2 is [0, 1], where 1 indicates a perfect match of the prediction data with actual data.

(i i i ) Mean Absolute Percentage Error (MAPE).
Like the aforementioned metrics, MAPE is also used to measure the prediction accuracy but unlike them, it is commonly used as a loss function in model evaluation because of its highly intuitive interpretation in terms of relative error [46]. The formal computation is given by:

(i v) Nash-Sutcliffe model efficiency coefficient (NSE).
The fourth evaluation metric that we consider to use is NSE, which is used to assess the predictive skill of models [47]. Following formula used to calculate the NSE value of the model prediction.
Hence, the NSE becomes one in the case of a perfect prediction.
(v) Root-mean-square error (RMSE) This evaluation metric is frequently used to measure the difference between values predicted by the model and observed values [48]. Formally, it is computed by: Hence, we see that RMSE value is always non-negative, and a lower RMSE indicates a more accurate prediction than a higher RMSE.

(vi ) Weighted Mean Absolute Percentage Error (WMAPE)
WMAPE is a variant of MAPE in which errors are weighted by values of actuals [49]. The advantage of this metric over MAPE is that it overcomes the "infinite error" issue [50]. The formal metric is defined by:

B. Results for each reward function with four attributes
As a first experiment result, we will show the prediction performance for the three reward functions to determine the most useful tweet attribute in predicting the price. For this, we use a dataset of tweets posted between July 1, 2017, and November 14, 2018. We obtain the prediction results based on four attributes: most commented, most liked, most retweeted, and the number of poster followers. Figure 5, we see that tweets posted by those with the most followers and tweets with the most comments exhibit the best prediction results for all the SDR, RDR, and CDR reward functions. However, the prediction with CDR is better than that with SDR and RDR because the CDR provides a reward by comparing the current action a t with the previous action a t −1 . Each action comparison with the previous action provides the opportunity to compare all actions relative to each other which boosts the learning process. The result shows that there is a high chance that people's tweets with the most followers catch the public's attention by being viral and have some influence on future events. Moreover, it can be seen from the results of the experiment, that there is a ranking among the attributes based on their predictive powers. Among three prediction outputs with the three different reward functions, the dataset sorted by the number of user followers shows the most accurate prediction. Next, the dataset created from tweets with the most comments shows a more accurate forecast than the remaining two datasets. As the prediction results in Figure 5, the most retweeted attribute comes in third place, whereas the most liked attribute is in the last place. Performance Evaluations. To see the sufficient prediction performance, we obtain six different evaluation metrics during the assessment of performance for each reward function and each attribute are listed in Table VI. First, we see that, in the case of CDR, the VAF values show the most accurate prediction compared with SDR and RDR. Further, the attribute of number of poster's followers has the highest prediction performance as we expected.

Tweet Attribute Classification. First, in
In contrast to VAF, the R 2 takes values in the range [0, 1] where 1 indicates an ideal prediction. Keeping this definition in mind and by comparing the R 2 values of each reward function, we can determine that the model achieves a more precise prediction with CDR by having a maximum 0.8 value rather than SDR and RDR by having 0.63 and 0.25, respectively. The maximum R 2 values are achieved with the dataset that consists of posters' tweets with the highest number of followers.
By scoping the three prediction outputs with metric MAPE, we obtain a result that indicates the level of error in the predictions. Therefore, a lower MAPE value indicates higher accuracy. The MAPE value also shows no contradiction in the priority of the CDR over the SDR and RDR functions. For example, while SDR is being implemented by the model, the first attribute has a value of 13.919, which is the lowest among the second, third, and fourth attributes, with 19.257, 22.833, and 26.785 values, respectively. During the RDR implementation, the model has the lowest prediction quality. The MAPE value of the first attribute increased to 17.690 in this scenario, but still dominates the remaining attributes.
For the NSE metric, we observe similar results as the R 2 metric. Because the performance values are quite similar, we refrained from analyzing the reward functions' preferability and ranking of attributes.
In using the RMSE, taking into account the fact that RMSE measurement is based on errors, a low value of RMSE indicates a more accurate prediction than a high value RMSE. While SDR is implementing by the model, the follower attribute has the lowest value among all attributes. The model has the poorest prediction quality during RDR implementation. In this case, the RMSE value of the follower attribute increased to 2789.5, but it still dominates the remaining attributes. As we expected, the RMSE also shows the best prediction when the model used CDR as a reward function.
The WMAPE is the last evaluation metric used in this study. Because it is a variant of MAPE, a smaller WMAPE value indicates an accurate prediction. With respect to CDR, WMAPE values indicate the most accurate forecast when compared to SDR and RDR. For example, the first attribute has a CDR value of 10.7, although this attribute has SDR and RDR values of 16.9 and 23.1, respectively.
Performance Comparisons. In order to detect how good the model's performance is, we compare the accuracy of our prediction along with other similar studies that used different approaches to achieve an accurate prediction. However, there are several problems that resist making a fair comparison: types of data and its time period are different across studies; the model design and its implementation are not explained in detail in some studies; diversity of the metrics that are used to evaluate the model's performance; and difficulties on gathering all source codes and run in the same PC environment. Therefore, in Table VII, we briefly compare the results of previous relevant work with our proposed method. Most references are used Twitter as the main data source to obtain Bitcoin price predictions and yet only a few of them have considered analyzing the PC resource usage level. In the Table, we use the terms as follows: (i ) Non-filtered: BTC historical price data is used as the main dataset without being filtered by any conditions and is used entirely in its form. (i i ) Non-attribute filtered: Bitcoin-related tweets are used as the main dataset without being filtered by any Twitter attribute and are used entirely in its form. (i i i ) Attribute-filtered: Bitcoin-related tweets are used as the main dataset and the dataset has been used after filtering by the "number of followers" attribute. We see that the result in Ye et. al. [22] shows the highest accuracy level with an 88.74% value but the resource usage did not considered. We observe that our proposed Qlearning model that considers only Bitcoin-related tweets that are posted by posters who have the most number of followers, considers the PC resource usage level while obtaining 84.81% accuracy which overcomes most of the previous studies results.

C. Results with computer resource usage
In this subsection, we will describe the comparison results between our proposed approach and the classic approach. These two are explained as follows.
(a) Proposed approach: In the proposed approach, we obtain the Bitcoin-related tweets only from those who have the most followers, i.e., we use the data with attribute-filtering. (b) Classic approach: In this approach, we obtain all Bitcoin-related tweets, i.e., we use all of the data without attribute-filtering.  than in the classical approach, it achieves a 36.2% accuracy, whereas the accuracy of the classic approach is 23.8%, which is 13% less accurate than the proposed approach.

(i i ) Second experiment result (Fixed target accuracy).
Finally, we perform same experiment to achieve the target accuracy of Bitcoin price prediction. To do this, we set a target accuracy level of -VAF = 84.81%, because we observed that the model with the proposed approach achieved this level of accuracy during the first experiment. We run both approaches until they reaches the target accuracy level and compare resource usage accordingly. We obtain our results in Figure 6 and Table IX, respectively. As a result, we first see that there is a significant difference in CPU usage in this experiment. In the classic approach, the CPU workload is between 46.1% and 85.6%, with an average of 61.8%. The proposed approach shows a minimum of 4.3%, average of 7.7%, and maximum of 16.6%, which is almost 9 times less than the classic approach used. In the RAM usage, we check that there is no significant differences whereas we see that the average usage of memory in the proposed approach is better than that of classic one. Finally, we check that the classic approach runs for 21 hour 36 minute 39 second to achieve the target accuracy, which is almost five times more than the time required to achieve the same level by spending 4 hour 17 minute 14 second with the proposed model. From the experiment results, we conclude that the proposed approach has much advantages over the classic approach. Considering the poster's tweets with the highest number of followers can lead to accurate prediction and prevent the computer from wasting its resources.

VI. DISCUSSION
In this work, we have checked that it is better for prediction performance and resource efficiency to extract and use data suitable for price prediction than to use all data in Bitcoin price prediction through tweeter data. In particular, for this purpose, even if only the attribute data of the most follower among the data on Twitter was used, the results were much better than the classic approach using all data. Furthermore, the model contributes to the literature on tweet sentiment studies and price prediction using reinforcement learning and provides reliable advice for further in-depth analysis.
However, there exist some limitations to the considered approach in this paper. First, we only used Twitter posted data to analyze people's feelings, which may be biased since not all crypto-traders express their opinions on Twitter. We realize that Bitcoin values are affected by a variety of variables that cannot be captured only through Twitter sentiments. Tweets and other social media (e.g., Reddit and Facebook) may be used to extract feelings in the real world, such as through news and other sources including photos and videos from YouTube or TV channels. Second, we analyzed the price prediction of Bitcoin by considering only four attributes of Twitter. Additional comparison results can be obtained by considering other attributes such as tweet language and tweet poster's location. In the case of tweet language, most of the data is expressed in one language (e.g., English), so it will not significantly affect the price prediction. However, it may be interesting to see how data according to the tweet poster's location affects the Bitcoin price and prediction performance. Third, the algorithm for the predictive model can be modified by extending it to deep reinforcement learning algorithms. This has the advantage of being able to express the Q-function used in Q-learning more accurately with the deep learning method, so it is expected to help improve prediction performance. Finally, considering other sources for sentiment data and other types of cryptocurrencies could also increase the accuracy of predictions. All of these things could be our further research.

VII. CONCLUSION
In this paper, we considered Bitcoin price prediction based on Q-learning using tweet data. We analyzed the manner in which Bitcoin-related information on Twitter affects the actual Bitcoin price by considering four main attributes: number of followers of the poster, number of comments on tweets, number of likes, and number of retweets. We predicted the actual Bitcoin price using a Qlearning method, and obtained the most valuable attributes with three reward functions. We verified that tweets with the most user-related attributes had the greatest effect on the future Bitcoin price. Next, we compare our approach with a classic approach where all Bitcoin-related tweets without being attribute-filtering, are uses as input data for the model, by analyzing the CPU workloads, RAM usage, memory, time, and prediction accuracy. We conclude that the proposed approach has much advantages over the classic approach.