Journals & Magazines >IEEE Access >Volume: 13

Anchor-Based Explainable and Causal Artificial Intelligence for Enhancing Financial Predictions of Future Earnings

The proposed anchor-based explainable and causal AI framework for future earnings prediction. The dataset is used to develop XGBoost and RENN predictive models optimized ...

Abstract:

Accurate prediction of future earnings is crucial for stakeholders. However, existing machine learning models often operate as “black boxes,” offering high accuracy but m...Show More

Metadata

Abstract:

Accurate prediction of future earnings is crucial for stakeholders. However, existing machine learning models often operate as “black boxes,” offering high accuracy but minimal interpretability. Prior approaches focus on correlational patterns without establishing genuine causal relationships or providing straightforward rule-based explanations. This lack of transparency and causal insight limits the actionable value of current financial prediction models. We propose an anchor-based explainable and causal AI framework for earnings prediction. It integrates an optimized XGBoost classifier (with RENN undersampling to address class imbalance) for high-performance prediction, the Anchor XAI method to generate human-readable “if-then” rules explaining model decisions, and the DoWhy causal inference tool to validate genuine cause-and-effect factors in the financial data. The optimized XGBoost with the RENN model achieved an overall accuracy of ~93.3%, with precision, recall, and F1-scores ranging from 93% to 94%, outperforming other classifiers. Key features such as Inventory/Total Assets, %

$\Delta$ Net Profit Margin, and Cash Dividends/Cash Flows emerged as the most influential factors. Coordinated adjustments in these variables yielded significantly better predictive outcomes than isolated changes. Furthermore, DoWhy-based analysis confirms that improvements in these factors causally drive earnings growth, as verified by robustness checks like placebo tests. The proposed framework effectively bridges the gap between predictive accuracy and interpretability. It provides financial decision-makers with reliable earnings predicting and transparent, actionable insights for strategic planning and management, making the predictive model trustworthy and informative.

The proposed anchor-based explainable and causal AI framework for future earnings prediction. The dataset is used to develop XGBoost and RENN predictive models optimized ...

Published in: IEEE Access ( Volume: 13)

Page(s): 61026 - 61047

Date of Publication: 02 April 2025

Electronic ISSN: 2169-3536

DOI: 10.1109/ACCESS.2025.3557264

Contents

SECTION I.

Introduction

Predicting future earnings is crucial for various stakeholders in the financial ecosystem. Financial analysts rely on these forecasts to provide well-informed investment recommendations to their clients [1], [2]. Investors, in turn, utilize these projections to make sound investment choices, focusing on companies with a track record of consistent and robust financial performance. Creditors are also vested in understanding a company’s future earnings potential, as it directly impacts the likelihood of the business fulfilling its contractual obligations and repaying its debts. Moreover, accurate earnings forecasts are invaluable for an organization’s management team. By anticipating future financial performance, executives can proactively plan and adjust their strategies to ensure the company’s growth, development, and attainment of its objectives. A reliable model for predicting future earnings would be a powerful tool for formulating effective strategies and guiding organizational planning. Given the widespread demand for accurate earnings predictions among financial analysts, creditors, investors, and corporate executives, there is a clear need for robust forecasting models. These models can leverage various underlying data points, such as cash flow, profit, or other relevant accounting components, to generate reliable predictions of an organization’s future financial performance [3].

Statistical methods play a significant role in predicting earnings in research. Ball and Watts [4] investigated the time-series properties of accounting income by applying random walk and time-series models to predict future earnings. Their study aimed to determine whether accounting income could be characterized as a random walk or a more complex process. By analyzing the earnings data of individual firms over time, they found that the random walk model provided a reasonable approximation of the behavior of accounting income. However, some deviations were observed [4]. Freeman et al. [5] focused on forecasting the direction of earnings change in the subsequent period. They acknowledge the robustness of the random-walk model in terms of historical earnings data but argue that incorporating additional predictor variables beyond past earnings can enhance the accuracy of predictions. To illustrate this point, they extend the model by including a single additional variable—book value—in conjunction with earnings, effectively utilizing the book rate of return. Hou et al. [6] proposed a new approach to estimating the implied cost of capital (ICC) using earnings forecasts from a cross-sectional model instead of analysts’ forecasts. They showed that their model-based earnings forecasts had better coverage and lower forecast bias than analysts’ forecasts. They were a better proxy for the market’s expectations based on their higher earnings response coefficient. They found that the ICC estimated using their model-based earnings forecasts was a more reliable proxy for expected returns than the ICC based on analysts’ forecasts. In 2012, Gerakos and Gramacy [7] thoroughly studied regression-based earnings forecasts. They looked at them in many ways, including variable selection, estimation methods, estimation windows, and Winsorization. They found that forecasts generated using ordinary least squares and lagged net income were generally more accurate for scaled and unscaled net income. Azevedo et al. [8] proposed a method to forecast corporate earnings that combined the accuracy of analysts’ forecasts with the unbiasedness of a cross-sectional model. They built on recent insights from the earnings forecast literature to improve analysts’ forecasts by reducing their sluggishness concerning recent stock price movements and improving their long-term performance.

The exponential growth in the number of financial statements has rendered conventional statistical approaches inadequate for precise data classification. As a result, machine learning techniques have become imperative to efficiently process and analyze these extensive datasets, facilitating more accurate predictions of future earnings. Organizations can extract valuable insights from the vast financial data repositories by harnessing machine learning capabilities. This, in turn, empowers companies to make better-informed decisions and enhances the accuracy of their financial projections, ultimately contributing to improved strategic planning and risk management. Ishibashi et al. [9] applied various data mining techniques for variable selection to construct an earnings prediction model based on financial statement data. They compared the performance of different variable selection methods, including relief, correlation-based feature selection (CFS), consistency-based subset evaluation (CNS), C4.5 decision tree learner, and stepwise methods, across multiple datasets from different years. The authors found that variable selection methods generally improved prediction accuracy compared to a model using all variables. Random forests were used by Anand et al. [10] to make predictions about changes (increases or decreases) in five measures of profitability: return on equity (ROE), return on assets (ROA), return on net operating assets (RNOA), cash flow from operations (CFO), and free cash flow (FCF). With a minimum set of independent variables and out-of-sample testing, their method achieved classification accuracy ranging from 57–64% for the profitability measures, compared to 50% for the random walk. The authors also found that predictive accuracy was similar across forecast horizons of 1 to 5 years, with better performance on cash flow measures than on traditional, earnings-based profitability measures. Nguyen [11] compared the performance of a gradient-boosted regression tree (GBRT) model to analysts’ forecasts in predicting earnings. The GBRT model was trained on historical public financial data from 2013 to 2016, using predictors from existing earnings forecasting literature. The study found that the machine learning model could not outperform analysts’ predictions when comparing median absolute percentage error (MdAPE) on out-of-sample earnings from 2017 to 2019. Hunt et al. [12] extended the work of Ou and Penman [13] by applying machine learning techniques to predict the signs of future earnings changes. They compared the out-of-sample accuracy and trading strategy profitability of stepwise logit regression, elastic net, and random forest models. They also found that random forest significantly outperformed the other models in terms of prediction accuracy and generated the most profitable trading strategy. Chen et al. [3] utilized machine learning methods and high-dimensional, detailed financial data to predict the direction of one-year-ahead earnings changes. Their models demonstrated significant out-of-sample predictive power, with the area under the receiver operating characteristics curve ranging from 67.52% to 68.66%, substantially higher than the 50% of a random guess. Their models outperformed two conventional models that used logistic regressions, small sets of accounting variables, and professional analysts’ forecasts. Belesis et al. [14] investigated the forecasting of profitability direction by employing a range of machine learning techniques, broadening the scope of their research to include European markets and assessing the impact of profit mean reversion on the predictability of profitability. The authors discovered that basic algorithms such as Linear Discriminant Analysis (LDA) had the potential to surpass classification trees when the data underwent appropriate preprocessing. Furthermore, their findings revealed that measures based on cash flow, including Free Cash Flow and operating cash flow, demonstrated higher predictive accuracy than accrual-based measures like return on assets or equity. Hess et al. [15] predicted earnings for forecast horizons of up to five years using a fully flexible machine learning approach, utilizing the entire Compustat financial statement data set as input. Their approach yielded, on average, 13% more accurate one-year-ahead predictions than the best-performing traditional linear approach, with superior performance consistent across various evaluation metrics and translating into more profitable investment strategies. Extensive model interpretation revealed that income statement variables and incredibly different definitions of earnings were the most important predictors.

The increasing demand for dependable AI systems has driven the increased adoption of causal methods in machine learning (ML) research. It is imperative to comprehend causality to resolve the current deficiencies of machine learning, as underscored [16]. One example is the frequent use of black box models for socially impactful decisions, which requires transparency in their decision-making logic [17]. Most conventional machine learning methods concentrate on correlational patterns between variables rather than accurate causal relationships, which may result in inaccurate, biased, or potentially detrimental decisions.

Data science frequently encounters causal questions, particularly in marketing effectiveness, feature impact analysis, customer retention, and medical treatment efficacy. Traditional data analysis can disclose patterns and correlations; however, causal questions necessitate a more profound comprehension of the underlying processes and how the data was generated. Organizations frequently encounter difficulties in effectively implementing and interpreting causal analyses despite the existence of a variety of mathematical methodologies that have been developed in various domains to address causality. This has resulted in the creation of specialized frameworks, such as non-parametric structural equations and graphical models. Predictive modeling recognizes patterns between inputs and outputs in observed data. However, it is necessary to comprehend counterfactuals, a fundamental concept in causal inference, to make many real-world decisions. The challenge is the impossibility of observing both the intervention and non-intervention states simultaneously. This becomes particularly important when answering inquiries such as: What are the potential consequences of system modifications on user outcomes? What are the determinants of system outcomes? What systemic modifications could improve user experiences? How do human behavior and systems interact? What is the influence of system recommendations on user behavior? Or, more precisely, in the context of machine learning: “What is the impact of varying input features on the decision-making process of a classifier?” Additionally, “Did feature X directly cause outcome Y?” To address these inquiries, it is necessary to transition from correlation to causal reasoning [18].

Prior research in earnings prediction has demonstrated significant progress in utilizing various machine learning approaches to predict future corporate earnings. However, several critical gaps remain in the existing literature. First, while these studies have achieved considerable accuracy in predictions, they predominantly operate as “black box” models, lacking transparency and interpretability in their decision-making processes. Second, although previous research has identified various factors affecting earnings predictions, there has been limited exploration of the causal mechanisms linking these factors to earnings changes. This is particularly significant as understanding causality, not just correlation, is crucial for effective decision-making. A notable disconnect exists between predictive modeling and causal inference in the earnings predicting domain. Current studies have primarily focused on correlational patterns rather than establishing causal relationships, potentially limiting their practical utility for decision-makers. Furthermore, while recent research has incorporated explainable AI approaches, these efforts have mainly concentrated on feature importance without addressing the underlying causal structures that drive earnings changes.

To address these gaps in the literature, this study proposes four research questions:

RQ1:
How can machine learning models effectively utilize financial statement data to predict future corporate earnings accurately?
RQ2:
What factors or variables contribute to increasing and decreasing a company’s future earnings?
RQ3:
How explainable AI methods can provide interpretable insights into the reasons behind these earnings changes?
RQ4:
How can causal inference methods be integrated with machine learning to understand the cause-and-effect relationships between identified factors and future earnings for more informed decision-making?

Based on previous research and the research questions above, this study aims to significantly contribute to future earnings prediction by integrating explainable artificial intelligence (XAI) techniques with causal inference methods, explicitly utilizing the DoWhy framework. The study introduces a novel hybrid approach combining “Anchor Explanation” with causal analysis through DoWhy to improve the interpretability and causal understanding of the predictive model’s decisions. DoWhy, as a robust causal inference framework, provides a structured approach through potential outcomes and graphical models, allowing us to model assumptions and explicitly identify causal effects in earnings predictions. The study employs the Anchor Explanation method at the foundational level, which provides straightforward rules or conditions that easily explain the model’s predictions. This approach helps users comprehend how the model arrives at its predictions, facilitating a better understanding of the decision-making process. Integrating these explanations with causal analysis through DoWhy further enhances the model’s interpretability by revealing what factors are essential and why they matter.

The remainder of this paper is organized as follows: Section II provides the related works. Section III delves into the details of our proposed method. Section IV presents the experimental setup and results. Section V engages in discussion. Finally, Section VI summarizes the conclusions.

SECTION II.

Related Works

A. Future Earnings

Forecasting a company’s profitability based on information from its financial reports is predicting future earnings using financial statement data. This process uses financial statements to indicate whether the business will generate profits or incur losses in the future. Ou and Penman [13] developed a method to estimate the likelihood of earnings increasing or decreasing in the following fiscal year by analyzing descriptors found in the financial statements. Ishibashi et al. [9] introduced a formula to represent a company’s current profit earning per share in a given fiscal year as defined as the relative ability of the firm to generate earnings in the subsequent fiscal year, as expressed in the equation:\begin{align*} & \hspace {-0.3pc}{Pr}_{i}\left ({{ t+1 }}\right) \\ & =\begin{cases} \displaystyle 0 & ({e.p.s}_{i}\left ({{ t+1 }}\right)-{e.p.s}_{i}\left ({{ t }}\right)-{drift}_{i}\left ({{ t+1 }}\right)\lt 0) \\ \displaystyle 1 & ({e.p.s}_{i}\left ({{ t+1 }}\right)-{e.p.s}_{i}\left ({{ t }}\right)-{drift}_{i}\left ({{ t+1 }}\right)\gt 0), \end{cases} \tag {1}\end{align*} View SourceThe term of ${drift}_{i}\left ({{ t+1 }}\right)$ represents the average change in earnings per share (EPS) for the last four years starting from the fiscal year $\left ({{ t+1 }}\right)$ . If ${Pr}_{i}\left ({{ t+1 }}\right)$ denotes the earnings growth of the company, then ${Pr}_{i}\left ({{ t+1 }}\right)\mathrm {=0}$ signifies a decline in earnings.

B. Explainable Artificial Intelligence (XAI)

The development of Explainable AI (XAI) techniques aims to bring transparency and clarity to the decision-making processes of AI systems, enabling them to articulate their strengths and limitations and provide insights into their anticipated behavior in future scenarios. The XAI approach explores and develops various methods, supplying AI developers with diverse design choices. These options enable developers to find the optimal balance between an AI system’s performance and its ability to explain its internal workings understandably. XAI provides a range of techniques to cater to diverse needs and preferences, optimizing the performance-explainability trade-off according to the unique requirements of each AI application [19], [20].

XAI can potentially transform the financial services industry by addressing financial transparency. Recent advances in AI models have shown their ability to enhance the precision, efficiency, and effectiveness of various financial services, such as liquidity balance prediction, credit score assignment, and investment portfolio optimization. However, to prevent bias and ensure fair decision-making, it is essential for all stakeholders, including banks, investors, employees, and customers, to understand the logic and rationale behind these AI models. XAI techniques provide clarity and transparency, allowing stakeholders to gain insights into AI decision-making and understand how the models reach their conclusions. XAI demystifies AI in financial services, fostering trust, promoting accountability, and ensuring the unbiased and equitable harnessing of AI’s benefits [21]. Medianovskyi et al. [22] introduced an XAI method to predict financial crises in Small and Medium Enterprises (SMEs) using regression models like Gradient Boosting (XGBoost and Catboost), Random Forests, Logistic Regression, and Artificial Neural Networks. They applied Shapley’s Additive Explanations (SHAP) framework to interpret and validate their predictions. The study found that the Catboost model best forecasted financial crises within SMEs. SMEs younger than 10 with a cash ratio below 20% have a higher risk of experiencing a financial crisis. The study also noted that specific combinations of features deemed ‘hazardous’ substantially impact the probability of a financial crisis. Aruleba and Sun [23] combined ensemble classifiers with SMOTE-ENN for data imbalance and SHAP for model interpretability in credit risk prediction. Their results showed XGBoost performed best on the German credit dataset (0.930 Recall, 0.846 Specificity), while Random Forest excelled on the Australian dataset (0.907 Recall, 0.922 Specificity). The SHAP integration provided transparency into feature importance for credit decisions. De Lange et al. [24] proposed an XAI model for predicting credit default on unsecured consumer loans using LightGBM with SHAP. The LightGBM model outperformed the bank’s logistic regression credit scoring model. The most important explanatory variables for predicting default were the volatility of the utilized credit balance, remaining credit as a percentage of total credit, and the duration of the customer relationship. Tran et al. [25] applied machine learning algorithms to predict the financial distress of listed companies in Vietnam from 2010 to 2021 and utilized SHAP values to interpret the results. Extreme gradient boosting and random forest models outperformed other models regarding recall, F1 scores, and AUC. According to Shapley’s values, long-term debts to equity, enterprise value to revenues, accounts payable to equity, and diluted EPS significantly influenced the model outputs. Nallakaruppan et al. [26] proposed an XAI framework based on machine learning models and techniques like LIME, SHAP, and partial dependence plots for credit risk assessment. They experimented with a loan prediction dataset and found that the random forest model achieved the highest accuracy, sensitivity, and specificity of 0.998, 0.998, and 0.997, respectively. The LIME and SHAP explainers provided explanations with local and global surrogates for various parameters on the features, helping customers understand the reasons behind loan approval or rejection and enabling banks to avoid legal issues and financial losses. It can be observed that research in the field of financial analysis has utilized SHAP and LIME XAI to explain the reasons behind predictions by presenting essential values. However, no study has yet employed Anchor XAI to explain the reasons for predictions in the form of if-then rules, which would make the explanations more straightforward to understand.

C. Shapley Additive Explanations (SHAP)

Lundberg and Lee [27] introduced SHAP, a unified framework that has emerged as a powerful tool for improving the interpretability of machine learning models. By quantifying the contribution of each input feature to the model’s output, SHAP values provide valuable insights into the model’s decision-making process. This technique allows users to understand how different features influence the model’s predictions, making the model more transparent and explainable. Shapley values, denoted as $\emptyset _{i}$ , for a specific feature i are calculated by taking the average of the incremental contributions of the feature i overall potential combinations S of the set of features N, excluding feature i itself, as shown in equation (2).\begin{align*} \emptyset _{i}= \frac {1}{n!}\sum \nolimits _{S\in N-\{i\}} {\left |{{ S }}\right |!\left ({{ n-1-\left |{{ S }}\right | }}\right)![f\left ({{ S\cup \left \{{{i }}\right \} }}\right)-f(s) ] } \tag {2}\end{align*} View Sourcewhere S represents a subset of features, f refers to the predictive model, N is the complete set of features and n is the number of features. For a specific observation x, the output from the prediction model is explained through a linear function g, which is based on equation (3).\begin{equation*} f\left ({{ x }}\right)=g\left ({{ x^{\ast } }}\right)=\emptyset _{0}+\sum \nolimits _{i=1}^{M} {\emptyset _{i}x_{i}} \tag {3}\end{equation*} View Sourcewhere, x represents the specific instance being explained, $x_{i}$ indicates the simplified version of the input, M refers to the total number of these simplified input features and $\emptyset _{0}$ is the baseline value that is used when no input features are present.

D. Anchor (High-Precision Model-Agnostic Explanations)

The ANCHOR methods explain any black-box classification model’s predictions by identifying decision boundaries that sufficiently “anchor” the prediction. A rule that effectively ties a prediction to a specific local context within the examined instance, known as an anchor explanation, ensures that modifications to other feature values do not significantly impact the rule’s ability to explain the prediction. The Anchors technique relies on reinforcement learning methods and a graph search algorithm to minimize the number of model calls required during runtime and efficiently recover from local optima [28]. Anchor is an algorithm that employs a perturbation-based approach to generate local explanations in simple IF-THEN rules. This method creates scorable rules that accurately explain previously unseen instances as an alternative to LIME’s surrogate models. The coverage concept in Anchor ensures that these rules can precisely explain any potential unseen instances. Reinforcement learning techniques address the exploration or multiarmed bandit problem that the Anchor method uses to find anchors. Anchor analyzes the neighbors or perturbations of each instance under explanation, enabling the algorithm to function without considering the black-box model’s internal structure and parameters. As a result, the black box’s structure and parameters remain hidden and unchanged throughout the process. Anchor’s model-agnostic approach, which does not rely on the specific internal workings of the black box under explanation, makes it applicable to any class of models. By studying the perturbations of each instance, Anchor can generate local explanations in the form of simple and reusable IF-THEN rules, providing a clear and interpretable understanding of the model’s predictions [29]. Equation (4) provides the formal definition of an anchor.\begin{equation*} \mathbb {E}_{\mathcal {D}x(z\vert A)}\left [{{ 1_{\hat {f}\left ({{ x }}\right)=\hat {f}\left ({{ z }}\right)} }}\right ]\ge \tau,A\left ({{ x }}\right)=1 \tag {4}\end{equation*} View Sourcewhere x represents the data point being examined, like a record in a dataset. A is a set of conditions or the generated rule/anchor, where $A\left ({{ x }}\right)=1$ holds true if the features specified by A align with the attributes of x. f denotes the model being explained, such as an artificial neural network, used to predict outcomes for x and its perturbations. ${\mathcal {D}}_{x}(\cdot \vert A)$ is the distribution of instances close to x that satisfy A. The parameter $0\le \tau \le 1$ sets a precision threshold, with rules needing to achieve a local fidelity at or above $\tau $ to be considered reliable explanations.

Precision in Anchor Explanations

Precision is a key metric in Anchor explanations, quantifying how reliable an anchor is in making the prediction invariant. Formally, the precision of an anchor A (for an instance x with model prediction $f(x))$ is the conditional probability that the model’s prediction remains the same for any perturbed instance z that satisfies the anchor. Mathematically, this is defined as: $Precision \left ({{ A }}\right)=P(f\left ({{ z }}\right)=f\left ({{ x }}\right)\left |{{ A\left ({{ z }}\right) holds) }}\right. $ . This means that precision measures the probability that a random instance z, sampled from the vicinity of x, maintains the same predicted class as x, given that z fulfills the anchor conditions. A high precision (close to 1.0 or 100%) implies that whenever the anchor conditions hold, the model almost always outputs the same prediction as it did for the original instance—ensuring a stable, trustworthy rule. By design, Anchor algorithms prioritize high-precision rules, often imposing a minimum precision threshold (e.g., $t=0.95$ ) that an anchor must achieve to be considered valid. This ensures that the explanation remains consistent and reliable across multiple perturbed instances.

E. Dowhy Causal Inference

Causal inference is a methodological approach for understanding the impact of interventions by determining cause-and-effect relationships. A classic example is studying how medications affect disease outcomes. Among various causal inference methodologies, DoWhy is a comprehensive open-source Python library that implements powerful frameworks, including potential outcomes and graphical models founded on modeling assumptions and causal effect identification [30].

The DoWhy framework implements causal inference through four essential stages:

First, the Model stage involves constructing a causal graphical model following [16] methodology. This model explicitly outlines causal assumptions, though it need not be exhaustive. When working with partial graphs representing known variable relationships, DoWhy automatically considers remaining variables as potential confounders.
Second, the Identification stage determines viable approaches for measuring desired causal effects based on the graphical model. Using graph-based criteria and do-calculus, DoWhy identifies expressions that can quantify causal effects through various methods, including back-door criterion, front-door criterion, instrumental variables, and mediation analysis.
Third, the Estimation stage implements various methodologies to calculate causal effects, providing non-parametric confidence intervals and permutation tests for statistical significance. The estimation approaches include:
- Treatment assignment-based methods (e.g., propensity-based stratification, matching, and inverse weighting)
- Outcome model-based methods (e.g., linear regression and generalized linear models)
- Instrumental variables-based methods (e.g., binary instrument estimation and two-stage least squares)
- Mediation analysis methods using two-stage linear regression
Finally, the Refutation stage validates the estimated effects through multiple testing methods. These include:
- Adding random common causes to verify estimate stability
- Using placebo treatments to confirm null effects (Hint: the effect should go to zero)
- Testing with dummy outcomes to validate the effect of disappearance (Hint: The effect should go to zero)
- Employing simulated outcomes to match known data-generating processes
- Testing sensitivity to unobserved confounders
- Validating across data subsets
- Performing bootstrap validation

Currently, the DoWhy framework has been applied to analyze cause-and-effect relationships, as demonstrated by Liu et al. [31]. Their study presents a fault early warning model for high-speed trains that combines feature importance analysis using SHAP with causal relationship analysis using DoWhy. The experimental results show that this approach improved model performance by over 10% and increased computational efficiency by 35%. Moreover, it enables experts to verify causal relationships between features, leading to better-informed decision-making. Wang et al. [32] suggest a novel method for wind power forecasting that employs causal inference techniques, which is a departure from conventional neural network-based methods. The process entails the utilization of prospective outcomes models from causal frameworks to analyze numerical weather prediction (NWP) data. Cabije et al. [33] examine the causal relationships between the factors influencing investments in green facade technology for residential buildings. The results indicate that the Utility factor has the most potent positive causal effect (0.8469) on Willingness to Pay, while Environment and Risk Aversion do not exhibit any significant causal influence. Noh and Kim [34] employ DoWhy to examine the causal relationship between mental health and type 2 diabetes. Based on a socioenvironmental framework, the researchers developed a causal model and evaluated the statistical significance of their findings. The study determined that the incidence of diabetes is elevated by approximately 15% as a result of mental health issues (mean value: 0.146) after administering a variety of refutation tests.

F. Optuna for Hyperparameter Optimization

Optuna is a popular open-source library that simplifies the process of hyperparameter tuning in machine learning projects [35]. Optuna offers several key advantages that make it a powerful tool for hyperparameter optimization. One of its primary benefits is the define-by-run style API, which draws inspiration from deep learning frameworks. This API allows users to dynamically define the hyperparameter search space, providing flexibility and adaptability throughout the optimization process. Optuna’s efficient pruning and sampling mechanism is another significant advantage. This mechanism comprises two main policies: efficient searching and efficient performance estimation. These policies work together to create a cost-effective optimization method that intelligently explores the hyperparameter space while minimizing computational resources. Lastly, Optuna is known for its ease of setup, making it accessible to experienced and novice machine learning practitioners. Its user-friendly interface and comprehensive documentation enable users to quickly integrate Optuna into their existing projects, streamlining the hyperparameter tuning process and accelerating the development of high-performing models.

G. Hyperband Bandit-Based Approach

Hyperband, introduced by Li et al. [36], is an innovative hyperparameter optimization strategy that efficiently allocates computational resources among potential configurations. By dynamically assigning more resources to promising setups and quickly discarding underperforming ones, Hyperband streamlines the search for optimal hyperparameters in machine learning models. The key strength of Hyperband lies in its adaptive resource allocation mechanism, which initially assigns a small amount of resources to each setup and gradually increases the allocation for those showing promise. This approach enables the algorithm to identify and prioritize the most potential configurations early in the optimization process. Moreover, hyperband balances exploration and exploitation, two crucial aspects of efficient optimization. Our study demonstrates that combining Hyperband with Optuna improves the hyperparameter tuning process. Hyperband, a pruning algorithm, effectively prioritizes high-performing hyperparameter sets and discards less promising ones dynamically. This integration accelerates the search for optimal hyperparameters, leading to more efficient and effective optimization outcomes.

SECTION III.

Proposed Methodology

The proposed approach for predicting future earnings, as depicted in Fig. 1, consists of three main elements. The first component focuses on enhancing the accuracy and efficiency of future earnings predictions by employing Optuna, a state-of-the-art hyperparameter optimization framework, in conjunction with Hyperband, a robust pruning algorithm. This combination allows for the efficient exploration and identification of optimal hyperparameter configurations for the predictive models, resulting in improved performance and more reliable earnings prediction. The second component of the proposed method involves utilizing Anchor explainable AI techniques to provide interpretable and transparent explanations for predicted future earnings. We can uncover and present the underlying reasons and key factors influencing earnings predictions in a comprehensible manner by applying Anchor. The third component focuses on discovering causal relationships between factors affecting future earnings using DoWhy, a causal inference library. This enables us to understand correlations and actual cause-and-effect relationships between various factors and future earnings, providing deeper insights into the mechanisms that drive earnings predictions. This comprehensive three-pronged approach combines optimization, explainability, and causal inference to create a robust and interpretable framework for future earnings prediction.

FIGURE 1.

The proposed method of future earnings prediction through optimized XGBoost with causal and explanations AI.

Show All

A. The Improvement of Future Earnings Prediction through Optuna With Hyperband Optimization

The process in Algorithm 1 is divided into two key stages, as detailed in the following sub-sections.

Algorithm 1 Advanced Optimization of XGBoost by RENN Balancing

Inputs:

$X_{raw}$ : Original feature matrix, where $X_{raw}\in \mathbb {R}^{n\times m}$ , with n instances and m features.
$y_{raw}$ : Original labels vector, where $y_{raw}\in \mathbb {R}^{n}$ .

Outputs:

${model}_{final}$ : Optimally trained XGBoost classifier.
${accuracy}_{final}$ : Accuracy of ${model}_{final}$ on $X_{test}$ .
Additional metrics: Confusion matrix, precision, recall, and F1-score.

Step 1:

Handle Imbalance Data with RENN

Initialize Repeated Edited Nearest Neighbours (RENN):
- Configure RENN to iteratively edit the dataset by removing misclassified samples based on their k nearest neighbors.
Apply RENN to Full Dataset:
- $\left ({{ X_{balanced},y_{balanced} }}\right)\mathrm {\leftarrow }$ Apply RENN to $X_{raw}$ , $y_{raw}$ to achieve a more balanced dataset.
Split Balanced Data into Training and Testing Sets:
- Randomly split $\left ({{ X_{balanced},y_{balanced} }}\right)$ into $X_{balanced}^{train}$ , $y_{balanced}^{train}$ (training set) and $X_{balanced}^{test}$ , $y_{balanced}^{test}$ (testing set).

Step 2:

Initialization

Data Loading and Preprocessing:
- datasets $X_{balanced}^{train}$ , $y_{balanced}^{train}$ and $X_{balanced}^{test}$ , $y_{balanced}^{test}$ are correctly loaded and preprocessed.

Step 3:

Hyperparameter Tuning via Optuna

Define Objective Function$\boldsymbol {f}$
- Configure f to optimize XGBoost parameters dynamically during model training.
- Use cross-validation on $X_{train}^{balanced}$ to evaluate model performance.
Execute Optuna:
- Configure and execute an Optuna study to minimize the log loss and identify the best model parameters ${param}_{best}$ .

Step 4:

Train Optimized Model

Final Model Configuration:
- ${model}_{final}\mathrm {\leftarrow }$ Configure the final XGBoost model with ${param}_{best}$ .
Train Model with Instance Weights:
- Train ${model}_{final}$ on $X_{train}^{balanced}\mathrm {.}$

Step 5:

Performance Evaluation

Make Predictions:
- $predictions\mathrm { \leftarrow }$ Use ${model}_{final}$ to generate predictions on $X_{balanced}^{test}$ .
Compute Metrics:
- Evaluate ${accuracy}_{final}$ , confusion matrix, and other performance metrics.

1) Handling Imbalanced Data Using Repeated Edited Nearest Neighbours (RENN)

To tackle the challenge of imbalanced datasets characterized by disproportionate class representation, this study employed the Repeated Edited Nearest Neighbor (RENN) method [37]. RENN is an undersampling technique that iteratively applies the Edited Nearest Neighbors algorithm to balance the collected datasets. The RENN method operates by repeatedly executing the EditedNearestNeighbors algorithm until one of the following stopping criteria is met: (i) The algorithm achieves the predefined maximum number of iterations; (ii) It does not remove any further observations during an iteration; (iii) A majority class changes into a minority class, or (iv) The under-sampling process eliminates a majority class.

2) Utilizing Optuna and Hyperband for Optimizing Hyperparameters in Future Earnings Predicting Models

In this study, we employed tree-based machine learning algorithms, coupled with Optuna and Hyperband, to predict future earnings. Breiman [38] introduced the Random Forest (RF), the first tree-based model we utilized in our research. To produce the final model output, RF constructs an ensemble of decision trees, randomly selects features for each tree, and aggregates the outputs of all trees. This study employs XGBoost as the second tree-based machine learning approach [39]. To achieve ensemble learning, XGBoost uses two fundamental principles: bagging and boosting. Bagging entails simultaneously training models and generating trees through independent sampling, which improves the models’ stability and accuracy. Boosting, on the other hand, creates trees sequentially, with each tree building upon the weaknesses of the previous one, allowing for iterative improvement in the learning process. Finally, we applied LightGBM to predict the impedance values for circuit analysis. Developed by Ke et al. [40], LightGBM is a computationally efficient algorithm based on the gradient-boosting framework. It employs a novel, gradient-based, one-sided sampling technique to filter data instances and generate segmentation values. Moreover, LightGBM performs exclusive feature bundling to reduce the dimensionality of the feature space, resulting in a highly efficient training process.

The integration of Optuna and Hyperband provides a comprehensive and automated solution for hyperparameter tuning in the context of future earnings predictions. By leveraging this advanced optimization technique, we can fine-tune the predictive models to capture complex patterns and relationships within the financial data, ultimately leading to more accurate and robust earnings forecasts.

B. AI Explanation for Future Earnings Prediction

In this research, we distinguish the significance of features for each class in a two-class classification problem using SHAP values. By separating the positive and negative average SHAP values, we can discern the features that contribute favorably to class 1 (increased earnings) and those that contribute unfavorably, leaning predictions toward class 0 (decreased earnings). This methodology provides a comprehensive grasp of the feature relevance for each class, enabling better model interpretation and potential feature selection or feature engineering based on class-specific importance. Additionally, we employ ANCHOR techniques to identify a decision boundary that appropriately “anchors” the black box model’s prediction of future earnings. The anchor approach can precisely articulate its explanations’ boundaries for each instance as an if-then rule, simplifying them and rendering them easily understandable. The anchor explanations for XGBoost model predictions are generated by Algorithm 2.

Algorithm 2 Generating Anchor Explanations for XGBoost Model Predictions

Objective: To provide interpretable, rule-based explanations for predictions made by an XGBoost model using the Anchor Explainer methodology, enhancing transparency and aiding in compliance with regulatory requirements.

Inputs:

$X_{train}$ : Training feature matrix, where $X_{train}\in \mathbb {R}^{n\times m}$ , with n instances and m features.
$y_{train}$ : Training labels vector, where $y_{train}\in \mathbb {R}^{n}$ .
$X_{test}$ : Test feature matrix, analogous to $X_{train}$ .
$y_{test}$ : Test labels vector, analogous to $y_{train}$ .
${model}_{final}$ : A trained XGBoost classifier.
$class\_names$ : A list of class names in the target variable.
$feature\_names$ : A list of feature names in the dataset.
idx: Index specifying the test instance to be explained.

Outputs:

Anchor: A set of rules that explain the conditions under which the model makes a specific prediction for the test instance.
Precision: The accuracy of the anchor in predicting the same outcome as the model for other similar instances.
Coverage: The proportion of instances the anchor applies to within the dataset.

Steps:

Initialize the Anchor Explainer:

Construct an AnchorTabularExplainer using the training dataset $X_{train}$ , class names $class\_names$ , feature names $feature\_names$ .

Prepare the Instance for Explanation:

Select the instance at the index idx from $X_{test}$ for which the explanation is to be generated.

Generate the Anchor Explanation:

Use the Anchor Explainer to compute an explanation for the selected instance.
Set a minimum precision threshold of 0.95 in Anchor explanations to ensure that an anchor is considered valid only if it maintains the model’s prediction with at least 95% confidence.

Evaluate the Explanation:

Determine the precision of the anchor, which measures how often the model prediction remains the same under the conditions specified by the anchor across all data points.
Calculate the coverage, quantifying the proportion of dataset instances that satisfy the anchor’s conditions.

Report the Explanation:

Present the anchor as a set of if-then rules.

C. Causal Discovery of Factors Affecting Future Earnings Using Dowhy

The analysis of causal relationships in Algorithm 3 comprises five primary steps, commencing with data preparation, which encompasses data cleaning, variable identification, and data standardization. The Causal Graph is subsequently constructed and visualized to represent the relationships between variables. Next, it employs DoWhy’s identify_effect method and appropriate estimation techniques to identify and estimate the Causal Effect. Robustness in evaluating the reliability of the results by evaluating the impact of potential unobserved confounders. Lastly, it conducts Scenario-Based Predictions to assess the effects of hypothetical interventions on the outcome variable, which can be employed for future decision-making and planning. The algorithm systematically evaluates the relationships between treatment variables, outcome variables, and confounders during these stages to offer insights into causal relationships and their potential implications.

Algorithm 3 The Causality Between Factors Influencing Future Earnings through Causal Discovery using DoWhy

Inputs:

Dataset$\boldsymbol {D}$ : A balanced, structured dataset containing observed data for predictor variables X, treatment variable T, outcome variable Y (Earning), and potential confounders C.
Causal Graph$\boldsymbol {G}$ : Assumed causal relationships represented as a directed acyclic graph (DAG), constructed based on domain knowledge.

Outputs:

Causal Estimate$\hat {\tau }$ : The estimated causal effect of treatment variables on the outcome.
Sensitivity Analysis$\boldsymbol {S}$ : Result of robustness testing against unobserved confounders.
Scenario-Based Predictions: Expected outcomes under hypothetical interventions on the treatment variables.

Step 1:

Construct and Visualize Causal Graph

1.1

Define Causal Graph Structure: Construct the causal graph G in DOT format, representing the assumed relationships among T, Y, C, and any other relevant variables based on domain expertise or theoretical assumptions.

Step 2:

Causal Effect Identification

2.1

Specify Causal Estimand:

Using DoWhy’s identify_effect method, define the causal estimand $\mathbb {E}\left [{{ Y\vert do(T=t) }}\right]-\mathbb { E}\left [{{ Y\vert do(T=t^{\prime }) }}\right]$ that represents the causal effect of T and Y.
Determine the adjustment set Z necessary to satisfy the backdoor criterion, ensuring that confounding paths between T and Y are blocked.

Step 3:

Causal Effect Estimation

3.1

Estimate the Causal Effect:

Apply methods based on estimating the outcome model (Linear Regression) to calculate the causal effect $\hat {\tau }$ of T on Y, representing the average treatment effect (ATE).
This effect quantifies the direct influence of T on Y, controlling for the identified confounders C.

Step 4:

Predictive Effect of Treatment Optimization

4.1

Calculate Predicted Outcome: Use $\hat {\tau }$ to quantify the expected impact on Y when T variables are optimized (e.g., increase, decrease, or adjust specific values in T).

Step 5:

Robustness Analysis

5.1

Refutation Testing: Apply placebo treatment to verify the causal estimate against possible biases and confirm its validity.

Step 6:

Scenario-Based Causal Prediction

6.1

Define Hypothetical Scenarios: Design hypothetical intervention scenarios by simulating changes in treatment variables T, such as increasing or decreasing values to test their impact.

6.2

Estimate Effects Under Scenario:

Re-estimate the causal effect $\hat {\tau }_{scenario}$ for the hypothetical scenario using the updated data.
Assess the scenario’s impact on Y and and interpret the results regarding operational strategy for future earnings growth.

SECTION IV.

Experimental Setup and Result

A. Data Description

This study examined 7,530 financial statements from 959 companies listed on the Stock Exchange of Thailand between 2013 and 2022, as illustrated in Fig. 2. From these statements, we derived 68 accounting descriptor variables presented in Table 1. Each company’s earnings for the subsequent fiscal year were calculated using formulas established by Ou and Penman [13] and Ishibashi et al. [9], providing critical data on future earnings performance—the primary focus of our analysis. Reliability testing yielded a Cronbach’s Alpha coefficient of 0.737, confirming strong internal consistency and supporting the validity of our findings. The final dataset comprised two distinct classes: 5,036 instances where earnings decreased (class 0) and 2,494 cases in which earnings increased (class 1).

TABLE 1 Example of Descriptor Variables and Financial Earnings From Two Companies

FIGURE 2.

Number of financial statements by year.

Show All

B. Assessment Matrices

We employed various metrics, including accuracy, precision, and recall, to assess the predictive performance of the study. Table 2 presents the classification results as a confusion matrix, which displays each class’s correct and incorrect predictions. In the matrix, {P, N} denotes the positive and negative testing data, while {Y, N} represents the classifier’s predictions for the positive and negative classes, respectively [41]. This confusion matrix provides a comprehensive overview of the model’s performance in correctly identifying instances in each category.

TABLE 2 Confusion Matrix

The number of correct predictions for positive instances in the confusion matrix is called “true positive” (TP). In contrast, the count of accurate predictions for negative instances is called “true negative” (TN). On the other hand, incorrect predictions for positive examples are labeled as “false positive” (FP), and those for negative examples are termed “false negative” (FN). The prediction model’s performance on the datasets was evaluated using the metrics defined in equations (5) to (8) below. These utilize the counts from the confusion matrix to assess the model’s effectiveness in correctly classifying instances from both positive and negative classes.

Accuracy:\begin{equation*} \frac {TP+TN}{TP+FP+FN+TN} \tag {5}\end{equation*} View SourcePrecision:\begin{equation*} \frac {TP}{TP+FP} \tag {6}\end{equation*} View SourceRecall:\begin{equation*} \frac {TP}{TP+FN} \tag {7}\end{equation*} View SourceF-measure:\begin{equation*} \frac {\left ({{ 1+\beta ^{2} }}\right)\times Recall\times Precision}{\beta ^{2}\times Recall+ Precision} \tag {8}\end{equation*} View Sourcewhere $\beta $ is a coefficient that modulates the trade-off between precision and recall, often established at 1. A superior F-measure signifies the effectiveness of a learning algorithm concerning the target class, demonstrating elevated levels of both recall and precision.

C. Experimental Configuration

The testing environment was set up on a Windows 11 computer with an NVIDIA GeForce RTX 4070 Ti 8GB GPU and 32GB of RAM. Development was primarily conducted in Python, utilizing several machine learning libraries such as RandomForestClassifier, XGBClassifier, lightgbm, optuna, HyperbandPruner, imbalanced-ensemble, anchor-exp, and DoWhy for causal inference and model training. The DoWhy library was specifically employed to discover and analyze causal relationships between variables in the dataset.

D. Experimental Results

1) Result of the Improvement of Future Earnings Prediction through Optuna with Hy-Perband Optimization

Equation (9) illustrates how a standardization technique normalizes the input data. Standardization is rescaling the features to follow a standard normal distribution with a mean of zero and a standard deviation of one. This method, called Z-score normalization, centers the data around zero and adjusts the scale to have a unit variance.\begin{equation*} Z\left ({{ x }}\right)= \frac {x-\bar {x}}{\sigma } \tag {9}\end{equation*} View Sourcewhere x is an instance of the feature vector with n dimensions, where x belongs to $Z^{n}$ . The average values and variability of the features are denoted by $\bar {x}$ and $\sigma $ , respectively, both of which are n dimensions and belong to $Z^{n}$ .

Our study evaluated and compared four different sampling methods to address the issue of class imbalance in the dataset. The goal was to determine which technique was most effective at balancing the classes within the dataset to reduce bias and improve the validity of the analysis results. The sampling techniques employed in this study were SMOTE, ADASYN, Tomek Links, and Repeated Edited Nearest Neighbors (RENN). After balancing the dataset, we split the data into a training set (70% of the samples) and a test set (30%). This split was performed using stratified sampling [42] to preserve the class distribution in both subsets and prevent any class imbalance between training and testing data. By maintaining the same proportion of each class in both sets, stratification ensures that the model is trained and evaluated on representative data for all classes. This approach improves the model’s generalization to unseen data and provides a fair, unbiased evaluation across classes. To ensure the reproducibility of the results and avoid any selection bias in the splitting process, we used a fixed random seed when creating the train-test split. This guarantees that the exact same split can be obtained in repeated experiments. We used the training set for hyperparameter tuning using Optuna and Hyperband and reserved the test set for evaluating the final model. Table 3 presents the results of the test set for each classification algorithm (Random Forest (RF), XGBoost (XGB), and LightGBM) using different sampling methods. Furthermore, Table 4 reveals that the RENN dataset has identified the optimal parameters for the Random Forest (RF), XGBoost (XGB), and LightGBM models. The study’s results show that the model that achieved the best accuracy, recall, and F-measure scores for both classes was the one that used undersampling with the RENN technique and the XGBoost algorithm. This means that when combined with the XGBoost model, the RENN undersampling method did better than all the others at correctly classifying instances while keeping a good balance between accuracy and recall for each class.

TABLE 3 The Effectiveness of Random Forest, XGBoost, and LightGBM Following the Application of Balancing Techniques

TABLE 4 The Best Hyperparameter Settings for the RENN-Enhanced Dataset

2) Results of Using Anchor XAI for Transparent Future Earnings Predictions

Interpreting the rationale behind earnings prediction and recognizing the relative importance of various features for each earnings class is crucial. We can better interpret and explain the model’s predictions by understanding the factors influencing the predicted outcomes for different earnings groups. This understanding is invaluable for making informed business decisions and formulating effective strategies. This research employs two explainable AI techniques to improve the interpretability of the predictions. First, we apply Anchor XAI to generate easily understandable IF-Then rules illuminating the reasoning behind each prediction. These rules offer precise justifications for decisions based on the input features. Second, we use SHAP values to assess each feature’s significance with specific classes. By examining the signs (positive or negative) and the magnitude of the SHAP values, we can discern which features strongly influence the prediction of a particular class. Positive SHAP values indicate a feature’s contribution to class 1 (increased earnings), while negative values suggest a feature’s influence in class 0 (decreased earnings). By combining these two approaches, this study aims to provide a comprehensive and transparent understanding of the model’s predictions, enabling users to grasp the key factors driving each prediction and the overall importance of features for each class. In predicting future earnings, a threshold of 0.95 is set for the certainty of anchor explanations. This means that the generated IF-Then rules are considered reliable if they have a certainty score of at least 0.95. By evaluating these rules’ precision and coverage metrics, we can better understand their effectiveness in distinguishing between companies that are likely to experience an increase in earnings and those that may face a decrease. Precision measures the rules’ accuracy, indicating the proportion of correctly identified instances within each predicted class. Conversely, coverage evaluates the rules’ applicability by displaying the percentage of cases that the identified rules can explain. Table 5 presents the top 5 examples of IF-Then rules generated by the Anchor XAI method. These rules provide clear and concise explanations for the model’s predictions, highlighting the key features and conditions contributing to classifying a company’s future earnings as increasing or decreasing. Table 6 displays the top five key features that significantly influence each class’s prediction of future earnings. This table highlights the most impactful variables that drive the model’s classification of a company’s future earnings as either increasing or decreasing.

TABLE 5 Top 10 Anchor-Generated IF-Then Rules for Each Future Earnings Class

TABLE 6 Influencing Future Earnings Prediction for Each Class

3) Results of Causal Inference

This research examines the relationship between earnings (outcome) and important financial factors (treatments) identified in Table 6. For causal analysis, treatments are variables that potentially affect earnings, while confounders can influence both treatments and earnings, potentially creating bias in the observed relationships. Controlling confounders is essential in causal analysis to isolate the actual effect of the treatment on earnings, removing biases that might distort the findings. The causal graph in Fig. 3 illustrates the relationships between earnings (Earning) and important financial factors. In the graph, nodes represent variables, while arrows indicate potential causal influences between them. The variables on the left, such as %$\Delta $ in net profit margin, Inventory/Total assets, and others, are identified as treatment factors that potentially affect earnings. Variables like $\Delta $ in dividend per share and Cash dividend as % of cash flows are confounders, as they may influence both the treatments and earnings, introducing bias if not adequately accounted for. The arrows connecting treatments and confounders to earnings reflect hypothesized causal pathways, demonstrating how various financial factors interact to impact the outcome.

FIGURE 3.

Causal graph depicting the relationships between important financial factors and earnings.

Show All

Our investigation is driven by three crucial causal questions concerning financial factors and earnings, outlined in Table 7. These are high-stakes questions because they address core operational and financial variables directly impacting a company’s long-term profitability and sustainability. High-stake questions focus on factors where decisions have significant consequences, such as optimizing operational efficiency, improving profitability, and mitigating risks related to earnings decline. They require careful evaluation because the outcomes of these decisions influence short-term financial performance and long-term strategic viability.

TABLE 7 Summary of Causal Questions, High-Stake Aspects, Treatments, Outcomes, and Confounders for Earnings Analysis

By framing the causal questions as high-stakes, this research highlights the importance of balancing operational and financial priorities while considering resource constraints and risk mitigation. These questions assist in evaluating and balancing various operational and financial alternatives, leading to more effective resource allocation decisions. With a comprehensive understanding of how different key variables influence each other causally, companies can develop more targeted strategies to meet their earnings growth goals. High-stake questions ensure that decisions are data-driven and focus on areas with the potential to yield the most impactful results, making them central to the strategic decision-making process.

To enhance the causal validity of our findings, we conducted a rigorous assessment of confounding biases and sensitivity analyses within the DoWhy causal inference framework. Confounders—variables that simultaneously influence both the treatment (financial variables) and outcome (earnings)—pose a significant challenge in causal estimation. To address this, we applied the backdoor criterion by explicitly controlling for confounding variables identified in Table 7, such as $\Delta $ in Dividend per Share, Cash Dividend as % of Cash Flows, %$\Delta $ in Net Profit Margin, and %$\Delta $ in Operating Profit to Sales. These confounders were incorporated into the causal graphical model, ensuring an unbiased estimation of the causal effect. The causal effect analysis results in Table 8 reveal key insights derived from multiple metrics. The analysis employs the backdoor.linear_regression method as the Causal Estimation Method, ensuring transparency and reproducibility. The Direct Effect of Treatment on Outcome quantifies the impact of treatment variables, such as ‘Inventory/Total Assets’ and ‘%$\Delta $ in Net Profit Margin’ on Earnings’, demonstrating the effectiveness of operational adjustments on financial performance. The Estimated Effect After Larger Adjustments extends this analysis by modeling more substantial operational changes, such as a 10% reduction in inventory and a 20% increase in net profit margin, providing insights into the potential for strategic interventions to optimize earnings. To validate these causal claims, we conducted sensitivity analyses using the Placebo Treatment Refuter, which distinguishes between actual causal relationships and random variation. This refutation analysis consists of three key sub-metrics:

The Estimated Effect confirms the initially found direct effect.
The New Effect demonstrates zero impact (0.0) from the placebo test, confirming that the original effect is genuine and not driven by noise.
A p-value of 1.0 statistically confirms the insignificance of the placebo test, further strengthening confidence in the validity of the initial effect.

TABLE 8 Causal Effect Analysis Results

These methodological refinements strengthen the causal validity of our study, ensuring that the identified financial factors are not merely correlated but causally drive earnings outcomes. By integrating these validation techniques, our study enhances the reliability of causal inference for financial decision-making, providing practitioners with actionable insights for optimizing corporate financial strategies. This approach effectively addresses concerns regarding confounding biases and causal robustness, ensuring that financial interventions are based on rigorous, data-driven evidence.

SECTION V.

Discussion

The following solutions and a discussion about predicting future earnings are proposed based on the research questions.

A. RQ1: How Can Machine Learning Models Effectively Utilize Financial Statement Data to Predict Future Corporate Earning Accurately?

Significant insights into model performance and optimization techniques are revealed by our analysis of Tables 3 and 4. The study assessed three machine learning models—Random Forest (RF), XGBoost (XGB), and LightGBM—alongside various data balancing techniques, including SMOTE, ADASYN, Tomek Links, and Repeated Edited Nearest Neighbors (RENN), to mitigate class imbalance in financial datasets. The initial results of the asymmetrical data analysis indicated that XGBoost and LightGBM outperformed Random Forest in terms of accuracy, precision, recall, and F-measure. However, the substantial impact of class imbalance was evident in the low recall values for the minority class, highlighting the need for an advanced resampling strategy.

Among the evaluated balancing techniques, RENN consistently improved performance metrics across all classifiers, particularly in detecting the minority class, due to its iterative noise-reduction approach [43]. The most effective combination emerged as XGBoost with RENN, achieving the highest accuracy of 93.25% while significantly improving minority-class precision (92.56%), recall (92.86%), and F-measure (92.71%). For the majority class (earnings increase), the model maintained strong performance with precision (93.84%), recall (93.58%), and F-measure (93.71%), confirming its balanced predictive capabilities. The success of this approach is attributed to RENN’s ability to remove mislabeled data while retaining meaningful financial patterns, thereby preventing overfitting and enhancing generalization [44].

Beyond data balancing, further optimization was achieved through Optuna Hyperband, which efficiently fine-tuned XGBoost hyperparameters, including gamma, colsample_bytree, subsample, and min_child_weight, on the RENN-enhanced dataset. Optuna Hyperband’s successive halving strategy dynamically allocated resources to promising hyperparameter configurations, allowing the model to focus on optimal learning settings while minimizing computational overhead [36]. This approach ensured that XGBoost retained relevant financial trends without being influenced by irrelevant noise, ultimately enhancing predictive accuracy and reducing bias in financial earnings classification.

From a computational efficiency perspective, XGBoost demonstrated superior practicality for large-scale financial datasets, completing training in just 1.13 minutes, compared to Random Forest’s 4.62 minutes. While LightGBM achieved the fastest processing time, its accuracy remained slightly lower than that of XGBoost. These findings align with Douaioui et al. [45], who highlight XGBoost’s effectiveness in high-accuracy financial predicting tasks due to its efficient tree-based learning mechanism [46]. The high predictive accuracy, computational efficiency, and scalability of our XGBoost+RENN model with Optuna Hyperband have significant implications for financial earnings prediction. First, enhanced predictive performance enables financial analysts, investors, and corporate decision-makers to make more informed strategic decisions. Our optimized XGBoost+RENN model with Optuna Hyperband achieved 93.25% accuracy, significantly improving precision, recall, and F-measure for both earnings increase and decrease classes. This improvement reduces uncertainty, allowing stakeholders to anticipate corporate performance more accurately and adjust investment strategies effectively. Second, risk mitigation and financial stability benefit from the model’s precise earnings prediction. Businesses can proactively detect potential financial downturns, enabling creditors to assess solvency risks more effectively. Early identification of declining earnings allows companies to implement corrective measures, stabilizing financial performance. This is particularly crucial for banks, credit institutions, and policymakers who rely on robust predictive models to maintain economic stability. Third, cost efficiency and operational optimization stem from XGBoost+RENN’s faster computational capabilities, enhanced by Optuna Hyperband’s efficient hyperparameter tuning. Our study showed that XGBoost+RENN required only 1.13 minutes for training, compared to 4.62 minutes for Random Forest, demonstrating its computational advantage. The integration of Optuna Hyperband dynamically optimizes hyperparameters, further boosting predictive accuracy while minimizing training time. This efficiency enables prediction, allowing firms to adapt quickly to market fluctuations without incurring excessive computational costs. Additionally, faster training and optimized hyperparameters facilitate frequent model updates, ensuring that earnings predictions remain current and responsive to the latest financial data.

B. RQ2: What Factors or Variables Contribute to Increasing and Decreasing A Company’S Future Earnigns?

We determine the important attributes for each class in Table 6 by analyzing the SHAP values based on their sign (positive or negative) and magnitude. The essential elements for the “decrease earnings” class (class 0) include Inventory/Total assets, %$\Delta $ in net profit margin, Cash dividend as % of cash flows, Gross margin ratio, Equity to fixed assets. The details are as follows:

Inventory/Total assets measure the proportion of a company’s total assets tied up in inventory. A high ratio indicates significant capital invested in inventory, leading to decreased earnings due to storage costs, obsolescence risks, and reduced working capital flexibility. This aligns with Gupta et al. [47], whose studies have shown that firms engaging in opportunistic overproduction often experience declines in future accounting performance.
%$\Delta $ in net profit margin reflects a company’s profitability efficiency variations. A declining trend in this metric often precedes decreased earnings, indicating fundamental challenges in maintaining profitability. Shirzad et al. [48] have shown that changes in gross margin percentage are related to abnormal market returns, with the relationship being more pronounced when corresponding revenue and earnings surprises accompany positive or negative changes in gross margin percentage.
Cash dividends as % of cash flows, represent the portion of a company’s cash flows paid out as dividends. Consistently paying out more dividends than generated cash flows may force the company to rely on debt or reserves, negatively impacting its financial health and earnings. Benartzi et al. [49] support this, finding that firms with unsustainable dividend payouts are more likely to experience future earnings declines.
Gross margin ratio shows how much revenue remains after accounting for the cost of goods sold. A decreasing or low ratio suggests challenges in managing production costs or maintaining effective pricing strategies. Abarbanell and Bushee [50] demonstrated that companies experiencing gross margin deterioration are more likely to face earnings declines and negative earnings surprises.
Equity to fixed assets indicates the proportion of a company’s fixed assets financed by equity. A declining ratio suggests increased reliance on debt, leading to higher financial risk and interest expenses. This aligns with the findings of Muradoğlu and Sivaprasad [51], which show that firms with higher leverage (lower equity to fixed assets ratio) are more prone to negative abnormal returns and potential earnings declines.

The essential elements for the “increase earnings” class (class 1) include $\Delta $ in dividend per share, %$\Delta $ in operating income/total assets, %$\Delta $ in pretax income to sales, %$\Delta $ in Gross margin ratio, and %$\Delta $ in Times interest earned. The details are as follows:

$\Delta $ in dividend per share can signal a company’s financial health and future prospects. Positive changes typically indicate improved profitability and management’s confidence in sustaining higher earnings. According to Nissim and Ziv [52], their analysis showed that companies increasing their dividends tend to experience subsequent earnings growth, confirming the predictive power of dividend changes for future profitability.
%$\Delta $ in operating income/total assets reflects a company’s efficiency in using its assets to generate operating profits. A positive trend in this metric indicates improving operational efficiency and asset utilization. This implies that improvements in operating income relative to total assets can enhance a company’s financial performance and growth prospects [53].
%$\Delta $ in pretax income to sales measures a company’s profitability before tax relative to its revenue. A positive trend in this metric indicates improving operational efficiency and effective cost management throughout the income statement. This aligns with Ou and Penman [13], who found that increases in pretax profit margins are strong predictors of future earnings growth. The research demonstrated that companies showing consistent improvement in pretax income to sales typically maintain this momentum, as it reflects fundamental improvements in pricing power, cost control, and overall business efficiency.
%$\Delta $ in Gross margin ratio indicates shifts in a company’s essential profitability, measured as revenue retained after accounting for the cost of goods sold. When this change is positive, it signals the company’s enhanced ability to manage its core operational efficiency through better pricing power or more effective cost control in production. Fairfield and Yohn [54] provided empirical evidence that companies experiencing improved gross margin ratios tend to see corresponding future profitability and earnings increases. Their research demonstrates that positive changes in gross margin often translate into sustained earnings growth, as they reflect fundamental improvements in a company’s operational efficiency and market position.
%$\Delta $ in Times interest earned reflects the evolution of a company’s debt servicing capability using its operating income. An upward trend in this metric indicates that a company’s operating income is growing faster than its interest expenses, demonstrating enhanced financial strength and operational efficiency. According to Aivazian et al. [55], companies that improve their time’s interest earned ratio typically have greater financial flexibility to pursue growth opportunities. Their research revealed that such companies face fewer constraints in accessing capital markets, as the improved ratio signals better creditworthiness and operational stability, ultimately contributing to increased earnings potential through strategic investments and expansion opportunities.

C. RQ3: How Explainable AI Methods Can Provide Interpretable Insights Into the Reasons Behind These Earnings Changes?

This study uses Anchor XAI to provide explanations for the prediction outcomes. We present the explanations in the form of if-then rules. The following is an example of an if-then rule for predicting earnings change from Table 5:

An example of an if-then rule resulting in decreased earnings (class 0) is as follows:
- The rule “If Cash dividend as % of cash flows $\lt = -0.17$ and Operating profit (before depreciation) to sales $\lt = -0.17$ and Sales/total assets $\lt = 0.18$ and $\Delta $ in dividend per share <= -26.30 Then decrease earnings.” suggests that a company’s earnings are likely to decrease when these conditions are met simultaneously. This can be explained as follows: 1) Cash dividend as % of cash flows $\lt = -0.17$ : This condition indicates that the company pays out more in dividends than it generates in cash flows. A negative ratio, especially one as low as −0.17, suggests that the company is using external financing or its cash reserves to pay dividends, which is not sustainable in the long run. This could be a sign of financial strain and may lead to decreased earnings in the future. 2) Operating profit (before depreciation) to sales $\lt = -0.17$ : This ratio measures the company’s operating profitability before considering depreciation. A negative ratio, particularly one as low as −0.17, indicates that the company’s operating expenses exceed its sales revenue. This suggests that the company is not generating enough revenue to cover its operating costs, or it may be unable to control or reduce its operating costs, leading to decreased earnings. 3) Sales/total assets $\lt = 0.18$ : This ratio, known as the asset turnover ratio, measures how efficiently a company uses its assets to generate sales. A low ratio, such as 0.18, suggests the company is not effectively utilizing its assets to generate revenue. This inefficiency can result in decreased profitability and earnings. 4) $\Delta $ in dividend per share $\lt = -26.30$ : This condition indicates a significant decrease in the company’s dividend per share, with a change of -26.30 or lower. A substantial reduction in dividends could indicate that the company is facing financial difficulties and is trying to conserve cash. This may be due to declining profitability or earnings, as companies typically aim to maintain stable or growing dividends over time. When all four conditions are met simultaneously, the company risks experiencing financial problems, including difficulties maintaining dividend payments to shareholders, negative operating profitability, inefficient asset utilization, and a significant decrease in dividends per share. These factors collectively suggest that the company struggles to generate sufficient earnings and cash flows. As a result, there is a high probability of decreased earnings, as the company lacks sustainability in its business operations and finds it difficult to maintain profits and debt repayment ability.
An example of an if-then rule resulting in increased earnings (class 1) is as follows:
- The rule “If %$\Delta $ in Equity to fixed assets >69.74 and %$\Delta $ in sales >21.97 and $\Delta $ in dividend per share >68.24 Then increase earnings.” suggests that a company’s earnings are likely to increase when these conditions occur together: 1) %$\Delta $ in Equity to fixed assets >69.74: This condition indicates that the company has experienced a significant increase in its equity to fixed assets ratio, with a change greater than 69.74%. A higher equity-to-fixed assets ratio suggests that the company has more equity than its fixed assets, which can signify financial strength. This increase in equity could be due to various factors, such as higher retained earnings or the issuance of new shares. A strong equity position can provide a solid foundation for the company to invest in growth opportunities and generate higher earnings in the future. 2) %$\Delta $ in sales >21.97: This condition indicates that the company has experienced a substantial increase in sales, with a change greater than 21.97%. A significant growth in sales suggests that the company is successfully expanding its customer base, increasing its market share, or introducing new products or services. Higher sales revenue can directly contribute to increased earnings. Sustained sales growth is often a key driver of earnings growth. 3) $\Delta $ in dividend per share >68.24: This condition suggests that the company has significantly increased its dividend per share, with a change greater than 68.24%. A substantial increase in dividends per share is generally a positive signal, indicating that the company is confident in its financial performance and prospects. Additionally, boosting dividends can attract investors and boost shareholder confidence, potentially supporting the company’s growth and earnings potential. Simultaneously meeting all three conditions presents a strong case for increased earnings. A significant increase in the equity to fixed assets ratio indicates a stronger financial position; substantial sales growth directly contributes to increased revenue and potential profits; and a notable increase in dividends per share reflects the company’s confidence in its ability to maintain or grow its earnings in the future. These factors create a favorable environment for the company to enhance its profitability and generate higher earnings stably and sustainably.

Explainable AI (XAI) techniques—SHAP, LIME, and Anchors—offer distinct interpretability, stability, and computational efficiency trade-offs. SHAP (Shapley Additive Explanations) provides the most consistent and theoretically sound feature attributions, but its high computational cost makes it impractical for real-time applications [56]. LIME (Local Interpretable Model-agnostic Explanations) is computationally efficient but produces unstable explanations, making it less reliable for high-stakes decisions [57]. Anchor explanations, in contrast, generate high-precision, rule-based conditions that ensure stable and interpretable decision rules, though they may not generalize well across broader data distributions [56]. For financial earnings prediction, where transparency and reliability are critical, Anchor-based XAI is the most suitable approach. Unlike SHAP and LIME, Anchor explanations provide clear if-then rules, aligning with financial decision-making practices. Anchors ensure consistent, high-confidence explanations by setting a precision threshold (e.g., 0.95), making them ideal for justifying individual model predictions.

D. RQ4: How Can Causal Inference Methods Be Integrated with Machine Learning to Understand the Cause-and-Effect Relationships Between Identified Factors and Future Earnings for More Informed Decision-Making?

Causal Questions 1: The direct effect of 0.2097 shows that the combined influence of Inventory/total assets, %$\Delta $ in sales, and %$\Delta $ in Pretax income to sales on Earnings is moderate. When considered together, this value represents the baseline effect on earnings from these three variables. It indicates that they contribute positively to Earnings, although not to an overwhelming degree individually. In the placebo test, the causal model’s effect estimate was checked for robustness using a dummy treatment (a placebo) to see if the observed effect could be due to random variation. The new effect of 0.0 and p-value of 1.0 suggest that the observed effect is robust and statistically significant, reinforcing that the original estimate is likely due to genuine causal relationships rather than random noise. In a scenario where Inventory/total assets is reduced by 10%, %$\Delta $ in sales is increased by 20%, and %$\Delta $ in Pretax income to sales is improved by 10%, the estimated effect on Earning increases to 0.2330. This effect is higher than the baseline of 0.2097, indicating that larger changes across multiple variables produce a synergistic effect on earnings. This finding shows a more substantial impact on profitability that can be achieved by optimizing various operational areas simultaneously rather than focusing on more minor or isolated changes.

Causal Questions 2: The direct effect of 0.1978 indicates that Inventory/total assets and %$\Delta $ in Net profit margin positively impact earnings. The value suggests that better inventory efficiency and improved profitability can help to increase earnings, making them valuable levers for reducing risks associated with earnings decline. The placebo test, with an estimated effect of 0.0 and a p-value of 1.0, confirms that the observed direct impact is statistically significant and not due to random variation. This validation adds confidence to the interpretation, suggesting that the relationship between the adjustments in inventory and net profit margin with earnings is robust. When simulating a scenario with a 10% reduction in Inventory/total assets and a 20% increase in %$\Delta $ in Net profit margin, the effect on earnings rises to 0.2197. This result shows that more significant, simultaneous changes across both variables amplify their positive impact on earnings. This supports the idea that substantial, coordinated adjustments in inventory management and profitability can further reduce the risk of earnings decline by creating a more pronounced stabilizing effect on earnings.

Causal Questions 3: The direct effect of −9.2133e-06 suggests a very slight negative impact on Earning when both %$\Delta $ in Operating profit (before depreciation) to sales and %$\Delta $ in sales are increased. This slight negative value, close to zero, indicates that changes in these two variables alone may not significantly impact earnings in the expected positive direction. The placebo test shows a new effect of 0.0 with a p-value of 1.0, indicating that the small negative impact observed is statistically insignificant and potentially due to noise. This result implies that the direct effect may not be a genuine causal relationship but rather a weak or negligible association with Earning. In a simulated scenario where %$\Delta $ in Operating profit (before depreciation) to sales is increased by 10% and %$\Delta $ in sales is increased by 20%, the effect on earnings is estimated at −8.3920e-06. This value, still close to zero, reinforces the idea that even larger changes in these two variables do not contribute to a meaningful increase in earnings. This suggests that focusing solely on operating profit and sales changes might not be the optimal strategy for earnings growth.

These findings suggest that the factors with the most substantial positive impact on earnings are Inventory/total assets, %$\Delta $ in sales, %$\Delta $ in Pretax income to sales, and %$\Delta $ in Net profit margin. These variables, particularly when optimized together, contribute significantly to earnings growth and profitability. Conversely, %$\Delta $ in Operating profit (before depreciation) to sales and %$\Delta $ in sales alone (when paired with operating profit) do not influence earnings meaningfully. This analysis suggests that a strategic focus on inventory efficiency, sales growth, and improved profit margins is more effective for driving earnings than focusing on operating profit or isolated changes in sales.

E. Implementing A Predictive Model for Predicting Future Earnigns

This study introduces an implementation framework for predicting a company’s future earnings. The Securities and Exchange Commission of Thailand [58] recently accused a company in the Industrials sector of irregularities in its 2022 financial reporting. The predictive model leverages historical financial descriptor data from the company spanning a decade (2019–2022). Table 9 presents the forecasted results generated by this model.

TABLE 9 The Anchor If-Then Rule Results From Deployment Model

The Securities and Exchange Commission of Thailand accused the company of irregularities in its 2022 financial reporting due to unusual patterns and potential red flags in the predictions from 2019 to 2022. In 2019, the model predicted increased earnings based on the combination of a significant decrease in pretax income to sales ratio and operating income to total assets ratio, along with a substantial increase in sales, which may suggest potential earnings manipulation or unsustainable growth, which is consistent with the findings of Beneish [59], who developed a model to detect earnings manipulation and identified several financial ratios and indicators that may suggest aggressive accounting practices or unsustainable growth, such as rapid sales growth, deteriorating profit margins, and increasing accruals. In 2020, the model predicts increased earnings based on the significant increase in dividend per share despite a decrease in gross margin ratio, which may raise questions about the sustainability of earnings growth and dividend payouts, which contradicts the principle of dividend smoothing established by Lintner [60] that suggests companies aim to maintain stable dividend payouts. In 2021, the model predicted increased earnings based on the combination of a significant increase in dividend per share, a substantial increase in sales, a decrease in gross margin ratio, and a substantial increase in total assets, which may suggest aggressive growth strategies or potential earnings management, consistent with the findings of Dechow et al. [61] that examine various indicators of earnings management, such as rapid growth in sales and total assets, changes in accounting policies, and unusual accruals. Finally, in 2022, the model predicts decreased earnings based on a significant increase in the debt-equity ratio, a non-positive gross margin ratio, a significant decrease in sales, and a negative return on opening equity, which strongly indicates financial distress and deteriorating performance, which aligns with the predictions of the Altman Z-score model introduced by Altman [62], which uses financial ratios to predict the likelihood of corporate bankruptcy. The unusual patterns and inconsistencies in the predictions from 2019 to 2022, such as significant increases in dividends despite decreasing gross margins, substantial increases in sales and total assets alongside decreasing profitability ratios, and the sudden deterioration in 2022, may have raised suspicions and led to the company being accused of wrongdoing by the Securities and Exchange Commission, Thailand.

From 2019 to 2022, the analysis reveals a dynamic interplay between financial variables and earnings, closely aligned with the high-stakes causal questions. In 2019, the focus was on operational efficiency and sales growth, with key variables such as %$\Delta $ in Pretax income to sales, %$\Delta $ in sales, and the Quick ratio driving earnings growth. These findings align with Causal Question 1, emphasizing the importance of improving operational and financial efficiency to enhance profitability. In 2020, the emphasis shifted to profitability and stability, highlighting$\Delta $ in dividend per share, %$\Delta $ in Net profit margin, and Gross margin ratio as critical factors. This supports Causal Question 2, which emphasizes the role of profitability metrics and financial policies in stabilizing earnings. By 2021, the analysis underscored synergistic growth through variables, including Inventory/total assets, %$\Delta $ in sales, %$\Delta $ in Net profit margin, and Debt-equity ratio, aligning with Causal Questions 1 and 2. This year demonstrated the importance of coordinated, multi-faceted strategies to maximize earnings.

In 2022, the focus shifted to risk mitigation, with factors such as high Debt-equity ratio, negative Gross margin ratio, and declining sales contributing to earnings decline. These findings reflect Causal Question 2, emphasizing the importance of identifying and addressing risks to maintain financial stability. Across all years, limited emphasis on %$\Delta $ in Operating profit (before depreciation) to sales aligns with Causal Question 3, confirming that this variable has minimal influence on earnings growth compared to others. The analysis highlights consistent growth drivers, includingInventory/total assets, %$\Delta $ in sales, %$\Delta $ in Pretax income to sales, and %$\Delta $ in Net profit margin, while identifying key risk factors such as Debt-equity ratio and Gross margin ratio, particularly in challenging financial environments like 2022.

The findings demonstrate a shift from operational and profitability growth in 2019–2021 to risk management in 2022, underscoring high-stakes causal questions’ adaptability in addressing growth opportunities and risks. These insights validate the importance of focusing on inventory efficiency, sales growth, profit margins, and financial stability as essential strategies for achieving long-term profitability and managing earnings volatility in a dynamic business environment.

SECTION VI.

Conclusion

This study presents an integrated framework combining Optimized XGBoost with RENN, Anchor explanations, and DoWhy causal inference to analyze earnings factors. The framework balances predictive accuracy with interpretability, enabling reliable earnings predictions and a clear understanding of causal relationships in financial data.

The findings suggest that Inventory/total assets, %$\Delta $ in sales, %$\Delta $ in Pretax income to sales, and %$\Delta $ in Net profit margin have the most substantial positive impact on earnings. When optimized together, these variables contribute significantly to earnings growth and profitability, demonstrating the synergistic effects of coordinated operational improvements. In contrast, %$\Delta $ in Operating profit (before depreciation) to sales and %$\Delta $ in sales alone (when paired with operating profit) show minimal or negligible impact on earnings, suggesting that a strategic focus on inventory efficiency, sales growth, and profit margins is more effective than isolated adjustments in operating profit or sales. This study makes notable contributions to academic research by demonstrating a practical causal and predictive analysis methodology in financial contexts. Combining Optimized XGBoost with RENN, Anchor explanations, and causal inference offers a replicable framework that balances high model performance with interpretability. The research emphasizes the importance of addressing class imbalance in financial data, a frequent issue in real-world datasets, using RENN with XGBoost. Additionally, it shows how causal inference can validate machine learning predictions by distinguishing genuine causal factors, enriching the literature on causal AI and its applications in finance.

From a practical perspective, this research provides actionable insights for financial decision-makers aiming to optimize earnings. By identifying key drivers of profitability, the study offers a data-driven basis for prioritizing operational variables in strategic planning. The interpretability provided by Anchor explanations enables managers to understand and communicate the impact of specific operational adjustments on earnings, fostering transparency in AI-driven decision-making. Moreover, the robust predictive performance of Optimized XGBoost with RENN offers a reliable tool for earnings forecasting, which can be applied across various industries where financial prediction and risk assessment are critical.

This study has certain limitations. First, using a single dataset with specific variables may limit the generalizability of the findings across different industries, as each sector may have unique factors influencing earnings that require tailored models and variables. Second, while RENN effectively addresses class imbalance, it may also remove valuable data points, potentially affecting model performance. Third, our model primarily relies on firm-specific financial indicators to predict earnings fluctuations. Although these indicators provide valuable insights, they do not explicitly account for external macroeconomic conditions, industry-specific shocks, or competitive dynamics, which can significantly influence earnings changes.

For future research, the model could be enhanced by integrating external economic indicators such as GDP growth, interest rates, inflation, and market competition indices to assess their impact on predictive performance. Additionally, testing the proposed framework on datasets from multiple industries would improve its generalizability, applicability, and robustness. Researchers could also explore alternative causal inference methods, such as causal forests or Bayesian networks, which can handle non-linear relationships more effectively. Combining these advanced causal techniques with macroeconomic and sector-specific variables would provide a more comprehensive understanding of earnings dynamics.

References is not available for this document.

Anchor-Based Explainable and Causal Artificial Intelligence for Enhancing Financial Predictions of Future Earnings

Alerts

Abstract:

Metadata

Abstract:

Introduction

RQ1:

RQ2:

RQ3:

RQ4:

Related Works

A. Future Earnings

B. Explainable Artificial Intelligence (XAI)

C. Shapley Additive Explanations (SHAP)

D. Anchor (High-Precision Model-Agnostic Explanations)

E. Dowhy Causal Inference

F. Optuna for Hyperparameter Optimization

G. Hyperband Bandit-Based Approach

Proposed Methodology

A. The Improvement of Future Earnings Prediction through Optuna With Hyperband Optimization

Algorithm 1 Advanced Optimization of XGBoost by RENN Balancing

1) Handling Imbalanced Data Using Repeated Edited Nearest Neighbours (RENN)

2) Utilizing Optuna and Hyperband for Optimizing Hyperparameters in Future Earnings Predicting Models

B. AI Explanation for Future Earnings Prediction

Algorithm 2 Generating Anchor Explanations for XGBoost Model Predictions

C. Causal Discovery of Factors Affecting Future Earnings Using Dowhy

Algorithm 3 The Causality Between Factors Influencing Future Earnings through Causal Discovery using DoWhy

Experimental Setup and Result

A. Data Description

B. Assessment Matrices

C. Experimental Configuration

D. Experimental Results

1) Result of the Improvement of Future Earnings Prediction through Optuna with Hy-Perband Optimization

2) Results of Using Anchor XAI for Transparent Future Earnings Predictions

3) Results of Causal Inference

Discussion

A. RQ1: How Can Machine Learning Models Effectively Utilize Financial Statement Data to Predict Future Corporate Earning Accurately?

B. RQ2: What Factors or Variables Contribute to Increasing and Decreasing A Company’S Future Earnigns?

C. RQ3: How Explainable AI Methods Can Provide Interpretable Insights Into the Reasons Behind These Earnings Changes?

D. RQ4: How Can Causal Inference Methods Be Integrated with Machine Learning to Understand the Cause-and-Effect Relationships Between Identified Factors and Future Earnings for More Informed Decision-Making?

E. Implementing A Predictive Model for Predicting Future Earnigns

Conclusion

Authors

Figures

References

Keywords

Metrics

References

IEEE Account

Purchase Details

Profile Information

Need Help?