Loan Portfolio Dataset From MakerDAO Blockchain Project

Decentralized finance (DeFi) offers a range of financial instruments and services that leverage the capabilities of web3 technology. Maker protocol, which enables users to obtain loans backed by cryptocurrencies, is one of them. Unlike traditional banks, Maker’s data is transparently recorded on the Ethereum blockchain. In this research paper, we focus on analyzing the lending aspect of Maker from a traditional finance perspective. To achieve this, we create a unique dataset with loan portfolios from the MakerDAO project, making it the first dataset of its kind in the DeFi field. This publicly available dataset contains essential financial characteristics related to borrowing, including balance, loss given default, annual equivalent rate, and probability of default. Additionally, we develop a specialized mathematical model tailored specifically to this project. This model allows us to estimate the probability of default by considering the presence of crypto-collateral and utilizing Brownian motion passage levels. The results of this study provide valuable insights into lending practices in DeFi projects. They also help bridge the gap between traditional finance and blockchain-based financial services.


I. INTRODUCTION
Financial services refer to the range of activities and products provided by financial institutions, such as banks, insurance companies, and investment firms, to meet the financial needs of individuals and businesses.These services encompass various aspects of managing money, including lending, investing, insurance, and risk management [1], [2].Financial services are highly regulated to maintain confidence in the financial system, provide a financial stability and protect consumers [3], [4].For example, the Basel framework (BF) is the full set of standards of the Basel Committee on Banking The associate editor coordinating the review of this manuscript and approving it for publication was Mueen Uddin .Supervision, which is the primary global standard setter for the prudential regulation of banks [5].As of early 2024, 28 jurisdictions covering a half of the humanity population use the BF.The key quantities of a loan in BF are an interest rate, a loss given default, and a probability of the default.But the real bank data is a part of bank confidential data and may contain sensitive personal information [6], hence it is not openly available to compute the parameters.
Decentralized finance (DeFi)-peer-to-peer financial services on public blockchains [7], [8]-not only bring new web3-based services but also provide analogs of traditional financial instruments [9], [10].One of them is Maker, a blockchain protocol that facilitates crypto-backed loans [11].The set of smart contracts implement Maker protocol on the Ethereum blockchain, and a decentralized autonomous organization (DAO) named MakerDAO governs the project, including economic parameters assignment.Hereafter, we will use the terms MakerDAO and Maker interchangeably to refer to both the project and the protocol.
Maker's smart contracts are deployed on the Ethereum blockchain, making all related transactions visible to everyone.These transactions contain financial information, including user operation names and amounts.While it is technically possible to conceal this information using zero-knowledge proofs [12], [13], [14], doing so would greatly complicate and slow down the protocol, increase transaction fees, and reduce transparency for participants, ultimately making the project less reliable.Maker does not encrypt transaction data; the only hidden information relates to the real-world entities behind the protocol users identifiers.As a result, financial information can be extracted from the protocol.While user identifiers are visible, actual names are not.Hereafter, we will use the terms MakerDAO and Maker interchangeably to refer to both the project and the protocol.
Despite the transparency of transactions in blockchainbased projects, they lack regulation and standards as noted in previous studies [15], [16].The goal of the current research is to analyze lending in the DeFi project Maker from a traditional finance perspective.The outcome is twofold: we provide a real lending portfolio dataset and equip it with standard banking numerical parameters.
The main contributions of our work are summarized below 1) We introduced a novel loan portfolio dataset obtained from the MakerDAO project, making it the first dataset of its kind in the field of DeFi.The dataset, along with utility functions for easy access, is publicly available at [17].However, it should be noted that the dataset is currently limited to the ETH-A lending program.2) We incorporated borrowing-driven financial characteristics, such as balance, loss given default, annual equivalent rate, and probability of default, into the aforementioned dataset.3) We developed a project-specific mathematical model to estimate the probability of default.This model takes into account the presence of crypto-collateral and utilizes Brownian motion passage levels to provide a comprehensive understanding of both individual loan defaults and the correlation among different loans.
The rest of the paper is organized as follows.Section II provides an overview of the related papers.In Section III, we present a comprehensive perspective on the Maker protocol from the borrower's point of view.Section IV introduces mathematical models that describe the financial characteristics of loans.The structure of the dataset is outlined in Section V.In Section VI, a quantitative analysis is conducted to demonstrate the practicality of the collected data and the effectiveness of the proposed computational models.Finally, Section VII provides concluding remarks for the paper.

II. RELATED WORK
Banks use a variety of tools to maintain reasonable risk levels and increase efficiency, including regulator-required frameworks like the Basel framework [5] and machine learning models [18], [19].Loan portfolio data has also attracted researchers' attention, with some studies accessing proprietary data that is not publicly available.For example, papers [20], [21] examine the impact of loan portfolio diversification on risk and capital efficiency based on German and Australian large banks, respectively.They have an access to more than thousand individual bank portfolios over seven years.Various machine learning techniques to predict non-performing loans using a portfolio dataset provided by a bank for four years, consisting of 181 thousand borrowers and hundreds of features, are compared in [22].The paper [23] applies random forest to classify non-performing loans for Indonesia's bank loan dataset with 3300 borrowers and 12 features.
Several classic finance datasets are publicly available, such as Home Credit Default Risk on Kaggle, which challenges participants to predict the probability of default among 307 thousand debts using 239 features [24].The UC Irvine Machine Learning repository contains several credit datasets, with the Taiwan credit card defaults dataset being the largest, containing 30 thousand debts and 24 features [25].The peerto-peer lending platform Lending Club provides a dataset containing 887 thousand debts collected from 2007 until 2015 with 79 features to predict the probability of the default [26].
In addition to traditional finance, there have been several studies on decentralized finance (DeFi) and its potential impact on traditional finance.For instance, the risk model of DeFi money lending was analyzed using tools from modern portfolio theory in papers [27], [28].The challenges and opportunities of DeFi in the financial industry were analyzed in [29].The risks and benefits of DeFi from the perspective of financial intermediation were analyzed in [30].Also, the latter paper proposed the framework for analyzing DeFi projects.
While available data from traditional banking are limited due to trade secrecy and privacy concerns, DeFi loan data is openly available on public blockchains.Several studies have been conducted on DeFi lending platforms such as Maker, AAVE, Compound, and Spark Lend [11], [31], [32], [33], including their data collection, economic parameters estimation, and risk management.For example, the paper [34] analyzes data from the decentralized Ethereum protocol called Compound, using a relational database and providing statistical details to facilitate further analysis.The paper [31] assess the stability of the DAI stablecoin of the Maker project over the course of its first year deployment, including the cryptocurrency crisis in March 2020.The issue of high collateral requirements for blockchain-based loans using cryptocurrencies as collateral due to their high volatility is discussed in [35], and the authors propose a solution to make loans more accessible by offering lower collateral requirements while keeping risk for lenders bound.
DeFi possesses unique operational characteristics.For instance, all computations are executed lazily and only upon a request.However, it is worth noting that the volume of centralized crypto finance surpasses that of the decentralized counterpart.Consequently, DeFi relies on oracles to incorporate centralized rates [36], [37], [38].Oracles are compensated for their transactions and receive interest for their services.Typically, platforms obtain values from oracles at desired intervals.Nevertheless, in times of crisis [39], delays may occur, leading to deviations and unstable values as shown in [40].Moreover, the unique features of DeFi pose challenges for new risk models.While models for the entire system exist, for example, the stochastic model for collateral-based stablecoins [41], models specifically addressing individual debts are yet to be presented.
However, there has been limited research on providing a real lending portfolio dataset for Maker and equipping it with standard banking numerical parameters.Such a dataset could be useful for both academic research and practical applications, such as risk management and portfolio optimization in Maker lending.

III. MAKER PROTOCOL FOR BORROWER
The Maker Protocol [11] operates using the native DAI token, which has a one-to-one soft peg to the United States dollar and is an ERC-20 token [42].The protocol allows for collateral-secured DAI debts, with loan terms such as financial parameters.DAO mechanism to modify certain debts is included in corresponding smart contract.Financial parameters, such as the lending interest rate f (the multiplier applied to the loan balance over time) and the liquidation ratio r (the minimum allowed ratio of the locked collateral value to the debt value), are examples of these loan terms.Users can deposit Ethereum native cryptocurrency called Ether (ETH) or other tokens into their instance of a specific smart contract (Vault) and use them as collateral to mint DAI debt.
Let's consider a borrower's workflow in the Maker Protocol.The borrower starts by creating a Vault and depositing supporting collateral.The Maker Protocol's Oracle then evaluates the collateral, providing a real-time price feed for each asset.Based on the current market value of the collateral and the chosen borrowing program, the protocol calculates the maximum amount of DAI that can be borrowed.For example, ETH-A, ETH-B, and ETH-C are different borrowing programs with ETH as collateral, each with its own set of parameters and risk profiles.ETH-A is the original and most commonly used program, while ETH-B and ETH-C were introduced to offer additional options for users with different risk tolerances or preferences.
Once the borrower has minted DAI, they can use it for any purpose.The borrower is responsible for repaying the loan with interest, which is calculated based on the lending interest rate and duration of the loan.They can fully or partially repay the loan at any time, borrow more up to the maximum permitted collateral program and size of the collateral amount, or increase or decrease the amount of collateral.
If the value of the collateral falls below a certain threshold, the Vault is at risk of being liquidated.In this case, the MakerDAO system will automatically initiate a liquidation process, which involves selling off a portion of the collateral to cover the outstanding debt.The borrower can not interact with the Vault under the liquidation process.The liquidation process is designed to be fast and efficient, with the goal of minimizing losses for both the user and the MakerDAO system.When a Vault is liquidated, the collateral is sold through an auction, allowing users to bid on the collateral using DAI.The auction is competitive, with bidders offering progressively lower prices until the collateral is sold.
If the auction is successful and the collateral is sold for a price that covers the outstanding debt, the remaining DAI is returned to the user.If the auction is unsuccessful and the collateral is not sold for a sufficient price, the MakerDAO system may take a loss on the liquidation.The resulting penalty for the borrower is flexible but usually ranges from 10% to 33%.
All actions involving the Vault and system parameters are recorded as plaintext Ethereum blockchain transactions.However, these transactions may be challenging for the general audience to understand due to Maker's use of technical terms such as ilk, frob, and art.To address this issue, we aim to present the loan portfolio dataset from the Maker project in a more accessible format.

IV. MATHEMATICAL MODELS
This study focuses on a single type of collateral.To obtain a loan in the Maker protocol, a user must have a Vault.Vaults can be associated with single user only and cannot be transferred.Ethereum addresses, which are 42-character hexadecimal strings, represent users.While an address does not reveal information about the real-world owner, some users may indirectly or directly disclose their identity.At the same time, a user can have multiple Vaults, and we keep track of which Vaults belongs to which user even in case of anonymous users.
We can determine if a Vault has an active loan at a given time by checking if its DAI debt is non-zero.Therefore, we define the beginning of a loan as when the DAI debt changes from zero to a positive number.We define the end of a loan as when the DAI debt becomes zero from any nonzero value.A loan can be active or ended, with the latter occurring either through successful repayment or liquidation.A single Vault can have multiple loans, and all loans in it have non-overlapping beginning-to-end time intervals.Now let's examine the financial characteristics of a loan.

A. BALANCE
When a user borrows DAI in the Maker project, they need to provide collateral.Without the loss of generality, we will refer to the collateral asset as ETH.The loan starts at t 0 and lasts until T , which can be either the liquidation time, full repayment time, or maximum observed time if the loan is still active at the point T .
The amount of collateralization assets at any given time t is denoted by a(t) (an example is shown in Figure 1).The blockchain records updates to the collateral balance as a piece-wise constant function, represented by update times τ and corresponding changes a(τ ) due to collateral deposits or withdrawals, and liquidation processes.The maximum allowed debt is determined by the collateral price in DAI and the minimum allowed collateralization ratio r min (t).Oracles provide the ETH/DAI exchange rate e(t), which is typically consistent with centralized exchange rates except in cases of extremely high transaction fees [31].The minimum allowed collateralization ratio r min (t) is a piece-wise linear function with small slopes at non-constant intervals to ensure platform stability.Since debts in Maker project are over-collateralized we have that r min (t) > 1.Let d(t) be the debt at time t (an example is shown in Figure 2).Interest is charged on the active debt, with the logarithm of the interest over time denoted by f (t).If no actions are taken on the debt during an interval (t 1 , t 2 ], then the debt at time t 2 can be calculated as The log-interest f (t) is piece-wise constant by design of the platform.If ).So the collateral balance is piecewise exponential.The function breaks are due to the debt repayment, getting more or liquidation process.Changes in the log-interest cause derivative breaks without function breaks.
The current collateralization ratio r(t) for d(t) > 0 equals to the following value (see Figure 3) If d(t) = 0, we can set r(t) = +∞ and if r(t) drops below r min (t) at any point in time, the platform triggers the liquidation.The collateralization requirement check normally is near real-time.And the borrower is responsible for paying the interest during the liquidation period.

B. LOSS GIVEN DEFAULT
Loss given default (LGD) refers to the portion of an asset that is lost in the event of a borrower defaulting [5].In the Maker protocol, debts are typically over-collateralized, resulting in losses for users in most cases.We can represent a user's balance at time t as and the left-side limit for any given function ϕ as ϕ(t−) = lim τ ↑t ϕ(τ ).To calculate LGD for a user's collateral liquidation at time t, we use the following formula A positive value for LGD indicates a loss for the user, while a negative value indicates a gain.To determine the average of D user defaults at times t 1 , . . ., t D , we use a weighted average:

C. ANNUAL EQUIVALENT RATE
The interest rate on the platform changes over time and is charged in second-wise intervals.If a loan is liquidated, the loss of collateral value during liquidation is calculated as (a(T ) − a(T −)) • e(T ).The log-equivalent rate (LER) is a constant log-interest rate that results in the same final debt, including any liquidation losses, for a debt from t 0 to T with debt changes d 0 , . . ., d N at times t 0 , . . ., t N respectively.To find the LER, we use the cumulative debt at time T with LER = x, denoted by h(x), which is calculated as The LER is then determined by solving the following equation: where h(x) equals the final debt plus any liquidation losses.The function h(x) is monotonically increasing for x > 0 since d(t) > 0 for t ∈ (t 0 , T ).Therefore, if there is no default, the LER falls within the range of [min t∈(t 0 ,T ) f (t), max t∈(t 0 ,T ) f (t)].However, if there is a default, the LER can be large and the solution of ( 6) may be unstable.To avoid this issue, we only consider values of LER that are less than or equal to a fixed constant f max > 0.
To determine the average of D users at time t, we use a weighted average:

D. PROBABILITY OF DEFAULT
The probability of default (PD) is a risk assessment parameter commonly used by financial institutions.It is a financial term that describes the likelihood of default over a particular time horizon.Let us consider a dataset of loans from Maker platform in the format of time intervals as follows: • N intervals of the time till default for default debts: t 1 , . . ., t N • M intervals of the time during which there were no defaults: τ N +1 , . . ., τ N +M .These intervals correspond to either active debts or returned debts.τ s are intervals from the debt opening until the dataset generation for the active debts.τ s are intervals from the debt taking until debt repayment for the returned debts and consider different debt models.

1) POISSON MODEL
The classic finance baseline model is a Poisson model that assumes all debts are independent and have an exponential distribution with an unknown parameter λ > 0 for time until default [43].This simplification allows for the estimation of λ [44], [45].However, this assumption does not hold for Maker data since all debts are based on the same collateral type but with different collateralized ratios.

Statement 1.
Let λ be the parameter of the exponential distribution.Let X 1 , . . ., X N +M be independent and identicallydistributed (i.i.d.) random variable from the exponential distribution with a parameter λ.Let x 1 , . . ., x N +M be realizations of X 1 , . . ., X N +M .Given x 1 , . . ., x N and deterministic parameters y N +1 , . . ., y N +M such that ∀n = N + 1, . . ., N + M : x n > y N , the maximum likelihood estimator (MLE) λ of the parameter λ is Proof: The likelihood defines as where I (A) is an indicator function of the event A, i.e., I (A) = 1 if A is true and I (A) = 0 if A in not true.The likelihood is non-negative, and L N ,M (λ) equals 0 if any x n < y n , n = N + 1, . . ., N + M .So the maximum of L N ,M (λ) is for x N +1 , . . ., x N +M : and the maximization of ( 9) is equivalent to the classic problem for the exponential distribution with the log-likelihood expressed as The observation that MLE λ of λ is given by ( 8) finishes the proof.
The probability of the default for a single debt during T ,where X is an exponential random variable with parameter λ can be written as As the likelihood is functional equivariant [44], the MLE for PD is where λ is given by (8).It is important to note that the model assumes independence between debts, so the covariance between different users is zero.

2) BROWNIAN MOTION MODEL
Another model considers the minimal allowed collateralization ratio r min (t) in comparison to the actual user's collateralization r(t).We assume that the logarithm of e(t)/e 0 follows a Brownian motion with zero mean and an unknown standard deviation σ > 0. Therefore, 1 σ (ln e(t) e 0 ) is a Brownian motion B t with zero mean and unit variance.
Let us denote for fixed parameters f and T .
Theorem 1.If 1) the normalized exchange rate 1 σ (ln e(t) e 0 ) for a given constant σ > 0 is a Brownian motion B t with zero mean and unit variance 2) the borrower has a debt d 0 and collateral a 0 at time t = 0 3) the borrower has no actions with debt and collateral during t ∈ (0, T ] 4) the platform's interest rate f ≥ 0 and the minimum collateralization ratio r min > 0 are constant, then the probability of the borrower's default during the time interval (0, T ] and its variance are given by and respectively, where Proof: Firstly, let the stability fee f = 0. Then a(t) = a(0) ≡ a 0 and d(t) = d(0) ≡ d 0 .Therefore, a debt default is equivalent to the existence of such t > 0 that A debt default occurs when the Brownian motion B t reaches the level x min (17).Let us enote T C = inf{t > 0 : I.e., a default is the passage of level by Brownian motion B t (see Figure 4).Let Then for C < 0 and f > 0 And from ( 14): PD = ψ(x min ).As the default is a Bernoulli random variable, its variance is given by ( 16) and theorem statement follows.
Theorem 2. If, in addition to the assumptions 1)-4) of Theorem 1,5) the second borrower has a debt d0 a collateral ã0 at time t = 0, Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
then the covariance of two borrowers' defaults during time interval (0, T ] equals cov = ψ(min{x min , y min }) where Proof: The second user has a single asset as a collateral with a default at level y min (t) = y min + ft, where y min is given by (24).Without loss of generality, y min ≤ x min .Then the probability of the passage of both x min and y min is Denote I (A) the indicator of the event T x min ,f < T , i.e., I (A) = 1, if T x min ,f < T , and I (A) = 0, if T x min ,f ≥ T .Denote I (B) the indicator of the event T y min ,f < T .The mathematical expectations and the covariance can be written in ( 26) and ( 27), in turn.
As the standard deviation of the Brownian motion ln e(t) e 0 is unobservable, we can estimate it.More precisely, given a sample 3) MODELS COMPARISON Both the Poisson process and Brownian motion provide models for predicting probability of default PD.However, the true model cannot be observed directly.To test the accuracy of these models, we generated a dataset of daily defaults and utilized both models to predict defaults one day in advance.Initially, the models assume that the data is stationary.To verify this hypothesis, we employed the Augmented Dickey-Fuller ADF test [46].The ADF test is based on the autoregressive model and assesses unit roots in time series, which causes trends.The null hypothesis of the ADF test is the presence of a unit root in a time series.If the p-value (the probability that the null hypothesis is true) is less than a given significance level (usually 0.05), then the null hypothesis is rejected, indicating that the time series is stationary.
To compare the fit of models to data, we use four quantities [43], [44], [47]: • Kullback-Leibler divergence (KL) measures the divergence between two probability distributions.It quantifies the amount of information lost when one distribution is used to approximate another.
• Total Variation (TV) measures the divergence between two probability distributions.It is defined as half the sum of the absolute differences between the corresponding probabilities in the two distributions.
• Relative Root Mean Squared Error (RRMSE) and Relative Mean Absolute Error (RMAE) are quadratic and linear aggregations of point-wise distances between two datasets, respectively.In our specific case, RMAE; is equivalent to TV, but we retain both terms since they are commonly used by machine learners and statisticians.For each quantity, a smaller value indicates a better fit of the theoretical model to the empirical data.

V. DATASET STRUCTURE
Our focus is on the ETH-collateralized risk program A debts (ETH-A) within the MakerDAO protocol deployed on the Ethereum network.This program has the largest number of debts (137,441 out of 259,048) and debt volume (13.4 billion DAI out of 36.9 billion DAI).We utilized publicly available data from November 11th, 2019 (the first debt start in the considered asset) to July 31st, 2023, accessed via the Big Query project by Google.The collected raw data was decoded and further processed using Python.To verify specific information and ensure data correctness, we used a third-party API Ethereum provider, Infura.After processing all the internal terms such as frob and wad and focusing solely on the borrower-related aspect of the Maker protocol, we collected a loan portfolio dataset.The dataset comprises two parts: system and borrower data.
The system data contains common parameters, such as the ETH/DAI exchange rate e(t), which is used to estimate the collaterization ratio since the ETH-A program deals with ETH as collateral.The exchange rate is provided by oracles and typically corresponds to the centralized exchanges rate [31].Maker's loans are overcollateralized, and the minimum allowed collaterization ratio is defined as r min (t).The platform receives interest rates from borrowers, and the dataset contains the log-interest f (t).
The borrower data contains borrowers' catalog and debt details.The borrower catalog lists all borrowers and their debts, each with start and end times, status, and loan actions (see Table 1).The possible statuses are • repaid: the borrower returned the debt • liquidated: the debt is fully repaid via liquidation process • restructured: the debt is partially repaid via liquidation process, and a new debt started immediately after the liquidation • active: the loan is active by the end of the observation period (July 31st, 2023).The raw data only includes reference points for these parameters.Our utility functions allow obtaining their values over time and plotting system parameters for the entire period (see Figure 5) and loan characteristics for its lifetime (see Figure 6).The prepared dataset is publicly available on Gitlab [17].

VI. NUMERICAL EXPERIMENTS
To showcase the practicality of the gathered dataset (Section V) and viability of the suggested computational models (Section IV), we conducted a quantitative analysis.The code to reproduce the experiments is available on Gitlab [17].We do note the possible presence of unexpected events that can significantly affect the market.Examples of the latter include the Black Thursday price crash on March 12th and 13th, 2020 [40], [41], and Maker's announcement of their upcoming Spark Lend project in 2023.In the following subsections, we will observe their effects.

A. BALANCE
The borrower data contains the details of all the debts.For our perposes we have to find the number of debts in Figure 7 24850 VOLUME 12, 2024 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.and total collateral amount (total value locked, and debt in Figure 8, where the balance is their difference.The number of borrowers experienced a remarkable surge initially, but came to a halt during the unfortunate event of Black Thursday in 2020.The debt amount peaked in mid-2021, only to witness a decline due to the cryptocurrency market downturn known as the crypto winter in the same year.Currently, the total debt in ETH-A is steadily decreasing towards zero as Maker DAO transitions into the Spark Lend Protocol, marking a significant development.

B. LOSS GIVEN DEFAULT
The monthly average LGD (4) is depicted in Figure 9. Maker receives a fixed percentage of 13% from each auction (represented by the dashed black horizontal line), which indicates the typical loss level in an efficient market.The efficiency of the auction is a critical factor [40] and tends to decrease during periods of declining ETH prices.

C. ANNUAL EQUIVALENT RATE
The monthly average AER (7) is shown in Figure 10.Although the AER for returned debts aligns with the annual Maker's interest rate for ETH-A (Figure 5), the majority of liquidated debts have an AER that exceeds 100%.

D. PROBABILITY OF DEFAULT
The number of daily defaults has an ADF test statistic of −8.89, which is significantly smaller than the 1% significance level of −3.45.This statistic enables us to reject  the null hypothesis of data non-stationarity and conclude that the time series is stationary.
We have computed a day-ahead actual number of defaults together with the predictions by Poisson and Brownian motion models.The Brownian motion model is found to be superior to the baseline Poisson process model in describing and predicting debt defaults (see Table 2).This conclusion is supported by the KL; divergence, which measures the difference between the two probability distributions.The regression-specific RRMSE; is worse for the Brownian motion model compared to the Poisson process model.However, the RMAE; and TV; are comparable between the two models.
Both the RRMSE and RMAE values are close to one, indicating that both models have poor predictive power from a regression perspective.This finding suggests that further analysis is needed.
Overall, the findings confirm the hypothesis that the Brownian motion model is better suited for capturing real-world data and demonstrates its superiority in fitting probabilistic distributions in collateral-based DeFi lending.

VII. CONCLUSION AND FUTURE WORK
This research focuses on analyzing the lending aspect of the Maker protocol in the DeFi space from a traditional finance perspective.The authors have gathered a unique dataset comprising loan portfolios sourced from the MakerDAO project, making it the first dataset of its kind in the DeFi field.This publicly available dataset contains essential financial characteristics related to borrowing, including balance, loss given default, annual equivalent rate, and probability of default.The current version of the dataset covers only the most popular Maker's borrowing program called ETH-A.However, the authors plan to expand the dataset to include other programs and new Spark Loan data in future work.
In addition to collecting this dataset, the authors have developed a specialized mathematical model tailored specifically to the Maker project.This model allows them to estimate the probability of default by considering the presence of crypto-collateral and utilizing Brownian motion passage levels.The proposed model outperformed the Poisson process baseline model on the loan portfolio dataset.By incorporating borrowing-driven financial characteristics into the dataset and developing this model, the authors provide a comprehensive understanding of both individual loan defaults and the correlation among different loans.
Expanding the analysis to include other borrowing programs beyond ETH-A presents challenges in finding the default correlation of level passage times between two correlated Brownian motions representing different collateral types.The authors acknowledge this as a future work.However, this also opens up opportunities to estimate the platform's risk, where simultaneous defaults of a significant portion of borrowers could pose a threat.
The findings of this study offer valuable insights into lending practices in DeFi projects and help bridge the gap between traditional finance and blockchain-based financial services.This research contributes to the understanding of how DeFi lending operates and offers a standardized approach to analyzing and evaluating loan portfolios in the DeFi space.Furthermore, the methodology can be extended to other DeFi lending platforms such as Compound and Aave.

FIGURE 1 .
FIGURE 1.The amount of collateralization assets a(t ) as a function of time.

FIGURE 2 .
FIGURE 2. The debt d (t ) as a function of time.

FIGURE 3 .
FIGURE 3. The collateralization ratio r (t ) and the minimum allowed collateralization ratio r min (t ) as functions of time.

B
t = C} for C < 0. Then for T > 0[45] from the reflexion principle we have thatP(T x min < T ) = density p C (t) = C √ 2πt 3 e − C 22t .Now, let stability fee be nonzero constant f ≥ 0. Then d(t) = d 0 • e ft and a default is equivalent to the existence of such t > 0 that e(t) = d 0 • e ft a 0• r min ≡ e min (t).

FIGURE 4 .
FIGURE 4. Borrower default can be described as a Brownian motion level passage.The black solid curve represents the normalized log-exchange rate of the collateral.The magenta dashed line indicates the minimum allowed rate before default for a user starting from x min .Similarly, the blue dash-dotted line represents the minimum allowed rate before default for a user starting from y min .

FIGURE 5 .
FIGURE 5. Annual Maker's interest rate for ETH-A.

FIGURE 7 .
FIGURE 7. The number of ETH-A debts.

FIGURE 8 .
FIGURE 8.The total collateral amount and borrowers' balance.

TABLE 2 .
Comparison of PD models.The lower the value in each column, the better the fit to the data.The best result in each column is highlighted.