Fuzzy Machine Learning: A Comprehensive Framework and Systematic Review

Machine learning draws its power from various disciplines, including computer science, cognitive science, and statistics. Although machine learning has achieved great advancements in both theory and practice, its methods have some limitations when dealing with complex situations and highly uncertain environments. Insufficient data, imprecise observations, and ambiguous information/relationships can all confound traditional machine learning systems. To address these problems, researchers have integrated machine learning from different aspects and fuzzy techniques, including fuzzy sets, fuzzy systems, fuzzy logic, fuzzy measures, fuzzy relations, and so on. This article presents a systematic review of fuzzy machine learning, from theory, approach to application, with the overall objective of providing an overview of recent achievements in the field of fuzzy machine learning. To this end, the concepts and frameworks discussed are divided into five categories: 1) fuzzy classical machine learning; 2) fuzzy transfer learning; 3) fuzzy data stream learning; 4) fuzzy reinforcement learning; and 5) fuzzy recommender systems. The literature presented should provide researchers with a solid understanding of the current progress in fuzzy machine learning research and its applications.

Abstract-Machine learning draws its power from various disciplines, including computer science, cognitive science, and statistics.Although machine learning has achieved great advancements in both theory and practice, its methods have some limitations when dealing with complex situations and highly uncertain environments.Insufficient data, imprecise observations, and ambiguous information/relationships can all confound traditional machine learning systems.To address these problems, researchers have integrated machine learning from different aspects and fuzzy techniques, including fuzzy sets, fuzzy systems, fuzzy logic, fuzzy measures, fuzzy relations, and so on.This article presents a systematic review of fuzzy machine learning, from theory, approach to application, with the overall objective of providing an overview of recent achievements in the field of fuzzy machine learning.To this end, the concepts and frameworks discussed are divided into five categories: 1) fuzzy classical machine learning; 2) fuzzy transfer learning; 3) fuzzy data stream learning; 4) fuzzy reinforcement learning; and 5) fuzzy recommender systems.The literature presented should provide researchers with a solid understanding of the current progress in fuzzy machine learning research and its applications.
Index Terms-Data stream learning, fuzzy logic, fuzzy sets and systems, machine learning, recommender systems, transfer learning.

I. INTRODUCTION
I N THE dynamic realm of technology, machine learning has profoundly transformed various sectors.It leads innovation by decoding complex data patterns, driving advancements in artificial intelligence, and influencing how we engage with information and understand the capabilities of computational systems.However, with most of the existing machine learning methods, accuracy suffers in scenarios characterized by uncertainty, such as the only available observations are imprecise or where the data are noisy or incomplete.In addition, many real-world datasets contain uncertain relationships, and conventional machine learning methods generally find it difficult to The authors are with the Australian Artificial Intelligence Institute, Faulty of Engineering and Information Technology, University of Technology Sydney, Sydney, NSW 2007, Australia (e-mail: jie.lu@uts.edu.au;guangzhi.ma@student.uts.edu.au;guangquan.zhang@uts.edu.au).
Digital Object Identifier 10.1109/TFUZZ.2024.3387429identify or work with these structures.To address these issues, researchers have used fuzzy techniques to integrate into machine learning called fuzzy machine learning (FML) [1] as a solution, since fuzzy techniques are successful to deal with uncertainties.FML systems fuse machine learning algorithms with fuzzy techniques, such as fuzzy sets [2], fuzzy systems [3], fuzzy clustering [4], fuzzy relations [5], fuzzy measures [6], fuzzy matching [7], fuzzy optimization [8], and so on, to build new models that are more robust to the many and varied types of uncertainty found in real-world problems.FML stands out as an invaluable ally in the realm of complex and dynamic (uncertain) environments, presenting substantial advantages that elevate its efficacy.Unlike traditional machine learning approaches, fuzzy techniques that are generally based on the concept of fuzzy sets [9] and fuzzy theory [10] excel in capturing and navigating the nuanced shades of uncertainty inherent in dynamic scenarios.Their inherent ability to model uncertainty empowers it to gracefully adapt to the ever-changing patterns that characterize dynamic environments.In situations where traditional models might falter or struggle to keep pace, fuzzy techniques emerge as robust problem solvers, providing a more accurate representation of the inherent fuzziness present in real-world data [11].Furthermore, in the relentless quest for interpretability, FML triumphs.Its models not only navigate complexity but also offer clear insights into decision-making processes.This interpretability proves to be a critical asset in dynamic environments, where understanding the rationale behind model decisions is paramount.Next, we summarize some main successes of how fuzzy techniques can improve machine learning algorithms.
1) Fuzzy sets [2] can be used to represent vague or ambiguous concepts and data, such as that commonly found with linguistic variables, noisy or incomplete data, and interval-valued data.The fuzzy sets enhance the algorithm's ability to make decisions in uncertain and complex situations, which can be particularly useful in applications where real-world conditions can be unpredictable, such as robotics or autonomous vehicles.2) Fuzzy-rule-based systems [3] can provide a transparent and interpretable prediction framework.Fuzzy-rule-based systems use linguistic rules to represent knowledge and, so, can be used to generate explanations for the decisions made by the system.This can be useful in applications like medical diagnosis.
3) Fuzzy clustering [4], which is a well-known approach to clustering, can improve machine learning algorithms by identifying patterns in data that traditional clustering methods may not easily identify.Fuzzy clustering not only allows for overlapping clusters but can also handle data points that may not belong to any particular cluster with certainty.This can be useful in applications like image recognition.4) Fuzzy relations [5] can provide a more flexible and nuanced representation of the relationships between variables or data points.They can also capture nonlinear relationships to enable more accurate and expressive machine learning models.In addition, fuzzy relations are useful when handling multimodal data or data assembled from multiple sources because researchers can define fuzzy relations between the different modalities to result in a more comprehensive and accurate model.In the past decade, there have been over 500 000 articles in high-quality journals and conference proceedings containing the words "fuzzy" and "machine learning."However, none of these articles provides a comprehensive review of the recent literature on FML.Several previous surveys in the area only offered valuable insights into certain subfields of FML.For example, Baraldi and Blonda [12] provided a brief review of fuzzy clustering algorithms for pattern recognition, while Škrjanc et al. [13] summarized models based on evolving fuzzy rules and neuro-fuzzy networks (NFNs) for clustering, regression, identification, and classification problems.In addition, Zheng et al. [14] reviewed recent work on fusing deep learning models with fuzzy systems.Moreover, the last decade has witnessed the emergence of new subfields in FML, such as fuzzy transfer learning and fuzzy data stream learning.Providing an investigative report to outline these new subfields is significant.For these reasons, a new, more comprehensive, and more up-to-date survey of FML is warranted.This article primarily targets researchers interested in employing fuzzy techniques to enhance the performance of machine learning methods, particularly in situations involving complex or uncertain factors.
The studies included in this survey were selected in the following three steps.
Step 1) Identify and determine an appropriate set of publication databases to search.We searched the well-known databases of Science Direct, ACM Digital Library, IEEE Xplore, and SpringerLink.These provided a comprehensive bibliography of research papers on machine learning and FML.
Step 2) Preliminary screening of articles: The first search was based on keywords.The articles were then selected for inclusion in the review if they: a) presented a new theory, algorithm, or methodology in the area of FML; or b) reported an application built around a FML algorithm.
Step 3) Filtering the results for presentation: The articles selected in Step 2 were then divided into five groups to be summarized in separate sections: a) fuzzy classical machine learning; b) fuzzy transfer learning; c) fuzzy data stream learning; d) fuzzy reinforcement learning (RL); and e) fuzzy recommender systems.At this point, we undertook one final screening of the articles (see Fig. 1).A study was retained if it demonstrated sufficient: a) novelty, i.e., it had been published within the last decade; and b) impact, i.e., it had been published in a high-quality journal/conference or having high citations.The main contributions of this article are as follows.
1) It comprehensively summarizes the developments and achievements in the field of FML.Work in this field is divided into five main categories for discussion.
2) The shortcomings of traditional machine learning methods in real-world scenarios are analyzed for each category, followed by an explanation of how FML has been used to address these issues.The insights provided are designed to help researchers understand the context of developments in FML research and its applications.
3) It provides a critical discussion of the state-of-the-art (SOTA) FML models and outlines directions for future research.The rest of this article is organized as follows.Section II provides some relevant mathematical concepts to illustrate how fuzzy logic can be integrated into machine learning.Sections III-VII discuss the five categories of FML, respectively.Finally, Section IX summarizes the material covered and goals of this review and outlines future work.

II. BASIC CONCEPTS OF FML
In this section, we briefly introduce some relevant mathematical concepts to illustrate how fuzzy logic can be integrated into transfer learning, data stream learning, RL, and recommender systems.These concepts should help researchers to better understand the articles introduced in the following sections.

A. Fuzzy Transfer Learning
Transfer learning [15] tries to train a well-performed model in one domain (target) by leveraging knowledge from another domain (source) that has different distribution or learning tasks compared with the previous one.This section introduces two representative fuzzy transfer learning frameworks: 1) fuzzyrule-based [16] and 2) fuzzy-equivalence-based [17].
1) Fuzzy-Rule-Based Transfer Learning Framework [16]: is the ith input-output data pair in the nth source domain.Here, X n ⊂ R p denotes the feature space of each source domain and Y is a response space (Y = {1, 2, . . ., K} given a classification task, and Y ⊂ R given a regression task).
is the unlabeled target domain (for unsupervised scenario), where X T ⊂ R p is the feature space of the target domain.In homogeneous cases, X 1 , . . ., X N and X T have the same number of features, while they contain different number of features in heterogeneous cases.
We denote R = {R 1 , R 2 , . . ., R N } as the constructed fuzzy rules space of S, where ( Let R T denote the obtained fuzzy rules of target domain T .Finally, Φ = {Φ 1 , Φ 2 , . . ., Φ N } is denoted as the conclusion of R (e.g., linear combination), where is the nth conclusion of R n .Hence, fuzzy-rule-based transfer learning aims to use the knowledge from D = {S, R, Φ} to fit the data in the target domain, i.e., obtain R T and the conclusion of R T .
2) Fuzzy-Equivalence-Based Transfer Learning Framework [17]: Different from fuzzy-rule-based transfer learning, this framework applies the fuzzy equivalence relations among features in source and target domains to replace the fuzzy rules.Let U = {U 1 , U 2 , . . ., U N } denote the membership function space of the features in S, where , is an m n × m n matrix (see [17] and [18] for details) where R S n is a fuzzy equivalence relation operator on S n .Hence, the fuzzy-equivalence-based transfer learning framework aims to use the knowledge from D = {S, U , R M S } to fit the data in the target domain.

B. Fuzzy Data Stream Learning
Data stream learning [19], [20], also known as stream mining, refers to a set of techniques and algorithms designed to handle and analyze data that arrive continuously over time in a streaming fashion.However, in real-world scenarios, the statistical properties of the data may change over time, making models and algorithms that were previously accurate less effective over time.This phenomenon is known as concept drift [21], [22], [23].A formal definition of concept drift follows.
Definition 1 (Concept drift [23]): Consider a time period [0, t] and a set of samples, denoted as S 0,t = {d 0 , . . ., d t }, where d i = (X i , y i ) is one observation (or one data instance).X i is the feature vector, y i is the label, and S 0,t follows a certain distribution F 0,t (X, y).Concept drift occurs at time stamp t + 1, if F 0,t (X, y) = F t+1,∞ (X, y), denoted as ∃t : P t (X, y) = P t+1 (X, y).
Hence, when a concept drift occurs at t + 1, we aim to adapt the predictor H t = arg min h∈H (h, X, y|(X, y) ∈ P t (X, y)) to fit the new distribution P t+1 (X, y).Next, we briefly introduce a fuzzy-clustering-based drift learning structure [24] to show how fuzzy logic can be integrated into data stream learning.
In fuzzy-clustering-based drift learning [24], fuzzy clustering is applied to learn how many patterns exist in the observed data instances and the membership degree of each instance belonging to each pattern during the process of learning the parameters for the predictor.Let {μ tk } be the membership of the tth instance belonging to the kth cluster, {C k } be the kth cluster centroid, {X t } be the input variable at time step t, and {θ t } be the parameter for the kth predictor.Then, the purpose of fuzzy-clustering-based drift learning is shown as follows: where λ 1 and λ 2 are two preassigned parameters.Fuzzy clustering [25], [26] is utilized to optimize μ tk and C k .

C. Fuzzy Reinforcement Learning
RL [27] is the study of planning and learning in a scenario where a learner (called an agent) proactively interacts with the environment to achieve a certain goal.The agent's aim is to develop the optimal strategy for accumulating rewards.It does this by learning from the feedback it receives.RL has been successfully applied to a variety of real-world problems, such as robotics control [28], game playing [29], and autonomous driving [30].In this section, we provide information of how fuzzy logic can be integrated into RL.
First, fuzzy sets can be used to represent uncertainty in state, action, or reward spaces in RL.For instance, fuzzy reward signals [31] represent the uncertainty or imprecision in the reward received by an agent.In addition, fuzzy controllers [32] that use fuzzy logic to map inputs to control actions can be integrated into RL systems to handle uncertain or qualitative control decisions.Next, we give a general mathematical expression for a fuzzy controller.Let X 1 , . . ., X n be the input variables to the fuzzy controller and Y be the output variable representing the control action.The fuzzy sets associated with each variable are denoted as A 1 , . . ., A n for inputs and B for the output.Let μ A i (x i ) represent the membership function for the fuzzy set A i of input X i , and let μ B (y) represent the membership function for the fuzzy set B for output Y .Then, generic fuzzy rules that define the mapping from inputs to outputs can be expressed as After applying the fuzzy rules, defuzzification is performed to obtain a crisp output value.Furthermore, a fuzzy inference system can be used to make decisions in RL [33], such as determining the next action to take based on fuzzy input signals representing uncertain states or rewards.For example, fuzzy Q-learning [34] extends Q-learning by incorporating fuzzy logic to handle uncertain and imprecise state-action pairs.Fuzzy rules and membership functions are applied to update the Q-values.

D. Fuzzy Recommender System
A recommender system [35], [36] is a type of information filtering system that analyzes user preferences or behavior to provide suggestions personalized to that particular user.In this section, we provide information of how fuzzy logic can be integrated into recommender systems.
Let V = {v 1 , v 2 , . . ., v M } denote the item set and U = {u 1 , u 2 , . . ., u N } denote the user set.In data preprocess, a fuzzy set or linguistic variable can be used to represent item/user terms and user-item rating matrix R, (R a rating of a user for an item).The fuzzy set can help dealing with some types of uncertainty in the description of item features.For example, Yager [37] denotes a set of primitive assertions to describe items, denoted as A = {A 1 , . . ., A n }.For one item v, we can view the item v as a fuzzy subset over the space A. If one item v satisfies assertion A i , the assertion has validity equal to one otherwise zero.The membership degree on Then, an item in a recommender system can be represented as a fuzzy set over an assertion set.In addition, a linguistic variable is widely used to generate the user-item linguistic-term-based rating matrix R.
In the fuzzy user preference/profile generation process, the fuzzy-rule-based system, such as Takagi-Sugeno-Kang fuzzy system (TSK-FS), is usually applied to model the uncertainty and imprecision inherent in users' preferences.Finally, in order to obtain the final predicted ratings ru i ,v j for unrated items, fuzzy similarity is widely used for calculating the similarity between items and users.For instance, S(v i , v j ) [38] is a fuzzy similarity to measure the similarity between item v i and v j where U ij represents the set of users that both rated items v i and v j .[r u,v i ] α represent the α-cut of r u,v i (linguistic variable), and f, g, and h are predefined functions.

III. FUZZY CLASSICAL MACHINE LEARNING
Classical machine learning algorithms, such as decision trees, support vector machines (SVMs), and neural networks, have been responsible for remarkable achievements both theoretically and from a practical point of view.Numerous articles involve combining fuzzy techniques with classical machine learning algorithms to overcome different types of problems with uncertainty, such as incomplete information and imprecise observations.In this section, we summarize these works, dividing the techniques into two categories: 1) non-deep-learning-based method and 2) deep-learning-based method.

A. Non-Deep-Learning-Based Method
The non-deep-learning-based methods can be further divided into three main types: clustering, regression, and classification.Each is discussed in turn next.
1) Clustering: Fuzzy clustering has been widely researched over the last 40 years, and several survey papers have already been published summarizing prior work in this field [39], [40].First, we summarize the main ascendancies of applying fuzzy techniques in clustering as follows.
a) Soft assignment of data points: Traditional clustering algorithms assign each data point to a single cluster, resulting in a hard assignment.However, in many cases, some of the data points may have ambiguous relationships with the clusters or their memberships may overlap into multiple clusters.Fuzzy clustering allows for soft assignment, where a data point's membership in a cluster is not simply binary, but rather it is measured in degrees and can apply to multiple clusters.b) Flexibility in cluster shape: Unlike traditional hard clustering algorithms, such as K-means, which assume spherical clusters of equal size, fuzzy clustering allows for more flexible and irregular cluster shapes.Fuzzy logic allows researchers and analysts to model overlapping clusters, clusters of varying sizes and densities, and clusters with complex boundaries.Thus, fuzzy clustering is highly suitable for datasets with complex structures.c) Handling outliers and noise: Applying fuzzy logic makes clustering more robust to outliers and noisy data than traditional clustering methods.With fuzzy logic, a data point can have a low membership degree to a cluster, which effectively reduces the influence of outliers or noisy data points on the overall clustering results.d) Interpretability and granularity: The fuzzy membership degrees assigned to data points offer a quantitative measure of their association with each cluster.This allows for a more nuanced understanding of the data and provides insights into the degree of similarity or dissimilarity between data points and clusters.Fuzzy logic also allows for the representation of gradual transitions, providing a more detailed and fine-grained view of the clustering.One of the most powerful and well-known algorithms in fuzzy clustering analysis is fuzzy c-means (FCM), developed by Dunn in 1973 [25] and further developed by Bezdek et al. in 1984 [41].In the intervening years, FCM has been widely used and revised many times to deal with different types of problems [42], [43], [44], [45], [46].Among the most recent of these achievements, Ding and Fu [42] proposed a novel kernel-based FCM clustering algorithm that uses genetic algorithm optimization to improve clustering performance.To enhance the robustness of image segmentation, Gao et al. [46] presented a new robust FCM clustering method that combines an elastic FCM with a smoothing method.This elastic FCM provides a sparser description for reliable points and a fuzzier description of the marginal points of clusters.Lei et al. [43] designed a more efficient and more robust FCM algorithm for fast and reliable image segmentation.Their variant is based on morphological reconstruction and membership filtering.Subsequently, Lei et al. [47] built a fuzzy clustering framework around the above implementation for image segmentation.
In research departing from FCM, Jiao et al. [48] developed a fuzzy clustering algorithm that relies on unsupervised fuzzy decision trees to improve model interpretability.To cluster multiple nominal data streams, Sangma et al. [49] proposed a fuzzy hierarchical clustering method that involves the clustering-byvariable approach.The method calculates the fuzzy affinity of data streams to different clusters using normalized cosine similarity and handles concept evolution by updating the hierarchical clustering structure.
2) Regression: Fuzzy regression models [50] perform a type of regression analysis that incorporates both possibility theory and fuzzy set theory [51].They are particularly useful when precise data are lacking or when the relationships between input and output variables are complex and difficult to model using classical regression methods.In addition, fuzzy regression models are good at expressing nonlinear relationships and dealing with noisy data.Fuzzy logic handles the nonlinear relationships between variables.Noisy or incomplete data are handled by allowing for partial memberships and fuzzy sets.Thus, by assigning lower membership values to outliers or inconsistent data points, fuzzy regression models provide a mechanism for mitigating the impact of this type of uncertainty.Another technique for improving the performance of regression models has been to integrate fuzzy techniques with classical machine learning techniques, such as SVM [52], [53], [54] and neural networks [55], [56], [57], [58].Other solutions combine interval regression analysis [59] with machine learning methods [60], [61], [62], [63].
The latest developments in fuzzy regression analysis include He et al. [64], [65], who developed a fuzzy nonlinear regression model using a random weight network that takes triangular fuzzy numbers as its inputs and outputs the same.Baser and Demirhan [66] proposed a new method that combines fuzzy regression models with an SVM to estimate the yearly mean and daily values of horizontal global solar radiation.By applying fuzzy regression functions, their method is robust to outlier observations and problems with overfitting.Chachi [67] designed a robust fuzzy regression modeling technique based on weighted least squares fuzzy regression to handle crisp input-fuzzy output data.Choi et al. [68] addressed issues with multicollinearity in fuzzy regression models by incorporating ridge regression.Naderkhani et al. [69] proposed an adaptive neuro-fuzzy inference system for analyzing and predicting nonparametric fuzzy regression functions with crisp-valued inputs and symmetric trapezoidal fuzzy outputs.Xia et al. [70] developed a novel regression model built around a Takagi-Sugeno fuzzy regression tree to address complex industrial modeling problems, while Zhang et al. [71] introduced an interpretable model based on graph community neural networks and time-series fuzzy decision trees for predicting the delays experienced by a high-speed train.
3) Classification: Numerous studies have combined fuzzy techniques with classical machine learning algorithms to address classification problems.The techniques used include fuzzy decision trees [72], [73], [74], [75], [76], neuro-fuzzy classification [77], [78], [79], and support-vector-regression-based fuzzy classification [80], [81].Rabcan et al. [82], for example, have recently introduced a new approach to signal classification that includes a fuzzification procedure in the transformation process and fuzzy decision trees to perform classifications.Xue et al. [83] proposed an adaptive softmin model based on an enhanced TSK-FS to classify high-dimensional datasets.An adaptive softmin function overcomes the drawbacks of "numeric underflows" and "fake minimums" that frequently arise in existing fuzzy systems.However, although the enhanced TSK-FS maintains adequate rules, it does not grow the number of rules exponentially with features.Ma et al. [11], [84] put forward a novel framework for addressing multiclass classification problems with imprecise observations that provides a theoretical analysis of the problem based on fuzzy Rademacher complexity.The imprecise observations can be either fuzzy-valued or interval-valued, and the framework, which combines classical machine learning techniques like neural networks and SVM with a fuzzy-membershipbased defuzzification method, extracts crisp-valued information from these fuzzy-or interval-valued features.

B. Deep-Learning-Based Method
Fuzzy neural networks (FNNs), also known as NFNs, are a type of hybrid neural network that combine fuzzy techniques with neural networks to improve the efficiency and interpretability of machine learning models.A standard FNN structure is illustrated in Fig. 2. The fuzzy logic component of FNNs allows them to handle imprecise or incomplete data and make decisions based on uncertain inputs.In addition, using an FNN capitalizes on the many significant advances achieved through deep learning Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

TABLE I SUMMARY OF THE SOTA DEEP-LEARNING-BASED FNN ACHIEVEMENTS
in fields such as computer vision, natural language processing, and robotics.
Many researchers have fused deep learning methods with fuzzy techniques to address different types of problems with uncertainty.The most commonly used deep learning models include deep belief networks [85], convolutional neural networks (CNNs) [86], and recurrent neural networks (RNNs) [87].In this section, we summarize the SOTA achievements in FNNs from 2020 to 2023.Earlier research successes can be found in prior surveys like [14], [88], and [89].
Chen et al. [90] devised a fuzzy deep neural network (DNN) with a sparse autoencoder as a way to try and predict human intentions.The model is based on human emotions and identification information.Lu et al. [91] constructed a novel hashing method that integrates DNNs and fuzzy logic to measure the similarity between pairwise images.Zadeh [51] introduced the concept of a type-2 fuzzy set as far back as 1975.These sets, whose membership level themselves are type-1 fuzzy sets, can be used when there is uncertainty about the membership function itself-for example, if one does not know the shape of the function or some of its parameters.The superior performance of type-2 fuzzy sets has seen them used in a range of machine learning tasks.For instance, to perform complex stock timeseries tasks, Cao et al. [92], [93] designed two multiobjective evolution models.Both combine interval type-2 fuzzy sets with rough FNNs.
Several fuzzy-based ensemble models have also been developed to address problems like load forecasting [94], image classification [95], [96], [97], [98], and image fusion [99], [100].For example, Khatter and Ahlawat [101] combined an RNN with fuzzy techniques and a web blog searching method to enhance classification performance, while Concepción et al. [102] presented a theoretical analysis of why fuzzy-rough cognitive networks delivered better performance than the SOTA classifiers.Long short-term memory models have also been combined with fuzzy techniques.Some example works include [103], [104], [105], and [106].In summary, combining fuzzy techniques with classical machine learning algorithms is not only useful for solving uncertainty problems, like imprecise or noisy data, but can also improve the interpretability and robustness of the algorithms.Fuzzy sets are good at handling ambiguity and uncertainty and typically provide a more realistic representation of the inherent fuzziness and uncertainty present.In addition, fuzzy logic tends to improve interpretability.These logics often rely on rule-based systems, where the rules express relationships between the input variables and the output decisions.The rules can either be derived from expert knowledge or be learned from the data.

IV. FUZZY TRANSFER LEARNING
Notably, most current transfer learning [158] methods have limitations when handling real-world situations with uncertainty, such as when only a few labeled instances are available.
To overcome these problems, many researchers have turned to fuzzy sets and fuzzy logic.
Existing studies on transfer learning can be divided into categories based on the type of knowledge that is being transferred.These knowledge categories include instances [159], feature representations [160], model parameters [161], and relational knowledge [162].Alternatively, in terms of the problem settings tackled, studies can be grouped into four categories: multitask learning [163], domain adaptation [164], [165], cross-domain adaptation [166], and heterogeneous learning [167].We have divided our summary of recent works (2015-2023) into three areas based on the fuzzy technique used.These are fuzzy sets, fuzzy systems, and fuzzy relations.Table II summarizes recent achievements in the field of fuzzy transfer learning.

A. Transfer Learning Based on Fuzzy Sets
Behbood et al. [168] proposed an innovative fuzzy-based transfer learning framework to predict long-term bank failures.The framework relies on fuzzy sets, as well as similarity and dissimilarity, to modify the labels of target instances predicted by an FNN classifier.Wu et al. [169] developed OwARR, a new algorithm that combines fuzzy sets with domain adaptation.The aim is to reduce the amount of object-specific calibration data so as to solve the important regression problem of estimating online drowsiness in drivers from EEG signals in brain-computer interfaces.Gargees et al. [170] proposed a transfer learning method for the possibilistic c-means clustering problem with insufficient data, overcoming a crucial problem for clustering tasks where the source and target domains have a different number of clusters.Based on the idea of fuzzy sets, the proposed algorithm employs Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
historical cluster centers of the data in the source domain as a reference to guide the clustering of data in the target domain.
In terms of applying type-2 fuzzy sets to transfer learning models, Sun et al. [171] proposed a new transfer learning model to address the uncertainty caused by conflicting implications in text sequence recognition.The proposed model uses FCM to transform the correspondences among words into information granules.By integrating type-2 fuzzy sets into a hidden Markov model, this granular information can be used for sequence recognition.To reliably estimate gross domestic product (GDP) from only CO 2 emission data, Shukla et al. [172] proposed a new approach to a kernel extreme learning machine (KELM) that combines transfer learning with interval type-2 fuzzy sets.Interval type-2 fuzzy sets are used to improve the efficiency of the knowledge transfer.To consider the uncertainty in input datasets, Kumar et al. [173] presented a novel transfer learning approach that incorporates type-1 and interval type-2 fuzzy sets into a KELM framework.The aim is to predict GDP based on uncertain carbon emissions data.
In general, fuzzy sets have been widely applied to address uncertainty in data in transfer learning scenarios, and, experimentally, they have been shown to improve both the efficiency and accuracy of knowledge transfer in comparison to nonfuzzy methods.

B. Transfer Learning Based on Fuzzy Systems
Most of the existing transfer learning methods have a number of drawbacks.For instance, the performance of model-based transfer learning algorithms is heavily dependent on the selected classifier.In addition, feature-based transfer learning methods can negatively impact the discriminant information and geometric properties of instances from both the source and target domains.Furthermore, the lack of interpretability and an inability to handle uncertainty are two significant flaws.To address these issues, researchers have turned to fuzzy-rule-based systems to improve interpretability and handle uncertainty.Notably, the TSK-FS [3] has received significant attention in this regard.
Shell and Coupland [174] proposed FuzzyTL, a novel structure that combines transfer learning with a fuzzy-rule-based system.This structure is designed to bridge the knowledge gap between contexts that lack prior direct contextual knowledge.Meher and Kothari [175] developed an interpretable domain adaptation method, named the rule-based fuzzy extreme learning machine (ELM) classification model, that uses a fuzzy inference system to design an ELM architecture for remote sensing image classification.The model uses the maximum fuzzy membership grade of features, which is characterized by class-belonging fuzzification, to construct the fuzzy rules and two rule extraction matrices.Moreover, Deng et al. [176], [177] proposed two novel transfer learning approaches for regression tasks using the Mamdani-Larsen fuzzy system and the TSK-FS coupled with a new fuzzy logic algorithm and its objective functions.However, they noticed that the antecedent parameters of the TSK-FS model constructed in the target domain were directly inherited from the source domain, which meant that they could not leverage enough knowledge from the source domain.To address this problem, Deng et al. [178] proposed a new transfer learning method that contains two knowledge-leveraging strategies to better learn the antecedent and consequent parameters in the TSK-FS model.First, they applied an FCM-based clustering transfer technique to the antecedent parameters, which means that the antecedent parameters can be learned from both the source and target domains.Second, they introduced an enhanced knowledge-leverage mechanism to learn the consequent parameters.Another knowledge-leverage term is then introduced to make more effective use of the knowledge in the source domain.Furthermore, they applied and modified these methods so that they could be used for analysis in scenarios with insufficient data, such as recognizing EEG signals [179], [180], [181], [182], [183], [184] or with situations involving multiple-source domains [183].The aim of transfer representation learning is to learn a shared space that matches the distributions of instances from both domains.However, transfer representation learning based on kernels suffers from some shortcomings, such as a lack of interpretability and difficulties with selecting a kernel function.To overcome these issues, Xu et al. [185] proposed a new transfer representation learning method that uses the TSK-FS instead of kernel functions to realize nonlinear transformations.In this approach, instances from both domains are transformed into a fuzzy feature space to minimize the differences between the distributions.Meanwhile, any discriminant information or geometric properties are preserved using latent Dirichlet allocation and principal component analysis.
Notably, Zuo et al. [186] devised a new way of constructing a TSK-FS model for regression tasks.This model uses data from the source domain to construct fuzzy rules and then modifies these rules using a nonlinear continuous function based on sigmoid functions to estimate values in the target domain.To address any significant difference in the label distribution between the source and target domains, Zuo et al. [187] developed some fuzzy-system-based domain adaptation models for classification tasks.In [188], they applied granular computing techniques to transfer learning and proposed a comprehensive domain adaptation framework based on a Takagi-Sugeno fuzzy model to handle three different regression scenarios: one where the source and target domains share different conditions, one where they share different conclusions, and one where both apply.Moreover, they identified two issues in fuzzy transfer learning that had not yet been resolved: how to choose an appropriate source domain and how to efficiently select labeled data for the target domain when the target data structure is unbalanced.The solutions, which involve an innovative method again based on a Takagi-Sugeno fuzzy model [189], combine an infinite Gaussian mixture model with active learning to improve the performance and generalizability of the initial model.Li et al. [190] designed a new transfer learning model for multisource domain adaptation that relies on a fuzzy-rule-based DNN.To address the more challenging problem in multisource domain adaptation where no source data are available, Li et al. [191] proposed a new model based on a DNN with fuzzy rules.
Importantly, all the domain adaptation studies mentioned so far only work when both domains have identical feature spaces and the same number of fuzzy rules, i.e., they are all methods of homogeneous domain adaptation.Zuo et al. [192], however, devised a novel approach to heterogeneous scenarios based on a Takagi-Sugeno fuzzy model.In this framework, fuzzy rules are constructed in the source domain and then transferred to the target domain using canonical correlation analysis so as to minimize the discrepancy between the feature spaces of the two domains.This was the first article to solve heterogeneous domain adaptation problems using a fuzzy-rule-based system.Subsequently, Lu et al. [16] addressed the more challenging scenario of when the only available instances to build the model span multiple source domains.They proposed two novel transfer learning methods for regression tasks based on a Takagi-Sugeno fuzzy model-one for when the feature spaces are homogeneous and one for when the spaces are heterogeneous.In the former, knowledge from multiple source domains is merged in the form of fuzzy rules, while, in the latter, knowledge is merged in the form of both data and fuzzy rules.Che et al.'s [193] fuzzy transfer learning method addresses multioutput regression problems in both homogeneous and heterogeneous scenarios.Their approach applies fuzzy rules to accurately capture the commonalities and characteristics of multiple numerical output variables.
In summary, most of the above methods share a common model construction framework: they begin by constructing a fuzzy-rule-based model on the source data (e.g., a TSK-FS) and subsequently modify the existing model (fuzzy rules) to establish a new fuzzy model for the target domain.Fuzzy-rulebased systems provide a linguistic representation of knowledge, enabling generalization and adaptation, while also making the model more robust to domain shift.Their power to transfer relevant knowledge also helps to improve a model's interpretability.All these characteristics make fuzzy-rule-based systems well suited to transfer learning tasks-particularly, the more challenging tasks, such as heterogeneous domain adaptation and source-free domain adaptation.

C. Transfer Learning Based on Fuzzy Relations
Most studies mentioned so far focus on supervised or semisupervised transfer learning in homogeneous scenarios, where both the source and target domains have labeled instances and only their data distributions are different.However, it is not uncommon in the real world for there to be no available labeled instances in the target domain.Furthermore, the feature spaces of the source and target domains will usually be different.This scenario, which is characterized by a high degree of uncertainty, is commonly referred to as heterogeneous unsupervised domain adaptation (HeUDA).Recently, researchers have developed ndimensional fuzzy geometry theory [194] and fuzzy equivalence relations [195] to analyze and handle such problems with uncertainty.
Liu et al.'s [18] solution to HeUDA problems, called F-HeUDA, is to use fuzzy geometry to measure the similarity of features between the source and target domains.Shared fuzzy equivalence relations are then introduced, which means that both domains will share the same number of clustering categories.Hence, knowledge can be transferred from a heterogeneous source domain to a target domain with only unlabeled data.Using these techniques, F-HeUDA outperformed the SOTA models on four real datasets and performed especially well when the target domain had very few instances.Moreover, Liu et al. [17], [196] focused on a more realistic problem called the multisource HeUDA problem.Solving this problem involves transferring knowledge from several different source domains that have labeled data but heterogeneous dimensions and one target domain with unlabeled data.Their approach, called a shared fuzzy equivalence relations neural network, improves upon previous work in shared fuzzy equivalence relations to extract the shared fuzzy information contained in multiple heterogeneous domains.
In summary, because there is a high degree of uncertainty when transferring knowledge from a heterogeneous source domain to a target domain with only unlabeled data, nonfuzzy models will not usually perform well.Fuzzy relations offer a flexible, interpretable, and adaptable framework for representing and transferring knowledge between such domains.Hence, researchers tend to apply fuzzy relations to improve transfer efficiency in heterogeneous situations.

V. FUZZY DATA STREAM LEARNING
Learning from data streams [19], [20] involves developing algorithms and techniques to adaptively and incrementally process and learn from continuously arriving data.Unlike traditional machine learning scenarios where a static dataset is available for offline training, data stream learning deals with dynamic, evolving data streams that may not be stored entirely.However, data streams often exhibit concept drift, which refers to changes in the statistical properties of the data.Detecting and adapting to concept drift are two important challenges in data stream learning.One approach is to continuously monitor the data and update models or retrain them periodically to account for changes.Another approach is to use online learning techniques that can adapt to changes in the data stream in real time.While concept drift often come with some uncertainty problems-for example, making predictions from data streams with mixed drift problems and detecting drift in data streams with missing values-researchers are considering the application of fuzzy techniques to address these challenges.
The aim of concept drift detection is to identify when concept drift has occurred so that appropriate measures can be taken to update or retrain the models in question.Several research teams have turned to FCM-based methods to detect concept drift [201], [202].These two methods derive fuzzy membership functions from the data stream and use the membership results to mine concept drift patterns.Zhang et al. [206] designed a new drift detection model based on fuzzy set theory to address drift problems associated with user interests for recommender systems, while Dong et al. [207] developed a data-distribution-based drift detection method for business intelligence and data-driven decision support systems that incorporates fuzzy set theory.In both these methods, fuzzy set theory is used to handle the challenging issue of where an item's features and its related information are usually incomplete and imprecise.Along these lines, Liu et al. [222] proposed a robust drift detection algorithm Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
that can handle missing values.This algorithm comprises a masked distance learning algorithm to reduce the cumulative errors caused by missing values and a fuzzy-weighted frequency method to identify discrepancies in the data distribution.
Concept drift adaptation refers to the process of updating or modifying a machine learning model in response to concept drift so that it remains accurate and effective over time.Over the past five years, many new adaptation models that fuse fuzzy techniques with machine learning algorithms have been built to deal with the phenomenon of concept drift.The applied fuzzy techniques include fuzzy clustering algorithms [24], [203], [204], [205], fuzzy-rule-based systems [208], [209], [210], [211], and fuzzy time series (FTS) [220], [221].Song et al. [24], [203], [204] proposed a series of kernel FCM-based adaptive models to handle data stream regression problems with concept drift.In [203] and [204], kernel FCM is used to determine the most relevant learning set, while, in [24], kernel FCM is used to measure the degree to which upcoming examples belong to different patterns.These fuzzy membership values are then embedded in the learning process to handle mixed drift data streams.
In terms of applying fuzzy-rule-based systems, Garcia et al. [210] developed a modified evolving granular fuzzyrule-based model that incorporates an incremental learning algorithm to simultaneously impute missing data as well as adapt the model's parameters and structure over time.García-Vico et al. [211] proposed an evolutionary fuzzy system to extract knowledge from data streams as a way to adapt to concept drift.Both these methods use type-1 fuzzy systems; however, by contrast, Pratama et al. [208] proposed an evolving type-2 recurrent FNN to simultaneously address three challenges: data uncertainty, temporal behavior, and system absence.FTS [228] is a mathematical framework that combines fuzzy logic and time-series analysis to model and forecast uncertain and imprecise data over time.FTS is particularly useful in situations where the data have missing values, outliers, or noise, and where traditional time-series models may not perform well.de Lima e Silva et al. [220] introduced a nonstationary FTS, while Severiano et al. [221] introduced an evolving forecasting model based on FTS to deal with concept drift.Moreover, Liu et al. [223] proposed a new concept drift adaptation method based on a fuzzy windowing approach.Unlike traditional windowing methods, this approach employs sliding windows with an overlapping period to enable precise identification of the data instances that belong to different concepts.Focusing on multiple relevant data stream regression with concept drift, Song et al. [224] developed a new adaptation model based on fuzzy drift variance, where the variance is designed to measure the correlated drift patterns among streams.
In addition, several works simultaneously address concept drift detection and adaptation [212], [213], [225], [226].For example, Dong et al. [225] introduced an adaptive ensemble algorithm based on fuzzy instance weighting to handle data streams involving concept drift.Yu et al. [213] presented an evolving neuro-fuzzy system for streaming data regression that employs an online topology learning algorithm to self-organize each layer of the proposed system.To effectively detect drift and adapt the learned model, Zhang et al. [226] proposed a novel approach that combines a dynamic intuitionistic fuzzy cognitive map scheme and a concept drift detection algorithm.
More recently, researchers have used fuzzy techniques to address data stream classification and regression problems.These techniques include evolving fuzzy systems [214], [215], [216], neuro-fuzzy systems [217], granular fuzzy-rule-based systems [218], [219], and the fuzzy time-matching method [227].Not only can these techniques help to improve the performance of streaming data classification and regression in uncertain environments, they can also be applied to handle the phenomenon of concept drift.Table III summarizes these recent achievements in the field of fuzzy data stream learning.
In summary, fuzzy techniques are applied in data stream learning, especially to handle concept drift scenarios, owing to their capacity to handle uncertainty, adapt to changing patterns, and provide interpretable models.These features make fuzzy techniques valuable for detecting, understanding, and adapting to concept drift, leading to better performance than nonfuzzy methods.

VI. FUZZY REINFORCEMENT LEARNING
RL [27] represents a powerful paradigm in machine learning, where agents learn to make decisions through interaction with an environment, guided by a system of rewards or penalties.However, the traditional RL framework is not without its challenges, especially in scenarios where the training process is inherently slow due to complex and uncertain environments [229] or sparse reward signals [230].Fuzzy RL emerges as a promising approach to address these limitations, leveraging fuzzy logic to enhance training efficiency and overcome the hurdles associated with slow reinforcement processes.
One of the primary advantages of fuzzy RL lies in its adaptability to dynamic (uncertain) environments.In traditional RL, slow training processes can be exacerbated by the challenges posed by dynamic scenarios where the optimal strategy may change rapidly.Fuzzy logic allows the system to gracefully adapt to these changes, incorporating fuzzy rules that capture the gradual transitions and uncertainties in the environment.In addition, in many RL applications, the scarcity of meaningful rewards can impede the learning process, leading to slow convergence or even stagnation.Fuzzy RL introduces the concept of fuzzy rewards [31], enabling the system to consider partial or intermediate successes that may not be fully captured by binary Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
reward signals.This approach helps mitigate the challenge of sparse rewards by providing a more nuanced and continuous feedback mechanism, allowing the agent to learn from a broader spectrum of experiences.
Fuzzy Sarsa learning (FSL) [31] is a critic-only fuzzy RL algorithm that combines the Sarsa algorithm with fuzzy logic.In traditional Sarsa learning, an agent learns to take actions through trial and error that maximize a reward signal in a given environment.In FSL, the state and action spaces are represented as fuzzy sets, which allows for a more gradual transition between states and actions, rather than strict boundaries.The algorithm updates the Q-value function based on the fuzzy membership functions of the current state and action, as well as the fuzzy reward function.The use of fuzzy logic allows FSL to handle more complex environments with uncertain or imprecise information, while still maintaining a high level of performance.As an example, Fathinezhad et al. [237] proposed a novel method of robot navigation that combines supervised learning and FSL.Their method applies a zero order Takagi-Sugeno fuzzy controller with some candidate actions for each rule as the main module.Also, Hein et al. [238] developed a new particle swarm approach to RL based on fuzzy controllers.This approach builds fuzzy RL policies by training parameters on world models that simulate real system dynamics.Furthermore, Shi et al. [240] developed an adaptive fuzzy comprehensive evaluation method that integrates a fuzzy analytical hierarchy process, a Bayesian network, and RL.The authors successfully applied this method to a robot soccer system, which is a typical complex time-sequence decision-making system.
In the field of control systems, Zhang et al. [231] designed a new fault-tolerant control algorithm by combining RL with a fuzzy augmented model for partially unknown systems with actuator faults.The fuzzy augmented model was inspired by the well-known Takagi-Sugeno fuzzy model.With this algorithm, less information needs to be transmitted, which reduces computational loads during the learning process, even when dynamic matrices are partially unknown.Similarly, Zhang et al. [232] proposed a novel parallel tracking control optimization algorithm using fuzzy RL techniques for partially unknown fuzzy interconnected systems.This algorithm uses the precompensation technique to treat working feedback controls as reconstructed dynamics with virtual controls.This approach to building the model results in a new augmented and interconnected fuzzy tracking system where a valid performance index is guaranteed for optimal control.In the realm of traffic light control systems, Kumar et al. [233] proposed a novel system that is both dynamic and intelligent to overcome issues with long waiting times, fuel waste, and rising carbon emissions.Traditional traffic light systems operate on a fixed duration mode, whereas Kumar's proposed system uses a deep RL model to switch the lights and a fuzzy inference system to select one among three modes based on current traffic information.To mitigate frequency deviations caused by power fluctuations, Yin and Li [239] developed a fuzzy vector RL approach to control how much power a power system generates.The framework also considers flywheel energy storage systems.Turning to large-scale multiagent RL, Li et al. [234] introduced the concept of fuzzy agents to be used for training homogeneous agents.They also proposed a new RL method that uses fuzzy logic to learn abstract policies.In comparison to other simplification methods, their fuzzy agents both reduce the computing resources required to train a model and ensure that an effective policy is learned.Zhu et al. [235] devised a new control strategy based on RL and a fuzzy wavelet network.The aim here is to improve the stability of the hybrid system's buffer compliance control.To reduce the negative effects of noisy information in communication channels on multiagent RL, Fang et al. [236] developed a two-stream fused fuzzy DNN by applying a fuzzy inference module and a DNN module.Experiments with two large-scale traffic signal control environments demonstrate the proposed method's superior performance.Table IV summarizes these recent achievements in the field of fuzzy RL.
In general, by combining RL and fuzzy logic, fuzzy RL can handle the complexity of real-world environments that involve uncertain information and imprecise data, making it a promising technique for solving problems in fields such as robotics, control systems, and game theory.

VII. FUZZY RECOMMENDER SYSTEMS
In real-world recommender systems, descriptions of user preferences and item features, item values, and business knowledge are often vague, imprecise, and plagued with uncertainty.And, further, these issues can occur across the entire recommendation process from collecting the data to generating the recommendations.Other key problems that can occur with recommender systems include sparsely populated user-item matrices and problems with measuring the similarity of items and users (see Fig. 3).Commonly used fuzzy techniques to deal with these issues include intuitionistic fuzzy sets [241], fuzzy user profiles [242], fuzzy-rule-based systems [243], and fuzzy similarity [244].This section provides a summary of recent articles focused on these techniques.Collaborative filtering is a key approach to recommender systems.However, traditional collaborative filtering methods, such as unsupervised clustering, are quite sensitive to uncertainty and therefore often experience high error rates.Once again, FCM or modified FCM algorithms have been implemented to eliminate these issues [245], [246], [247].For example, FCM has been used to classify the users in a dataset according to the similarity of their item ratings.To improve the quality of recommender systems with sparse datasets, Nilashi et al. [248] designed a hybrid item similarity model that combines an adjusted Google similarity with an intuitionistic Kullback-Leibler similarity based on fuzzy sets.This approach essentially makes a tradeoff between prediction accuracy and efficiency.
In terms of content-based filtering recommender systems, Yera et al.'s [249] solution uses a fuzzy decision tree to match the most appropriate function in the individual recommendation aggregation step.Other researchers have also relied on fuzzy-rule-based systems to extract relevant knowledge from uncertain data to improve the performance of knowledge-based recommender systems [250], [251], [252].
Hybrid recommender systems are another area of research progress.Here, Walek and Fajmon [253] designed a new hybrid recommender system that combines a collaborative filtering system, a content-based filtering system, and a fuzzy expert system to enhance recommendation performance.The fuzzy expert system is used to evaluate the importance of the recommended products with vague information and rank them appropriately for users.Table V summarizes these recent studies in the field of fuzzy-based recommender systems.
In summary, fuzzy techniques provide a rich spectrum of methods for managing uncertainty, vagueness, and imprecision in data both during the learning process and when making recommendations.Particularly, fuzzy techniques are well suited to handling imprecise user preference descriptions (e.g., in linguistic terms), knowledge description, and the gradual accumulation of user preference profiles.Therefore, applying fuzzy techniques in the recommender system can bring more efficient and accurate performance than nonfuzzy models.

VIII. FUTURE RESEARCH DIRECTIONS
So far, we have summarized recent achievements of FML.In this section, we aim to give further discussion of FML's current research trends and share some insights on future research directions.

A. Fuzzy Classic Machine Learning
Most current research in fuzzy classic machine learning mainly focuses on the following aspects: 1) handling noisy or incomplete data; 2) addressing imbalanced datasets; and 3) enhancing algorithms' interpretability and robustness.Analyzing imprecise data (fuzzy-valued or interval-valued) [84] has not received widespread attention.However, in many real-world scenarios, we will inevitably encounter this kind of data.Therefore, it would be a promising direction to investigate how to apply fuzzy logic to analyze imprecise data.
Moreover, deep learning [86] has made significant strides, but it still faces several challenges.DNNs, particularly complex architectures like deep convolutional or recurrent networks, are often viewed as black boxes.Understanding how these models arrive at specific decisions is crucial, especially in applications where interpretability is essential, such as health care and finance.In addition, deep learning models are vulnerable to adversarial attacks [276], where small carefully crafted perturbations to input data can lead to misclassification.Fuzzy techniques are potential tools to overcome these challenges.We suggest future work that uses fuzzy techniques to overcome these challenges.

B. Fuzzy Transfer Learning
Recent fuzzy transfer learning works [191], [277] mainly focus on applying fuzzy-rule-based systems to model the uncertainty and variability between different source and target domains, enabling more effective adaptation of knowledge from the source to the target domain.However, open-set problems [165], [278] have gain more and more attention in transfer learning, where target domain contain private categories.Detecting unknown classes is a challenging problem that contains a large degree of uncertainty.We believe that applying fuzzy techniques to address this challenge problem is worth investigating for future work.

C. Fuzzy Data Stream Learning
A couple of recent works [222], [224] in fuzzy data stream learning are focused on developing adaptive fuzzy models that can effectively handle concept drift in data streams.Learning from multiple stream [20] is a crucial and challenge problem in data stream learning, especially when streams have different rates, arrive asynchronously, or experience delays.Streams may vary in terms of data types, formats, and modalities.In addition, there is an uncertain relationship between each pair of streams.Traditional machine learning algorithms face difficulty in addressing these challenges.Therefore, we recommend that researchers use fuzzy techniques in future studies to tackle these issues.

D. Fuzzy Reinforcement Learning
Recent research [233], [237] is mainly focused on integrating fuzzy systems with RL for improved performance in complex and dynamic environments.Research is exploring the integration of fuzzy logic into Q-learning algorithms to handle uncertainties in estimating state-action values.Furthermore, fuzzy logic is applied to model and handle uncertain or imprecise reward signals in RL.However, in multiagent RL [234], capturing complex relationships and dependencies between agents while maintaining a scalable and efficient learning process is a key challenge.In addition, agents in a multiagent system may have diverse capabilities, objectives, or learning speeds.Coordinating heterogeneous agents and ensuring fair and effective collaboration is another challenging problem.We believe that it would be a promising direction to investigate how to apply fuzzy techniques to address these challenges.

E. Fuzzy Recommender Systems
Fuzzy techniques are mainly used to handling imprecise user preference descriptions (e.g., in linguistic terms), knowledge description, and the gradual accumulation of user preference profiles in fuzzy recommender systems [260], [264].Crossdomain recommendations [279], where recommendations are made across different domains or platforms, present several challenges due to the diversity and heterogeneity between domains.Different domains may have distinct characteristics, user behaviors, and item features.Moreover, there will be uncertain relationships between different domains.Applying fuzzy techniques to address these challenges in cross-domain recommendations is promising in future work.

IX. SUMMARY
In this article, we reviewed recent developments across the five main research streams of FML.Our review shows that fuzzy techniques can significantly improve machine learning algorithms by providing a way to handle different uncertainty situations.The main improvements are reflected in the following five aspects: 1) enhancing the representation of the inputs; 2) improving the learning process of different machine learning algorithms; 3) enhancing measurement accuracy and reliability; 4) improving the accuracy of the matching function; and 5) enhancing the performance (e.g., accuracy, robustness, and interpretability) of the output results.
In future research, several new directions in the field of FML warrant thorough consideration; for instance, applying fuzzy techniques to address open-set transfer learning problems, where the target domain encompasses classes that are unknown in the source domain.In addition, multistream learning, multiagent RL, and cross-domain recommendations are three challenge problems that are far from being solved.They all involve intricate relationships and heterogeneous information, posing difficulties for traditional machine learning algorithms.Fuzzy techniques emerge as promising tools for investigating and addressing these complex problems.
We believe that this survey can provide researchers with SOTA knowledge on machine learning based on fuzzy techniques and give a guide on future research directions in the field of FML.

Manuscript received 4
July 2023; revised 8 January 2024 and 7 March 2024; accepted 1 April 2024.Date of publication 11 April 2024; date of current version 2 July 2024.This work was supported by the Australian Research Council under Grant FL190100149 and Grant DP220102635.Recommended by Associate Editor F. Doctor.(Corresponding author: Jie Lu.)
Fuzzy Machine Learning: A Comprehensive Framework and Systematic Review Jie Lu , Fellow, IEEE, Guangzhi Ma , Student Member, IEEE, and Guangquan Zhang (Survey Paper) Table I provides a summary of the SOTA literature related to deep learning with FNNs.

TABLE II SUMMARY
OF THE SOTA PAPERS IN FUZZY TRANSFER LEARNING

TABLE III SUMMARY
OF THE SOTA ACHIEVEMENTS IN FUZZY DATA STREAM LEARNING

TABLE IV SUMMARY
OF THE SOTA ACHIEVEMENTS IN FUZZY RL Fig. 3. Fuzzy recommender system framework.

TABLE V SUMMARY
OF THE SOTA FUZZY-TECHNIQUE-BASED RECOMMENDER SYSTEM ACHIEVEMENTS