Abstract:
In modern process industries, it is of significance to build data-driven soft sensors to predict key performance indicators (KPIs) that are difficult to measure directly....Show MoreMetadata
Abstract:
In modern process industries, it is of significance to build data-driven soft sensors to predict key performance indicators (KPIs) that are difficult to measure directly. However, the industrial data obtained are usually characterized by uncertain time series, different degrees of outliers, multiple redundant variables, and abundant unlabeled data, presenting difficulties in data-driven modeling. To address these difficulties, a semi-supervised and robust data-driven modeling algorithm is proposed. First, the t-distributed stochastic neighbor embedding (t-SNE) is applied to reduce the dimensionality of unlabeled samples. Second, a bidirectional long short-term memory (BiLSTM) network based on a capped Huber loss is developed to deal with outliers, and the least absolute shrinkage and selection operator (LASSO) is introduced to remove redundant variables. Third, experiments on an artificial dataset and an industrial dataset demonstrated that the developed algorithm had a higher prediction accuracy than other state-of-the-art methods. Furthermore, ablation studies were conducted to evaluate the contributions of different techniques to the model performance.
Published in: IEEE Transactions on Instrumentation and Measurement ( Volume: 74)
Funding Agency:
Keywords assist with retrieval of results and provide a means to discovering other relevant content. Learn more.
- IEEE Keywords
- Index Terms
- Short-term Memory ,
- Long Short-term Memory ,
- Long Short-term Memory Network ,
- Semi-supervised Learning ,
- Bidirectional Long Short-term Memory ,
- Soft Sensor ,
- Time Series ,
- Absolute Shrinkage ,
- Unlabeled Data ,
- T-distributed Stochastic Neighbor Embedding ,
- Data-driven Models ,
- Key Performance Indicators ,
- Industrial Data ,
- Artificial Datasets ,
- Redundant Variables ,
- Huber Loss ,
- Loss Function ,
- Training Data ,
- Artificial Neural Network ,
- Prediction Error ,
- Input Variables ,
- Flotation Process ,
- Mean Absolute Error ,
- Extreme Outliers ,
- Gated Recurrent Unit ,
- Target Variable ,
- Training Time ,
- Mean Absolute Percentage Error ,
- Foaming Agent ,
- Locally Linear Embedding
- Author Keywords
Keywords assist with retrieval of results and provide a means to discovering other relevant content. Learn more.
- IEEE Keywords
- Index Terms
- Short-term Memory ,
- Long Short-term Memory ,
- Long Short-term Memory Network ,
- Semi-supervised Learning ,
- Bidirectional Long Short-term Memory ,
- Soft Sensor ,
- Time Series ,
- Absolute Shrinkage ,
- Unlabeled Data ,
- T-distributed Stochastic Neighbor Embedding ,
- Data-driven Models ,
- Key Performance Indicators ,
- Industrial Data ,
- Artificial Datasets ,
- Redundant Variables ,
- Huber Loss ,
- Loss Function ,
- Training Data ,
- Artificial Neural Network ,
- Prediction Error ,
- Input Variables ,
- Flotation Process ,
- Mean Absolute Error ,
- Extreme Outliers ,
- Gated Recurrent Unit ,
- Target Variable ,
- Training Time ,
- Mean Absolute Percentage Error ,
- Foaming Agent ,
- Locally Linear Embedding
- Author Keywords