Learner Emotional Value Recognition Technology: An Artificial Neural Network Optimized by the Grey Wolf Algorithm

In cross-cultural communication, multimedia animation is crucial in defining a nation’s image and cultural form. It is a key vehicle for cultural diffusion and a tool for film and television to convey national culture and highlight regional culture. Animation’s particular charm, position, and function in cultural dissemination are further highlighted by the special cinematic language used to portray emotions, and it also reflects the medium’s unique significance in the rapidly evolving modern society. To better understand the emotions generated by various multimedia animations, in- depth research is needed. To investigate these issues, this article explores the use of the Sigma-pi artificial neural network (SP-ANN) algorithm based on the grey wolf optimization algorithm (GWOA) to identify emotional states. Compared with traditional Sigma $\pi $ artificial neural network algorithms, a training process that does not require complex derivative calculations in derivative-based algorithms is performed. Sigma $\pi $ networks can benefit from the proposed learning algorithms. This algorithm has high approximation accuracy and is particularly suitable for real-time approximation of nonlinear processes. The test results indicate that the proposed algorithm can work as expected.


I. INTRODUCTION
The universality of emotions refers to the fact that no one in human intelligent behavior can be separated from human emotions [1], [2].It is also difficult to imagine a person living in an environment without feelings; even the simplest intelligence of seeking benefits and avoiding harm is closely related to emotions.However, at the same time, emotions are also so subtle that even emotional communication between humans is difficult to grasp.On the other hand, there is broad market demand in fields such as film and television animation, virtual agents, video chats, and virtual hosts.Therefore, the research direction of emotional animation has attracted more and more attention from researchers [3].
The associate editor coordinating the review of this manuscript and approving it for publication was Ali Shariq Imran .
Emotional animation research encompasses various fields, such as object detection, recognition, and tracking in video processing, computer 3D modeling, and animation generation [4], [5].The definition and classification of emotions in the field of psychology research, as well as the impact of emotions on human behavior.The research in various fields mentioned above has a long history, but the organic integration of these subsystems and using computer-understandable language to represent them is only the beginning.Related scholars attempt to integrate audio emotion modeling and video emotion modeling to establish a multimedia-driven emotion animation system [6].
Emotions and learning outcomes can be influenced by the unique qualities of students' visual and linguistic perception of multimedia resources.On the other hand, user emotional data might be an implicit tool that can be used to manage the learning process, making it nonlinear, branching, and properly incorporating recurring actions, thereby making educational resources effective and customized.The most recent developments in comprehensive and in-depth automatic facial expression detection are found in [7], [8], [9], and [10].
Several fundamental measures must be taken to solve the issue of automatic emotion recognition in humans: allocation of facial areas in images during image preprocessing, Facial feature allocation and emotion detection using feature eigenvalues.This article suggests an improved SP-ANN algorithm based on the GWOA to address the issue of computer emotion recognition.
The goal of this research is to develop artificial intelligence optimization algorithms that are distinct from derivative-based optimization algorithms for SP-ANN training.There are some drawbacks to using derivative-based methods for optimization.The Hessian matrix, used by some derivative-based optimization techniques, necessitates understanding the second derivative.Determinant knowledge is needed in this situation.However, the determinant might be zero in some computations.Numerous derivative-based optimization methods could also encounter local minima.
The remaining organization of this article is as follows.The second part introduces the relevant work.The third part presents the network structure of animation emotions.The SP-ANN technology and GWOA are introduced in the fourth part of the article.The study's fifth section is its application section.The discussion and conclusion in the sixth segment make up the conclusion section.

II. RELATED WORKS
Incomparable in its intuitiveness and effectiveness compared to other media forms, media animation is a significant medium and cultural form of cross-cultural communication for the general audience.In engaging audiovisual communication, multimedia animation contributes significantly to the penetration and spread of culture.No statistical inferences about the data were made when analyzing the effect of discourse reconstruction on cross-cultural communication in animated movies; the analysis was solely based on the estimation of the similarity or distance function of a group of clustering objects, or what is known as unsupervised learning in the field of machine learning [11], [12].In real-world applications, inadequate supervision is frequently thought to enhance the performance of unsupervised clustering.Several currently used semi-supervised clustering methods were created by adding supervised information to the foundation of conventional clustering algorithms.
Using Google's word2vec to generate word vectors as CNN input, Hagendorff suggested a framework for word2Vec convolutional neural networks.This is the first time Word2vec and CNN have employed a 7-layer architecture model to assess phrase sentiment.Three convolutional and pooling layers created an acceptable CNN architecture for sentiment analysis computations.The experiment results demonstrate that convolutional neural networks can perform better than shallow classification techniques through pre-training and fine-tuning.Convolutional neural networks are better than other machine learning models at training for problems involving natural language processing [13], [14].
Perceptron is a notion that Zhang suggested and has since gained prominence and significance in studying neural networks.With a clear understanding of the solutions to linear separable problems, he initially put out the concepts of self-organization and self-learning.Mathematically, a strict proof and convergence procedure were presented.Later models were developed based on this guiding concept or its advancements and generalizations [15].To solve the spectral decomposition problem of the graph matrix, Shi et al. introduced the spectral clustering algorithm, which is based on the graph partitioning theory.
Pairwise restrictions are used in the semi-supervised spectral clustering algorithm to modify the data similarity matrix and improve clustering [16].Wang and Zhang proposed a discriminative semi-supervised clustering algorithm based on paired constraints [17].This algorithm efficiently reduces data dimension and cluster using pairwise constraint information integration.This approach preprocesses the entire dataset using paired constraints, generates a feature projection matrix based on the paired constraints, and then utilizes the K-means algorithm based on paired constraints to cluster the data in the projection space.For class results, choose a projection space.
The artificial neural network method is a compelling emotional value recognition technique [18], [19], [20].The term ''feedforward NN model'' refers to a high-order ANN that combines additive and multiplicative units in its architecture [21].These NN also have higher computational power than many ANN models.The higher-order ANN has a more straightforward design and fewer weights than the multilayer perceptron, one of the most widely used NN models in time series prediction literature.This is a result of the lack of trainable weight layers.They can, therefore, learn more quickly than many other ANNs.
The sigma-pi artificial neural network (SP-ANN) and pi-sigma artificial neural network (PS-ANN) are two commonly used high-order neural network models [22], [23].However, the activation function of SP-ANN is applied to the weighted sum of input signals, which can help better control the activation of neurons.This can improve the stability and convergence of the network in some cases.Relevant researchers have contributed significantly to the study of SP-ANN.For time series prediction, an enhanced SP-ANN technique based on the backpropagation learning algorithm is proposed in [24].The genetic optimization algorithm in SP-ANN can be found in [25].In [26], the author used the steepest descent optimization method to train SP-ANN.The SP-ANN approach was applied in [27] to identify emotions.Then, these methods are based on the second derivative.In this case, determinant information is required.It may fall into a local minimum in certain cases due to the determinant being calculated as 0.
Although SP-ANN performs well in many tasks, its performance is still affected by hyperparameter selection and falling into local optima.Using the GWOA to optimize SP-ANN can help overcome these potential drawbacks, improve the performance and robustness of neural networks, and make them more adaptable to complex tasks and data distribution.This combination will allow for more effective utilization of the potential of SP-ANN.
According to the substantial literature investigation conducted for this study, there has never been a survey evaluating emotional research in various multimedia animations.Emotion identification technology based on people's physiological information has recently increased.As a result, it is possible to examine emotional concerns in various multimedia animations.

III. FEATURE DATA EXTRACTION IN ANIMATION
When conducting emotional analysis in multimedia animation, the first step is to convert animation information into processable computer information.As a result, the first step is to convert animation data into forms that machine learning algorithms can process readily.The most fundamental animation unit is the frame number.It formalizes frame numbers for training and decision-learning models by converting them into real numbers or vectors.Word2Vec is applied to coarse-grained vectorization to categorize coarse-grained and fine-grained animation emotions.
The Word2Vec model can address the problem of vector sparsity and a loss of semantic information when using a hot code to vectorize words.As a result, word vector representation can be made simpler and word vectors with related semantics can be clustered together.The Word2Verc model's structure uses a text body as an input and vector space as an output.This model produces a low-dimensional space vector for words that preserves contextual information while allowing for data size compression.The input layer is a hotspot vector of the context word, and the hidden layer has neither a linear unit nor an activation function.The Word2Vec model's neural network architecture is shown in Figure 1.
The input, hidden, and output layers of the Word2Vec word vectorization tool are effectively three layers of a neural network model, as seen in Figure 1.This method can translate the original word vector space to a new word vector space and represent the semantics of the text in a distributed digital form.Additionally, it can map words with related semantics to related locations in the new vector space.Word vectors also make clustering analysis easier by helping researchers identify words with similar meanings using Euclidean distance or cosine similarity.
The continuous bag of words (CBOW) model presupposes that the context determines the current word.We calculated the average of these word vectors because the CBOW model contains numerous contextual terms.The window size is W for any m-tuple in the training corpus.The weighted average of the input word vector is what is produced as a result: where p denotes the likelihood of correctly predicting the primary target word.The word vector V serves as the sole neural network parameter in the CBOW model.Optimize the word vector matrix M to maximize the logarithmic likelihood of each word:

IV. STRUCTURES OF NN FOR SENTIMENT ANALYSIS A. SP-ANN ALGORITHM
A feedforward neural network model, a hidden layer, and an output layer comprise SP-ANN.These layers typically operate between constants 0 and 1 to optimize the weights between the input and hidden layers.Due to this feature, SP-ANN is more computationally efficient than multilayer perceptron artificial neural networks and has fewer parameters.Figure 2 depicts the input SP-ANN for the k-order P structure.
FIGURE 2. The architecture of Kth-order P-input SP-ANN.
In Figure 2, the Q matrix represents the weight matrix of the network input to the hidden layer, and the vector δ indicates the bias value of the j-th hidden layer unit.The equation (3) gives the Q matrix and vector δ's form.
Here, δ j stands for the j-th hidden layer unit's border value and q ij is the weight from the i-th input unit to the j-th hidden layer unit.
Equation yields the result of the j-th hidden layer unit, which is represented by h j .
Formula (2)'s f 1 (x) = x represents a linear activation function, and the accompanying formula (5) yields the output of the network y

B. GREY WOLF OPTIMIZATION ALGORITHM
Seyedali and Mirjalili [28], a scholar at Griffith University in Australia, created the GWOA in 2014.A swarm intelligence algorithm imitates gray wolves' social structure and foraging strategies in the wild.It has easy programming execution, few adjustment parameters, and straightforward operation.

1) PREDATION BEHAVIOR OF GRAY WOLF POPULATIONS
The social carnivores known as grey wolves have clear social class distinctions.Gray wolves can be divided into four categories within a wolf pack: a, b, c, and d.The superior gray wolf rules over the inferior gray wolf, and the latter submits to and carries out the superior gray wolf's directives.
A symbolizes the head wolf in the pack and is at the top of the pyramidal structure in Figure 3.This wolf is responsible for making choices regarding hunting, resting, and other activities.The second stratum, b, is a's subordinate, whose job is to support a's decision-making.It is also the best candidate to take its position as the new a when the current a loses its benefit.Although it controls the underlying d, c still complies with a and b's requests.Most of the gray wolves in the group fall into class d, which is at the bottom.They are responsible for carrying out the choices made by the three gray wolf levels and managing the relationships within the pack.
We think that in the refining process, the best solution is a, the worst solution is b, the third best solution is c, and the rest are d.To complete the predation behavior and actualize the search process of global optimization, the gray wolf groups at all levels are given the tasks of the three phases of encirclement, pursuit, and attack.This is done through the GWO iteration process, which uses a, b, and c to direct d.We will discuss each of these three phases in more detail below.

2) GWO MATHEMATICAL MODEL
The top three best wolves (optimal solution) were determined as a, b, and c, which directed other wolves to seek the direction of the target to model the social level of grey wolves in GWO numerically.The surviving wolves, or potential answers, are defined as a, b, or c, and they reposition themselves around those points.
A:Surround prey When foraging, gray wolves behave as follows when they are near prey: Equation ( 7) is the formula for updating a gray wolf's position, and Equation ( 6) reflects the distance between an individual and its prey.⃗ A and ⃗ C are coefficient vectors, ⃗ X P and ⃗ X are the position vectors of the prey and gray wolf, respectively, and t is the present iterative algebra among them.The following are the calculation methods for ⃗ A and ⃗ C: where ⃗ a is the convergence factor, which decreases linearly from 2 to 0 with the number of iterations.The modulus of ⃗ r 1 and ⃗ r 2 is a random number between [0, 1].B:Hunting Grey wolves can locate their victims and encircle them.The gray wolf locates the prey, and b and c lead the wolf pack to encircle the prey under the guidance of a.We are unfamiliar with the ideal solution in the decision space of optimization issues.(location of prey).Therefore, to mimic the hunting behavior of gray wolves, we suppose that a, b, and c have a better understanding of the likely location of their prey.We keep the top three solutions we've found so far and use their locations to find the prey.The optimal gray wolf individual's position is used to compel the other gray wolf individuals, including d, to update their positions as they approach the 127692 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
prey. Figure 4 depicts the process of individual prey tracking within a wolf pack.The following is a description of the mathematical model used by individual gray wolves to pinpoint the location of prey: where ⃗ D a , ⃗ D b , and ⃗ D c are the distances between a, b, c and other individuals, respectively.⃗ X a , ⃗ X b , and ⃗ X c are the current positions of a, b, and c, respectively.⃗ C 1 , ⃗ C 2 and ⃗ C 3 are random vectors.⃗ X is the current position.
Formula ( 11) defines the step size and direction of individual gray wolf d towards a, b, and c, while Formula (12) defines the final position of gray wolf d.
C:Attack and search for prey The gray wolf completes the hunt by attacking when the prey stops moving.As the value ⃗ a steadily declines to represent approaching prey, the fluctuation range ⃗ A similarly declines, as depicted in Figure 5.When the value of a linearly drops from 2 to 0 during the iteration process, its corresponding value of A similarly changes within the range [-a, a].The following location of the gray wolf can be anywhere between its present location and its prey when the value ⃗ A is within the range.The wolf pack strikes its prey when | ⃗ A| < 1.Otherwise, encourage the wolf pack to break away from its meal in pursuit of something better.

C. OPTIMIZED SP-ANN THROUGH GWOA
In this paper, SP-ANN is trained using GWOA.As a result, SP-ANN's efficiency has also been enhanced using the most recent artificial intelligence optimization algorithms.
The role of GWOA in SP-ANN is to optimize the hyperparameters of SP-ANN to improve the performance of neural networks on specific tasks.The main functions are as follows.
1) Hyperparameter optimization: The performance of SP-ANN is influenced by hyperparameters, such as learning rate, number of neurons, number of layers, etc.The main function of GWOA is to search the hyperparameter space to find the optimal hyperparameter combination, thereby improving the performance of SP-ANN on problems.2) Global search: GWOA has global search properties, which helps to avoid falling into local optima.It extensively searches the hyperparameter space by simulating the collaborative behavior of gray wolf populations, thus having the opportunity to find better hyperparameter combinations.3) Automation: GWOA provides an automated method to search for the best hyperparameters without requiring manual adjustment.This can save a lot of time and effort, especially when the hyperparameter space is very large or complex.Specifically, GWOA is a swarm intelligence algorithm that simulates the collaborative behavior of gray wolf populations and is used to find the best solution to optimization problems.SP-ANN is a neural network structure whose performance is influenced by hyperparameters such as learning rate, number of neurons, number of layers, etc.By combining GWOA with SP-ANN, the hyperparameters of SP-ANN can be automatically searched and optimized to improve the performance of neural networks on specific tasks.The detailed stages for this method are provided in the following stages.
Stage 1: Find out which variables were used in the SP-ANN learning procedure.The lag number (n), SP-ANN degree (deg), and the number of wolves in the wolf packv(ks) are three examples.
Stage 2: Create the first wolf group.Before creating the initial wolf pack, let's define any wolf packs that correlate to each GWOA solution.During the learning process, the weight and bias values of SP-ANN are utilized to identify each wolf's position in the pack.the wolf group, use the root-mean-square error (RMSE) criterion shown in formula (15).
The equation (15) shows the amount of learning samples (m), observed values (x t ), and predicted values (x t ) in turn.In equation ( 13), x t−i is the lag corresponding to the i-th input.Stage 4: Determine the order of gray wolves, namely a, b, c, and d.In the preceding phase, the wolf with the lowest RMSE value is a, followed by b and c for the next three lowest RMSE value, and d for the remaining wolves.
Stage 6: Further update wolves a, b, and c based on the previous step's updated location of the wolf pack.
Stage 7: The process ends if either the maximum number of iterations has been reached or the fitness value of the wolf with the best fitness value is lower than the predefined error value e.If not, proceed to step 4.

A. EXPERIMENTAL PREPARATION
To evaluate the algorithm suggested in this article's effectiveness.Sixty-six children from a local school were chosen for this experiment by this article.They were split into three groups and given access to various multimedia animation resources, including interactive, vector, and traditional 127694 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.animation.Each educational resource lasts for 10 minutes and is chosen at random.Then, have each group learn one of the three multimedia resources.Students can communicate their emotions in the post-test.
We submitted the Grey Wolf algorithm's starting settings before the experiment: The maximum number of repetitions is 2000, and there are initially 30 wolf packs.

B. EXPERIMENTAL ANALYSIS
To verify the effectiveness of the improved activation function PPReLU, the ANN structure in Figure 2 is adopted, and ReLU, PReLU and PPReLU are used as activation functions for training.This study compared three artificial neural networks: the B-HANN, the LSTN-ANN, and the SP-ANN.The outcomes of the experiment are as follows.
From Table 2, it can be seen that the method proposed in this article has a faster iteration speed.
The RMSE of three groups of learners based on three types of animation under different algorithms is shown in Figure 6.
Based on the above results, the following conclusions can be drawn.For traditional animation, compared to LSTM neural networks, SP-ANN, and B-HANN, the proposed method in this paper has reduced average RMSE values by 8.8878, 6.5144, and 6.3758, respectively.
For Vector animation, compared to LSTM neural networks, SP-ANN, and B-HANN, the proposed method in this paper reduced average RMSE values by 12.9294, 12.5282, and 8.9938, respectively.
For interactive animation, compared to LSTM neural networks, SP-ANN, and B-HANN, the proposed method in It can be seen that the algorithms proposed in this article have low measurement errors for different types of animation.

C. SENSITIVITY ANALYSIS
To further verify the effectiveness of the algorithm proposed in this article, a sensitivity analysis was conducted.This section uses different numbers of wolf packs for experiments, namely minority wolf packs, medium wolf packs, and majority wolf packs.
Comparison of the number and time of convergence iteration using different numbers of wolf packs as shown in Table 3.It can be seen that increasing the number of gray wolves will increase the computational resource requirements of the algorithm.Each gray wolf needs to maintain its position and fitness values and participate in the update process, which requires more memory and computing power, increasing computational time and iteration response.
Compare the RMSE results of three groups of learners in different numbers of wolf packs for three types of animations, as shown in Figure 7.It can be seen that as the number of gray wolves increases, the resulting RMSE also decreases.
Therefore, we should choose the appropriate number of wolf packs according to actual needs to balance iteration time and experimental accuracy.

VI. CONCLUSION
Multimedia animation contains many emotional elements, which help people clearly understand the meaning brought by animation and contribute to the dissemination of culture.This article starts with computer technology and uses the Grey Wolf algorithm to train traditional SP-ANN to identify emotional issues in multimedia animation.Compared with traditional algorithms, based on the analysis results, it can be concluded that the method proposed in this article has more successful results than many methods, mainly reflected in the accuracy and error of prediction, and has unparalleled advantages compared to other methods.

FIGURE 1 .
FIGURE 1. NN structure of the Word2Vec model.

FIGURE 3 .
FIGURE 3. The social structure of grey wolves.

FIGURE 4 .
FIGURE 4. The social structure of grey wolves.

FIGURE 5 .
FIGURE 5. Attack and search for prey.

FIGURE 6 .
FIGURE 6. Compare the RMSE results of three groups of learners in different algorithms for three types of animations.

FIGURE 7 .
FIGURE 7. Compare the RMSE results of three groups of learners in different numbers of wolf packs for three types of animations.
Table 1 displays the locations of wolves made up of SP-ANN weights and bias values.Stage 3: Make a calculation of each wolf's fitness function number.To determine each wolf's fitness function value in

TABLE 1 .
Positions of a wolf.

TABLE 2 .
Comparison of the number and time of convergence iteration using different methods.

TABLE 3 .
Comparison of the number and time of convergence iteration using different numbers of wolf packs.