Optimization of the Virtual Scene Layout Based on the Optimal 3D Viewpoint

At present, visual and parametric designs have been widely used in architectural design, urban planning and landscape design for design inspiration. Highly integrated auxiliary design software has become more prevalent and greatly changed the traditional way of working with architectural design and urban planning with parameterised operations, convenient editing methods and efficient workflows. However, these methods for layout optimisation have more or fewer problems, such as humdrum forms, lack of aesthetic principles, and difficulty adapting to the layout optimisation of complex architectural environments, and usually presented obstacles to the users’ understanding of hierarchical relationships in complex environments. This paper presents a novel 3D scene layout optimisation method that combines scene view information with an interactive genetic algorithm to optimise the virtual environment’s layout. We apply the visual perception information of 3D building models to improve the scene’s layout and integrate the evaluation of visual salient features with other factors to realise the intelligent scene’s layout based on optimal 3D viewpoints and a genetic algorithm. The experimental results show that our method can efficiently optimise the layout of the 3D building environment, and it has innovation and practical application value in the fields of architectural visualisation, 3D animation, landscape planning and game design.


I. INTRODUCTION
3D scene could be construed as a digital reflection of the real world, especially in the fields of virtual reality, metaverse and game design. The layout design and scene planning of a complex scene in research fields are quite boring and complicated. The layout of buildings in real scenes has specific aesthetic principles, while the real urban environment has usually been composed of scene partitions with different functions, which have their own unique aesthetic characteristics.
After layout planning, the urban landscape group or commercial centre area often employs landmark buildings with more prominent shapes as the visual focus and is used for The associate editor coordinating the review of this manuscript and approving it for publication was Christian Pilato . organising and expanding the surrounding landscape. From the perspective of geometric information, these areas of visual interest are considerably different from unplanned areas in terms of building area, height, visual saliency and contour features (as shown in Figure 1). In the implementation of the genetic algorithm, the above elements can be used as the evolutionary goal to improve the layout of the virtual environment.
However, many studies have not focused on the visual information and aesthetic features of the scene. Numerous scenario layouts designed by parameterisation are unattractive, boring, and even confuse the user's line of sight in the virtual world.
In this paper, we propose a novel optimisation method for scene layout, which combines the viewpoint feature information of 3D building models with an interactive genetic algorithm to achieve rapid scene layout and reconstruction. Our method was carried out based on interactive genetic user evaluation and combined with mesh visual quality to improve the scene's layout phenotype (as shown in Figure 1). At the same time, a neural network evaluation strategy was introduced to effectively reduce user fatigue. By comparing with 5 typical layout optimisation methods, the experimental results show that our method is superior to a random layout, SGA, TIGA, SA and PSO in complex scene layout optimisation and achieves a more efficient virtual scene layout scheme that is more comfortable with designers' requirements on the basis of less reliance on user evaluation.
As shown in Figure 1, the planning and layout of complex scenarios can be achieved relatively easily by using our approach. It is very difficult for artists to further adjust the complex 3D scene generated by a manual layout. For example, adjusting the building density, the layout position of group units and fine-tuning the orientation of adjacent buildings affect the overall layout form and require considerable work. These trivial tasks sometimes cost manpower and resources as much as rearranging the whole scene. It can is evident that the traditional manual layout has been faced with many problems, and thus, the proposed interactive layout method has practical application value.
Our method has been applied by some researchers in the School of Architecture, Hebei University of Engineering, for the planning layout of a new urban area of Handan and has practice planning some blocks in the middle willows community. In these actual projects, although the final solution still relies on the planning of professional designers and architects, our method played a supporting role in the early stage and improved the work efficiency (as shown in Figure 2).

II. RELATED WORK A. RESEARCH BASIS OF THE GENETIC ALGORITHM AND PARAMETRIC DESIGN
Architectural parametric design reflects the beauty of mathematics in a sense; that is, some natural laws contained in mathematical logic have been presented in a special form of space art; therefore, it usually has unique artistry. In the parametric design of a large scene, relevant research results of optimisation theory have usually been integrated into the process, and the creation core of such stages has usually been regarded as the process of mathematical logic modelling.
In the real world, buildings should produce an echoing harmony, so the implicit correlation between the visual information and the layout appears to be particularly important. In previous work, scenes were organised by visual guidance of landscape nodes or geometric extension lines (as shown in Figure 3).
With the development of architectural visualisation technology, digital means to guide landscape planning and architectural design have become mainstream. Parametric designers often combine parametric means with a scene's geographic information to reasonably allocate scene resources to achieve a more efficient way of working (as depicted in .
In summary, we were inspired by these existing works. We believe that the current urgent problem is combining the organisation of visual elements with the concept of scene planning, so we carried out the following research.

B. APPLICATION OF THE INTELLIGENT OPTIMISATION ALGORITHM IN PARAMETRIC DESIGN AND INDUSTRIAL DESIGN
In parametric design and scene optimisation problems, novel bionic optimisation algorithms are also emerging, which broaden the research ideas of optimisation problems. Zhao et al. proposed a novel artificial hummingbird algorithm (AHA) and applied this bioinspired optimiser to engineering applications [1]. In this algorithm, the characteristics of flight behaviour and an intelligent foraging strategy of hummingbirds were modelled. Guided foraging, territorial foraging and migratory foraging behaviours were also simulated to handle different optimisation tasks, and global optimisation was combined with the memory function of the food source to obtain the global optimal solution more efficiently. Compared with the previous metaheuristic algorithm, the algorithm has less computation and higher solving precision. It has excellent performance in comparison with experiments of the recognised benchmark algorithm and 10 classical engineering cases and has been successfully applied in the design practice of hydropower station operating equipment.
Yuan et al. proposed a novel, elite opinion-based learning and chaotic K-best gravitational search strategy based on the grey wolf optimiser algorithm [2], which has been applied to nonlinear optimisation problems in industrial design. The proposed method is based on elite-antagonistic learning and  a chaotic K-optimal gravitational search strategy (EOCS). It is also combined with a chaotic K-best gravitational search strategy (CKGSS) to improve the global searchability. Experimental results show that this method has excellent robustness and accuracy in complex optimisation problems. In practical application, this method can effectively improve the global searchability and convergence speed, and its performance has proved to be superior to similar algorithms in search accuracy and reliability, which is very suitable for highly nonlinear optimisation problems in practical engineering applications.  Yan et al. proposed an enhanced whale optimisation algorithm for global optimisation, which improved the efficiency of obtaining global optimal solutions [3]. The algorithm simulates the behaviour of humpback whales, such as searching for prey, encircling prey and bubble-net attacking. The Levy flight strategy and the sortium-based mutation operator were also introduced to expand the search scope and enhance the global searchability. This method can effectively avoid the common drawbacks of other whale optimisation algorithms, such as slow convergence speed and ease of falling into a local optimal solution.
Hu et al. proposed a new quadratic interpolation-inspired optimisation algorithm with wavelet mutation [4]. In this method, the quadratic interpolation strategy was adopted to improve the solution accuracy, and wavelet mutation was also introduced to improve the population diversity. This method can avoid falling into a local optimal solution and effectively improve the convergence speed.
In the optimisation problem of engineering design, some more novel swarm intelligence optimisation algorithms and bionic optimisation algorithms based on human competitive behaviour have been successively proposed. Yuan and Ren et al. proposed a novel alpine skiing optimisation algorithm (ASO) based on mathematical modelling of skiers' racing behaviour [5]. In this method, the static sliding and dynamic sliding behaviours were simulated by combining physical fitness and sprinting to solve engineering optimisation problems efficiently and robustly. This method performs well in specific comparative experiments, such as twenty-three unconstrained benchmark functions, some classical competitive optimisation algorithms and four constrained engineering problems. This method has been successfully applied to engineering design practice.
In the field of architectural planning, genetic algorithms have been regarded as the most widely used optimisation method and have also been widely applied in the practical work of scene parametric layout design, such as building form optimisation based on genetic algorithms [6] and building thermal design and control optimisation based on multicriteria genetic algorithms [7]. Many researchers contributed to this area and worked hard pursuing theory perfection. For example, Han et al. proposed a parallel layout, the MRHC method, based on a genetic algorithm to optimise landscape quality [8] , and Wu et al. proposed a multiobjective genetic algorithm to optimise the conceptual design of high-rise buildings [9]. Han et al. proposed a method to obtain architectural view preferences based on CG images and genetic algorithms [10]. In the optimisation of architectural structure and parametric modelling design, many works based on genetic algorithms have also been carried out successively, such as the optimisation design of residential structures based on genetic algorithms by Tuhus-Dubrow et al. [11], the optimisation of architectural modelling based on free-form surfaces by using genetic algorithms by Jin et al. [12], and the 3D object layout constraint method based on genetic algorithms proposed by Stephan et al. [13].
However, the above methods and commercial software generally lack the ability to process complex scenes and aesthetic interest, and they mainly focus on user experience and functionality. This type of layout process has been boring, aesthetically unappealing, or confusing for the rover and cannot carry out an intelligent layout according to the visual aesthetic characteristics. Finally, these approaches often rely on subsequent manual processing.
Our method realises scene layout optimisation through an interactive genetic algorithm based on the optimal viewpoint and adopts visual ''preference'' as the important evaluation basis. The interactive genetic stage of our method also adopts user evaluation, so the deep learning method is also introduced to solve problems of user fatigue and estimation accuracy. We adopt a neural network to solve the user preference estimation problem and effectively improve the objectivity and accuracy of interactive evaluation. Experimental analysis shows that our method can effectively improve the scene layout and integrate the user's psychological preference into the optimised layout strategy to obtain better evaluation results.

III. 3D SCENE LAYOUT BASED ON VIEWPOINT QUALITY AND A GENETIC ALGORITHM A. THE BASIC IDEA
In our work, the optimal view theory has been combined with an interactive genetic algorithm to achieve a 3D scene layout, which is mainly based on previous work and introduces the relevant research of optimal view theory into the design of an interactive genetic algorithm.
There are many calculation methods for obtaining optimal viewpoints, such as a mesh saliency calculation based on Gaussian weighted curvature [14], a mesh uniqueness analysis based on vertex diffusion distance [15], and a surface region of interest calculation [16]. In our past work, we also proposed a calculation method to calculate aesthetic superior viewpoints [17]. All of these methods are beneficial to our present research and would be applicable to our scene layout method.
In a large-scale 3D scene, architectural settlements and spatial organisation are complex, and it is difficult to comprehensively understand scene information by only relying on the traditional planar plan method, while the texture information of single building facades and the mesh visual quality VOLUME 10, 2022  are also important factors for determining the street landscape in a 3D environment (as seen in Figure 7). Therefore, in our method, we consider combining the building facade and the plane information. It is difficult to obtain good user evaluations that only rely on the texture mapping of building facades, so the mesh visual quality of the building models can be regarded as an important factor to improve the composition of street facades.
The architectural layouts of the 3D environment are similar to real world layouts, which needs to learn from the basic principles of urban planning and architectural design in the real world. It is generally believed that there are four basic forms of planar combinations of architectural groups, namely, determinant, peripheral, point group and mixed patterns. Many complex scene structures have been formed by combining and arranging these basic layout principles according to certain rules.
Among them, determinant is a common form; architectural objects have been arranged in order and kept at a certain distance so that the spatial organisation form looks regular but slightly monotonous. Residential buildings, military barracks, school dormitories, factories and other common scene layouts mostly adopt this form in the real world. The determinant has the advantage of keeping most buildings oriented and forming a relatively regular layout, as shown in Figure 8.
The layout form of the determinant is easy to cause monotonous. It is characterised by rigid visual feeling and lacks aesthetic feeling, so it is often transformed into many ''variant'' forms in real scene design to enhance the aesthetic quality and interest of scene layout. Parallel and staggered arrangements are commonly used for determinant layouts. Multiple determinants can be interspersed and spliced to form a richer layout, as illustrated in Figure 9.
A more complex layout can also be achieved by changing the orientation of the building group, but the ''variation'' of various forms should be carefully mixed; otherwise, it is easy  to cause visual ''confusion.'' Figure 10 shows a more complex layout form developed by changing the spacing between buildings, splicing determinant units of different forms, and appropriately changing the orientation of the building group.
The peripheral type has often been shown as a closed or semiclosed courtyard space, which is conducive to dividing space and enhancing regional sense and can form a layout structure similar to a courtyard.
A peripheral layout makes it easy for people to have a sense of security and privacy, but there are many buildings with poor orientation, and the environment has also been relatively closed. Therefore, it often has the disadvantages of poor lighting and is not conducive to air circulation. The peripheral layout in a 3D environment severely blocks the view, often causing users to feel lost in space (as shown in Figures 11).
The point-group layout is relatively free in form and is often represented as a single-courtyard, multisstory pointtype and high-rise tower layout in the real architectural environment. The point group ensures the orientation of the building well, but it is poor in space utilisation and tends to cause a relatively scattered space experience. To make up for the shortcomings of the above common layout problems, determinant, peripheral and point groups are usually combined to form a spatial organisation form with ''rhythm'' and ''rhyme.'' For the 3D scene layout problem, if the designer does not have special knowledge of scenario planning, completing the layout of the complex scene appears to be an almost impossible task. Such works not only need to spend a lot of time on building placement but are also frequently interrupted when the adjustment and improvement occur in the awkward situation of the ''domino effect''. In other words, the efficiency is very low, and readjustment on the original basis is costly and difficult.
In our work, the role of mesh visual quality in the evolution of complex scene layouts became very important. First, we analyse the effect of mesh visual quality on the local layout formed by a small number of building groups and then extended the canonical form to more complex scenes.
Due to the user's knowledge of the scene layout and inadequate access to information, we employ a method based on the ''bounding box'' to realise the small-scale construction group layout optimisation. We assume that multiple bounding box boundaries are wrapping the building model parallel to the layout space coordinates on the x axis and the y axis. The initial simple layout model based on bounding volume boxes is shown in Formula (1) as follows: where Fit is the utilisation rate of the bounding box, n denotes the number of buildings in the bounding box, l i w i and h i represent the length, width and height of the i-th building rectangle area included in the bounding box, respectively, and L, W and H represent the length, width and height of the bounding box, respectively. l and w are the reserved volumes of the rectangular area around the building. If (x i , y i , z i ) and (x j , y j , z j ) represent the coordinates of the centroid of the object in the bounding box of the i-th and j-th, respectively, then there is a relationship for the i-th object, which is shown in Formula (2) as follows: For the j-th object, the basic spatial constraint relation of the bounding box is shown in Formula (3) as follows: Through this constraint, a specified number of buildings are incorporated into the bounding box, thus forming a building Group O.
In architectural Group O, the layout attribute of a single building can be denoted as We define the following constraints: the boundary-crossing constraint, E Insert (U (o)). This constraint condition can obtain the cross information of the boundary between buildings and the layout area in a specific region. The overlap constraint E Overlap (U (o)) is used to calculate the overlap information between individual buildings in the layout area. The direction constraint E θ (U (o)) is used to constrain the building orientation to punish the bad orientation deviating from the positive z-axis direction. The traffic constraint around the building, E Road (U (o)), is used to constrain the minimum path width around the building. The mesh visual quality constraint, E S (U (o)), represents the visual quality of the object and mesh saliency after normalised processing.
This paper argues that the higher the mesh visual quality of architectural blocks, the higher the aesthetic quality of street facade structure, and the more willing people to explore. where the wall-crossing constraint E Insert (U (o)) and overlapping constraint E Overlap (U (o)) can be combined into the position constraint, shown in Formula (4) as follows: Then, the optimisation problem can be defined as follows: We adopted the commonly used energy function method for the basic layout group to transform the optimisation problem of the objective function into the solution of the cost function, where E max denotes the designated constant to avoid negative results, as shown in Formula (6).
Multiple simple local layout templates can be generated by solving the cost function of the multigroup building model. Then, based on the layout templates of these local groups, an interactive genetic algorithm can be combined to realise the layout expansion of more complex scenes. The processing of this stage is relatively simple and can be optimised by a simulated annealing algorithm [18], [19], [20], [21] to obtain the preliminary layout template of the building cluster. The layout template in this stage presents a random layout without affecting the results of the interactive genetic and other methods.
After obtaining the preliminarily generated layout template, we divide the scene space plane into different building groups and into unit blocks of m × n, where elements are the genotypes of genetic operation. The same building can occupy multiple blocks (as shown in Figure 12). The plane space is divided into a matrix of m × n, and the architectural units laid out on the ''master slice'' are traversed line-by-line and form the corresponding coding string. The size of the matrix and the length of the corresponding coding string can be set according to the actual situation, and the mesh visual quality score, building elevation texture score and height information of the corresponding regional units can also be encoded. After this stage, the information of the layout template should be saved. We take A i to represent the i-th chromosome in the k-generation population P k , and the gene element is marked as a i pq , where p ∈ {1, 2, . . . , m} and q ∈ {1, 2, . . . , n}. Two matrices (A i and A j , for example) are randomly selected from the population P k , and the cross mutation operation is performed according to the given probability ϕ. Then, P k matrix chromosomes are randomly selected for the single point mutation operation.
In the optimisation stage of the interactive genetic algorithm, the evolutionary individual is denoted as x. The i-th gene meaning unit of its phenotype is denoted as X i , and the corresponding j k allele meaning unit is denoted as X j k i , so the user's preference for X j k i can be expressed as p(X j k i ). After stable evolution, the adaptive value of x before the end of the global search can be expressed as follows: represent the user's preference for some phenotypes, and τ (X j k i ) ∈ (0, 1] denotes the normalised mesh visual quality measure of the object O corresponding to this allele segment X j k i . This calculation process is based on our previous work. If ξ (·) denotes a normalised operation, then the normalisation process can be expressed as follows: To obtain the user's preference, the distribution of the meaning units of satisfied alleles in the user-evaluated individuals should be counted and measured by the difference between them and the mean of the adaptation value of the user-evaluated individuals. If the dataset that the user has completed the evaluation is denoted as D u , and (x, y) represents the operation of seeking the same, the user preference can be expressed as follows: where x i (t) denotes the individual evaluated by the user in the generation, t-th reaches at least stable generation, |D u | is the size of the evaluated population, and N = To obtain the objective distribution of X j k i evaluation by users, the proportion of X j k i in D u of the evaluated population can be counted. If the length of the chromosomes is s and R encoding has been adopted, a discriminant can be introduced to compare the difference between the average ratio of X j k i and the average ratio of meaning units selected for general alleles as follows: being selected is greater than the general allelic significance unit, and thus, Formula p(X j k i ) has been used to calculate the adaptation value.
τ a represents the mean value of the visual quality of the normalised object mesh and represents the measure of the visual quality of the meaning unit of the general alleles.
i is taken as the user preference gene meaning unit according to the given probability (0.25 and 0.4 according to the scale of different scenes in this paper), and p(X j k i ) is used to calculate the adaptation value. In other cases, the user preference is p(X j k i ) = 0. For the acquisition of meaning units of unknown alleles, our work refers to the estimation method used in TIGA to address this problem [22], [23]. T st is the evolutionary algebra in the stable period. If X j k i is still an unknown allele, the mean of the upper and lower limits of user evaluation can be taken as the estimated value of the unknown allele unit to obtain the individual adaptation estimate as follows: F(x i (t)) denotes the upper limit of user fitness evaluation, F(x i (t)) is the lower limit of fitness evaluation, and the estimated value of unknown alleles can be obtained by Formula (10). The method of the multiple approximation model [22] has been adopted to divide the search space into M n intervals. The approximation model dataset of user evaluated where K is the number of evolving individuals and F(x i ) is the adaptive value of evolving individuals x i . We need to search for the specific value X j k i that could maximise F(X j k i ), and X j k As evolution continues, phenotypes among individuals in the global search stage gradually converge, and a more accurate search cannot be obtained. Therefore, terminating the global search in an appropriate evolutionary algebra and calculating the difference between individuals must be considered as follows: where δ σ denotes the threshold set according to the actual situation. If the algebra of evolution meets ρ(x(t)) ≤ 0, the individuals of the population tend to be assimilated, then the global search would be terminated and local search should be carried out. The evolutionary generation that meets this condition translates to the global search termination generation t Global , and the corresponding global optimal individual is denoted as ). In our work, other researchers' methods are adopted for stratification [17+5]. ∀x i ∈ M has the following relationship, which is shown in Formula (13) as follows: O ki (13) If S min denotes the lower limit of the dataset generated by the approximate modelf i (·) in a particular subspace, when η i > S min can obtain the dataset denoted as S i = {(x k , f (x k ))|x k ∈ M i , k ∈ {1, 2, . . . , K }}, the approximate modelf i (·) of the subspace is generated by M i . Sims believes that the data need to be accumulated through user evaluation until the lower limit of the dataset meets the condition in the later stage of evolution [25]. Takagi believes that adopting a hybrid strategy with multiple models is effective for solving this problem [26].
However, other effective methods can be adopted to effectively segment the search space according to different situations in practical applications. For the situation where population individual adaptive values are scattered and the data volume is large, another method can be adopted to segment the evolutionary space [23]. By using the decision variable x i (t) and the correlation coefficient method, we calculate the dispersion of individual adaptive values f (x i (t)) and x i (t) of the t-th generation to obtain the segmentation of the space, which is shown in Formula (15)(16) as follows: There are some relationships as follows: The correlation between the decision variables and the individual adaptation values of x i (t) can be obtained by Formula (19) as follows: By obtaining the correlation coefficients between multiple decision variables and adaptive values, the search space is divided by the variable that maximizes the mean valuē κ(f (x i (t), x i )), and the search space is segmented by the symmetric Latin square sampling method. The purpose of this step is to split the search space into reasonable subspaces to improve efficiency.
In the later stage of evolution, adaptive methods can be used to segment the search space adaptively by determining the approximation accuracy After segmenting the search space, we obtain the comprehensive approximation model by combining approximate models in multiple subspaces. We calculate the distance between the two evaluated individuals and the subspace. The minimum distance between the evaluated individuals has been set as H min (x k , M n ), and the average value has been denoted as H(x k , M n ). The relationship is shown in Formula (20) as follows: Iff n (·) denotes the approximate model on the subspace M n of the neighbourhood around x k , then the adaptive value of x k can be obtained by H min (x k , M n ), which is shown in Formula (21) as follows: f n (x k ) denotes the adaptive value calculated by approximate modelf n (·) in neighbourhood subspace M n . The adaptive value of user evaluation might cause errors due to user VOLUME 10, 2022 fatigue or mood swings, so many researchers have adopted various improved evaluation strategies [22], [23], [24], [25], [26]. In our work, the neural network method is introduced to solve the problem of user fatigue. To increase the network's memory capacity and effectively extract user preferences, we use the ART1 network to realise the agent of user evaluation, which is described in the following section.
The n-th allele unit corresponding to candidate x is denoted as U m t n , and F eva (·) represents the evaluation function of the layout scheme corresponding to the object. U m t n denotes the energy function corresponding to the layout scheme that corresponds to the adaptive value, which has been split into polynomial factors; thus, the evaluation function of the layout scheme is established in Formula (22) as follows: where x is the selected layout scheme, and R(x), M (x), C(x) and H (x) are the evaluation functions of space utilisation, mesh visual quality, building facade texture and height of the scheme, respectively. Therefore, the optimisation scheme should not only meet the optimal solution in the adaptation value calculation but it should also meet the maximum F eva (x). ω 1 , ω 2 , ω 3 , and ω 4 are the weights of each measurement, and their values should be located in the range of [0,1] and n i=1 ω i = 1. Building facade texture can be classified and coded according to tone, symmetry and user preference. In our work, the influence weight of the building facade texture is adjusted to a small level because it was found in relevant literature and user surveys that the influence degree of the facade texture on the scene is always limited [35]. The modelling details and the visual salience of the mesh have a greater impact on scene visual information. The user's interest in the texture information on the ''flat'' mesh has demonstrated to be generally lower than that with modelling details, such as ''concave and convex.' ' Holland et al. argued that the ''optimal retention'' strategy could be adopted to optimise the evolution of individuals [22], [29]. If the evolutionary individuals x of the t-th generation need to be evaluated, we denote the optimal individual of the t m generation as x b (t m ) with T st ≤ t m < t. The adaptive value is denoted as F(x b (t m )), and the adaptive value of the optimal scheme x e (k i ) of the k i -th generation is denoted as F(x e (k i )). The population begins to evolve as K s is generated and ends as K e .
In this stage, we set δ(U m t n , k i ) to control the composition of x e (k i ), and the relation is shown in Formula (23) as follows: In the training process, we take the corresponding binary form as the input mode and obtain the corresponding value according to the user's evaluation. As the user evaluation stage is subject to many restrictions and users are prone to fatigue, the initial population setting scale should not be too large. The selection of genetic operators includes tournament selection, 2-point crossover and single-point mutation. During initialisation, users can select the appropriate crossover and mutation probability according to the actual situation because the specific distribution of 3D objects in different virtual scenes varies greatly. Our method realises adaptive adjustment according to different evolutionary goals to achieve the comparison of different evolutionary strategies.
We use different weights ω for testing and take the space utilisation ratio R(x), mesh visual quality evaluation M (x), elevation texture C(x) and height H (x) as factors of evaluation function F eva (x). The layout score is conducted by 16 subjects to analyse the importance of different attributes to scene layout optimisation. The experimental analysis in the next section shows that our method obtains better performance than the other five kinds of common layout methods and found that the mesh visual quality assessment M (x) as the optimisation goal to guide the optimisation could obtain the best results.

B. EVALUATION OF INDIVIDUAL FITNESS BASED ON NEURAL NETWORKS
Adaptive resonance theory, ART network, can simulate the cognitive and behavioural characteristics of the human brain, with scalability and fast memorability [30]. If the neural network cannot accurately approach the user's evaluation, it misleads the direction of genetic operation, so the selection of training datasets has a great influence on the performance of the network.
In this section, we adopt the ART1 network to realise the approximation simulation of user evaluation, and the SLHD method is also adopted to obtain the training data of the proxy model of the training dataset [23].
When user evaluation tends to be stable, we denote the learning sample as T u = {(x i , f (x i )), i ∈ {1, 2, . . . , N }}, where f (x i ) is the adaptive value of x i selected by the user, and N denotes the number of typical samples. S min is the lower limit of the neural network learning sample number. When N ≥ S min , the ART1 network can obtain an approximate modelf (x i ) close to f (x i ) through learning.
In our work, the comparison sample has been set as . . , N c }}, where x i denotes the input of the ART1 neural network,f (x i ) is the adaptation value of x i output by the neural network, and N c denotes the number of test samples. An error test cam be adopted to determine whether the method meets the approximation degree of the ART1 neural network method to user evaluation [22], which is shown in Formula (24) as follows: We set σ to determine whether the learnability of the neural network meets the accuracy requirement. Next, we learn the characteristics of the user selection adaptive value in the ART1 network, which performs the following algorithm:

CONTINUE end for
As shown above, f (x i (t)) the adaptive value of the tth generation evolutionary individual. We divide it with the corresponding network input We set the alert threshold ρ ∈ (0, 1] and learning rate β ∈ (0, 1] here, and the output of the neural network has been updated to V 1,2 and V 2,1 .
At layer R, ART1 computes the inner product, V 2,1 m x n , obtains the clustering pattern s corresponding to the maximum inner product, performs the alert test and computes V 1,2 s x n /x X n x n in layer R. The ART1 network can determine whether the result is greater than the alarm threshold.
If V 1,2 s x n /x X n x n > ρ, the network generates resonance and classifies the new mode as clustering mode s and then updates the feedforward weight and feedback weight of clustering mode s. If all the neurons fail, the system creates new ones to create a new category pattern.
Through the training of the ART1 network, users' preference for the genotype combination form has been remembered by the ART1 network, and user evaluation is approximated through the learning process.
In our work, the optimal adaptive solution evaluated by users is denoted asf (x i ), and the accuracy error of the adaptive solution f (x i ) output by ART1 is shown in Formula (25) as follows: The minimum threshold of the accuracy error range is denoted as ξ . If it meets ξ (f (x i ),f (x i )) ≤ ξ , it indicates that the learning accuracy of the neural network is close to user evaluation and can replace artificial evaluation for further population evolution. Among them, the choice of threshold depends on the actual problem. The neural network learning process has been layered by setting different thresholds to divide interactive genetics into different periods and evaluate them.
In general, the evolution process can be divided into two periods, which can be achieved by setting thresholds for two levels of accuracy.
To improve the learning efficiency of the neural network, we first set a large threshold value ξ 1 . Then, the ART1 network could obtain a relatively obvious difference between the adaptive values under a large threshold value to quickly learn the samples roughly.
Due to the obvious differences among the evolving individuals and the overall dispersion, this process can achieve low-precision clustering in a short amount of time. When this stage has been completed, the network can no longer effectively classify the adaptive values, so it should be considered to reintroduce the neural network method for the second stage of evolution.
At this time, users' evaluation is still needed, and they can be reused as samples for learning the ART1 network. According to the information of the degree of difference between individuals, a smaller threshold ξ 2 can be set, and a higher precision approximation model can be obtained through neural network learning.
According to the population size and the actual demand, the whole evolutionary process can be divided into more stages, but in principle, it should not be too many, such as 2-3 evolutionary stages, which should be appropriate [22], [23]. The final goal of the process is to obtain the approximate model that can replace the user for intelligent evaluation, and the determined result depends on the accuracy error of user evaluation and neural network output.
Our method is based on the principle of an interactive genetic algorithm, so user preference plays an important role in the evolution process. Different subjects have different choices on the evolution direction of interactive genetics, which is related to users' aesthetic taste and personal preference. Figure 13 shows the small building groups of 10 building scales obtained by three subjects with different academic backgrounds based on our method. Although the same building monomer has been adopted, the scene layout schemes generated on the master film have obvious differences. Figure 13(a) shows the results obtained by subjects with professional art backgrounds, while Figure 13(b)- Figure 13(c) displays the other subjects' design sketches who do not have professional art backgrounds.   Although the layout schemes formed are different, it is obvious that even non-professionals could obtain a reasonable scene layout by using our method (as shown in Figure 14), but if a purely manual layout has been adopted by non-professionals, a very poor result is obtained. Therefore, our method can substantially improve the efficiency and design quality for those who do not have the professional background of art design and planning.
For professional subjects, the questionnaire verifies that the generated layout could considerably improve the efficiency of production, but if applied in practical applications (such as 3D game scene design and parameterised construction planning), it also needs the help of the relevant professional background knowledge or according to the actual demand for fine-tuning and improving the design. Our method can obviously improve the efficiency and quality of professional and non-professional workers in 3D scene layouts.
We encode a single building location, orientation, adjacent relation, grading and mesh facade texture visual quality scores to facilitate the analysis viewpoint quality for implicit effect scene layout. We randomly select 10 building models as the building layout group compositions and adopt a universal evaluation methodology to analyse the performance of different methods. At the same time, we adopt 19 typical types of buildings for the subjects to choose.
The 19 types of buildings are as follows: Place style, Classical residence, Pagoda style, Gothic style, Byzantine style, Rococo style, Baroque style, Commercial buildings, Steampunk style, Modern residential buildings, Pavilion, Gallery, Gymnasium, Museum, Villa, Islamic style architecture, Japanese style, Buddhist temple and Modernism style. Road network design is not carried out during layout operation because road planning is complicated and depends on practical problems.
The experiment is mainly based on a rectangular ''master slice'' planning process. The aesthetics characteristics of a single building are particularly important in a building group but have very little influence in a larger scene. Therefore, for a larger scene expansion, it could be disassembled into a small building group. The layout optimisation strategy of placing a ''bounding box'' and setting layout constraints such as area, orientation and energy function according to actual requirements is relatively simple to achieve.

A. PERFORMANCE EVALUATION OF ALGORITHMS
Commonly used evolutionary strategies in the field of architectural visualisation and layout optimisation include the proportional selection of adaptive values and multipoint crossover and single-point variation [6], [7], [8], [9], [10], [11], [12], [13]. A number of researchers believed that the setting range of crossover probability was [0.6, 0.95], while the optimal value range of mutation probability was [0.001, 0.1].
In the experiment, tournament selection has been adopted because it is more suitable for interactive genetic operation than proportional selection, random traversal and roulette, and two-point crossover and single-point variation have also been adopted. The crossover probability is set to 0.75 in initialisation, the mutation probability is set to 0.04, the evolution termination generation is set to 200, and the population size is set as 10. The number of participating users varies according to different evolutionary goals. When users are tired, they turn to the ART1 network for adaptation evaluation.
In terms of performance comparison of layout generation, we set up a series of comparative experiments. For simplicity and ease of distinction, we denote our method as the VPbased method. In contrast to Experiment 1, we adopt our method (VP-based method) for comparison with TIGA and SGA. The three types of methods are tested 10 times, and the differences in evolutionary algebra are compared. TIGA and  SGA are 63.8 and 89.4, respectively, and our method is 46.5; therefore, our method can effectively reduce the population evolution algebra. The experimental results confirm that our method can effectively exploit the user agent method to evaluate and reduce user fatigue, as shown in Figure 15.
In the 10 experimental sets, the average number of individuals required before user evaluation based on our method is approximately 195.8, while the average number of evolutionary individuals searched in the evolution process is approximately 321.4. The average number of individuals evaluated by the TIGA method is approximately 283.5, and the average number of individuals searched is roughly consistent with the number of individuals evaluated. The number of individuals searched by our method is approximately 1.13 times more efficient than TIGA, but the number of individuals evaluated is only 0.69 times that of TIGA, as displayed in Figure 16.
We use Gong et al.'s satisfaction calculation method to calculate the satisfaction of TIGA and our methods and adopt the upper limit mean of the interval of users' evaluation of the optimal solution for estimation [22], [23]. Satisfaction with SGA is calculated by the satisfaction definition in a simple genetic algorithm [23]. Figure 17 depicts the comparison of the satisfaction degree of the optimal solutions of the three methods in the 10 experiments. The results show that the average satisfaction degree of the optimal solutions obtained by our method obtain the highest scores.

B. USER SATISFACTION AND AESTHETIC EVALUATION
The reliability of the interactive genetic method in the early stage of population evolution is low because users have blind and random evaluations of the layout of complex scenes in the early stage. After the gradual familiarity with the scene content, users' cognition of the scene is gradually enhanced, and the credibility of the evaluation is also improved. In contrast, in experiment 2, our method adopts the evolutionary strategy of mesh visual quality combined with user preference to effectively eliminate those negatives.
The contrasting experiments recruited 6 postgraduate students and 10 undergraduate students studying landscape design as subjects to evaluate the scene layout results. All the subjects were familiar with Galapagos and Rhinoceros 3D landscape layout plug-ins based on a genetic algorithm and had professional background in landscape planning and aesthetic evaluation.
In the experiment, we select the 3D scene Layout scheme generated by the above 6 methods: random layout, simple genetic algorithm (SGA), traditional interactive genetic algorithm (TIGA), simulated annealing algorithm (SA), particle swarm optimization (PSO) and our method (VP-based Method) to acquire 5 groups of schemes respectively.
In the experiment, the 16 subjects with a professional background were ranked in descending order according to the ranking of professional ability, and the subjects with higher rankings had a stronger professional ability. Moreover, the subjects were analysed to evaluate the various aesthetic attributes of the optimal solution of various layout methods. The evaluation includes the spatial organisation form of the layout (as seen in Figure 18), the aesthetic evaluation of the street facade (as seen in Figure 19), and the user experience of roaming in the generated scene (as seen in Figure 20).
By analysing subjects of the layout, it was found that the layout of the various aesthetic attribute evaluations was subjective. The subjects' evaluation curves did not directly reflect the aesthetic evaluation with the professional ability to ascend accompanying trends (aesthetic evaluation of the individual differentiation changed more obviously), but for the same subjects, scores obtained by the method according     to different types were stable. It objectively reflected the different aesthetic qualities obtained by the six layout methods. Figure 18 displays the evaluation of the subjects on the satisfaction of layout plane organisation obtained by different methods. Figure 19 shows the evaluation of the aesthetic quality of the street scenes presented by street facades. It was found that our method not only generates a layout plane with high satisfaction but also obtains a high evaluation on the combination of street facades. Figure 20 depicts the user experience of the subjects during virtual tours in scene schemes generated by different methods. The reason for this result is attributed to the optimisation and evolution of the mesh visual quality of the 3D buildings.
After analysing the comparative experiments of the subjects with professional backgrounds on layout schemes generated by different methods, we also set up comparative Experiment 3 to analyse the scores of the non-specialists (ordinary subjects) on various layout optimisation methods. This paper carries out aesthetic scores of ordinary subjects according to the optimal solutions obtained by the above six comparison methods (random layout, SGA, TIGA, SA, PSO and VP-based method). The subjects were all non-art professionals, and the scene scale was set to 6, 15, 30, 70 and 120 buildings, and 5 types of schemes were screened out from each method for comparative experiments.
In the experiment, 37 ordinary subjects without professional backgrounds were surveyed by a questionnaire. They evaluated layouts by viewing the scene plan and perspective and then scored the layouts. We adopted Cronbach's alpha value to analyse the credibility of the questionnaire. The standardised scoring standard was set to 10 points with one decimal place reserved, and normalised processing was carried out to obtain the mean score between (0,10]. It can be seen from the analysis (as shown in Figure 21) that our method obtains good scoring results on different scene scales, while the random layout method always maintains the lowest scoring level. The score of the TIGA layout was relatively stable and comparatively high (possibly because TIGA reflects the preference of the subjects), but the performance was generally lower than that of our method. The score of the PSO method is lower than that of the TIGA method and shows a certain volatility, which varies with the change in the scene scale. The SGA and SA methods have high volatility, and their scores are considerably lower than those of the TIGA method, the PSO method and our method.
It can be observed that with the increasing scene scale, the scoring performance of other methods except ours shows a downward trend. This result indicates that the building cluster formed by these methods performs poorly after expanding into large scenes and should only be suitable for layout optimisation of small scenes or local areas (as shown in Figure 21).
On the basis of the scene layout scheme obtained by our method, subjects also have to be permitted to fine-tune the location and orientation of some buildings in the scene, which can easily obtain more ideal effects. If we do not rely on our methods, the subjects may spend almost five times more time manually adjusting the layout of the building in a random mess.
It is worth noting that if users rely on purely artificial methods to achieve virtual scene layout, they usually need to adjust the overall scene layout repeatedly for more than 3-5 times. This phenomenon may be attributed to the insufficient understanding of the complexity and diversity of the scene information because local adjustments should easily destroy the integrity of the scene layout. The vicious cycle will continue and repeat like ''overturn-reconstruction.'' Such artificial approaches bring great inconveniences to manual 3D environment layout planning. Then, it can be predicted that the intelligent scene layout method will become the development trend of virtual environment design.

C. ANALYSIS AND SUMMARY OF EXPERIMENTAL RESULTS
We set up three groups of comparative experiments and compare them with five classical methods: the random layout method, SGA (simple genetic algorithm), TIGA (traditional interactive genetic algorithm), SA (simulated annealing algorithm), and PSO (particle swarm optimisation). Through the  above comparative experimental analysis, it can be concluded that our method avoids the shortcomings of other methods,   other recognised algorithms, our method can expand the search scope, prevent the algorithm from falling into a local optimum, and enhance the global searchability. It has a faster convergence speed, it is easy to find the global optimal solution, and it is more competitive in accuracy and robustness.
By comparing the results of Experiment 1, it can be proven that our method is superior to the TIGA and SGA methods in population evolution algebra and can effectively reduce the user fatigue problem. The number of individuals searched by our method is approximately 1.13 times that of the TIGA method, but the number of evaluated individuals is only 0.69 times that of the TIGA method, which effectively expands the search scope and reduces the number of evaluated individuals. In terms of satisfaction calculation, we adopt a specific satisfaction calculation method proposed by Gong and Sun et al. to calculate the satisfaction of SGA, TIGA and our method [23], and the results show that the average satisfaction of the optimised solution obtained by our method received the highest scores.
In comparative Experiment 2, our method is compared with the random layout method, SGA, TIGA, SA and PSO from the analysis of professional subjects' satisfaction with layout plane organisation form, scene roaming user experience and street view facade aesthetic quality. By comparing the results of Experiment 2, it can be seen that because the grid visual quality of building objects is effectively used to optimise the layout quality, our method in this paper not only generates a layout plane with high satisfaction but also obtains a high evaluation of the combination of street facades, which is higher than the other five classical methods.
In contrast to Experiment 3, we still use the above six comparison methods (random layout, SGA, TIGA, SA, PSO and VP-based method) to score the aesthetic results of ordinary subjects and adopt Cronbach's alpha value method to analyse the reliability. By comparing the results of Experiment 3, it can be seen that our method obtains better scoring results under different scene scales and performs considerably better than the other five methods. The random layout method always maintains the lowest score level, and the score of the TIGA method is stable and relatively high (possibly because the TIGA method reflects the preference of the subjects), but it is generally lower than the performance of our method. The scores of the PSO method are lower than those of the TIGA and our method and have a certain volatility, which varies with the change in scene scale. The SGA and SA methods have high volatility, and their scores are substantially lower than those of the TIGA method, PSO method and our method.
By analysing and comparing the results of Experiment 3, it can be proven that with the increasing scene scale, the score performance of the other five methods except our method shows a downwards trend. This phenomenon shows that the building groups formed by these methods perform poorly after being extended to large scenes and are only suitable for layout optimisation of small scenes or local areas. Experiment 3 confirms the fact that our method is suitable for larger and more complex scenarios than other methods. The advantage of our method lies in the fact that the larger the scale of the scene, the better the performance of our method, and the stronger the robustness and stability.
Our approach has also been applied to real project cases. We apply our method to the rapid generation of urban layout, with high efficiency and good visual effect. As seen in Figure 22, we made 3D modeling of landmark buildings in the real scene and placed them in specific positions according to the real scene content. Then we imported a large number of 3D building models through the building library.Through our method, the architectural models from the material library are filled between landmark buildings in a good layout form, forming a layout form very similar to the real scene. The implementation process of the whole scheme is very efficient. It can be seen that our method is very suitable for the digital construction of large urban scenes.
We put the virtual scene in Unity 3D engine and set up the material and lighting effects, highlighting the appearance of the landmark buildings (as seen in Figure 23-26). These landmarks are defined elements of the scene and are immutable. We basically restored these landmark buildings one-to-one, and completed the corresponding interactive functions, so that users can enter these buildings for roaming or virtual shopping. For the area around these landmarks, all the scene layouts are automatically generated by using our method, which can not only deliver the efficiency of digital city construction, but also provide a meaningful reference for the renovation and expansion of the city block.

V. CONCLUSION
We present a novel optimisation method for scene layout and guide the evolution direction of scene layout through mesh VOLUME 10, 2022 visual quality evaluation. Then, we combine an interactive genetic algorithm with visual feature information of 3D buildings to achieve fast layout and reconstruction of a scene. This method is based on interactive genetic and user evaluation and is then combined with mesh visual quality to improve scene layout phenotypes. At the same time, the neural network evaluation strategy has been introduced to effectively reduce the error caused by user fatigue.
According to the experimental analysis, among the different evolutionary objectives, the best solution is to adopt the mesh visual quality evaluation as the evolutionary objective to guide the virtual scene layout optimisation performance. Its performance proves to be relatively better, and it can rely less on user evaluation to effectively reduce the cost of artificial evaluation. The SGA, PSO and SA methods have randomness and ignore user psychological preferences; these characteristics lead to low-quality scene layouts and are unsuitable for layout optimisation tasks of large scenes.
Compared with the five typical layout optimisation methods, the experimental results show that our method is superior to the random layout, SGA, TIGA, SA and PSO in complex scene layout optimisation. Our method can achieve a more efficient virtual scene layout scheme that is more in line with designers' requirements on the basis of less reliance on user evaluation. In practice, this method can guide designers to conduct rapid draft design, scene layout optimisation and scheme evaluation and assist designers in achieving scene reconstruction and evaluation decisions. It is envisaged that this method will also be applicable in 3D game scene design, landscape animation design and other virtual scene planning problems.
In future work, we will continue to advance this research and fully explore the practical application value and development potential of our work in related fields, such as scene layout, reconstruction and parametric design.