A Data Integration and Simplification Framework for Improving Site Planning and Building Design

Site planning and building design results are generally managed in Geographic Information System (GIS) and Building Information Modeling/Model (BIM) separately. The incompatibility of data has brought potential challenges for the assessment and delivery of the results. A data integration and simplification framework for improving site planning and building design is proposed in this paper. A BIM-GIS integrated model with a multi-scale data structure is developed to link the results of site planning and building design together. Geometric optimization algorithms are then designed to generate simplified building models with different levels of details (LODs) based on the information required at each scale. This paper provides a feasible way to integrate planning and design data from different sources to enhance the evaluation and delivery of the results. The proposed approach is validated by a village construction project in east China, and results show that the method is capable to integrate site planning and building design results from different platforms and support seamless visualization of multi-scale geometric data. It is also found that a seamless database facilitates understanding of planning and design results and improves communication efficiency. Currently, the main limitation of this paper is the limited access to 3D real-world data, and data collection techniques like point cloud are expected to solve the problem.


I. INTRODUCTION
Site planning and building design are two closely related stages in a building construction project. The task of site planning is to determine the optimal location of the building to achieve the coordination of convenience, resident comfort and harmony with natural landscapes [1] . In built-up areas, interaction with neighboring buildings is also an indispensable factor to be considered. The subsequent building design stage focuses on the arrangement of spaces and building components to create a safe and livable structure. In a typical construction workflow, the planning result of site selection is an important reference for architects to design buildings [2] . The placement of windows is suggested to be based on the ventilation conditions at the site to take full use of natural winds [3] . The daylighting conditions are used to arrange the orientation and layout of the building to reduce light consumption [4] . The neighborhood conditions, including prosperity and accessibility, will also have a major influence on the design of the floorplan [5] .
In recent years, information technologies including Geographic Information System (GIS) and Building Information Modeling/Model (BIM) have been widely applied in the site planning [6] and building design process [7] . The decision-making process of site planning involves multiple geospatial factors including local availability, land cost and topography [8,9] , which can be managed in and accessed from GIS. As a result, GIS has been considered a popular and effective tool to determine suitable locations under multiple criteria and constraints [8] , and the planning results are often presented in 2D or 3D GIS maps [10] . On the other hand, BIM is found to be ideally suitable for the building design process for its ability to integrate and deliver building information [11] . Architects can optimize their design in the BIM platform and deliver design results to the subsequent stages of the building lifecycle in BIM files [12] .
Although site planning and building design are closely related in workflow, there are still gaps in the data delivery of the two processes. Site selection results are generally delivered based on GIS files such as 2D aerial maps [13] , while the building design plan is usually presented in BIMs [12] . Since BIM and GIS are independent information systems that are originally designed for different purposes and follow different data exchange standards [14] , the results of the two processes are currently difficult to be managed and presented in an integrated system. Other participants of the building project, including owners and constructors, need to access and view the planning and design results on two independent platforms. Site selection results, including surrounding buildings, topography and landscapes, are difficult to be linked with the design of the building itself, which reduces the communication efficiency in the collaborative design process.
The integration of BIM and GIS has become a rapidly developing and widely applied idea in the building industry. At the data level, the integration of BIM and GIS mainly focuses on the conversion and unification of data formats, which provides a consistent and easy-to-access database for collaborative working in the building project [15] . Currently, the results of site planning and building design are managed in BIM and GIS separately, which results in extra efforts to connect the works of the two processes together. This paper applies data-level BIM-GIS integration to address the problem. BIM-GIS integration has been successfully applied in the building industry including site planning and building design stages [16] . BIMs of the existing buildings can be integrated into GIS to supplement the essential semantic information for site selection [17] . GIS can also provide geospatial information to BIM to support sustainable building design [18] . However, current BIM-GIS integration research mostly focuses on separated stages in planning and design [14] . The purpose of current research is mainly to supplement data to enable the decision-making process to consider more factors, while the results of planning and design are still difficult to be delivered, managed and visualized in a unified manner. This paper aims to develop a cross-stage BIM-GIS integration method to provide a comprehensive view of site planning and building design results.
In detail, a BIM-GIS integrated framework for improving the workflow between site planning and building design is proposed in this paper. A unified information model integrating BIM and GIS is first designed to link the results of the two phases together. Since the two stages focus on different spatial scales, a multi-scale data structure with different levels of details (LODs) is implemented in the model. Geometric optimization algorithms are then developed to generate building models with different LODs corresponding to different scales to simplify geometric model and accelerate data transmission and visualization. At last, a web-based visualization platform is developed to provide a unified view of site planning and building design results. The proposed framework bridges the gaps in data integration of the site planning and building design process, and provides practical tools to manage and visualize the design results of the two stages together. The proposed framework is expected to improve communication efficiency in the collaborative design process.
The remainder of this paper is organized as follows. Related researches of applying BIM and GIS to improve site planning and building design process are first reviewed. The methodology including data model definition, BIM-GIS integration and multi-scale geometric optimization algorithms are then detailed. The next section gives a case study to evaluate the performance of the proposed approach. Finally, the discussion and conclusion are given.

II. LITERATURE REVIEW
Site planning is a key process of building planning and has a significant impact on the subsequent design and construction stages [19] . In the past few decades, various site selection techniques from heuristic methods to precise methods have been proposed [8] . Due to the complexity of the problem, evaluation of building location often involves multiple criteria such as local climate, terrain, land use and existing buildings [8] . GIS, which is an information system for storage, query, analysis and visualization of geographic data [20] , provides an ideal tool for managing these decision data. Cheng et al. [19] established a GIS database integrating population, streets, traffic volume and household income to help determine the optimal location of a mall. Kumar and Bansal [8] developed several GIS-based datasets including elevation raster data, road features and existing buildings to support the safe site selection process. Algorithms such as multi-criteria analysis [10] and analytical hierarchy process [21] can be then developed to acquire data from GIS and calculate the optimal location. The evaluation results can also be integrated into GIS for visualization [10] . Data in the site planning process can be delivered to support the subsequent stages of the building lifecycle, such as the construction planning process [22] through GIS.
Building design requires the collaboration of multiple parties including owners, architects and structural engineers [23] . BIM is considered to be an ideal medium to manage and deliver design results in the building design process [24] for its ability to store, manage, exchange and express building information based on three-dimensional models [25] . BIMbased platforms have been developed to integrate building design results from multiple software in different formats and promote data interoperability [26] . It is proved that the appropriate application of BIM can address the problem of data sharing and communication barriers, enabling interdisciplinary design teams to understand each other's work more deeply [27,28] . Data-driven algorithms can also be  TABLE 1  BIM-GIS INTEGRATION RESEARCH IN SITE PLANNING AND BUILDING DESIGN STAGE  Research  Stage  Integration approach  Extracted data  Integration purpose Isikdag et al. [17] Site planning Extract BIM data into GIS Semantic data of buildings Supplement data for site selection analysis Wang et al. [29] Site planning Extract BIM data into GIS Functional zoning of buildings Simulate traffic flow for site layout optimization Ouyang and Du [30] Building design Extract GIS data into BIM Terrain data, climate data, economic data, urban planning data Supplement data for building performance analysis D'Amico et al. [31] Building design Extract BIM data into GIS Geometric data of buildings Analyze the impact of the buildings on the surroundings Amirebrahimi et al. [32] Building design Extract BIM data into GIS Semantic data of buildings Assess flood damage to buildings Bai et al. [33] Building design Extract BIM and GIS data into another system applied to the integrated information in BIM to optimize the design plan. For example, the thermal performance of building envelopes can be assessed through the physical properties of components in BIM [34] . And evacuation in emergency situations can be simulated based on the spatial topology of the building [35] . Building design results are suggested to be provided to the subsequent stages of the project lifecycle in BIM to improve collaboration [36] .
Although GIS provides a data modeling specification CityGML to exchange building data [37] , most GIS software lacks sufficient tools for detailed building modeling [38] . On the other hand, BIM is difficult to manage the surrounding geographic information of buildings [16] . In applications where both building and surrounding environment data are involved, BIM-GIS integration is proved to be an effective method to improve the data management process [14] . Integrating BIM and GIS can combine the advantages of both systems to support comprehensive building and city modeling [15] and multi-scale information management [39] . A seamless BIM-GIS database can also reduce the workload of information acquisition and improve the efficiency of data exchange [40] . Some BIM-GIS integration applications have been achieved in site planning and building design stages, as concluded in Table 1.
The mainstream methodology of data integration can be divided into three categories, including extracting BIM data into GIS, extracting GIS data into BIM, and extracting both BIM and GIS data into another system [16] . During the integration process, BIM is designed to provide detailed information of buildings, such as semantic and geometric data of buildings, and material property of components, while GIS is responsible for supplying regional geographic data, such as terrain and climate. The purpose of integration is mainly to provide essential data for numerical analysis [15] .
Most of the current BIM-GIS integration research focuses on improving workflows within a single stage, while the cross-stage application still remains to be explored [15] . As a result, although BIM-GIS integration applications have been achieved in both stages, the delivery of planning and design results is still based on a single information system. Currently, design results of the two stages often need to be managed and visualized on different systems due to the lack of effective cross-stage data integration methods [14] . In site selection tasks, GIS is a mainstream method to manage geographic data involved in the decision-making process and present candidate locations [41] , while the presentation of building design results is mainly based on electronic drawings or BIM [23] . Therefore, an original BIM-GIS integration framework is proposed to realize data integration of the two stages. As shown in Table 1, the proposed approach extracts GIS data from site selection and BIM data from building design, and manages them in the designed data model. The proposed method is expected to support integrated delivery and visualization of site planning and building design results.
BIM focuses on data management of building internal details, while GIS supports a broader information scale from buildings to cities [42] . How to integrate data of different scales is one of the main challenges to achieving BIM-GIS integration [15] . One solution is to map the scale structure of BIM or GIS to the corresponding scales of the other system [18] . However, details may be lost during the conversion process [42] . The idea of multi-scale models has been proposed to manage data of different levels [43] . The multiscale data structure consists of data models of different scales, such as micro-scale model and macro-scale model which share the same database with consistent information and are closely linked with each other [39] . Multi-scale models have been applied to integrate BIM and GIS information for VOLUME XX, 2017  different purposes, such as collaborative railway design [43] and planning of infrastructure projects [44] . Generally, models of different levels in a multi-scale model can be divided by LODs [39] . BIM and GIS have both defined LOD schemes to organize elements with different amounts of details [45,46] , but their information definition is different. As a result, LOD mapping is often required when integrating BIM and GIS data at different scales [47] . Original LOD frameworks have also been proposed for applications integrating BIM and GIS data into another system [48] . Besides, LOD can also be applied to reduce the workload of loading and rendering models during the visualization process [49] . For example, urban buildings can be rendered in high fidelity and real-time with multi-LOD building models [50] . In this case, geometric simplification algorithms are often required to generate building models with different LODs [51] .

III. METHODOLOGY
As illustrated in Fig. 1, an original framework to improve the workflow between site planning and building design is proposed in this paper. Some commonly used BIM and GIS data formats in the planning and design process are considered in the approach. An introduction of these formats is listed in Table 2. IFC is an open standard that supports data exchange of geometric and semantic information in BIM.
The format is widely applied in practice and can be exported by almost all mainstream design software [52] . Therefore, the proposed framework selects IFC as the exchange format for building design results to access data from various design platforms. Shapefile is a widely applied format for exchange geographic vector data. A variety of data that need to be considered in site planning, such as regional data and transportation data, can be represented in vector format and exchanged with Shapefile. However, Shapefile is not an open standard and is not well supported by some platforms. Therefore, another widely accepted format, GeoJson, is selected as a supplement to exchange vector data. The two formats are applied together to ensure the proposed framework can access planning data from most GIS databases. Sometimes, site planning also involves 3D data such as the reconstructed oblique photography model or point cloud model [53] . The OBJ format is selected to work with these data for its versatility in exchanging 3D information.
To process the delivered design results in BIM and GIS files, a unified pipeline to deal with data in different formats was proposed. A file parsing program was first developed to extract the required information from files. The extracted data were then merged and integrated into a unified data model, and geometric optimization algorithms were further VOLUME XX, 2017

A. THE MULTI-SCALE INFORMATION MODEL
As shown in Fig. 2, the proposed multi-scale information model consists of four levels. The structure of the model is designed to combine the information characteristics of different BIM/GIS file formats. BIM files usually contain detailed design results of single buildings. And the hierarchical modules including Building, Building Element and Mesh are designed to match and store the information in BIM files. Shapefile and GeoJson are used to integrate information of building groups, which will be organized in the Building Group module. OBJ files are applied to exchange reconstructed 3D building and terrain model in the proposed approach. These data will be managed in the building and terrain module. The corresponding data collection is established in the MongoDB database to manage the integrated BIM/GIS information in the cloud server.
As shown in Fig. 3, multiple information levels at different scales are defined in the proposed model to support multiscale applications and carry out the subsequent geometric optimization algorithms. The proposed model extends the concept of LOD in CityGML [46] to organize planning and design information at different scales. Corresponding storage structures are designed in the Terrain and Building Module to organize the hierarchical data.
(1) The regional scale contains 3 LODs including LOD0, LOD1 and LOD2. This scale mainly organizes the region and terrain information that are involved in the decision-making process of site selection, as well as the location and the brief appearance of buildings as planning results. Results of site selection can be visualized together with the outline of design results at this scale to evaluate if the planned building is in harmony with regional landscapes and the community.
(2) The single building exterior scale contains the information defined in LOD3. This scale manages a building model which is geometrically optimized from the original design plan. Only the exterior components of the building design results will be preserved. The scale is mainly designed VOLUME XX, 2017 (3) The single building interior scale corresponds to the LOD4 level. Detailed information about interior building components will be retained. The scale is designed to facilitate participants of the building project including constructors and owners to view the internal design results.

B. INFORMATION EXTRACTION AND INTEGRATION
As illustrated in Fig.4, information extraction and integration methods are developed for involved BIM/GIS file formats. Since decoding methods for different file formats have been developed, the main work of the paper is to reorganize the extracted BIM and GIS data and map them to the corresponding level of the multi-scale information model to achieve data integration and unified management.

1) IFC FILES
IFC files contain rich geometric information and semantic information of buildings. An open-source package xBIM [54] was applied in this paper to realize the information extraction of IFC files. The extracted data are stored in EXPRESS entities. These entities are then organized into a hierarchical structure through the Decomposes and IsDecomposedBy attributes defined by the IFC schema. For example, an IfcProject entity will have an IsDecomposedBy attribute, whose value is a set of IfcRelDecomposes entities that reveal the entities making up the IfcProject. Generally, these entities are IfcSite entities. The IfcSite entities can be further decomposed by IfcBuilding entities. In this way, a hierarchical information structure shown in Fig.4 is obtained. The hierarchical entities are then mapped to corresponding modules of the information model. An IfcProject entity is mapped to a BuildingGroup collection. And the IfcBuilding and IfcBuildingElement entities correspond to the Building and BuildingElement module, respectively.
Multiple geometric representations can be applied to define the geometry of IfcBuildingElement entities, including bounding box representation, surface model representation and boundary representation (BREP). The geometric representation of a certain building element can be obtained from the IfcShapeRepresentation attribute. For these representations, xBIM provides functions to uniformly convert them into surface models composed of triangular meshes. The converted geometric data is recorded in the Mesh module.

2) SHAPEFILE AND GEOJSON FILES
Shapefile and GeoJson are common formats to organize vector geospatial data. The Geospatial Data Abstraction Library (GDAL) [55] is used to process files in these two formats. Although data are organized differently in original files, GDAL can read these files into memory in the same data structure, as shown in Fig. 4. Each Layer class is mapped to a building group entity. The Feature class under Layer corresponds to the Building or Terrain module. Coordinate data in the Geometry class are extracted and stored in the Mesh module. VOLUME XX, 2017

FIGURE 5. HLOD tree to organize spatial information
As geometric data are organized in triangles in the Mesh module, while Shapefile and GeoJson files use polygons to represent geometry, structural transformation of geometric data needs to be carried out. The ear clipping algorithm [56] is applied in the proposed method to triangulate polygons. The algorithm works based on the two ears theorem which states that there are two "ears" in any polygon without holes. The "ear" refers to a triangle where two sides of the triangle are edges of the polygon, and the other side is inside the polygon. The algorithm iteratively detects "ears" and removes them from the polygon. Finally, the polygon can be represented with triangle meshes and stored in the Mesh module.

3) OBJ FILES
OBJ files are often used to exchange 3D models generated by oblique photography and point cloud scanning in site planning projects. These files organize geospatial information in text documents and distinguish different geometric elements by the identifiers at the beginning of text lines. A parsing program is developed in the proposed approach to read OBJ files and map the geometric elements to the information model. The group element, which is a collection of points and faces, is mapped to the Building module. The face, vertex, texture and normal vector elements are integrated into the Mesh module. Since OBJ files also apply surface models represented in triangle meshes, the geometric data can be imported into the Mesh module directly.

C. MULTI-SCALE GEOMETRIC OPTIMIZATION
In order to generate building models with different LODs for multiple scales, a multi-scale geometric optimization method is developed. As shown in Fig. 5, the proposed method achieves the geometric optimization of building models based on the LOD framework. For a single building, optimization algorithms at single building interior, single building exterior and regional scales are carried out separately to generate three optimized models with different LODs. These models are then integrated back into the building module of the multi-scale information model. Data can be then extracted at different levels for delivery to applications of different scales. When buildings need to be displayed, these models are exported to Cesium [57] , on which the visualization platform developed in this study, to achieve seamless visualization of site planning and building design results.
As shown in Fig. 5, the visualization platform implements the hierarchical display of models of different LODs and scales through the Hierarchical LOD (HLOD) framework. All geometric information is organized in an HLOD tree where each node in the tree represents a specific range of space and records the geometric information of the model in the space. Specifically, the root node manages all the geometric information of the construction project, the node at single building exterior scale records data of the single building, and the leaf node stores geometric data of the room scale. The child nodes are always inside the boundaries of the parent node and usually have more detailed information than the parent node. When the camera is far from the model, the spatial data in the root node is first displayed. As the camera approaches, the content of the parent node is replaced by the child node to display a more detailed model. In the proposed method, models of the regional level, single building exterior level and interior level are organized in the root node, intermediate node and leaf node respectively. The displayed VOLUME XX, 2017

1) GEOMETRIC OPTIMIZATION FOR SINGLE BUILDING INTERIOR
The single building interior scale model is the most detailed model that contains rich geometric and semantic information of building components. This scale is used to present the design results inside buildings, including interior decoration, furniture placement and layout of space elements such as rooms, corridors and stairs. Since the main focus is inside the building at this scale, surrounding environment and buildings in the site planning process can be first filtered out. The geometric optimization algorithm was then designed to further reduce the size of the single building model.
The general idea of the algorithm is that the building model does not have to be entirely displayed, since the overall appearance of the building is not concerned with the scale. Building elements far away from the camera will be blocked by walls or other components that are closer to the camera. As a result, only components around the camera are displayed unobstructed, and other parts of the building will not be observed from the perspective even if they are loaded into the rendering pipeline. As an optimization, these parts can be unloaded to reduce the calculation required for visualization while maintaining the fidelity of the observation around the camera. This idea can be implemented by dividing building models into parts and organizing them into HLOD trees.
The proposed HLOD data structure for building division is shown in Fig. 6. The HLOD structure uses nodes to represent VOLUME XX, 2017

FIGURE 9. Flow of the geometric optimization algorithm for single building exterior
a certain cuboid space. Building components inside or intersecting with space are also recorded in the node. Specifically, octrees are selected to organize the spatial data in this method. Each node can generate eight child nodes in octrees to represent the subdivision of the space. In the proposed method, the eight child nodes are divided evenly. The flow of the proposed algorithm is illustrated in Fig. 7.
After the single building model is filtered from the multiscale information model, the root node containing all the elements of the building is first created. The node is then divided iteratively to generate child nodes. In each iteration, the bounding box of the node is calculated and subdivided into smaller boxes. Child nodes are generated corresponding to the subdivided boxes, and building components of the parent node are assigned to child nodes based on the spatial relationship. The division of nodes continues until any termination condition in Fig. 7 is met. In termination condition 1, it is appropriate to stop the division when the length and width of the box are smaller than around 5 meters, as it is the common size of a room in a typical building.
The distance to the perspective when the model is loaded can be preset for each node. For general buildings, the distance threshold is determined to be 5 to 8 meters after a trial and error process to achieve a good result. The Cesium platform can then decide whether the part of the building needs to be loaded based on the position of the perspective. As shown in Fig. 8, only the node where the camera is located and its adjacent nodes are prepared to be loaded and rendered. As a result, the number of the ready-to-display models can be reduced.

2) GEOMETRIC OPTIMIZATION FOR SINGLE BUILDING EXTERIOR
The single building exterior scale model is used to display the exterior appearance of the building. Although the designed building is still the main interest at this scale, the surrounding environment and other existing buildings will also have a certain impact on the concerned building. For example, a nearby building that is too high or close to the target building may block the sunlight. Therefore, local surroundings and buildings around the target building also need to be evaluated from the multi-scale information model. Since surrounding buildings are not the main concern of the scale, they can be presented in optimized forms at the regional scale to only show the size and location. The generation of the regional scale model will be introduced in the next section.
For the target building, the geometric optimization algorithm at the single building exterior scale is designed to reduce model size. The core idea of the algorithm is that most building elements inside the building will be blocked by external components such as walls, windows, doors and roofs, VOLUME XX, 2017 and can be unloaded without affecting the appearance of the building. Only exterior components that can be directly observed from the outside need to be retained at this scale.
The flowchart of the proposed algorithm is illustrated in Fig. 9. Similarly, the algorithm creates the root node containing all components of the building first and reuses the building division method introduced in Fig. 7. But at the single building exterior scale, only termination condition 2 is applied in the iteration to ensure that the division of building is fine enough so that the internal and external building elements can be divided into different nodes. After the HLOD tree is generated, the leaf nodes of the tree are traversed and marked into three categories including empty, boundary and internal. As shown in Fig. 10, empty nodes are nodes that are outside the building and do not intersect with any building component. The boundary nodes intersect at least one building component and have no non-empty nodes on their outside. And the internal nodes are inside the boundary nodes.
Each node needs to be marked from six directions including up, down, left, right, front and back. Taking the marking process from the right as an example, a given node will be marked based on the category of its neighbor, that is, the node on its right. If its neighbor has not been marked, the neighbor node will be considered first and the original node will be temporarily stored in a stack until the marking process of its neighbor finishes. The specific marking criterion of the node is shown in Fig. 10. For instance, node 1 in the figure does not intersect any components, and there is no node on its right. As a result, it will be marked as empty due to the criterion. Node 2 intersects with the outer wall, and its right node is marked as empty. The marking result for the node is the boundary.
After all leaf nodes are marked, only nodes marked as boundaries in at least one direction will be reserved. Building components that are recorded in reserved nodes are reorganized and written into a new building model. The model contains only boundary components of the buildings, and internal elements are discarded. The number of geometric elements can be thus reduced while maintaining the original appearance of the building.

3) GEOMETRIC OPTIMIZATION AT THE REGIONAL SCALE
The regional scale is used to present the surrounding environment and spatial layout of the community in the site planning stage. Although all building models in the region need to be displayed at this scale, the detailed information of each building is no more important. The location, size and orientation of the building, and the harmony between buildings and surrounding environment are the main concerns at the scale. Therefore, the appearance of the building can be simplified to only retaining the information of interest, and internal elements of the building can be ignored. Geometric simplification of exterior building elements can be further carried out, and the geometric and semantic information of the model can be organized in units of buildings. A geometric optimization algorithm is designed based on these ideas.
The flow of the proposed algorithm is illustrated in Fig. 11. The algorithm is based on the output model of the optimization algorithm at the single building exterior scale. Exterior building components are first filtered based on their types. Only certain types of building components including walls, columns, slabs, roofs, doors and windows are retained. For other building components, they are no longer important FIGURE 11. Flow of the geometric optimization algorithm at regional scale when displaying the appearance of the building at the regional scale and can be ignored. The reserved components are then determined whether they can be replaced with their bounding boxes. The judgment is based on the proportion of space occupied by the component in its bounding box, which can be calculated according to Eq. 1.
The maxX, maxY and maxZ are the largest x, y and z coordinate in the component model, and the minX, minY and minZ are the smallest x, y and z coordinate. The volume of the bounding box can be calculated in the denominator by multiplying its length, width and height. The volume in the molecule refers to the volume of the component, which can be obtained through the attached properties in the IfcBuildingElement entity. A threshold of 0.9 is set in the judgment process for the case study in the next section. If the proportion is higher than the threshold, the geometric of the component will be simplified to its bounding box which is a cuboid represented in six rectangular faces.
For building components that cannot be replaced by their bounding boxes, the proposed algorithm checks whether there are openings in these components. The opening information of building components is retrieved by the inverse reference through HasOpenings property in the IfcBuildingElement entity when parsing IFC files and recorded in the attribute list of the building element module. Any openings will be filled by ignoring related IfcOpeningElement entities and regenerating triangular meshes. The processed components are then reorganized to integrate their geometric and semantic information in units of buildings. Finally, the optimized model at the regional scale can be exported.

IV. CASE STUDY
A building project in Jiangsu Province, China is selected to verify the feasibility of the proposed approach. The local government initiated a construction project for a village to reconstruct the cramped and unsafe original buildings, as shown in Fig. 12. The buildings were built in the form of townhouses, with 2 to 6 single-family houses for each building. The entire construction project contains more than 140 buildings, with over 1000 villagers involved as owners. As mentioned above, the project has encountered challenges in data interoperability between site selection and building design stages. The site planning results are delivered to the owners and constructors in GIS files, while the detailed design of buildings is presented in BIM. Owners need to view the design results on two platforms separately, and it is difficult to associate the building with its planned location to give their feedback. Constructors and subsequent O&M personnel also need to pay extra efforts to reintegrate BIM and GIS data again to support numerical analysis. Based on the proposed approach, a platform is developed to address these problems and improve the workflow between site planning and building design.
As illustrated in Fig. 12, multiple BIM and GIS data are involved in the site selection and building design process. The site planning process needs to consider multiple factors including topography, landscapes and surrounding transportation to determine the most suitable location for the building. The topography and landscapes are obtained from reconstructed 3D models derived by drone photography in the format of OBJ files. With the oblique photography technique, the reconstructed model has good resolution and accuracy. And the road data is obtained from the Google Map in GeoJson files. The accuracy of this GIS database has been verified in practice. These data are first integrated into the platform to help planners determine the location and the scale of each building. The GIS software ArcGIS is applied to perform the process, and the output results are presented in Shapefile files which are also integrated into the proposed platform. The detailed design process is subsequently carried out based on the position and scale of the building. The topography data are also involved in the process to optimize the daylighting of the room. The design process is carried out with the BIM software Revit, and the design results are exported to IFC files. At last, IFC files are also imported to the platform to achieve the integration of planning and design data. ArcGIS and Revit are both commonly used software in the building industry, and their output can ensure the completeness of planning and design results.
Based on the proposed approach, data files were uploaded to the platform and integrated into the multi-scale information model after information extraction. The validation of the process is illustrated in Fig. 13. The visualization of the planning and design data in their original platform before integration is displayed in the first row, and VOLUME XX, 2017 The conversion does not take much time, and a mediumconfigured computer (2.8 GHz processor and 8 GB of RAM in this study) can complete the task in seconds. The data stays valid in the proposed workflow and is ready to support collaborative work in the planning and design process.
The integrated visualization of the planning and design data was shown in Fig. 14. Fig. 14(a) shows the integration results of OBJ and GeoJson files. The scene is used to visualize the site planning results. Fig. 14(b) added the scale and orientation information of buildings which was exchanged in conceptual models in Shapefile format to refine the selection plan. The detailed building models were presented together with surrounding data in Fig. 14(c) to evaluate the coordination of the designed buildings and their surrounding environments.
The multi-scale geometric optimization algorithm is then carried out to generate building models with different LODs. The optimized models are used to build multi-scale hierarchies for organizing spatial data of different scales. Geometric optimization is also used to improve the display efficiency of the platform since it is costly and unnecessary to show all the details of site planning and building design at the same time. Optimization processes are carried out at three scales including single building interior, single building exterior and regional scale. The result of the geometric optimization algorithms is shown in Fig. 15. VOLUME XX, 2017 Compared with the original model, the geometric optimization algorithms at three scales reduced the model size and number of triangles while keeping the content of interest. At the single building interior scale, building components around the camera were completely preserved, while the distant building components were not loaded. The algorithm at the building exterior scale preserved building surface components and ignored internal elements. Geometric optimization algorithms on these two scales reduced model size while retaining the appearance of the model in the parts that people were most interested in. At the regional scale, the proposed method greatly reduced the model size, but the appearance of the model was changed and details of building elements was lost as expected. The cost was acceptable since the location and orientation of the building were the main concerns in this scale.
The optimized building models were then integrated into the multi-scale information model and used for multi-scale visualization. Models at different scales were organized as an HLOD tree introduced in Fig. 5 to achieve hierarchical display. The multi-scale visualization effect is illustrated in Fig. 16. When the camera was far from the model and the viewing range was wide, the regional scale model at the root of the HLOD tree would be displayed. As the camera approached the building, the regional model would be replaced by the single building exterior scale model, which was the children nodes of the root. And when the camera moved into the building, the single building interior scale model would be displayed. As shown in Fig. 16, the multiscale geometric optimization algorithm reduced the model size and increased the frame per second (FPS), which is an indicator of the smoothness of model browsing at all scales. The result is valid since that computers with a medium configuration (2.8 GHz processor and 8 GB of RAM in this study) could display the model smoothly with FPS higher than 25. Planners and designers can upload their data without concerning the details of data integration. And participants in the project can browse planning and design results seamlessly and smoothly without expensive computer devices.
An inside look at the multi-scale visualization is shown in Fig. 17. The regional scale contains the surrounding environment model at LOD0 and the optimized building model at LOD2. Building models at this level contain their location and general appearance, which supports designers and owners to evaluate site planning results to see whether the building is in harmony with the surrounding environment. Semantic properties of buildings are also preserved at this level for potential regional analysis tasks in subsequent stages. At the single building exterior scale, the mainly concerned building is presented in LOD3 to keep the fidelity of its appearance. Surrounding environments and buildings are also displayed at this scale to assess the impact of surroundings on the building, such as the blocking of the sunlight. Since only the impact on the main building is concerned, the surrounding buildings are managed in LOD1 to only keep their size and orientation information. At the single building interior scale, the building model at LOD4 is presented. The building model at LOD4 contains detailed information of the building to present the building design results.

V. DISCUSSION
The main contribution of the paper is to develop a feasible data integration and simplification method to integrate site planning and building design results, which facilitates crossstage data exchange and improves the understanding of planning and design results. Compared with existing BIM-GIS integration methods, the proposed framework addresses the current challenges from the following aspects: (1) Current BIM-GIS integration research focuses on applications in a single stage, while the cross-stage data integration is still limited. As a result, site planning and building design results often need to be managed and visualized on different platforms, which reduces the efficiency of the collaborative design process. This paper improves the workflow of site planning and building design by developing a cross-stage BIM-GIS integration framework. Site selection results in GIS files can be first integrated into the data model. Designers can obtain geographic data in the site selection process, including transportation, topography and landscapes, from a unified data source and apply them as references to improve building design. For example, the topography data is used to optimize the daylighting of the room in the case study. After the design is completed, detailed building models can be imported into the platform and linked with the site planning data. Other participants of the building project, including owners and constructors, can access and view the results of the two stages in a unified platform.
(2) Information loss of features during the integration process is another challenge for current research [15] . One of the main reasons is that BIM and GIS are "partially" integrated to address specific problems in most of the current studies [42] . Specifically, only the data involved in the problem will be extracted and integrated, while the irrelevant data is ignored. Although the integration approach is easy and flexible, it is only effective for specific problems, and the integrated database is difficult to be exchanged for other applications. To solve the problem, a multi-scale information model is designed in this paper to balance flexibility and data completeness. While the multi-scale model manages all the information extracted from BIM and GIS, involved data can be defined flexibly based on specific applications at different scales. And the geometric optimization algorithm is developed to generate the minimized geometric model suitable for specific applications. The integrated multi-scale model can be exchanged and original levels can be defined for new applications.
(3) Current BIM-GIS integration is at the professional stages, which means that the users of the integrated systems are mostly experts in the building industry [42] . Few general and public users have benefited from BIM-GIS integrated applications. The proposed approach tries to cover this gap by linking the multi-scale information model with multi-level visualizations. In the developed web-based platform, users can browse site planning and building design results at different scales seamlessly. Models with different LODs can be switched automatically based on the perspective, just like browsing an electronic map such as Google Maps [58] . Multilevel visualization eliminates the need to switch data sources when visualizing different scales, and is expected to help public users to use the integrated systems more conveniently.
The geometric optimization algorithm proposed in this paper aims to generate building models with multiple LODs corresponding to different scales. When accessing data at certain scales, only the minimized model that satisfies the information requirement needs to be extracted. It should be noted that the aim of the algorithm is to reduce the model size in the data extracting, transmission and visualization process. However, the size of files that need to be stored and managed will increase instead because multiple models with different LODs of the same building are generated. The cost is acceptable since the model files are processed and stored in the back end of the platform which is deployed in the cloud server with sufficient storage and computing capabilities, and the browser-based front end running in personal computers is often the bottleneck of visualizing large-scale models. In future work, storage optimization methods can be further developed to compress model files. For example, buildings and components with similar geometry can be stored in one model uniformly, and the mesh network can be optimized to reduce the number of triangles.
At present, there are still limitations in the proposed framework. The acquisition methods of 3D real-world data are limited, and the integrated model contains little semantic information of the surroundings. In future works, point cloud models acquired by laser or lidar scanning can be applied as a source of high-resolution 3D models [59] . And with computer vision algorithms, the semantic information of the surroundings can be supplemented to provide more sufficient information for planning and design. Shirowzhan et al. [60] developed a compactness metric to compute the 3D dimensions of buildings from airborne lidar data. The work provides rapid access for climatic design for building designers. Justo et al. [61] designed a supervised learning approach to automatically generate IFC models from point cloud data, which is expected to reduce the workload of building modeling. Currently, the information integrated by the framework is static and cannot be updated automatically. After the BIM-GIS integration model is delivered to subsequent stages, the Internet of Things (IoT) technique can be applied to support the establishment of digital twin models [62] . Real-time monitoring data can be integrated into the multi-scale information model to update the status of the building in real time and help residents better understand the operation of the building.

VI. CONCLUSION
Site planning and building design results are generally managed in GIS and BIM systems respectively, although the two stages are closely related in the workflow. The data barrier makes it difficult for practitioners to evaluate the planning and design results in a unified platform, and the results are difficult to deliver in a uniform format. In this paper, a framework based on BIM-GIS integration and geometric optimization is proposed to improve the workflow between site planning and building design processes. Parsing programs for common BIM and GIS data formats in planning and design stages are developed to extract information, and the multi-scale information model is proposed to achieve BIM-GIS data integration. Three scales including single building interior, single building exterior and regional scale are defined to support the management of planning and design results at multiple levels. Geometric optimization algorithms are further implemented on each scale to generate simplified building models with corresponding LODs. A web-based platform is developed, and the proposed approach is validated by on-site data of a building project in east China. The application results indicate that data in the planning and design stage can be converted, integrated and managed in the platform. And with the multi-scale data structure and geometric simplification, data can be browsed seamlessly and smoothly. The method provides a feasible method to integrate site planning and building design results to improve the workflow. The proposed work constructs a seamless browsing scene to enhance the understanding of planning and design results. It also reduces the data gap between site planning and building design to improve the collaborative design.