Workplace Assignment to Workers in Synthetic Populations in Japan

In this article, we assign workplace attributes to each worker in each household in a synthetic population using multiple censuses conducted in Japan. The synthetic population is a set of artificial individual attributes for each resident that is synthesized according to census data. We have synthesized a set of the synthetic populations of Japan. We assign a workplace attribute to each worker to estimate daytime population distribution and develop activity-based models in agent-based or microsimulations. Although statistical information in a residential area or a working place is released by the government and some individual moving data are released by cellphone companies, it is hard to collect the information with home and workplace location of a worker with their family and working information. We employ origin–destination–industry (ODI) statistics to estimate workplaces for workers. Since some attributes in ODI statistics are not available for privacy reasons, we propose a workplace assignment method for all cities, towns, and villages using restricted ODI and OD statistics in Japan. We show how much difference there are between the number of workers using the complete ODI statistics and the number of workers by the proposed workplace assignment method. We show that 88.2% of workers in a city in Japan are assigned to correct cities as workplaces by our proposed method. We also show several maps of daytime population distributions by our proposed method. Synthetic populations with workplace attributes enable real-scale social simulations to design transport or business systems in times of peace or to estimate victims and plan recoveries in times of emergency, such as disasters or pandemics.

researchers tried to see how infected patients spread in a target region [1], [2], [3], [4], [5], [6]. To observe such an increase of patients in a specific region or area using social simulations, researchers need a synthetic population with attributes of each resident and household composition in the target area or region. Currently, there are few research projects, which publicly release the data on country-level synthetic populations only in the U.S. [7], [8], the U.K. [9], Japan [10], [11], [12], and Belgium [13]. As a Japanese team, we distribute the Japanese synthetic population to researchers who are to conduct social simulations for regions in Japan. 1 The first method to synthesize populations based on statistics, called the synthetic reconstruction method (SR method), was proposed by Wilson and Pownall [14]. They reconstruct individual and household data from statistics with some actual samples using an iterative proportional fitting (IPF) procedure [15]. Barthelemy and Toint [16] indicate that the SR method has difficulty reconstructing populations simultaneously to fit both individuals and households. To cope with this difficulty, Gargiulo et al. [17] and Barthelemy and Toint [16] proposed synthetic population methods without sample data. Lenormand and Deffuant [18] compare the methods with the SR method and show that the synthetic population method without samples can reconstruct a better solution. We employed an SR method without samples in synthesizing Japanese populations [10].
A synthetic population with attributes of residents and households is synthesized based on the census collected from residents in a municipality. Therefore, a synthetic population shows the information on residents as a so-called "night-time population," that is, attributes of individual residents and their households are connected with their home location. In order to estimate the activities of residents in agent-based or microsimulations, it is essential to estimate the movements of residents during the daytime. Since youths and children are going to school mainly by foot or bicycles, we estimate that their schools exist not so far from their homes. On the other hand, locations of workplaces are scattered over areas, including their home and neighboring cities by workers. In this article, we assign workplaces based on workers' statistics available in Japan that helps to estimate their daytime activities.
In order to estimate workplaces for workers, many researchers employ origin-destination (OD) surveys. Workplace assignment methods become different whether their OD survey is: 1) a sampling survey that extracts several subjects from the entire population or 2) a complete survey that collects replies from the entire population.
Many workplace assignment methods employ 1) sampling OD surveys. For example, Abdel-Aal [19] conducted a workplace assignment using a sampling OD survey among 15 zones in Alexandria, Egypt. The sampling OD survey is adjusted using the IPF procedure to estimate the entire movements between 15 zones in Alexandria.
As an example of 2) a complete OD survey, Fournier et al. [20] employed longitudinal employer household dynamics (LEHD) origin-destination employment statistics (LODES) collected by the Center for Economic Studies of the U.S. Census Bureau 2 and Ye et al. [21] employed Chinese National Population and Economic Census. Fournier et al. [20] utilize three LODES statistics, such as workplace OD totals due to the lack of origindestination-industry (ODI) data, workplace origin totals by industries (OI), and workplace destinations by industries (DI). Fig. 1 shows the relation among the three attributes (origin, destination, and industries) in the three tables. As shown in Fig. 1, each of the three tables is a projection of the ODI data using two attributes. Since they can utilize only projected tables, they should generate a joint distribution from OI, OD, and DI tables.
In Japan, the complete ODI data (the number of relations among origin-destination-industry) is released by local governments whose area have more than 200 000 residents. However, the destinations category is not fine for cities, towns, or villages, where the number of residents is less than 200 000. In those areas, only six categories are shown for the destination, such as "At home," "In the same city," "In another city in the same prefecture," "In another city in another prefecture," "Unknown in Japan or outside Japan," and "Not available." This article proposes a workplace assignment method for workers in all cities, towns, and villages in Japan 2 The following note is given at https://lehd.ces.census.gov/data/ for the data: "The data released by LEHD are based on tabulated and modeled administrative data, which are subject to error. Because the estimates are not derived from a probability-based sample, no sampling error measures are applicable. However, the data are subject to nonsampling errors, which can be attributed to many sources: misreported data, late reporters whose records are missing and imputed, and geographic/industry edits and imputations. The accuracy of the data is impacted by the joint effects of these nonsampling errors. While no direct measurement of these joint effects has been obtained, precautionary steps are taken in all phases of collection and processing to minimize the impact of nonsampling errors." (as another unit, there are "wards" in Tokyo and some larger cities). We compare the proposed method with a method using statistics available in a city with more than 200 000 residents to see challenges in the workplace assignment to workers in the proposed method.
This article consists of the following sections. Section II shows a workplace assignment method using the ODI data for all cities, towns, and villages in Japan. We first assign a workplace at the city level (or town and village-level) for each worker. Because restricted ODI data are only available for cities, towns, and villages with less than 200 000 residents, we also employ the OD data with the names of cities, towns, and villages as destinations to assign workplaces. On the other hand, cities with more than 200 000 residents release more precise ODI data. We apply the proposed method with restricted statistics to a city with more than 200 000 residents and then compare the results with the workplace assignment using the complete ODI data for that city. Section III shows a method of workplace assignment at a small area level in an assigned city in Section II. Section IV shows the results of the proposed workplace assignment method for workers. It shows a tendency of workplaces by industries. Section V concludes and shows some further applications of the synthetic population with workplace attributes. Fig. 2 shows attributes for each household member in a synthetic population synthesized by our method [6], [7], [8]. Four address attributes are assigned to each household, such as (A) prefecture, (B) city, (C) small area in the city, and (D) home coordinate (latitude and longitude). In Fig. 2, Japan had more than 127 million residents in 2015. It has 47 prefectures, the biggest one is Tokyo, with more than 13.5 million residents, and the smallest one is Tottori, with 573 thousand. Fig. 2 shows Osaka Prefecture with more than 8.8 million. It has 33 cities, 9 towns, and 1 village. Among them, eight cities have more than 200 000 residents, and the others have less than it. Fig. 2 shows an example of a city with more than 200 000, Takatsuki-city in Osaka. The city has 448 small areas indicated by red lines in the figure. Geospatial Information Authority of Japan, Ministry of Land, Infrastructure, Transport and Tourism, Japan releases the fundamental geospatial data, including the shape data of each building with latitude and longitude. Each household in our synthetic population has a coordinate based on the fundamental geospatial data.

A. Attributes of Household Members in Synthetic Population
The synthetic population also includes seven biological and social attributes for each household member, such as (1) age, (2) sex, (3) role in the household, (4) working status, (5) working industry, (6) industry size, and (7) income if it is working. The role is assigned to each household member according to its family type, such as husband, wife, child, and parent. We employ nine family types shown in Fig. 3. These nine family types cover 95% of whole households in Japan.
The government of Japan collects census data every five years. Attributes (A) and (B) and (1)   by prefecture and by city, town, or village [6]. To assign households to small areas in a city, we utilize the number of residents by sex in each small area [7]. Then, we allocate each household to a building in the assigned small area to specify the address attributes (C) and (D) in Fig. 2 [7]. Attributes (4)-(7) are assigned to each worker based on Basic Survey on Wage Structure [8].

B. Workplace Assignment From City to City
The workplace assignment is conducted to each worker of a household in a synthetic population. Using three attributes, such as (1) age, (2) sex, and (5) working industry, we assign a workplace to each worker.
To assign a workplace to each worker, we need ODI data in Fig. 1. In Japan, we have ODI data from residential cities to working cities in the census data. However, incomplete destination information is only available in some cities, towns, and villages with less than 200 000 residents for privacy reasons. To compensate for the lack of finer destinations, we propose a method for all cities, towns, and villages (especially for cities, towns, and villages with less than 200 000 residents).
1) All Cities, Towns, and Villages: Statistics Bureau of Japan releases ODI data for all cities, towns, and villages in a restricted manner. Table I shows the number of workers by destinations and by working industry in an originating city (residential city of workers). Table I indicates the destinations, not in a finer way, but in a vague way, such as "In the Same Prefecture" or "In Other Prefectures." Therefore, we employ Table II, which indicates the number of workers by destination at the city level. From Table II, we can see the number of male and female workers in each city, town, or village in Japan. It should be noted that Table II does not indicate the industry of workers. Among 1896 wards, cities, towns, and villages in Japan, 487 are listed in Table II for workers in Takatsuki, Osaka, Japan, in 2015 (with four more destinations, such as "At Home," "In City," "Unknown in Japan or Outside Japan," and "Not Available"; the number of destinations becomes 491 in total).
Before assigning cities of workplaces to workers, we adjust the number of workers in "T) Industry unable to classify" in Table I since the respondents classified in T) miswrote their occupations that are unable to classify. On the other hand, Economic Census that is used in Section III does not have such categories since all companies and self-employers register themselves as one of A)-S). Table I shows that there are 5622 male workers in T). The same statistics show that there are 4644 female workers in T). Therefore, there are 10 266 workers in T) in total. We distribute these workers to industries A)-S) according to the rate of male and female workers in each industry over A)-S).
Using Tables I and II, we assign a workplace using the following procedure.
Step 1: Assign one of six workplace categories a in Table I according to the sex x (male or female) and the industry d of the workers in a synthetic population [attributes (2) and (5) in Fig. 2]. Let us denote n(x, d, a) as the number of x workers who work for industry d in one of the area category a in Table I. The workplace category is defined using the following rate: where A is a set of six workplace categories in Table I, such as "At home," "In the same city," "In another city in the same prefecture," "In another city in another prefecture," "Unknown in Japan or outside Japan," and "Not available." Note that the Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.  workplace is not assigned stochastically but assigned randomly up to n(x, d, a).
Step 2: Assign a city, town, or village in the worker's living prefecture according to Table II up to the number of workers in the specific industry. Let us denote n(x, w) as the number of x workers who work in a city, town, or village w. One of cities, towns, and villages is assigned using the following rate: where B is a set of workplaces in the living prefecture in Table II (e.g., 52 male workers in A) Agriculture & Forestry in Table I).
Step 3: Assign a city, town, or village in other prefecture up to the number of workers in the industry (e.g., 29 male workers in agriculture and forestry in Table I). One of cities, towns, and villages is assigned using the following rate: where C is a set of workplaces in cities of another prefecture in Table II.
Step 4: The rest of the workers are not assigned to any city, town, or village since their workplace is "Unknown in Japan or Outside Japan" or "Not available" (e.g., three in "Unknown in Japan or Outside Japan" and two in "Not available" in Agriculture and Forestry in Table I).
2) Cities, Towns, and Villages More Than 200 000: Statistics Bureau of Japan releases more precise ODI data for cities with more than 200 000 residents. Those cities have the statistics that connect Tables I and II. Table III shows an  example of Takatsuki city. Although Tables I and II distinguish  the sex, Table III shows the total workers of male and female workers (i.e., no distinction in sex) in every industry type due to privacy reasons. The number of cities, towns, and villages is 487 in Table III for Takatsuki city, the same as Table II. In Japan, the number of wards and cities with more than 200 000 is 290 wards and cities among 1896. The total number of workers in these 290 bodies becomes 52.2% of all workers in Japan.
Using Table III without the distinction of sex, we assign a workplace using the following procedure for cities with more than 200 000 residents.
Step 1: Assign one of 491 destinations in Table III according to the industry of the workers in a synthetic population [attributes (2) and (5) in Fig. 2]. Let us denote n(x, d, w) as the number of x workers who work in a city, town, or village w. One of cities, towns, and villages is assigned using the following rate: where D is a set of workplaces in cities of another prefecture in Table III.
Step 2: The rest of the workers are not assigned to any city since their workplace is "Unknown in Japan or Outside Japan" or "Not available" (e.g., four in "Unknown in Japan or Outside Japan" and two in "Not available" in Agriculture & Forestry in Table III).

3) Differences in City Allocation Between Two Methods:
Since Tables I and II are available in every city in Japan, we apply the proposed workplace assignment method in Section II-B1 to Takatsuki city, Osaka, Japan. Fig. 4 shows the number of workers by industry type based on Table I (i.e., Step 1 in Section II-B1) after distributing the number of workers in T) to each industry for Takatsuki city. In each industry, a numeral figure over a bar indicates the number of workers in other cities, towns, and villages. To each of those workers, one of the cities, towns, and villages in Table II is assigned. Fig. 5 shows the number of workers assigned to cities correctly and mistakenly. We apply the proposed method using Tables I and II according to the procedure in Section II-B1. We also applied another method using Table III by the procedure in Section II-B2. We assigned cities to workers using ten sets of the synthesized populations of Takatsuki city. In Fig. 5, the solid bar shows the average number of workers assigned to the correct cities indicated in Table III   the rate of workers mistakenly assigned among all workers who work in other cities in each industry. Table IV shows the classification rate of workers who are correctly assigned to the cities of workplaces by the proposed method and the rate of workers who are working in their home city. The accuracy of workplace assignment by the proposed method is 88.2% for all industries. The figures with underline in the tables show that the value is larger than the overall value (i.e., 88.2% in "Accuracy" and 46.3% in "Rate of Home City"). This result shows that the industry with higher accuracy than the overall accuracy comes from the higher rate of workers working in their home city. Fig. 6 depicts the relations between the number of all workers by industries and the number of correctly assigned workers by the proposed method. The vertical difference between each maker and the diagonal line shows the number of workers who are mistakenly assigned to workplaces. From Fig. 6, the proposed method using only the OD data assigns relatively many workers correctly to their workplaces. Of course, this depends on the number of workers who are working outside their living city. For Takatsuki-city, the workers are working in their living city are 43.6% whereas that in the entire Japan are 53.8%. That is, the accuracy of workplace assignment becomes higher than the result of Takatsuki-city for the entire Japan.

III. WORKPLACE ASSIGNMENT AT SMALL AREA LEVEL
Using the procedures in Section II, we have assigned a working city to each worker. In this section, we assign a small area to each worker in the assigned city. As shown in Fig. 2, each city, town, or village has small areas. Census data are accumulated based on these small areas. We employ Economic Census for Business Frame in Japan. This census is a survey targeting all establishments and enterprises to identify the structures and activities of businesses in Japan and is conducted every five years. Although we employ census data in 2015 to synthesize a population and assign cities of workplaces, Economic Census was not conducted in 2015. We employ Economic Census in 2014 in this article.

A. Location of Workplace Who Work at Home
We assign the home coordinate (i.e., latitude and longitude) as the workplace coordinate for workers who work at home in Tables I or III. We do not need the results of Economic Census for them.

B. Location of Workplace Who Work in a Specific City
For each worker who works within the city where it lives or in other city assigned by the procedures in Section II, we employ Economic Census. In each city, town, or village in Japan, Economic Census shows the number of workers in each   Attributes of household members and workplace coordinate of workers in the household (pseudo household is shown). small area by industry. Table V shows an example of Ibaraki city, the western neighbor to Takatsuki, Osaka, Japan. One of the small areas is assigned to a worker of the synthetic population according to the following probability based on Table V: where n(w, d, c k ) is the number of workers who work for a company of industry d in a small area c k , a city, town, or village w, and N w is the number of small areas in w. It should be noted that the number of assigned workers to each small area does not always become equal to the corresponding value in the table, such as Table IV, since a small area is assigned  probabilistically to each worker. Furthermore, workers from city, town, or village with less than 200 000 residents are sometimes assigned to a workplace city mistakenly, as shown in Fig. 5. Therefore, there is room to optimize the number of workers after assigning workers from all originating cities, towns, and villages. This challenge remains for further study.
After assigning one of the small areas in the assigned city (town or village) to each worker, we assign a center coordinate (i.e., latitude and longitude) of the small area in Economic Census. Using this coordinate developers of agentbased simulations easily deploys movements of each agent from home to workplace. It is not a coordinate of a building of a workplace, although a coordinate for each home [(D) in Fig. 2] is represented by each building coordinate. Fig. 7 shows the workplace attributes for a household with workers. The coordinate of the center of the small area where the worker's workplace exists is added to each worker. The coordinate of the center of the small area is given up to the fifth decimal. On 35 • north latitude, the difference of 0.00001 • in longitude becomes 0.913 m. It is sufficiently precise since each building for residents or workers is larger than 1 m 2 .

IV. RESULTS OF WORKPLACE ASSIGNMENT
We depict the number of workers in some industries on the map of Osaka Prefecture and its neighboring areas (Hyogo, Kyoto, Nara, and Shiga Prefectures). Fig. 8 shows the color scales of the number of workers from Takatsuki city, Osaka, Japan. Fig. 9 shows the number of workers in all industries  from Takatsuki city in each small area. We show the number of workers in D) Construction, E) Manufacturing, H) Transport and Postal Services, and P) Medical and Welfare Services. From Figs. 8 and 9, we can see that those residents in Takatsuki-city commute to Hyogo Prefecture (Western side of Osaka), Kyoto Prefecture (Northeast from Takatsuki-city), and Shiga Prefecture (Eastern side of Kyoto, including Biwa Lake). On the map, the Yodo River flows from Biwa Lake to Osaka Bay from Northeast to Southwest. Since Takatsuki-city locates on the northern side of the river and two train lines (Japan Railways and Hankyu Railways) run along with the river, we can see more workers on the northern side of the river than on the southern side of Takatsuki.
From Figs. 10 and 11, the workers in D) Construction and E) Manufacturing are working along with the river and downtown Osaka (the plain next to Osaka Bay). Those workers commute to the southern side of the river in downtown. From  Fig. 12, the workers in H) Transport and Postal Services are working in areas along the northern side of the river. Since many storehouses are located along the river, many workers in transport and postal services are working in these areas. Fig. 13 shows a different tendency of workplaces in P) Medical and Welfare Services. Workers in this category reside in the same city. Since they may have a night shift for their work, they live and work in the same city.
In order to see the fact of tendency observed in Fig. 13, we accumulate the ratio of workers living and working in the  same city in four cities, such as Takatsuki, Ibaraki, Suita, and Toyonaka, in Figs. 14. Fig. 15 shows the ratio of workers living and working in the same city in E) Manufacturing and P) Medical, Health Care & Welfare services. Dashed and solid lines indicate the ratio of workers who live in the corresponding city. The horizontal axis shows the city of the workplaces. From Fig. 15, we can see the ratio of workers living and working in the same city. The ratio of workers in E) Manufacturing work in the same city from their living city up to 34.31%. On the other hand, the ratio of workers in P) Medical and Welfare Services becomes over 50% among these four cities. We can see the apparent tendency of workers of P) Medical and Welfare Services to work in their home city.

V. CONCLUSION
This article proposed a method to assign workplace attributes to each worker of the synthetic population in all cities, town, and villages in Japan. Since the ODI data for cities, towns, and villages with less than 200 000 residents have only restricted information on the destination city for work, we proposed a method to assign a workplace using OD data without the industry information. By applying the proposed method to Takatsuki-city, Osaka, Japan, we showed the difference between the number of assigned workers using the complete ODI data and the number of assigned workers by the proposed method with the restricted ODI data. We show  that 88.2% of workers in a city in Japan are assigned to correct cities as workplaces by our proposed method. Since Takatsukicity has a larger number of workers who are working outside the city than the average of the entire Japan, we can expect the better workplace assignment in many places in Japan.
We have already developed our proposed method to apply to all cities in Japan. We will soon release the data to research communities to develop social simulations. Synthetic population data with the workplace can be utilized in many applications, such as epidemic simulations and transportation design in target regions or areas. It is also needed to estimate the number of victims of earthquakes or floods and plan to rescue them from places struck by disasters.