## INTRODUCTION

At a time when the effects of climate change on our urban life are very visible, some cities have become non-breathable and greenhouse gas emissions are caused by heating and cooling networks in buildings and extensive gasoline transportation. At a time when transport has become the first emitter of CO2, we have to imagine other ways of occupying urban space. This requires a better understanding of the spatial distribution of facilities and population (1–7). The information age and the online mapping revolution enable us to globally examine the interactions of humans with their built and natural environment (8–13). Pioneering multicity studies have uncovered scaling laws that relate population to facility distribution and socio-economic activity at the macroscopic level (3, 6, 14–16). For example, it has been argued that more populous cities have more efficient per capita consumption (3, 4) and their employment diversity can be modeled than social networks embedded in space (10). However, a systematic understanding of the interplay of urban form, its distribution of facilities and their accessibility on several levels remains an elusive task.

At the state level, Gastner and Newman (17) showed a simple two-thirds power law between the optimal density of the facilities d and their population density ρ when maximizing the accessibility of the population to a specified number of facilities. The energy bill was amended to assign 5,000 facilities in the continental United States using population data within more than 8 million census blocks. In this case, each facility covers an area approximately the size of a rural district (1000 km2). In a follow-up study, Um et al. (18) suggested different optimization goals in order to distinguish public services such as fire stations and public schools from commercial facilities such as banks and restaurants. Public service institutions aim to minimize the total distance between people and institutions and to adhere to d ∝ ρ2 / 3. In the case of profit-oriented investments that aim to maximize the number of potential customers, the power law has an exponent close to 1, ie d ∝ ρ. The authors found agreement in the analytical optimization and empirical distributions in the US and South Korea and confirmed the 2/3 exponent for public services and the 1 exponent for for-profit institutions. The simple power law on the city scale shows the equilibrium of the empirical distribution of resources among cities with different populations. However, the distribution of small-scale facilities within cities, where the coverage area per facility is only a few blocks (10 km2), leads to more heterogeneous settlements with different socio-economic characteristics. Studies on accessibility in cities deserve attention for scientifically sound land use planning and the redistribution of public services after disasters and evacuations (19–23). Forward-looking approaches to the planning of facilities in cities would also take into account the preferences of individuals towards facilities based on mining mobility patterns. Zhou et al. (24) introduced a location-based data set for social networks to infer the demand for different types of cultural resources and identified the urban regions with missing venues. Efforts have been made to address the problem of optimal allocation in certain cities (24-27), but from an urban science point of view, there is no systematic understanding of optimal allocation of facilities.

To contribute in this direction, we propose a multicity study that measures the accessibility of city blocks to different types of facilities via their road networks and examines the role of population distribution. On a large scale, travel expenses can be replaced by the Euclidean distance from residents to facilities. Road networks and geographic restrictions play an important role in human mobility in cities (28–31). It is known that road network properties influence residents' daily journeys (32–34), their urban form (35, 36) and their accessibility (29, 37). As a complement to most studies dealing with commuter travel expenses, in this thesis we analyze the distance of the road network of people to the nearest amenity of different species, dividing the space into high-resolution blocks with a constant area of 1 km2. We optimally redistribute the existing facilities for each city and each type of system and compare the result with their empirical distribution. We find that when redistributing, some blocks increase their accessibility and others decrease it. This means that some blocks would benefit if other facilities were removed to make the best use of the existing facilities for fairer accessibility. At the city level, the gap between the empirical system distribution and optimal planning offers the opportunity to evaluate the planning quality of systems in different cities. We also rethink the power law between facility and population density, noting that the two-thirds power law is not followed by empirical cases, and it is only observed in the optimal scenario when the number of facilities is small compared to the total blocks in the City.

We further investigate the optimal distribution of facilities by modeling the average travel distance in different cities as a function of the number of facilities to be allocated. A model of this size is derived for both synthetic and real cities and, with only two free parameters, fits well with different cities. In addition, we get a universal function between the average travel distance and the number of facilities that are controlled by the city form derived from the population distribution. As a use case, we estimate the number of facilities required to achieve a certain accessibility via the proposed function in 12 real cities.

## RESULTS

### Empirical distribution of facilities

We select three cities (Boston, Los Angeles (LA) and New York City (NYC)) in the US and three cities (Doha, Dubai and Riyadh) in the Gulf Cooperation Countries (GCC) to study the empirical distribution of assets. For each city, we record the population in blocks with a spatial resolution of 30 arcseconds (1 km2 near the equator) from LandScan (38), road networks with the OpenStreetMap (39) and facilities from the Foursquare service application (40). These novel, extensive and publicly available data sets have proven themselves in transport planning (34, 41, 42), in land use studies (43, 44) and in modeling human activities (45–47). The boundary of each city is drawn together with the metroplex and includes both urban and rural regions. 1 shows the road network, population density, and 10 selected types of facilities (e.g. hospitals and schools) in NYC and Doha. The statistical information from six cities is summarized in Table 1. For the sake of clarity, all variables and notations introduced in this work are summarized in Note S1. The distribution of all available facilities in Foursquare data for the six cities is shown in Fig. 2. S1. Details on the data sets are described in Materials and Methods and Note S2. Figure S2 shows the distribution of the population and the various facility categories depending on the distance to the central business district and shows the diversity of the selected cities. In particular, it can be observed that Doha and Dubai have more facilities in the more densely populated areas, while Boston has most facilities near the city center, where fewer people live. Discrepancies in the distribution between population and institutions can also be observed in LA, NYC and Riyadh. These three cities have the highest population densities near the city center, but the facilities are more evenly distributed across the city.

Fig. 1

(**A**) Records in NYC. The bottom layer shows the city's map and road network. The middle and top layers illustrate the population density and locations of the 10 selected types of facilities in the same region. (**B.**) Records in Doha. Compared to very dense NYC, Doha has a simpler road network, less population and more sparse facilities. Illustration of the three data sets in NYC (A) and Doha (B). The bottom layer shows the city's map and road network. The middle and top layers illustrate the population density and locations of the 10 selected types of facilities in the same region. Compared to very dense NYC, Doha has a simpler road network, less population and more sparse facilities.

Table 1

It is noteworthy that to calculate the facility density and the total number of facilities, we first merge the same facility type (e.g. hospitals) that is in the same block as a facility. Thus, the number of devices thereafter relates to the number of blocks that accommodate a particular type of device, denoted by N. We define Nmax as the total number of blocks in a city. Since nearly uninhabited blocks play no role in calculating accessibility, we define Nocc as the number of occupied blocks indicated by the blocks with a population above a threshold. In real cities, we set the threshold to 500, which is commonly used to differentiate between urban and rural areas. The ratio between the number of blocks occupied by the devices N and the blocks occupied Nocc is denoted by Docc. Table 1 lists Docc of the 10 types of facility selected in the six cities studied. For example, Boston Hospital's Docc equals 0.11, which indicates that approximately 11% of the populated blocks are occupied by hospitals.

To quantify the population's accessibility to facilities, previous work used the Voronoi cell around each facility as a proxy for the tendency of individuals to choose the closest facility at Euclidean distance (17, 18). Within cities, however, the distance people travel on road networks is limited by infrastructure and the landscape. In this regard, the routing distance is a better indicator of the accessibility from home to any amenity. Figure S2C compares the routing distance distributions of the actual and optimal locations of facilities with the Euclidean distances. Our results confirm that the optimal strategy based on Euclidean distance has a similar cost to the actual distribution of facilities, which is much less effective than the strategy optimized for routing distance.

### Optimal distribution of facilities to maximize general accessibility

The accessibility indicates the level of service of the facilities for the residents. In network science, accessibility is defined as the easy accessibility of points of interest within a certain cost budget (48–50). Allocating facilities to maximize general accessibility in cities is one of the most important concerns of facility design. With this in mind, we are redistributing the facilities by minimizing the total route distance of the population to the nearest facilities. In the following we refer to this redistribution as the optimal scenario. Likewise, the empirical distribution of facilities is referred to as the actual scenario. In particular, among the Nmax blocks of a city, we refer to the N blocks that are occupied by a certain type of facility in the actual scenario as facility-marked and redistribute the same number of facilities in the optimal scenario. The shortest distance between two blocks is calculated using the Dijkstra algorithm in the road network. The idea is to find a new set of N blocks and mark them as facility marked so that the total population weighted distance traveled from all Nmax blocks to the newly selected N blocks is minimized. This problem of optimal allocation in networks is known as the p-median problem and is solved here with an efficient algorithm proposed by Resende and Werneck (51) (materials and methods).

The difference in travel distance between the actual and the optimal scenario evaluates the quality of the distribution and thus the accessibility in different cities. In each scenario, each block of flats is assigned to the facility that can be reached by the shortest route. The block is connected to itself when it is occupied by a facility. It is important to note that in the present study we do not consider facility capacity as a limitation; H. There is no limit to the number of people using the same facility. We group the blocks served by the same facility and define them as a service community. Using the example of hospitals, we present the service communities in Boston in the actual and optimal scenarios in Fig. 2 (A and B). The color of each cluster shows the total population pjS in the service community of the jth institution. The communities in the optimal scenario are more uniform in both size and population than in the actual scenario. In the actual scenario in particular, the communities have a small area in downtown Boston but a large one in the rural area, which shows the uneven distribution of hospitals.

Fig. 2

(**A**) Service communities of hospitals, measured using empirical data in Boston. The points refer to the blocks in the city. The color shows the population in each parish. (**B.**) Service communities of the optimally distributed hospitals in Boston. In this optimal scenario, both the area and the population of the municipalities are generally more equal than in the actual scenario. (**C.**) The block gain index ri on a logarithmic scale in Boston. The green blocks indicate that the residents are better served in reality. ri has no units. The red blocks indicate that their actual distance from the hospital is greater than the optimal distance and that they are underserved, for example in the northern and south-eastern areas.

In order to quantify the differences between the blocks in the service level for a certain type of system, we compare the actual and optimal routes with the facilities. We define a gain index of the ith block asri = lil̂i(1)Where l̂i and li are the shortest driving distances from the i-th block to the next facility in the actual or optimal scenario. An ri> 1 indicates that the block is better served by the facility in the actual scenario than in the optimal scenario. The residents of these blocks benefit more from the distribution of the facilities than in the social optimum scenario. In Figure 2C, in Boston, we illustrate the RI of each block for hospitals on a logarithmic scale. The green blocks near hospitals are in the central, south, and northeast areas, while the red blocks are less accessible compared to the optimal hospital scenario and are in the north, south-west and southeast areas. This bears a certain resemblance to the spatial distribution of wealth in the metropolitan area of Boston (52). The actual distance traveled l̂i and the profit index ri in the i-th block for hospitals for six cities is shown in Fig. 1. S3.

Although the inequality in the distribution of facilities can be visually observed in Fig. 2C, to compare the inequality between plant types and between cities, we calculate the Gini coefficients of ri of all blocks per plant type per city as shown in Fig. 2C. S4A. We observe that the Gini coefficients of all selected plant types in Boston are similar and are around 0.5. NYC has the most discrepancies in Gini coefficients across the 10 facility types where the distribution of schools, parks, pharmacies, banks, and bars is more equitable than others due to their high density (see Table 1). In GCC cities, fire stations are the most evenly distributed facilities, while bars, hospitals, parks, and pharmacies are less evenly distributed than others. The Lorenz curves and the values of the Gini coefficients per system type are shown in Fig. 1. S4B. The three cities in the United States are generally more evenly planned than the GCC cities.

Then we compare the difference in accessibility between cities with different system types. Figure 3A shows the average driving distances in the actual scenario (L̂) and optimal scenario (L) for the 10 types of amenities selected. The first line shows the higher density facilities in US cities: banks, pharmacies, schools, parks, and bars. Next up are hospitals and supermarkets, followed by concert halls, soccer fields and fire stations with the lowest density. As expected, the lower the density, the longer the range to them. Note that accessibility to parks, fire stations, and bars shows the biggest differences between cities in the United States and the GCC, largely due to the lower availability in the latter. To compare travel distance in different cities in the same order, we show the scatter plots of L̂ and L versus Docc, the relationship between N and Nocc in Figure 3 (B and C). The discrepancy in the actual distance traveled L̂ Among the six cities, this is mainly due to the different plant design strategy and urban shape. As expected, the optimal travel distance L shows a more uniform tendency than L̂and shows the potential of modeling L with the number of facilities N.

Fig. 3

(**A**) The average travel distance in the actual scenario, L̂and optimal scenario L for the 10 kinds of facilities in the six cities. (**B.**) L̂ as a function of Docc, the ratio between N and Nocc in the six cities. The point refers to one of the 10 types of facilities in a given city. Cities have different descending rates. (**C.**) L as a function of Docc in the six cities. All cities have similar descending rates. (**D.**) Box plot of the optimality index R according to system type. The plant types are classified according to their average density in the six cities in descending order. Among the system types, the fire station is most optimally distributed and the bank and school the worst. In general, the lower-density facility type is better localized than the dense facility-type from the standpoint of collective utility maximization.

An interesting measure is the improvement of the general accessibility, if the locations of the facilities are redistributed optimally on an urban scale. For this purpose, we define the optimality index R for a certain type of facility at the city level as the ratio between the average driving distance to the nearest facilities in the optimal and actual scenarioR = LL̂ = ∑i = 1Nmaxpili∑i = 1Nmaxpil̂i(2)Here, pi is the population in the ith block. R ranges from 0 to 1, with 1 indicating that the facilities are optimally distributed in reality. In Note S3 and Fig. S4C, we discuss changing R with N / Nmax by introducing two extreme planning strategies, random and population-weighted assignments, which are described in Note S3. We observe that the R-value of actual planning lies mainly between the two extreme strategies, with the exception of Riyadh, where R is even lower than the random assignment. This suggests an imbalance between the locations of facilities and the delivery of services in Riyadh (53). We also find that the actual planning R-value is highest when N / Nmax is smallest for cities other than LA and NYC. For the two extreme strategies we observe that R is U-shaped as a function of N with the exception of LA. This suggests a higher R for both small and large N values. This is because with a small N, simply assigning the facilities in the crowded blocks would significantly reduce the total cost of travel, while with a large N, most of the blocks are occupied by facilities. LA's R-value remains unchanged compared to other cities, mainly due to the polycentric population distribution, indicating that a small number of facilities cannot efficiently serve most of the population.

Figure 3D shows the box diagram of R of the 10 types of facilities in the six cities. R is generally lower as the plant density increases, which suggests that the gaps between actual and optimal distribution are larger. For example, hospitals and fire stations have a much lower density than bars, but their R-values are larger. Public services need to be distributed evenly, but commercial services do not. The best facilities available are banks, pharmacies, schools, parks and bars with an R between 0.4 and 0.5 on average, showing that the average travel distance could be reduced by 50% if all facilities were planned in the optimal locations are.

### Review of the scaling law between facility and population density

Previous work has linked plant density to population density as a power function in both the actual and optimal scenarios at the national level (18). Here we analyze these power laws in the two scenarios in different cities by introducing the road networks. We calculate both the facilities and the population density in the service communities as shown in Fig. 2 (A and B). Specifically, djS = 1 / ajS and ρjS = pjS / ajS, Where ajS is approximated by the product of the number of blocks njS and the average block area in the city, that is, ajS = njSā. Using the example of hospitals, Fig. 4A shows their density versus the population density of the service communities in the actual scenario across the six cities. The solid lines represent the functions of the adjusted power law using the least squares method and with communities with more than 500 inhabitants. Cities have different exponents and the r2 values of the fit are less than 0.5 in most cases. These results show that although the two-thirds power law for public institutions was found in the dissolution of the county (18), more uniform solutions, i.e. H. At the inner-city level, no uniform law can be found between the density of facilities and the population.

Fig. 4

(**A**Actual hospital density (inversely to the municipality area) compared to the population density at the service community level. Each colored dot relates to a service community and the size represents its population as shown in Figure 2A. The full line represents the best fit power law. The colored shadow represents the 95% confidence interval. The low r2 indicates that no clear power law can be found and that the fitted exponent deviates from the empirically observed 2/3 on a larger scale (18). (**B.**) Hospital density versus population density in the optimal scenario. The full lines show clear power laws with exponents close to 2/3 in all cities. The r2 values are higher and the confidence intervals are narrower than in the actual scenario. The adjusted exponents for other system types are given numerically in Table S1. (**C.**) Change of the adjusted exponent β with Docc in the optimal scenario in the six cities. Each β is calculated by optimally distributing a certain number of facilities in the city. The gray dashed and solid lines indicate that β = 2/3 and Docc = 0.2, respectively. (**D.**) Change of the adjusted exponent β with Docc in the optimal scenario in four toy cities.

As soon as the facilities are optimally distributed in the city, the service communities will be reorganized accordingly. The adjusted power laws between the distribution of the hospitals and the population in the optimal scenario of the six cities are shown in Fig. 4B. The fitted exponents are closer to 2/3 and have a larger r2, and the 95% confidence intervals are narrower than those in Figure 4A, which depicts the actual scenario. The exponents for the 10 selected types of facilities in the actual and optimal scenarios are given in Table S1. As expected, cities have different exponents for actual and optimal scenarios. In all cases we find that the optimal exponents deviate from the previously reported analytical 2/3 if the facilities in the national case are optimally distributed by a Euclidean distance (17). The reasons for differences are both the restrictions introduced by the road network and the higher density of facilities to be distributed.

For a comprehensive understanding of the existence of the power laws, we assign a different number of facilities N optimally in our six study cities and in synthetic cities. In Fig. 4 (C and D) we relate β to Docc, the ratio of N to Nocc, and observe 2/3 when Docc <0.2 (0.1) for the real (synthetic) cities. We simulate controlled scenarios across four synthetic or toy cities measuring 100 × 100, with the population distributions shown in Fig. 5A. Note that the population threshold in toy cities is set at 50 to count Nocc, and the total population is set at half a million, which is about 1/10 of the cities studied. We find that the curves of different cities coincide into one, which indicates that the difference in the change in β between cities is mainly caused by different Nocc. In the toy towns we find that the change in β is not monotonous. It stays around 2/3 if Docc is less than 0.1. Subsequently, β decreases with Docc as more devices are allocated to the low density regions, and then increases as devices begin to repopulate the high density regions. After facilities have been assigned to all the high-density blocks, β begins to drop to zero, which means that all blocks are filled with facilities. The same fluctuation in β is not observed in real cities because the high and low density regions are not separated as in synthetic cities. In summary, it can be said that in the optimal scenario the two-thirds power law can be found for a limited number of institutions, but tends to disappear for larger values of N.

Fig. 5

(**A**) Population distribution of four selected toy cities. (**B.**) Population distribution of four selected real world cities, of which Paris is the most central and Melbourne is the most polycentric. (**C.** and **D.**) Simulated and modeled optimal travel distance L compared to the number of facilities N in toy cities or real cities. The points represent the simulated L with varying N. The lines represent the fitted model L (N) in Eq. 5. Cities are ranked according to urban centrality (UCI) in descending order in the legend. (**E.**) Simulated and modeled L versus αN in toy cities. By scaling N by α we reduce the curves of L to a single one with a UCI less than 0.9. (**F.**) Simulated and modeled L versus αN in real cities. Die schwarze Linie repräsentiert die Funktion in Gl. 6, in allen Städten vom simulierten L angefahren. (**G**) Beziehung zwischen α und Nocc. α kann gut mit Nocc ausgestattet werden, was darauf hindeutet, dass die Zerfallsrate der geteilten Bevölkerung in Blöcken ohne Einrichtungen für ein kleines N umgekehrt proportional zum Stadtgebiet in einer Stadt ist. (**H.**) Validierung der Universalfunktion in LA und Barcelona.

### Modellierung der Zugänglichkeit zu optimal verteilten Einrichtungen

In Fig. 3B sehen wir, dass Docc der bestimmende Faktor ist, um die durchschnittliche Entfernung zu verringern L̂ zu einer Einrichtung unabhängig von Typ und Stadt. In 3C beobachten wir, dass diese abnehmenden Funktionen L (Docc) für die optimalen Verteilungen in jeder Stadt zusammenbrechen. Im Anschluss an diese Beobachtung untersuchen wir weiter die Beziehung zwischen der Reisedistanz im optimalen Szenario L und der Anzahl der Einrichtungen N für verschiedene Städte mit verschiedenen geografischen Einschränkungen und Bevölkerungsverteilungen. Zu diesem Zweck haben wir 17 Spielzeugstädte mit unterschiedlichen Ebenen der städtischen Zentralität entworfen. 4 davon sind in Fig. 5A dargestellt. Bevölkerungsverteilungen der Spielzeugstädte werden durch eine zweidimensionale Gaußsche Funktion (z. B. Städte a und b) oder eine Mischung mehrerer zweidimensionaler Gaußscher Funktionen (z. B. Städte c und d) erzeugt. Die Spielzeugstädte haben die gleiche Bevölkerung von einer halben Million und sind gleich groß und bestehen aus 100 × 100 Blöcken. Die Größe jedes Blocks wird auf 1 km2 festgelegt, und die Reisekosten zwischen zwei Blöcken werden mit dem euklidischen Abstand zwischen ihren Schwerpunkten berechnet. Wir messen die Zentralität einer Stadt, indem wir den von Pereira et al. Vorgeschlagenen Urban Centrality Index (UCI) berechnen. (54) der Bevölkerungsverteilung (Materialien und Methoden). Die UCI reicht von 0 bis 1, wobei 0 die vollständige Polyzentrik angibt – wobei die Bevölkerung der Stadt gleichmäßig verteilt ist – und 1 die vollständige monozentrische Bevölkerung angibt – wobei die gesamte Bevölkerung in einem Block lebt. Darüber hinaus umfassen wir 12 reale Städte zur weiteren Erkundung, die sechs oben genannten, zu denen wir hinzufügen: Paris, Barcelona, London, Dublin, Mexiko-Stadt und Melbourne. Die Bevölkerungsverteilung von vier ausgewählten Städten ist in Fig. 5B dargestellt. Paris ist mit einem UCI von 0,50 und den meisten Einwohnern in der Stadtregion am monozentrischsten, während Melbourne mit einem UCI von 0,08 und über die Stadt verteilten Einwohnern am polyzentrischsten ist.

Für eine Schätzung der optimalen Reisedistanz L in jeder Stadt nehmen wir zunächst an, dass die In-Block-Reisedistanz konstant lmin = 0,5 km ist und die durchschnittliche Reisedistanz innerhalb einer Service-Community ungefähr beträgt gjSaj, occS, wo gjS bezeichnet den geometrischen Faktor in der Gemeinschaft; aj, occS bezeichnet die Fläche der besetzten Blöcke (17). Dann wird L als die Summe zweier Terme ausgedrückt, der erste für die Bevölkerung in den N Blöcken mit Einrichtungen und der zweite für die Bevölkerung in den Nmax-N-Blöcken ohne EinrichtungenL = 1P⋅ (lmin⋅∑j = 1Npj + ∑j = 1NgjSp˜jS (aj, occS) 0,5)(3)wobei P die Gesamtbevölkerung in der Stadt ist und p∼jS bezeichnet die Bevölkerung in der Dienstgemeinschaft der j-ten Einrichtung nach dem Entfernen des Blocks, in dem sich die j-te Einrichtung befindet, d. H. p∼jS = pjS – pj. Wir glauben, dass aj, occS folgt dem Potenzgesetz in Bezug auf die Gesamtfläche in der Gemeinde ajS in den meisten Städten also aj, occS∝ (ajS) γ (Abb. S5A). Wir nehmen an, dass gjS ist in jeder Stadt konstant, geschrieben als gcity, und ajS≈aS¯ = ā · Nmax / N.With ein bezeichnet die durchschnittliche Blockfläche in der Stadt. Dann können wir Gl. 3 asL (N) = lmin⋅p (N) + A⋅N – λ⋅ (1 – p (N))(4)wobei p (N) den Bevölkerungsanteil in Blöcken mit Einrichtungen bezeichnet; A und λ sind beide konstant. Weitere Einzelheiten zu dieser Ableitung finden Sie in Anmerkung S4.1.

Wir untersuchen weiter, wie der Bevölkerungsanteil in Blöcken ohne Einrichtungen mit der Anzahl der Einrichtungen N zusammenhängt, und stellen fest, dass 1 – p (N) ≈ e – αN ist, wenn N ≪ Nocc (siehe Details in Abb. S5B und Anmerkungen S4.2) und S4.3). Dadurch könnten wir L als modellierenL (N) = lmin⋅ (1 – e – αN) + A⋅N – λ⋅e – αN(5)wobei die Anzahl der Einrichtungen N die Hauptvariable ist, die L bestimmt. Während α die Beziehung zwischen p (N) und N steuert, sind A und λ zwei freie Parameter, die kalibriert werden müssen. Das Modell von L (N) fasst die Tatsache zusammen, dass für das Modell L die einzigen zwei wesentlichen Bestandteile die Anzahl der Einrichtungen zur Zuweisung von N und die Verteilung der Bevölkerung im Weltraum sind.

Als nächstes weisen wir die optimale Verteilung der Einrichtungen bei unterschiedlicher Anzahl von Einrichtungen sowohl für Spielzeug- als auch für reale Städte numerisch zu. We present the average travel distance L versus the number of facilities N in the toy and real-world cities in log-log plots in Fig. 5 (C and D). In Fig. 5C, we see that for the same N, the global travel costs in polycentric cities are larger than in the monocentric ones. To validate the proposed function L(N), we first calibrate α by fitting 1 − p(N) = e−αN per city (fig. S5B), and then calibrate the two free parameters A and λ in Eq. 5 with the simulated L. All parameters are presented in table S2. The fitted L(N) values are shown with lines in Fig. 5 (C and D). The simulated and modeled L values are presented separately for each city in fig. S6, showing good results under various empirical conditions.

For seeking a universal function to approach the simulated L in diverse cities, we use λ in Eq. 5 as a constant, fixing its average empirical value λ̄=0.382 in the 12 real-world cities. Combining the observation that Nmax is inversely proportional to α and A≈gcitya¯λ¯Nmaxλ¯ (note S4.1), we can expect that A∝α−λ¯. Figure S7C confirms this, showing that A=1.4443α−λ¯. We can rewrite Eq. 5 as followsL(N)=lmin⋅(1−e−αN)+1.4443⋅(αN)−λ¯⋅e−αN(6)

This function with only one free parameter α suggests that we are able to rescale N with α to collapse the curves of L in all cities into one, as shown in Fig. 5F that depicts Eq. 6 as solid line. The same rescaling of N in toy cities is presented in Fig. 5E, where the collapse is not as good as in the real cities due to the divergent values of λ of toy cities in table S2. Next, we go beyond the average distance L and plot the distribution of travel distances when keeping αN fixed (fig. S7, F and G). In all cases, the travel distance follows a gamma distribution. This universality suggests that (i) given a certain αN, all real-world cities can reach comparable accessibility; and (ii) the overall accessibility in the optimal scenario depends not only on the availability of the resources but also on the settlement of population, independently from the road network and total area of the city.

Empirically, the decay of population share in blocks without facilities α depends on the population distribution in space. Taking into account that unpopulated blocks are not ideal when optimizing accessibility, Nocc is a better variable to express α. A good agreement α = 1.833/Nocc (R2 = 0.96) over the 12 real-world cities is shown in Fig. 5G, suggesting that α can be estimated by Nocc. Given that α = 1.833/Nocc and the universal relation of L(αN), we can explain the collapses found in Figs. 3C and 4 (C and D).

As a concrete application of this universal model for optimal distance of facilities, in Eq. 6, we can plan for facilities by, for example, extracting how many facilities are needed for varying levels of accessibility to a given type of service. In this context, the number of facilities N can be estimated with the inverse function of Eq. 6. As the second term in Eq. 6 dominates the L for a limited N, we simply invert the second term to estimate N, given by N(L;α)=λ̄α·W((L/1.4443)−1/λ̄λ̄), where W( · ) is the ProductLog or the Lambert W function (note S4.5). Figure 5H presents the estimated and simulated N versus L for two limiting cases, LA, in which the approximation agrees well with the simulation, and Barcelona, in which the approximation underestimates N. The results of other real-world cities are depicted in fig. S8, showing in general a good agreement between the analytical approximation via the Lambert W function and the numerical simulations.

## DISCUSSION

As cities differ in their form, economy, and population distribution, the interplay between population and facility distributions is challenging to plan. The accessibility of facilities is constrained by their availability, the road network, and means of transportation. While efforts are devoted to managing daily commuting and transit-oriented developments, the planning of the distribution of different urban facilities deserves attention to a paradigm shift toward walkable cities. We present a framework that uses publicly available data to compare the optimal and the actual accessibility of various facility types at the resolution of urban blocks. This allows us to efficiently pinpoint blocks that are underserved, i.e., those where people have to travel longer distances to reach the facilities they need compared to the social optimum. By relocating the facilities to optimize the global travel distance, we find that the relation between facility and population densities follows the scaling law, d ∝ ρβ only in the limit of few or limited number of facilities, regardless of the differences in road network structures. The observed exponent β is generally around 2/3 if the number of facilities is diluted or less than 10% of the occupied blocks, and it starts to decay for larger number of facilities. This confirms the continuous limit for diluted number of facilities presented at national scale (18). We observe that the empirical conditions within cities do not follow the continuous approximation for the power law with population density because facilities are not equally planned, and the number of facilities is large in comparison with the number of populated blocks.

To gain further insights when the number of facilities is large, we analytically model the average travel distance L in the optimal scenario versus the number of facilities N and three parameters. Parameter α represents the rate of the population share in blocks without facilities, and the other two parameters can be approximated as constant among cities. A universal expression L(αN) is verified with 17 synthetic cities and 12 real-world cities depicting diverse urban forms. Furthermore, the travel distance to optimally distributed facilities follows a gamma distribution for all cities once αN is fixed. This function can be applied to estimate the number of facilities needed to offer services to people within a given accessibility in average. The results estimated with the derived function find a good match to the numerical simulations that require solving the optimal distribution of facilities. When relating α to the urban form, we uncover that centralized cities require less facilities than polycentric cities to achieve the same levels of accessibility. Applications of this framework could be to optimally reallocate resources that provide emergency services, such as the placement of shelters, ambulances, or mobile petrol stations in the event of natural disasters.

The optimal planning of facilities in this work supposes that all residents equally need the resources, and the accessibility is measured from their places of residence. In reality, the socioeconomic segregation in cities results in heterogeneous needs for resources. Cities in different social systems and economic development levels also exhibit different needs for various types of facilities that would need to be taken into account for economic considerations. On the other hand, people’s needs are naturally dynamic and change in time and space owing to their time-varying mobility behavior. All these factors result in complex interactions between the allocation of facilities and settlements of residents and can be considerable avenues for future research. Another important avenue is to consider the limited capacity of facilities in the optimal planning. This became ever more evident when distributing the health care system resources during the outbreak of a pandemic, such as the COVID-19 in 2020.

**Acknowledgments: ****Funding:** This work was supported by the QCRI-CSAIL, the Berkeley DeepDrive (BDD), and the University of California Institute of Transportation Studies (UC ITS) research grants. **Author contributions:** Y.X., L.E.O., S.A., and M.C.G. conceived the research and designed the analyses. Y.X. and S.A. collected the data. Y.X. and L.E.O. performed the analyses. M.C.G. and Y.X. wrote the paper. M.C.G. supervised the research. **Competing interests:** The authors declare that they have no competing interests. **Data and materials availability:** All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials. Additional data related to this paper may be requested from the authors.