Research in programming Wikidata/Cities
The article is devoted to the study of different types of cities corresponding to the four objects of Wikidata - "Town", "City", "Big city" and "City with millions of inhabitants". Using SPARQL queries to Wikidata, data on the number of instances of the objects under study was obtained and the following information was gathered:
- Population of different types of cities
- Number of cities without sister cities
- List of cities ordered by number of sister cities
- Number of cities with certain amount of sister cities
- Country with most sister cities
- Closest neighbours of Russia by number of sister cities
Item lists
edit"Town"
edit- Wikidata element: Q3957
SELECT ?city ?cityLabel WHERE {
?city wdt:P31 wd:Q3957.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
SPARQL query, 13800 records (2020).
"City"
edit- Wikidata element: Q515
SELECT ?city ?cityLabel WHERE {
?city wdt:P31 wd:Q515.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
SPARQL query, 20800 records (2017), 9260 records (2020).
Most complete elements include > San-Francisco, Berlin, Petrozavodsk, …
Almost empty elements are > Madinat Zayed, Muzaffarpur, Willow-River, …
According to ProWD Singapore is the leader in terms of the number of properties (104 properties) among cities around the world. Novorossiysk contains 31 properties. This is the maximum number of properties for Russian cities.
"Big city"
edit- Wikidata element: Q1549591
SELECT ?city ?cityLabel WHERE {
?city wdt:P31 wd:Q1549591.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
SPARQL query, 198 records (2017), 3075 records (2020).
Most complete elements include > Bern, Berlin, Geneva, …
Almost empty elements are > Balanga (Nigeria), Ungaran, Kayes, …
According to ProWD Singapore is the leader in terms of the number of properties (104 properties) among big cities around the world. Moscow contains 76 properties. This is the maximum number of properties for Russian big cities.
"City with millions of inhabitants"
edit- Wikidata element: Q1637706
SELECT ?city ?cityLabel WHERE {
?city wdt:P31 wd:Q1637706.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
SPARQL query, 616 records (2020).
Different types of cities
editSELECT ?city ?cityLabel WHERE { # Selecting items which are ...
{ ?city wdt:P31 wd:Q3957 } UNION # ... instances of "town" ...
{ ?city wdt:P31 wd:Q515 } UNION # ... instances of "city" ...
{ ?city wdt:P31 wd:Q1549591 } UNION # ... instances of "big city" ...
{ ?city wdt:P31 wd:Q1637706 } # ... instances of "city with millions of inhabitants"
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
SPARQL query, 26751 records (2020).
Population
edit"Town"
editUsed:
- Object: town (Q3957)
- Property: instance of (P31)
- Property: population (P1082)
SELECT (SUM(?population_city) as ?sum) WHERE { # Selecting total population of items which are
SELECT (MAX(xsd:integer(REPLACE(STR(?population),"\\.",""))) as ?population_city) ?city WHERE {
?city wdt:P31 wd:Q3957. # ... instances of "town" ...
?city wdt:P1082 ?population # ... with filled property "population"
}
GROUP BY ?city
}
SPARQL query, 53,30 million people (2020).
"City"
editUsed:
- Object: city (Q515)
- Property: instance of (P31)
- Property: population (P1082)
SELECT (SUM(?population_city) as ?sum) WHERE { # Selecting total population of items which are
SELECT (MAX(xsd:integer(REPLACE(STR(?population),"\\.",""))) as ?population_city) ?city WHERE {
?city wdt:P31 wd:Q515. # ... instances of "city" ...
?city wdt:P1082 ?population # ... with filled property "population"
}
GROUP BY ?city
}
SPARQL query, 1 133,56 million people (2020).
"Big city"
editUsed:
- Object: big city (Q1549591)
- Property: instance of (P31)
- Property: population (P1082)
SELECT (SUM(?population_city) as ?sum) WHERE { # Selecting total population of items which are
SELECT (MAX(xsd:integer(REPLACE(STR(?population),"\\.",""))) as ?population_city) ?city WHERE {
?city wdt:P31 wd:Q1549591. # ... instances of "big city" ...
?city wdt:P1082 ?population # ... with filled property "population"
}
GROUP BY ?city
}
SPARQL query, 2 538,49 million people (2020).
"City with millions of inhabitants"
editUsed:
- Object: city with millions of inhabitants (Q1637706)
- Property: instance of (P31)
- Property: population (P1082)
SELECT (SUM(?population_city) as ?sum) WHERE { # Selecting total population of items which are
SELECT (MAX(xsd:integer(REPLACE(STR(?population),"\\.",""))) as ?population_city) ?city WHERE {
?city wdt:P31 wd:Q1637706. # ... instances of "city with millions of inhabitants" ...
?city wdt:P1082 ?population # ... with filled property "population"
}
GROUP BY ?city
}
SPARQL query, 2 118,39 million people (2020).
Analysis
editDifferent characters, such as point, comma, or space, are used as separators in different countries. As a result, the variants of representing the value of the population property can also be different. Problems arise when using a point, because in Wikidata this character is the separator between the integer and decimal parts of a number. To disambiguate, REPLACE function to remove the specified character should be used. This conversion does not affect the value itself, since the population is an integer, and the separators are used solely for ease of reading.
The table below shows a summary of the population of different types of cities, as well as the proportion of the population per type of city of the world population, which reached approximately 7,8 billion people in 2020[1]. According to Wikidata, almost three quarters of the world's population live in cities.
City type | Population (million people) |
% of world |
---|---|---|
"Town" | 53,30 | 0,7 % |
"City" | 1 133,56 | 14,5 % |
"Big city" | 2 538,49 | 32,5 % |
"City with millions of inhabitants" | 2 118,39 | 27,1 % |
Total | 5 843,74 | 74,8 % |
Sister cities
editSister cities are cities of different states that have established permanent friendly relations with each other in order to strengthen international relationship in the fields of culture, economics, creation and management of urban infrastructure, the functioning of civil society, and so on[2].
How many cities don't have a single sister city?
editUsed:
- Object: town (Q3957)
- Object: city (Q515)
- Object: big city (Q1549591)
- Object: city with millions of inhabitants (Q1637706)
- Property: instance of (P31)
- Property: sister city (P190)
SELECT (COUNT(?city) as ?count) WHERE { # Counting items which are ...
{ ?city wdt:P31 wd:Q3957 } UNION # ... instances of "town" ...
{ ?city wdt:P31 wd:Q515 } UNION # ... OR instances of "city" ...
{ ?city wdt:P31 wd:Q1549591 } UNION # ... OR instances of "big city" ...
{ ?city wdt:P31 wd:Q1637706 } # ... OR instances of "city with millions of inhabitants"
FILTER NOT EXISTS { ?city wdt:P190 [] } # ... with unfilled property "sister city"
}
SPARQL query, 21479 cities (2020).
There are 26751 cities of four types known by Wikidata for 2020. Thus, sister cities are known only for 20% of cities.
List of cities ordered by number of sister cities
editAll
editUsed:
- Object: town (Q3957)
- Object: city (Q515)
- Object: big city (Q1549591)
- Object: city with millions of inhabitants (Q1637706)
- Property: instance of (P31)
- Property: sister city (P190)
SELECT ?city ?cityLabel (COUNT(?sister) AS ?sisterCount) WHERE { # Counting sister cities of cities which are ...
{ ?city wdt:P31 wd:Q3957 } UNION # ... instances of "town" ...
{ ?city wdt:P31 wd:Q515 } UNION # ... OR instances of "city" ...
{ ?city wdt:P31 wd:Q1549591 } UNION # ... OR instances of "big city" ...
{ ?city wdt:P31 wd:Q1637706 } # ... OR instances of "city with millions of inhabitants"
?city wdt:P190 ?sister. # ... with filled property "sister city"
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
GROUP BY ?city ?cityLabel # Grouping by city
ORDER BY DESC(?sisterCount) # Sorting by number of sister cities (descending)
SPARQL query, 4046 cities with sister cities (2020).
Russia
editUsed:
- Object: town (Q3957)
- Object: city (Q515)
- Object: big city (Q1549591)
- Object: city with millions of inhabitants (Q1637706)
- Object: Russia (Q159)
- Property: instance of (P31)
- Property: country (P17)
- Property: sister city (P190)
SELECT ?city ?cityLabel (COUNT(?sister) AS ?sisterCount) WHERE { # Counting sister cities of cities which are ...
VALUES ?cityTypes {wd:Q3957 wd:Q515 wd:Q1549591 wd:Q1637706}
?city wdt:P31 ?cityTypes. # ... instances of different types of cities ...
?city wdt:P17 wd:Q159. # ... belonging to Russia ...
?city wdt:P190 ?sister. # ... with filled property "sister city"
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
GROUP BY ?city ?cityLabel # Grouping by city
ORDER BY DESC(?sisterCount) # Sorting by number of sister cities (descending)
SPARQL query, 82 cities with sister cities (2020).
There were more cities wishing to be friends with the cultural capital of Russia (Saint Petersburg, 230 sister cities) than with the official capital (Moscow, 134 sister cities) for 2020. Omsk (58), Volgograd (56) and Kaliningrad (54) had almost the same number of sister cities. Petrozavodsk, Perm, Vladimir and Belgorod each had 14 sister cities.
Number of cities with certain amount of sister cities
editAll
editUsed:
- Object: town (Q3957)
- Object: city (Q515)
- Object: big city (Q1549591)
- Object: city with millions of inhabitants (Q1637706)
- Property: instance of (P31)
- Property: sister city (P190)
#defaultView:LineChart # Do line chart as result representation
SELECT ?sisterCount (COUNT(?sisterCount) AS ?FreqNSister) WHERE { # Count No. of cities having ?sisterCount sister cities
# and number of sister cities themselves
{
SELECT (COUNT(?sister) AS ?sisterCount) WHERE { # Count sister cities of cities which are ...
VALUES ?cityTypes {wd:Q3957 wd:Q515 wd:Q1549591 wd:Q1637706}
?city wdt:P31 ?cityTypes. # ... instances of different types of cities ...
?city wdt:P190 ?sister. # ... with filled property "sister city"
}
GROUP BY ?city # Group list by city
}
}
GROUP BY ?sisterCount # Group by number of sister cities
ORDER BY DESC(?sisterCount) # Order by number of sister cities (descending)
SPARQL query, 90 variants of sister cities amount (2020).
A little more than four thousand cities (4046 cities) have at least one sister city, of which:
- 32% (1314 cities) have relations with more than five cities;
- 18% (728 cities) have at least 11 sister cities;
- 9% (345 cities) friends with more than 20 cities;
- 2% (94 cities) have 50 or more sister cities.
It can be concluded that the relation between number of sister cities the city have and number of cities which have this amount of sister cities has a distribution close to a power law.
Russia
editUsed:
- Object: town (Q3957)
- Object: city (Q515)
- Object: big city (Q1549591)
- Object: city with millions of inhabitants (Q1637706)
- Object: Russia (Q159)
- Property: instance of (P31)
- Property: country (P17)
- Property: sister city (P190)
#defaultView:LineChart # Do line chart as result representation
SELECT ?sisterCount (COUNT(?sisterCount) AS ?FreqNSister) WHERE { # Count No. of cities having ?sisterCount sister cities
# and number of sister cities themselves
{
SELECT (COUNT(?sister) AS ?sisterCount) WHERE { # Count sister cities of cities which are ...
VALUES ?cityTypes {wd:Q3957 wd:Q515 wd:Q1549591 wd:Q1637706}
?city wdt:P31 ?cityTypes. # ... instances of different types of cities ...
?city wdt:P17 wd:Q159. # ... belonging to Russia ...
?city wdt:P190 ?sister. # ... with filled property "sister city"
}
GROUP BY ?city # Group list by city
}
}
GROUP BY ?sisterCount # Group by number of sister cities
ORDER BY DESC(?sisterCount) # Order by number of sister cities (descending)
SPARQL query, 24 variants of sister cities amount (2020).
A little less than a hundred Russian cities (82 cities) have at least one sister city, of which only 48% (39 cities) are connected with over than five cities.
Which country has the most sister cities?
editUsed:
- Object: town (Q3957)
- Object: city (Q515)
- Object: big city (Q1549591)
- Object: city with millions of inhabitants (Q1637706)
- Property: instance of (P31)
- Property: country (P17)
- Property: sister city (P190)
#defaultView:BubbleChart
SELECT ?countryLabel (COUNT(?sister) as ?sisterCount) WHERE { # Selecting number of distinct sister cities of particular country cities which are ...
SELECT DISTINCT ?countryLabel ?sister WHERE {
VALUES ?cityTypes {wd:Q3957 wd:Q515 wd:Q1549591 wd:Q1637706}
?city wdt:P31 ?cityTypes. # ... instances of different types of cities ...
?city wdt:P17 ?country. # ... with filled property "country" ...
?city wdt:P190 ?sister. # ... with filled property "sister city"
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
}
GROUP BY ?countryLabel
ORDER BY DESC(?sisterCount)
SPARQL query, 208 countries (2020).
Germany had the largest number of sister cities (1375 cities) for 2020.
List of countries having sister cities with Germany
editUsed:
- Object: town (Q3957)
- Object: city (Q515)
- Object: big city (Q1549591)
- Object: city with millions of inhabitants (Q1637706)
- Object: Germany (Q183)
- Property: instance of (P31)
- Property: country (P17)
- Property: sister city (P190)
SELECT ?country ?countryLabel (COUNT(DISTINCT ?sister) as ?sisterCount) WHERE {
# Selecting number of distinct particular country sister cities of cities which are ...
VALUES ?cityTypes {wd:Q3957 wd:Q515 wd:Q1549591 wd:Q1637706}
?city wdt:P31 ?cityTypes. # ... instances of different types of cities ...
?city wdt:P17 wd:Q183. # ... belonging to Germany ...
?city wdt:P190 ?sister. # ... with filled property "sister city" which are ...
?sister wdt:P17 ?country. # ... with filled property "country" ...
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
GROUP BY ?country ?countryLabel
ORDER BY DESC(?sisterCount)
SPARQL query, 93 countries (2020).
The table shows a list of ten countries that have the largest number of sister cities with Germany (2020).
# | Country | Number of sister cities |
% of total |
---|---|---|---|
1 | France | 247 | 18,0 % |
2 | Germany | 195 | 14,2 % |
3 | United Kingdom | 120 | 8,7 % |
4 | Italy | 86 | 6,3 % |
5 | Poland | 81 | 5,9 % |
6 | United States of America | 60 | 4,4 % |
7 | Austria | 41 | 3,0 % |
8 | Russia | 39 | 2,8 % |
9 | Hungary | 39 | 2,8 % |
10 | Belgium | 33 | 2,4 % |
Closest neighbours of Russia by number of sister cities
editUsed:
- Object: town (Q3957)
- Object: city (Q515)
- Object: big city (Q1549591)
- Object: city with millions of inhabitants (Q1637706)
- Object: Russia (Q159)
- Property: instance of (P31)
- Property: country (P17)
- Property: sister city (P190)
- Property: geoshape (P3896)
#defaultView:Map
SELECT ?country ?countryLabel ?sisterCount ?shape ?layer WHERE {
{ # Selecting number of distinct particular country sister cities of cities which are ...
SELECT ?country ?countryLabel (COUNT(DISTINCT ?sister) as ?sisterCount) WHERE {
VALUES ?cityTypes {wd:Q3957 wd:Q515 wd:Q1549591 wd:Q1637706}
?city wdt:P31 ?cityTypes. # instances of different types of cities
?city wdt:P17 wd:Q159. # city belongs to Russia
?city wdt:P190 ?sister. # city has "sister city"
?sister wdt:P17 ?country. # which belongs to "country"
FILTER(?country NOT IN(wd:Q159)) # except the Russia
SERVICE wikibase:label {bd:serviceParam wikibase:language "en"}
}
GROUP BY ?country ?countryLabel
ORDER BY DESC(?sisterCount)
}
OPTIONAL {?country wdt:P3896 ?shape.} # country has "geoshape"
BIND(
IF(?sisterCount < 5, "<5",
IF(?sisterCount <= 10, "5-10",
IF(?sisterCount <= 20, "11-20",
IF(?sisterCount <= 30, "21-30",
IF(?sisterCount <= 40, "31-40",
">40"))))) AS ?layer).
}
SPARQL query, 102 countries (2020).
Russia has more than twenty sister cities with countries such as United States of America (46), China (46), Germany (44), Ukraine (28), Bulgaria (25), Poland (24), France (23) and Italy (22).
Wikidata completeness and disadvantages
editCity is a type of human settlement with people not occupied with agriculture. At the same time, different countries use different criteria when assigning city status to settlements, the main of which is population. Some countries don't define a term "city" at all. So, in France, only one geographic unit of this kind is used — a commune, regardless of the number of people living in it and the type of their activity. Therefore, it can be difficult to clearly determine which settlement is classified as a city and which is not.
In practice, some Wikidata objects can simultaneously be instances of different types of cities. For example, Shanghai is assigned to three objects under study: city, big city, city with millions of inhabitants. It is easy to guess that such multiple assignment affects the results of SPARQL queries, in particular, using the UNION construction. This can be verified by running, for example, SPARQL query for finding different types of cities. Shanghai is found in the results for three times.
Wikidata has an inheritance mechanism expressed in the subclass of property. This mechanism consists in the fact that if an object is an instance of big city, then it is also an instance of city, since big city is a subclass of city. Thus, the situation described above with Shanghai can be resolved by leaving only one class — city with millions of inhabitants. It should be noted that replacing a UNION construction with a subclassing construction is not equivalent.
# Selecting items which are ...
SELECT ?city ?cityLabel WHERE {
# ... instances of "city" ...
{ ?city wdt:P31 wd:Q515 } UNION
# ... instances of "big city" ...
{ ?city wdt:P31 wd:Q1549591 } UNION
# ... instances of "city with millions of inhabitans"
{ ?city wdt:P31 wd:Q1637706 }
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
# Selecting items which are ...
SELECT ?city ?cityLabel WHERE {
# ... instances of "city" subclasses
?city wdt:P31/wdt:P279* wd:Q515
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
Shanghai, considered earlier, can be found four times in the new query results. The fact is that in addition to some of the objects under study, there are other classes inherited from city. For example, lost city, free imperial city, autonomous city and even ideal city.
Also, probably due to the ambiguity in the criteria for assigning city status, subclasses were created for specific countries — city in Chile, city in Cyprus, city of Japan and so on. This tendency was not spared by the cities of Russia, which could be noticed when comparing the results of a SPARQL query to find instances of the "City" object. For 2020, most of them belong to the city/town class.
According to the Russian Census (2010)[3] and the Crimean Federal District Census (2014)[4] , the total number of Russian cities was 1117 in 2014. All cities in Russia have an article in both Russian and English Wikipedia.
Number of Wikidata elements which are Russian cities equals to 1126[5]. It can be assumed that Wikidata completely covers, at least, Russian cities.
Future work
edit- Construct a graph of Russian sister cities.
- Get list of Russian cities situated beyond the Arctic circle.
- On which river in Russia is the largest number of cities located?
- Which country has the largest proportion of sister cities within a country relative to the number of sister cities that relate that country to other countries?
Tests
edit
Check yourself:
Addon
editReferences
edit- Menshikova E. (2020). "Cities in Russia". ProWD.
- Menshikova E. (2020). "Big cities in Russia". ProWD.
Links
edit- Alla Salkova (16 April 2020). "Growth is not endless: what awaits the world's population by the end of the century" (in Russian). Gazeta.Ru. Retrieved 2020-12-06.
- The International Association "Sister cities" (2018). "Sister cities" (in Russian). Retrieved 2020-12-06.
- Federal State Statistics Service (2010). "Russian Census — 2010" (pdf) (in Russian). Retrieved 2017-03-08.
- Federal State Statistics Service (2014). "Crimean Federal District Census — 2014" (pdf) (in Russian). Retrieved 2020-11-05.
- "Wikidata Query Service". Retrieved 2020-11-05.
- Andrew Krizhanovsky, Alexander Kurbeev (2017-03-13). "Исследование городов-побратимов с использованием Викиданных - Authorea" (in Russian). Retrieved 2017-05-03.