Research in programming Wikidata/Schools
School — this is an educational institution for general education. The article is devoted to the study of schools on the basis of the Wikidata.
SPARQL queries are used to study schools. These queries work with Wikidata objects, which have a school type. A list of all the existing schools, which are described in the Wikidata, has been built. In the article there is an analysis of the completeness of the Wikidata containing information about schools, which shows the comparison between official country data with Wikidata. Also there are a map of the location of Russian schools and a linear diagram, which shows the number of famous students of each school. Information about this schools is presented in the Wikidata. Basically this map shows that schools are located in Moscow and St. Petersburg. According to the linear diagram, most schools have two famous students.
Instances of the "School"
edit- Object: school (Q3914).
- Property: instance (P31).
Let's build a list of all schools.
#List of schools
SELECT ?school ?schoolLabel
WHERE
{
?school wdt:P31 wd:Q3914.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
SPARQL query, 18615 results.
Examples of instances of the object "School", which are fully completed:
Examples of instances of the object "School", which were poorly completed in the past:
Completeness of Wikidata
editAccording to the Russian Federal State Statistics Service (Rosstat), as of 2016 there were 42,600 schools in Russia [1]. According to SPARQL query, there are only 82 schools in Russia. Here we see a huge difference between the data of Rosstat and the Wikidata.
The same situation can be observed for any country. For example, according to one more SPARQL query, there are 20 schools in the USA. This is the incorrect data (according to statistics - 28,220 schools as of 2008)[2].
It is difficult to estimate the completeness of Wikidata because huge number of schools (4629), presented in Wikidata, don`t belong to any country (SPARQL query). Such schools account for about 25% of the total number of schools in the Wikidata. This fact does not allow to attribute the school to any country and compare the official country data with Wikidata.
Filling out Wikidata
editIt was decided to fill the property "pupils (P802)" of Russian schools.
This property shows a list of famous students, who is noted in history. Let`s consider the Derzhavinsky lyceum as an example. A famous student of this lyceum is G.I. Shirshina, a Russian political and public activist.
Let`s determine how many Russian schools did not have the property "pupils (P802)" before the task was completed.
SELECT ?school WHERE {
?school wdt:P31 wd:Q3914. # school
?school wdt:P17 wd:Q159. # Russian schools
FILTER NOT EXISTS { ?school wdt:P802 []} # empty property "student"
}
According to SPARQL query, there are 82 schools without famous students.
Also, seven Russian schools with famous students (according to Wikipedia data) and an unfilled property "country(P17)" were found on Wikidata. Therefore at first the property "country (P17)" was filled with these objects. After that the number of Russian schools with famous students increased by 7 and became equal to 89.
During the filling of the property it was discovered that a couple of schools had a false property "country (P17)", which equals to Russia. This has been fixed. After that the number of Russian schools is decreased.
Also there were filled out the property "Label" in English for famous students connected with the property "pupils (P802)".
The result of work: the property "pupils (P802)" in 45 Russian schools has been filled. Other schools did not have famous students or did not describe in Wikipedia. Let`s get the list of Russian schools with the filled property "pupils (P802)" using the following script:
SELECT ?school ?schoolLabel (count(*) as ?countStudents) WHERE {
?school wdt:P31 wd:Q3914. # the object is the school
{ ?school wdt:P17 wd:Q34266 } UNION # Russian Empire
{ ?school wdt:P17 wd:Q15180 } UNION # Soviet Union
{ ?school wdt:P17 wd:Q159 }. # Russia
?school wdt:P802 ?student.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
GROUP BY ?school ?schoolLabel
SPARQL query, 43 objects, i.e. 43 russian schools with famous students.
Let's construct a linear diagram of the number of famous students of each such school:
The diagram above shows the division of schools into three groups: one famous student (five schools), two famous students (36 schools) and three famous students (one school: school number 1212).
Let`s get a map of Russian schools with famous graduates with the help of the script below:
SELECT ?school ?location WHERE {
#the object is the school
?school wdt:P31 wd:Q3914.
{ ?school wdt:P17 wd:Q34266 } UNION # Russian Empire
{ ?school wdt:P17 wd:Q15180 } UNION # Soviet Union
{ ?school wdt:P17 wd:Q159 }. # Russia
?school wdt:P802 ?student.
?school wdt:P625 ?location.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
Basically on the map above schools are located in Moscow and St. Petersburg. There are also many schools near Rostov-on-Don.
Thus, it was not possible to find information about famous graduates for 41 Russian schools. (SPARQL query)
Future work
edit- Output the name of the country with the largest number schools that have a logo.
- Display a map with the schools marked on it, existing for more than 200 years.
- Construct the graph of the domain zones of the official websites of schools.
Exercises
edit
SPARQL queries with replies:
See also
editReferences
edit- ↑ Number of state and municipal general education organizations 2016.
- ↑ Digest of Education Statistics 2011.
- Rosstat (2016-04-18). "Number of state and municipal general education organizations". Rosstat. Retrieved 2017-11-24.
- Thomas D. Snyder (June 2011) (pdf). Digest of Education Statistics. pp. 61. http://files.eric.ed.gov/fulltext/ED544580.pdf.