Research in programming Wikidata/Operating systems

The article explores the object of the "operating system" and its properties. The following problems were solved in the paper with the help of SPARQL queries: finding instances of the object "operating system", building a list of operating systems (OS) by base, by creation time, by programming language, in which the OS was written. Also a histogram is constructed, it shows the number of programs written in some programming language, and the proportion of how many of them work for some OS. A lot of software does not specify the programming language on which it was developed. The property "programming language" was added to several objects to improve the results. Wikidata plays a big role in software documentation.

Instances of the object "operating system" edit

Let's build a list of all the operating systems.

#added 2017-03
#List of `instances of` "operating system" 
SELECT ?os ?osLabel
WHERE
{
    ?os wdt:P31 wd:Q9135.
    SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}

SPARQL query 510 results (January 2018), 1086 results (September 2020).

[+]> The most complete and detailed operating systems on Wikidata are: Linux, Windows, Windows 8

[-]> Almost empty and less informative operating systems are: SPIN, JavaOS, Atari TOS, Xubuntu

According to ProWD the only one Russian operating system on Wikidata is Miraculix, which has 7 properties. The leaders in terms of the number of properties (24 properties) among operating systems around the world are Microsoft Windows and Windows 8.

List of operating systems by base edit

SELECT ?osLabel ?baseLabel
WHERE
{
    ?os wdt:P31 wd:Q9135.
  	?os wdt:P144 ?base.
    SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
GROUP BY ?osLabel ?baseLabel

SPARQL query 159 results (January 2018), 118 results (September 2020).

The query shows relation between OS and it's base.

List of operating systems by creation time edit

#defaultView:Timeline
SELECT ?osLabel ?time
WHERE
{
    ?os wdt:P31 wd:Q9135. 
  	?os wdt:P571 ?time.
    SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
GROUP BY ?osLabel ?time
ORDER BY DESC(?time)

SPARQL query 298 results (January 2018), 238 results (September 2020).

Count of operating systems by programming language edit

#defaultView:BarChart
SELECT ?lang (count(*) as ?count)
WHERE 
{
    ?os wdt:P31 wd:Q9135.
    ?os wdt:P277 ?langObj .
    OPTIONAL {
		?langObj rdfs:label ?lang
		filter (lang(?lang) = "en")
	}
}
GROUP BY ?lang
ORDER BY DESC(?count) ASC(?lang)

SPARQL query 35 results (January 2018), 37 results (September 2020).

The query shows (only on the basis of the completed wikis, so it's not a fact that it's true) that the OS is predominantly written in Assembler language, which is certainly true, because it is the fastest, yet convenient programming language. On the second and third places are C and C++, which are not the worst analogue, because in spite of its "slowness", they are the most convenient and simple programming languages.

The programming languages used to write the operating system edit

It is also interesting to look at the results of this query in the form of a graph, it is also perfectly visible on it how many objects simply have an empty field "programming language".

#defaultView:Graph
SELECT ?os ?osLabel ?sharesBorderWith ?sharesBorderWithLabel
WHERE
{
    ?os wdt:P31 wd:Q9135.
    SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
    OPTIONAL { ?os wdt:P277 ?sharesBorderWith . }
}

SPARQL query 533 results (March 2017), 1117 results (September 2020)

If you look at the same query, but with such a restriction that at least the number of operating systems written in the language is at least 2, you can see a significant difference with the result of the previous query.

#defaultView:Graph
SELECT ?os ?osLabel ?language ?languageLabel
WHERE
{
  {
    SELECT ?language ?languageLabel
    WHERE {
      ?os wdt:P31 wd:Q9135. # os is os
      ?os wdt:P277 ?language. # os written by language
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
    } 
    Group by ?language ?languageLabel 
    Having (Count(?os) > 1) # get laguages which has more than one written os
  }
  ?os wdt:P31 wd:Q9135. # os is os
  ?os wdt:P277 ?language. # os written by language
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}

SPARQL query 118 results (October 2020)

Graph of languages used to create operating systems 2020.


Completeness of the Wikidata edit

According to information from the site www.operating-system.org, there are about 611 operating systems [1] (not including Linux distributions, which number exceeds the number of operating systems themselves). SPARQL query told us only about 510 operating systems. And if you look through a large number of objects from the query, it becomes clear that many of them are not very well filled, or even completely empty. From this observation we can conclude about the incompleteness of the wikidata.

Programming languages for creating operating systems edit

List of operating systems and languages in which they are written edit

To get a list of the operating systems (OS) links and the programming language used to create it, you can run the following query

SELECT ?osLabel ?langLabel
WHERE 
{
    ?os wdt:P31 wd:Q9135. # os is instace of operating system
    ?os wdt:P277 ?lang. # os is written on programming language
    SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}

SPARQL query 147 results (September 2020).

Software and operating systems on which they are used edit

The amount of software can be regarded as an indicator of the importance of the OS. The more OS users, the more software vendors will want to provide their products to such an audience. Hence the conclusion suggests itself: the more software is written for the system, the more significant it is. This request shows which software is supported by which OS.

SELECT ?software ?softwareLabel ?os ?osLabel
WHERE
{
    ?software wdt:P306 ?os. # on which os can works software
    SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}

SPARQL query 5738 results (January 2018), 30184 results (September 2020).

To get the most popular operating systems for software developers, you can modify the previous request in this way

#defaultView:BarChart
SELECT ?os ?osLabel (COUNT(*) as ?count)
WHERE
{
  ?software wdt:P306 ?os. # on which os can works software
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
GROUP BY ?os ?osLabel
LIMIT 10

SPARQL query

As you can see, the priorities for developers are: Linux. Microsoft Windows, Ubuntu.

A number of programming languages were used to create software for the operating system edit

SELECT ?software ?softwareLabel ?os ?osLabel (count(*) as ?count)
WHERE
{
    ?software wdt:P306 ?os. # on which os can works software
    ?software wdt:P277 ?lang. # programming language in which developed software 
    SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
GROUP BY ?software ?softwareLabel ?os ?osLabel
ORDER BY DESC(?count)

SPARQL query 2259 results (December 2018), 6883 results (September 2020). The request shows for each software for each OS in how many languages it is written

Cartesian product of OS and languages with software and languages edit

SELECT ?software ?softwareLabel ?os ?osLabel ?softwareLanguageLabel ?osLanguageLabel
WHERE
{
  ?software wdt:P306 ?os. # software works on os
  ?software wdt:P277 ?softwareLanguage. # software is written by parogramming language
  ?os wdt:P277 ?osLanguage. # os is written by parogramming language
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
GROUP BY ?software ?softwareLabel ?os ?osLabel ?softwareLanguageLabel ?osLanguageLabel
ORDER BY DESC(?softwareLabel)

SPARQL query 5336 results (January 2018), 18976 results (September 2020).

How much software was written using a language for an OS written using a programming language edit

SELECT (count(*) as ?count) ?osLanguageLabel ?softwareLanguageLabel
WHERE
{
  ?software wdt:P306 ?os. # software works on os
  ?software wdt:P277 ?softwareLanguage. # software is written by parogramming language
  ?os wdt:P277 ?osLanguage. # os is written by parogramming language
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
GROUP BY ?osLanguageLabel ?softwareLanguageLabel
ORDER BY DESC(?count) DESC(?osLanguageLabel) DESC(?softwareLanguageLabel)

SPARQL запрос 418 results (January 2018), 829 results (September 2020). The query shows that most of the software written for OS written in C/C ++ is also written in C/C ++. On the whole, it can be seen that most of the software is written in C, C ++, Python, Java, ObjectiveC.

How many software was written for the operating system using a language edit

SELECT ?osLabel ?softwareLanguageLabel (count(*) as ?count)
WHERE
{
  ?software wdt:P306 ?os. # software works on os
  ?software wdt:P277 ?softwareLanguage. # software is written by parogramming language
  ?os wdt:P277 ?osLanguage. # os is written by parogramming language
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en"}
}
GROUP BY ?osLabel ?softwareLanguageLabel
ORDER BY DESC(?count) DESC(?osLabel)

SPARQL query 378 results (January 2018), 671 results (September 2020). The query shows that most of the software written for macOS is written in C ++, C, Python, for Android - in C ++ and Java, for iOS - in C ++.

How many software has been written in one or another programming language, and which part of them works under a particular operating system edit

The histogram shows how much software was written in a particular programming language, and which part of them works under a particular operating system

#defaultView:BarChart
SELECT (count(*) as ?count) ?softwareLanguageLabel ?osLabel
WHERE
{
  ?software wdt:P306 ?os. # software works on os
  ?software wdt:P277 ?softwareLanguage. # software is written by parogramming language
  ?os wdt:P277 ?osLanguage. # os is written by parogramming language
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en"}
}
GROUP BY ?softwareLanguageLabel ?osLabel
HAVING (?count > 50)
ORDER BY DESC(?count) DESC(?osLabel)

SPARQL запрос 378 results (January 2018), 671 results (September 2020).

Programming languages and count of OSs for which programs written in languages (2020).



The histogram in the figure allows you to see for each programming language the number of programs that were written on it, and for which operating systems these programs work. It can be seen from the graph that the largest number of programs is written on С(1084), С++(1598), Java(526), JavaScript(242), Objective C(252), Python(454).

Let's look at each of these languages in more details.

Most of the programs which are written in C are for macOS(472) and Linux(235). The language was developed in 1972, but it still does not lose its popularity because, probably, it is using to write low-level applications.

Most of the programs which are written in С++ are for macOS(780), Linux(265) and Android(264). Probably, C++ will lead for a long time, because at the moment it is using for solutions that require high performance, which is not allowed by high-level languages like Java or C#.

Most of the programs which are written in Java are for macOS(196) and Android(156). Probably, Java is popular due to code portability, i.e. the Java code will be run on any machine in which the JVM is installed.

Most of the programs which are written in JavaScript are for macOS(100) and Android(60) и iOS(40). It is using to write the client side of web applications, it reduces server load and increases application speed.

Most of the programs which are written in ObjectiveC are for macOS(112) and iOS(72). Some time ago, if was especially using by the Apple corporation.

Most of the programs which are written in Python are for macOS(212) и Linux(107). It is a high-level language, has a low entry threshold. It is using, for example, to write web applications and data analysis.

Looking at the histogram, we can conclude that each of these languages has taken its "region" in the field of software development and is used for a certain range of tasks. It is also seen, that most of the programs are for macOS(2388), Linux(895) or Android(908).

Completeness of the Wikidata edit

Let's compare queries 2 and 3. Оbviously that a lot of software products don't have "programming language" property.

Filling in the Wikidata edit

After filling the "programming language" field in 100 software products, query 3 shows 2502 results, 06.11.2017 01:40.

Software documentation edit

Wikidata plays a big role in software documentation. This is illustrated by the programs included in the GNOME and KDE[1]. This article shows that while the English Wikipedia describes almost all the programs included in GNOME and KDE, the Italian and French ones only contain a subset of the articles. Documenting large projects is a well-known and difficult task. To solve it, you need a centralized system. It is in this role that the bunch of Wikipedia and Wikidata acts[1].

Future work edit

  1. Show all those OSs that have a "logo (P18)" property.
  1. Construct a diagram reflecting the statistics of how many OSs was created in which country. Шt is permitted to use properties developer, country или headquarters location, if creator is a company, or country of citizenship, if creator is an individual developer.
  1. Count how many OSs was created (inception (P571)) in 1995.

Exercises edit

1 Specify the relation between the OS and its developer:

Apple Sun Microsystems Canonical Ltd.
Newton OS
JavaOS
Ubuntu Touch

2 Select desktop image of OS Fuduntu:

3 Select the OS based on which most other OSs were created:

Debian
Android
Ubuntu
Linux kernel


1. SPARQL query, OSs and developers

2. SPARQL query, OSs and logos

3. SPARQL query, OSs and countries

4. SPARQL query, OSs and count of "descendants"

References edit

  1. 1.0 1.1 Documenting Software Applications on Wikidata 2020.
  • Sysoev M. "Operating systems in Russia". ProWD. Retrieved 2020-09-28.

Links edit