Research in programming Wikidata/Ships

The article is devoted to the study of the object of the Wikidata ships. Three examples of good and poorly filled "ships" objects are distinguished. With the help of SPARQL queries, computed on objects of the "ship" type, the following tasks are solved: lists all ships of the world, as well as ships that participated in military conflicts and are associated with any country. Also an estimate of the completeness of the Wikidata is given. The paper presents a graph showing the relationship between ships associated with Russia and the military conflicts in which they participated.

Instances of the object "ship" edit

ship (Q11446) is a large marine vessel.

Wikidata properties considered in the work:

Let's build a list of all ships in English.

# List of ships
SELECT ?ship ?shipLabel
WHERE
{
  ?ship wdt:P31 wd:Q11446. # instance of ship
  SERVICE wikibase:label {bd:serviceParam wikibase:language "en"}
}

SPARQL-query, 19 820 results (2017), 50 681 results (2020), 71 203 results (2021).

# List of ship from Russia, Soviet Union and Russian Empire
SELECT ?ship ?shipLabel
WHERE
{
  ?ship wdt:P31 wd:Q11446; # instance of ship
        wdt:P137/wdt:P17 ?country. # belongs to country
    
  VALUES ?country {wd:Q34266 # Russian Empire
                  wd:Q15180 # Soviet Union
                  wd:Q159}  # Russia
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}

SPARQL-query, 107 results (2017), 578 results (2021).

Completeness of the Wikidata edit

Finding the exact number of ships in the world is a difficult task. After all, data about some of them are top secret, some are private vessels and there is no information about them either. Suppose that the total number of ships is about 1.6 millions, as indicated in the vessel database. The script in the listing showed only 71 203 records, which makes up only 4.5% of the total number of ships.

As for the Russian ships, the actual civil and military fleets includes 17 657 ships. At the time when the script in the listing showed only 579 records, which is only 3.27% of the total number of Russian ships.

In the first and in the second case, the difference between the actual number of ships and the result of requests is huge, which indicates the incompleteness of the Wikidata.

ProWD edit

Fig. 1. Filled property imbalance and Gini coefficient

Data was collected with ProWD.id, 2020. The graph and Gini coefficient show that completeness is not uniform.

The ship Krasin (Q281147) has the greatest quantity of properties (34) according to ProWD report. The ships Liven (Q99198666) (5 properties) and Dispatch (Q28155282) (4 properties) have the lowest quantity of properties.

Filling the properties of warships edit

It is required to find and fill a hundred objects of ships connected with Russia and participating in any military conflicts.

# List of ships with countries and war conflicts
SELECT ?ship ?shipLabel ?countryLabel ?conflict ?conflictLabel
WHERE
{
  ?ship wdt:P31 wd:Q11446;        # instance of ship
        wdt:P137/wdt:P17 ?country;# belongs to country
        wdt:P607 ?conflict.       # engaged in some conflict
SERVICE wikibase:label {bd:serviceParam wikibase:language "en"}
}

SPARQL-query — 1400 ships (2017), 3586 ships (2020), 3567 ships (2021).

With the serarator ";" in the script in the listing it is possible to extract multiple properties of the same object in one line of code. It this script to properties were extracted: country (P17) of the operator (P137) and conflict (P607) Military conflicts and military operations, which are part of wars, are different concepts. Filled data on ships can be roughly divided into two types:

  1. Objects in which military operations are combined with military conflicts. For example, in Soviet destroyer Gremyashchiy 10 wars / battles, see listing below. Such a large number is due to the fact that the ship took part in many arctic convoys which are military operations.
  2. Objects in which military operations are separated from military conflicts. For example, in the British cruiser HMS Trinidad participation in the military campaign and the Arctic convoy are listed as part of World War II with the qualifier including (P1012). Thus, in the Wikidata, this cruiser has one war/battle.
# List of military conflicts of the two ships 
SELECT ?ship ?shipLabel ?conflict ?conflictLabel
WHERE
{
  VALUES ?ship {wd:Q4148613   # Soviet destroyer Gremyashchiy
                wd:Q1565575}  # United Kingdom's HMS Trinidad
  ?ship wdt:P607 ?conflict.   # conflict
  SERVICE wikibase:label {bd:serviceParam wikibase:language "en"}
}

SPARQL-query — War conflicts with destroyer Gremyashchiy and HMS Trinidad (Q1565575). 10 and 1 conflicts are found respectively, 2021.

# List of ship with countries and war conflicts
SELECT ?ship ?shipLabel ?countryLabel ?conflict ?conflictLabel
WHERE
{
  ?ship wdt:P31 wd:Q11446;        # instance of ship
        wdt:P137/wdt:P17 ?country;# belongs to country
        wdt:P607 ?conflict.       # engaged in some conflict
  
  VALUES ?country {wd:Q34266 # Russian Empire
                   wd:Q15180 # Soviet Union
                   wd:Q159}  # Russia
SERVICE wikibase:label {bd:serviceParam wikibase:language "en"}
}

SPARQL-query — 105 results (2017), 86 results (2020), 82 results (2021).

It is important to notice that the ships from script in listing are not necessary connected only with Russia, USSR or Soviet Union. For example, there is Kasato Maru (Q653477). It is a Japanese ship but it has multiple operators in the list. This list also includes Dobroflot (Q3737187), this operator owned this ship for some time. It means that the same ship may be owned by different operators in different periods. Owners may be changed time to time.

Fig. 2 shows a graph of the dependence of ships associated with Russia and participating in any military conflicts.

You can see that most of the ships and military operations belong to time the USSR and Russia. It should also be noted that in this graph, as well as in the data themselves, from which they were built, there is one shortcoming. The fact is that Russia can be divided into several different countries (by periods: in particular, on Russian Empire, Russian Socialist Federative Soviet Republic, USSR and the post-Soviet period). And in the ships filled with the editors of the Wikidata ships, the period when the ship existed is not always true. For example, in Fig. 2 can be seen (and the Wikidadata confirm this) that Borodino battleship existed in Russia, and not in the Russian Empire, which is a mistake.

Despite the shortcomings, the graph clearly shows which ship is participating in which particular military battle. The figure shows this dependence in the context of some periods of Russian rule. It also allows you to track which military event has the most warships and vice versa, to find a ship that participates in more wars.

Because of changing of the ship wikidata object description number of ships is differ between 2017 and 2020 years. In 2017, each ship had country property. In 2020, it is used operator property instead. It is possible to get a country through an operator. However, some ships are still missing, because they do not have an operator on wikidata (for example, Dmitrii Donskoi). Dmitrii Donskoi has another property — country of registry. Future work is possible only with additional research of possible ways to describe the ship's country on wikidata.
Fig. 2. List of ships with countries and war conflicts in English (2017)
Fig. 3. List of ships with countries and war conflicts in English (2020)

Museum ships around the world edit

Museum ship (Q575727) — a ship that houses a museum exhibition dedicated to the history of the ship. Such ships are used for educational and memorial purposes. The ship's participation in the conflict (Q180684) may lead to the creation of a museum ship in memory of past events.

Let's build a graph of museum ships and the countries in which these ships are located. The vertices of the graph are country(Q6256) и museun ship (Q575727). The edge between a ship and a country means the ship is in that country. And the edge between the two countries means that there were conflicts between these countries, the number of which is equal to the weight of the edge. The script in listing below builds this graph according to the rules described above.

#defaultView:Graph    
SELECT ?vertex1 ?vertex1Label ?vertex2 ?vertex2Label ?edgeLabel ?image 
WHERE {
  {
    # Conflicts
    SELECT ?vertex1 ?vertex1Label ?vertex2 ?vertex2Label 
            (STR(COUNT(?conflict)) as ?edgeLabel) 
    WHERE
    {
      ?conflict wdt:P31 wd:Q180684 .
      ?conflict wdt:P710 ?vertex1, ?vertex2 .
      ?vertex1 wdt:P31 wd:Q6256 . 
      ?vertex2 wdt:P31 wd:Q6256

      FILTER (?vertex1 != ?vertex2 && STR(?vertex1) < STR(?vertex2))
    
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
    }
    GROUP BY ?vertex1 ?vertex1Label ?vertex2 ?vertex2Label
  }
  UNION
  {
    # Museum ships
    SELECT DISTINCT ?vertex1 ?vertex1Label ?vertex2 ?vertex2Label ?image
    WHERE
    {
      ?vertex2 wdt:P31 wd:Q575727 .
      {?vertex2 wdt:P17 ?vertex1} UNION # located in country
      {?vertex2 wdt:P131/wdt:P17 ?vertex1}
        
      OPTIONAL { ?vertex2 wdt:P18 ?image}
        
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en"}
    }
  }
}

SPARQL-query, 117 vertices are found (2021).

From a fragment of the graph in figure 4 it can be seen that the museum ships mostly belong to Germany, the USA and Australia. This "correlation" is quite logical, since these countries have a long history, for which they have participated in many conflicts. Also, these countries have access to the sea, which historically determines the presence of a fleet.


Fig. 4. Fragment of the graph of countries, museum ships and conflicts (2020)

Future work edit

  1. Find the "Guinness ship" (to choose from: the largest, the longest, the most capacious).
  2. Output pictures of those ships, about which the film were shot. If there are no such, then those ships, about which the books were written.
  3. Bring out the museum ships.

Exercises edit

1 Based on the graph of the dependence of ships and military operations, which country has the most values of wars associated with ships?

Soviet Union
Russia
Russian Empire

2 Based on the graph of the dependence of ships and military operations, which war accounts for most of the values of ships?

Russo-Japanese War
World War II
Crimean War

3 The figure shows the most famous Soviet destroyer project 7, awarded the title of "Guards", name it.



References edit

  • Larionov D. (2020). "Ships in Russia". ProWD.


Links edit