AIFB DataSet is a Semantic Web (RDF) dataset used as a benchmark in data mining. The dataset consists of a single approximately 3 megabyte large file. It records the organizational structure of AIFB at the University of Karlsruhe.

Prerequisites

edit

Get the data

edit

The dataset is distributed from https://figshare.com/articles/AIFB_DataSet/745364.

1 Download the data file. Which file format is the data encoded with?

Notation3
RDF XML
JSON-LD

2 Which ontology does it use?

SWRC
FOAF
SIOC


Get context

edit

The dataset was used in Kernel Methods for Mining Instance Data in Ontologies. Find and read the part of the dataset on page 10.

How many instances does the paper record of the class "Person"?

2,547
1,058
1,232


Python

edit

Setup a Python environment with rdflib installed and load the AIFB file and count the number of times the "affiliation" property is used:

from rdflib import Graph, URIRef

g = Graph()
g.load('aifbfixed_complete.n3', format='n3')
len(list(g.triples((None, URIRef("http://swrc.ontoware.org/ontology#affiliation"), None))))

The URI for the affiliations can be obtained with:

affiliations = g.triples((None, URIRef("http://swrc.ontoware.org/ontology#affiliation"), None))
groups = set(affiliation[2] for affiliation in affiliations)

How many different affiliations are there?

Find the name of the affiliations via "http://swrc.ontoware.org/ontology#name".