KB maintains two big collections: the Deposit Collection, containing all the Dutch printed publications (one million items), and the Scientific Collection, with about 1.4 million books mainly about the history, language and culture of the Netherlands.
Each collection is described according to its own indexing system and conceptual vocabulary. On the one hand, the Scientific Collection is described using the GTT, a huge vocabulary containing 35,000 general concepts ranging from Wolkenkrabbers (Sky-scrapers) to Verzorging (Care). On the other hand, the books contained in the Deposit Collection are mainly indexed against the Brinkman thesaurus, containing a large set of headings (more than 5,000) that are expected to serve as global subjects of books. Both thesauri have similar coverage (there are more than 2,000 concepts having exactly the same label) but differ in granularity.
For each concept, the thesauri provide the usual lexical and semantic information: preferred labels, synonyms and notes, broader and related concepts, etc. The language of both thesauri is Dutch, but a quitesubstantial part of Brinkman concepts (around 60%) come with English labels. For the purpose of the alignment, the two thesauri have been represented according to the SKOS model, which provides with all these features.
The goal of the task is to find semantic links between the concepts contained by these GTT and Brinkman thesauri.
The expected alignments shall come in the format defined for the Ontology Alignment API.
As the context of the task is clearly thesaurus-based information systems, the alignment links to be produced shall be compatible with standard thesaurus semantic links.
Especially expected are alignment links found in the SKOS mapping
vocabulary namespace (http://www.w3.org/2004/02/skos/mapping#,
referred to as skosm: below):
skosm:exactMatch, which denotes equivalence between two
conceptsskosm:broadMatch and skosm:narrowMatch
denoting hierarchical generalization and specializationAdditionally, an adhoc link denoting general relatedness is allowed and will be evaluated:
http://stitch.cs.vu.nl/mapping#relatedMatchThe other semantic relations found in SKOS mapping vocabulary are allowed but will not be evaluated:
skosm:minorMatchskosm:majorMatchFinally, SKOS mapping provide with three classes that allow to map defined concepts. These are allowed, but will not all be evaluated:
skosm:AND and skosm:OR, denoting intersection
and union of concepts, will be evaluated;skosm:NOT will not be evaluated.Note on alignment cardinality and confidence measure:
The alignments shall be sent to the organizers of the task, Antoine Isaac and Henk Matthezing.
Evaluation of the alignments will be done by members of the STITCH team with the help of domain experts. Due to the size of the vocabulary, only sample evaluation will be caried out.
The modality of this sample evaluation will be determined, depending for example on the number of participants.
A special attention will be given to application relevance. Especially, criteria for evaluation will be chosen according to specific application scenarios like query reformulation, book annotation or unified thesaurus design.
Please notice that this e-mail process is not only expected to ensure compliance with IP situation. It will also help us to keep contact with participants, for instance if a new version of the data is produced, after a complaint by a participant.
In case the participants' tool cannot input or output the proposed SKOS (-inspired) data, OWL versions are provided, and OWL alignment relationships can be evaluated. Notice, however, that this amounts to making specific interpretations of the original data and produced alignments, and might reduce the quality of the final results.
The following conversions were made regarding the data:
skos:Concept are converted into instances of
owl:Class;skos:prefLabel, skos:altLabel and
skos:hiddenLabel statements are converted to
rdfs:label statements, which removes the subtle distinctions
that exist between these different properties (in GTT for instance, many
altLabels are not synonyms at all)skos:notes are converted to
rdfs:comments;skos:broader statements are converted into
rdfs:subClassOf statements;skos:related statements are converted into
rdfs:seeAlso statements.The following interpretations will be made of OWL data sent back by participants:
owl:Class will be interpreted as instances of
skos:Concept;owl:equivalentClass statements will be interpreted as
skosm:exactMatch statements;skos:Concept and not owl:Class) using
owl:sameAs, these statements will also be interpreted as
skosm:exactMatchs;rdfs:subClassOf statements will be interpreted as
skosm:broaderMatch statements;rdfs:seeAlso statements will be interpreted as
stitchm:relatedMatch statements;skosm:AND and skosm:OR. Disjunction-like
statements (owl:disjoint, owl:differentFrom,
owl:disjointWith, owl:ComplementOf which could
be interpreted as statements involving skosm:NOT) will be
ignored, as these are not evaluated.