Challenge Description
Despite the development of several ontology reasoning
optimizations, the traditional methods either do not scale well or only cover a
subset of OWL 2 language constructs. As an alternative, neuro-symbolic
approaches are gaining significant attention. However, the existing methods can
not deal with very expressive ontology languages. Other than that, some
SPARQL query engines also support reasoning, but their performance also is
still limited. To find and improve these performance bottlenecks of the reasoners,
we ideally need several real-world ontologies that span the broad spectrum in
terms of their size and expressivity. However, that is often not the case. One of
the potential reasons for the ontology developers to not build ontologies that vary
in terms of size and expressivity is the performance bottleneck of the reasoners.
SemREC aims to deal with this chicken and egg problem.
The second edition of this challenge includes the following tasks-
- Task-1 - Ontologies. Submit a real-world ontology that is a challenge in terms of the reasoning time or memory consumed during reasoning. We expect a detailed description of the ontology along with the analysis of the reasoning performance, the workarounds if any, that were used to make the ontology less challenging (for example, dropping of a few axioms, redesigning the ontology, etc.), and the (potential) applications in which the ontology could be used. We will be evaluating the submitted ontologies based on the time consumed for a reasoning task, such as classification, and the memory consumed during reasoning.
-
Task-2 - Systems
- Ontology/RDFS Reasoners. Submit an ontology/RDFS reasoner
that uses neural-symbolic techniques for reasoning and
optimization. In terms of technique used, the submissions could fall
under any of the below (or related) categories.
- Using learning-based techniques for performance optimization of traditional reasoning algorithms [6].
- Inductive reasoning techniques based on a subsymbolic representation of entities and relations learned through maximization of an objective function over valid triples [4, 5].
- Techniques that can learn the deductive reasoning aspect using the ontology axioms [1, 2, 3].
- Neural Multi-hop reasoners to deal with reasoning where multi-hop inference is required [7, 8].
- SPARQL query engines that support entailment regimes such as RDF, RDFS, or OWL 2. We expect a detailed description of the system, including an evaluation of the system on the provided datasets.
- Ontology/RDFS Reasoners. Submit an ontology/RDFS reasoner
that uses neural-symbolic techniques for reasoning and
optimization. In terms of technique used, the submissions could fall
under any of the below (or related) categories.
This challenge will be collocated with the 21st International Semantic Web Conference.
We have a discussion group for the challenge where we share the latest news with the participants and discuss issues related to the evaluation rounds.
Dataset Details
We will use the following datasets for evaluating the reasoners submitted to Tasks 2 and 3.
- ORE 2015 dataset.
- Ontologies of varying sizes and complexities generated using our benchmark, OWL2Bench. It supports all the OWL 2 profiles.
- Datasets submitted to Task-1.
Submission Details
We have not yet categorized the tasks based on different profiles, reasoning techniques, or reasoning tasks (entailment, class subsumption, class membership, type prediction, and link prediction). We will decide on this aspect based on the submissions we get.
- Task-1. To generate the leaderboard, the submitted ontologies will be run on some of the traditional description logic reasoners such as Konclude1, ELK2, and Openllet3 . We will use reasoning (classification) time and the memory consumed as the primary metrics. We will use a timeout value of 6 hours, and the limit on the memory will be 96 GB.
- Task-2.
- To generate the leaderboard, the neural-symbolic reasoners will be provided with training, validation, and test datasets. The participants will include all the results in the submitted papers and provide their trained models/embeddings. We will further be evaluating the provided models on another small test dataset. The evaluation metrics will be reasoning time, memory consumed, precision and recall.
- To generate the leaderboard, the SPARQL query engines will be evaluated for the scalability aspect in terms of load time, query response time, and memory consumed. We will provide the datasets, and the participants will discuss their evaluations in the submitted paper. For fair evaluation, we will be re-evaluating the submitted systems on our hardware.