Semantic Reasoning Evaluation Challenge (SemREC'23)

22nd International Semantic Web Conference (ISWC 2023)

Challenge Description

Despite the development of several ontology reasoning optimizations, the traditional methods either do not scale well or only cover a subset of OWL 2 language constructs. As an alternative, neuro-symbolic approaches are gaining significant attention. However, the existing methods can not deal with very expressive ontology languages. Other than that, some SPARQL query engines also support reasoning, but their performance also is still limited. To find and improve these performance bottlenecks of the reasoners, we ideally need several real-world ontologies that span the broad spectrum in terms of their size and expressivity. However, that is often not the case. One of the potential reasons for the ontology developers to not build ontologies that vary in terms of size and expressivity is the performance bottleneck of the reasoners. SemREC aims to deal with this chicken and egg problem.
The third edition of this challenge includes the following tasks-

  • Task-1 - Ontologies. Submit a real-world ontology that is a challenge in terms of the reasoning time or memory consumed during reasoning. We expect a detailed description of the ontology along with the analysis of the reasoning performance, the workarounds if any, that were used to make the ontology less challenging (for example, dropping of a few axioms, redesigning the ontology, etc.), and the (potential) applications in which the ontology could be used. We will be evaluating the submitted ontologies based on the time consumed for a reasoning task, such as classification, and the memory consumed during reasoning.

  • Task-2 - Systems
    • Ontology/RDFS Reasoners. Submit an ontology/RDFS reasoner that uses neural-symbolic techniques for reasoning and optimization. In terms of technique used, the submissions could fall under any of the below (or related) categories.
      1. Using learning-based techniques for performance optimization of traditional reasoning algorithms [6].
      2. Techniques that can learn the deductive reasoning aspect using the ontology axioms [1, 2, 3, 5].
      3. Inductive reasoning techniques based on a subsymbolic representation of entities and relations learned through the maximization of an objective function over valid triples. This could include works related to ontology completion (predicting missing facts or relations) [4, 7, 8].
      Based on precision and recall, we will evaluate the submitted systems on the test datasets for scalability (performance evaluation on large and expressive ontologies) and transfer capabilities (ability to reason over ontologies from different domains). We expect a detailed description of the system, including an evaluation of the system on the provided datasets.

    This challenge will be collocated with the 22nd International Semantic Web Conference.

    We have a discussion group for the challenge where we share the latest news with the participants and discuss issues related to the evaluation rounds.

    Dataset Details

    The submitted reasoning systems can vary in terms of their support for different OWL 2 profiles, subsets of Description Logics, reasoning tasks (such as classification, realization, or consistency checking for traditional reasoners, entailment, class membership, class subsumption, or axiom completion for neuro-symbolic reasoners). Ideally, the datasets must cover all these cases. However, due to the uncertainty with regard to the type of system submissions-- given the challenge is being organized for the first time, we are providing the OWL 2 Profile specific datasets only. If there are systems that partially support some profile then the participants can evaluate on part of the provided datasets.

    We will use the following datasets for evaluating the reasoners submitted to Tasks 2 and 3.

    1. ORE 2014
    2. OWL2Bench
    3. CaLiGraph

    Submission Details

    • Task-1.

      We expect a detailed description of the ontology along with the analysis of the reasoning performance, the workarounds, if any, that were used to make the ontology less challenging (for example, dropping of a few axioms, redesigning the ontology, etc.), and the (potential) applications in which the ontology could be used.

      To generate the leaderboard, the submitted ontologies will be run on some of the traditional description logic reasoners such as Konclude, ELK, and Openllet . We will use reasoning (classification) time and the memory consumed as the primary metrics. We will use a timeout value of 6 hours, and the limit on the memory will be 96 GB.
    • Task-2.
      • To generate the leaderboard, the neural-symbolic reasoners will be provided with training, validation, and test datasets. We provide an file for the subsumption task. This is provided only to give an idea of the kind of submission we expect from the participants. Participants are requested to make the changes mentioned in the file to evaluate it on their embeddings for the supported reasoning task (e.g., class subsumption, class membership, etc). We would require the class embeddings of your model along with a readme on the changes made on the evaluation file and how to use it. We plan to evaluate the submitted systems on a Linux-based GPU server.

      • The participants will include all the results in the submitted papers and provide their trained models/embeddings. We will further be evaluating the provided models on another small test dataset. The evaluation metrics will be reasoning time, memory consumed, precision and recall.

    The submissions can be either in the form of short papers of length 5 pages or long papers of length 10-12 pages. All the submissions must be in English and follow the 1-column CEUR-ART style (overleaf template). The proceedings will be published as a volume of CEUR-WS. Submissions should be made in the form of a pdf document on EasyChair.



  1. M. Ebrahimi, M.K. Sarker, F. Bianchi, N. Xie, D. Doran, and P. Hitzler, Reasoning over RDF knowledge bases using deep learning, arXiv preprint, arXiv:1811.04132, 2018.
  2. P. Hohenecker and T. Lukasiewicz, Deep learning for ontology reasoning, CoRR, arXiv:1705.10342, 2017.
  3. B. Makni and J. Hendler, Deep learning for noise-tolerant RDFS reasoning, SemanticWeb 10(5) (2019), 823–862.
  4. J. Chen, P. Hu, E. Jimenez-Ruiz, O. M. Holter, D. Antonyrajah, and I. Horrocks, OWL2Vec*: embedding of OWL ontologies. Machine Learning, 2021.
  5. S. Mondal, S. Bhatia, and R. Mutharaju. EmEL++: Embeddings for EL++ Description Logic. Spring Symposium on Combining Machine Learning and Knowledge Engineering (AAAI-MAKE), 2021.
  6. R. Mehri, V. Haarslev, and H. R. Chinaei, A machine learning approach for optimizing heuristic decision‐making in Web Ontology Language reasoners. Computational Intelligence. 37. 10.1111/coin.12404, 2020.
  7. B. Peng, Z. Lu, H. Li, and K.-F.Wong, Towards neural network-based reasoning. arXiv preprint arXiv:1508.05508, 2015.
  8. X. V. Lin, R. Socher, and C. Xiong, Multi-Hop Knowledge Graph Reasoning with Reward Shaping. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 3243–3253, 2018.