Software
Overview
Open-source tools, libraries, services, and ontologies maintained by the Bio-Ontology Research Group. Source is on github.com/bio-ontology-research-group.
Ontology Embedding & Machine Learning
Libraries and methods that turn ontologies into vector representations or otherwise combine logical structure with statistical learning.
mOWL
Python library for machine learning with biomedical ontologies. Unifies projection-, axiom- and geometric-embedding methods (EL Embeddings, ELBE, BoxSquaredEL, OWL2Vec*, DL2Vec, OPA2Vec) behind one API, with first-class OWLAPI access and PyTorch integration.
Walking RDF and OWL
Original feature-learning method over RDF graphs and OWL ontologies via biased random walks; the seed implementation for many later embedding methods including OWL2Vec*.
OPA2Vec
Combines ontology axioms with associated annotation properties (labels, synonyms, definitions) into a single corpus, then trains Word2Vec to produce semantically rich vectors for ontology classes.
EL Embeddings
Reference implementation of geometric embeddings for the EL++ description logic, the predecessor of GeometrE and BoxSquaredEL. Preserves subsumption reasoning by mapping classes to convex regions.
Onto2Vec
Representation learning for ontologies and their annotations by treating logical axioms as natural-language sentences; predecessor of OPA2Vec.
DL2Vec
Encodes description-logic axioms as a directed graph and learns embeddings via random walks; widely used for downstream gene-disease and protein-function prediction.
Interpretable Learning
Generates interpretable symbolic rules from learned representations over biomedical knowledge bases.
catE
Category-theoretic, lattice-preserving embedding of ALC description-logic ontologies that retains the consequence-closure semantics of the original theory.
DELE
Deductive EL embeddings: enrich training data with the deductive closure of an ontology before learning, so embeddings recover entailment rather than only asserted axioms.
EL2Box
Box-shaped geometric embeddings for EL++ that strengthen the topological guarantees of EL Embeddings.
Protein Function Prediction
Deep-learning and neuro-symbolic models for predicting Gene Ontology functional annotations of proteins.
DeepGOPlus
CNN-ensemble protein-function predictor that augments sequence-based scoring with k-nearest-neighbour homology and GO axioms; one of the strongest CAFA-evaluated open models.
DeepGO
Original sequence-based, ontology-aware deep classifier for predicting Gene Ontology functional annotations; basis of the entire DeepGO family of tools.
DeepGO2
Next-generation DeepGO model with transformer protein embeddings and improved hierarchical multi-label prediction.
DeepGOZero
Zero-shot extension of DeepGO using model-theoretic ELEmbeddings to predict GO classes that have never been observed during training.
DeepGOMeta
DeepGO trained specifically for metagenomic communities; predicts functional roles of proteins recovered from environmental samples and links them to biogeochemical processes.
GO-Agent
LLM-agent framework that decomposes protein-function prediction into tool-calling sub-tasks (sequence search, structure lookup, domain reasoning) and stitches the evidence into a final GO annotation.
DeepPheno
Predicts loss-of-function organism-level phenotypes (HPO/MPO) directly from a gene's annotated functions, using a hierarchical neural classifier over phenotype ontologies.
PU-GO
Positive-unlabeled ranking of protein functions with ontology-based priors; directly addresses the partial-annotation problem in CAFA benchmarks.
PhenoGoCon
Predicts gene–phenotype associations from predicted Gene Ontology functions; bridges GO function prediction and HPO/MPO phenotype prediction.
Genomic Context
Bacterial protein-function prediction that exploits operon and genome-neighbourhood structure in addition to sequence and homology.
Variant and Disease Prioritization
Tools that combine phenotype, ontology, and sequence data to rank candidate disease-causing variants and predict gene-disease associations.
PhenomeNET-VP
Phenotype-driven variant prioritization for whole-exome and whole-genome sequencing data; widely used implementation of the phenotype-aware variant ranking approach.
DeepSVP
Prioritizes structural and copy-number variants by combining patient phenotype with gene-function similarity learned from biomedical ontologies.
DeepViral
Predicts virus–host protein-protein interactions from sequence and infectious-disease phenotypes; trained jointly across coronaviruses, influenza, and other RNA viruses.
EmbedPVP
Embedding-based phenotype-aware variant predictor that ranks candidate causative variants using joint sequence- and phenotype-derived representations.
STARVar
Symptom-based tool for automatic ranking of variants using evidence from the biomedical literature and population genomes; combines text mining with phenotype matching.
predCAN
Ontology-based prediction of cancer driver genes by integrating phenotype, pathway and function knowledge with somatic-variant features.
INDIGENA
Inductive prediction of disease–gene associations from phenotype ontologies; generalises to unseen diseases via ontology-aware embeddings.
GenomeLinter
AI-powered clinical decision-support tool that ingests annotated VCFs and synthesises diagnostic interpretations for rare-disease patients without requiring deep bioinformatics expertise.
Ontology Reasoning & Tooling
Reasoning-as-a-service infrastructure, ontology repositories, and utilities for working with OWL.
vec2SPARQL
Adds embedding-similarity functions to a SPARQL endpoint so that vector-space queries (k-nearest neighbours, cosine similarity) can be mixed with classical graph patterns.
Onto2Graph
Generates entailment-aware graph projections of OWL ontologies suitable for downstream graph machine learning while preserving the axioms' deductive structure.
AberOWL
Ontology repository delivering OWL EL reasoning as a service: stores hundreds of bio-ontologies, exposes SPARQL with class-expression query expansion, and powers semantic search over PubMed/PMC.
AbermOWL
Cloud-deployed federated reasoning system that combines AberOWL's symbolic reasoner with mOWL embeddings, scaling reasoning over SROIQ(D) by complementing exact inference with neural approximation.
UNMIREOT
Identifies, diagnoses, and semi-automatically repairs hidden contradictions and unsatisfiable classes introduced by partial imports (MIREOT) into biomedical ontologies.
OntoFunc
Ontology-driven enrichment analysis that supports arbitrary OWL ontologies and full subsumption-aware aggregation, not only GO.
Knowledge Graphs & Drug Discovery
Biomedical knowledge graph construction, drug repurposing, drug-drug interactions, and graph-based prediction.
Multi-Drug Embedding
Drug repurposing method that learns joint embeddings of drugs, targets and diseases from biomedical knowledge graphs and the scientific literature.
NanoDesigner
Iterative refinement framework for nanobody/CDR design that explicitly models the antigen–CDR interdependence; companion code to the NanoDesigner paper.
DeepMOCCA
Graph neural network for cancer survival analysis that integrates multi-omics (mutation, expression, methylation, CNV) with a curated cancer knowledge graph.
SmuDGE
Semantic disease-gene embeddings; integrates phenotype, function and pathway ontologies into a unified vector space for downstream prediction.
PathoPhenoDB
Curated database of pathogens and the disease phenotypes they cause, distributed as an OWL ontology and an interactive web application.
Bio2Vec
Smart-analytics infrastructure for the life sciences. Integrates biomedical knowledge graphs (Bio2RDF and beyond) with semantic-similarity and embedding-based reasoning.
Teaching & Tutorials
Self-contained teaching material accompanying our courses and review articles.
Machine Learning with Ontologies
Companion code and worked examples for the Briefings in Bioinformatics tutorial review; the most-starred repository in the group.
Ontology Tutorial
Hands-on tutorial that walks new users through OWL, automated reasoning, and ontology-aware data analysis; basis for the AI in Biomedicine summer school.
mOWL Tutorial
Step-by-step worked notebooks that demonstrate every embedding family in mOWL on protein-function, gene-disease, and ontology-completion tasks.
Ontologies & Resources
Bio-ontologies and curated resources maintained by the group.
Units of Measurement Ontology (UO)
OBO Foundry ontology of units of measurement; aligned with QUDT and used across biomedical data standards.
PhenomeNet
Cross-species phenotype ontology and similarity network combining HPO, MPO, ZP and others; the substrate behind PhenomeNET-VP and DeepPheno.