CellMLOntologies

How we intend to integrate ontologies for CellML.

It is useful to read ModelingApproach before reading this.

CellML Ontologies

The ontologies are a system intended for solving goals, among others, in model representation ( see ModelRepresentation).

One problem that needs solving is the representation of the biological model that a CellML model models. This is something that we are hoping to solve by using ontologies. The following problems need to be thought out:

  • representing the associated biological entities and processes that CellML model objects represent, i.e., how do we actually define these associations at the metamodel level and how do we represent these in the CellML/XML serialization.

  • integration of this information into the anatomy repository.

    see CellMLAnatomyIntegration

Discussion

The relationship between CellML models and biological models is at least many to one; i.e., there are many CellML models that can describe the same conceptual biological model. It makes sense then to represent biological models as separate instances of a biological modelling frameworks and to link CellML model structures to these. When we consider the different ways we can represent a biological model(SBML and BioPAX are just two examples), then the relationship between a CellML model and a particular representation of a biological model is one to many.

The linking mechanism for CellML models to biological models is defined as an extension of the CellML metadata specification (CellMLMetadataSpecification). The extension has not been formally described yet, and will be as part of this documentation here. The basis for the linking is the cmeta:id, which can be associated with any CellML element. Through using rdf descriptions(the basis for the metadata specification), we can form relations between the identified CellML elements and any external model definitions such as biological models. These relations should operate in the inverse sense so that biological models can be used to form queries to return relevant CellML models.

How we represent biological terms that refer to biological entities and processes is critical to the reuse of these terms and the association of these models with other models(perhaps in different model representations). For now we are representing biological entities and processes using our own chosen terminology, but ensuring that these terms represent instances whose properties identify the same term in other terminologies/ontologies such as UMLS, Terminologia, SAEL, Reactome, and Gene Ontology.

A biological model usually consists of individual physical entities that form a range of relations with each other. These relations may be quite complex and represent an entire system of biological entities and sub relations. One question is to what extent do we need to bind all CellML elements to relevant entities or relations in the biological model. For instance, we may have a biological model representing glycolysis, and an associated CellML model that refers to this. Do we place a requirement to also create all the associations between CellML elements and the sub-elements of the biological model? It would be wise to, but how simple is this when there may not be a one to one correspondence between the CellML model elements and the biological model elements. We could force a modeler to add suitable abstractions to the CellML model such that these associations are able to be made. An example is the collection of components that make up a single reaction in some biochemical model.

A suitable abstraction may be:

  • individual models that represent the abastraction
  • encapsulation - although that is usually left for hiding interfaces
  • user defined grouping - this seems to be the best candidate

Binding to other ontologies

This needs a more detailed explanation. But I at least wanted to give a reference to a useful bit of information about such kinds of binding within OWL