CellML/ModelML Road Map

Model representation

CellML/ModelML

CellML was originally designed to describe and exchange models of cellular and subcellular processes. The design principles that were applied to the construction of a language with this relatively narrow focus are equally applicable to the specification of a language with a much wider scope. Because of this, CellML has been found to be suitable for describing and exchanging models of a wider range of processes than is indicated by its name.

We are keen to further develop these ideas by identifying the core model description elements of CellML and using those elements to provide a base model specification markup language, ModelML.

CellML 1.0/1.1

The specifications of CellML 1.0 and 1.1 are stable and freely available on the CellML web site. These specifications are expressed in prose. In such a form they carry ambiguities and are not directly machine interpretable. Machine interpretable specifications of CellML 1.0 and 1.1 syntax are publicly available as DTDs, included as appendices in the language specification documents. Because of the limitations of the document type definition specification, this DTD provides only a rudimentary specification of valid syntactical structure. To describe the semantics of CellML, schema languages with much greater expressiveness must be used. Two open community developments address this in different ways for particular purposes. The Semantic Web group have a number of use-cases that are relevant to CellML. The OWL language is the current development that builds on RDF/RDF-S, which forms the foundation to the CellML metadata specification. Another group, the Object Management Group (OMG), also have use-cases that are relevant to CellML. Representations useful to CellML in this case are the Unified Modelling Language (UML) and the Meta Object Facility (MOF).

We are researching the best way to represent CellML in an OWL ontology, especially establishing the correct way to integrate with other biological ontologies, such as BioPAX. We are also researching what benefits might be gained from using OMG representations and toolsets - part of this is being included in a joint Biochemical Pathways RFP submission with Lion Biosciences and SMBL to the OMG's Lifesciences group.

No further development of the CellML 1.0 is expected. The Draft CellML 1.1 and CellML Metadata specifications are ready to be elevated to full specification status.

CellML/ModelML

CellML 1.0 and 1.1 contain a mixture of basic elements, providing the building blocks for constructing models, and higher-order elements, those dealing with reactions. It has been recognised by the CellML development team that it is desirable to separate out the specialised higher-order elements. We intend to evolve from the CellML 1.1 specification, removing the higher-order elements, to create a specification for a new base modelling language, tentatively named ModelML. The OWL representation of the CellML 1.1 specification will form a useful basis for developing a CellML/ModelML ontology.

Because the base model specification language, ModelML, will have no domain-specific constructs, such information must be included in the form of metadata. Our preferred way of achieving this is to use Semantic Web languages, such as RDF and OWL, to associate typing information with ModelML constructs such as models, components, variables, mathematics, groups, and connections.

Domain specific ontologies are currently been researched.

Interoperability

CellML is one way to represent mathematical models of biological systems. There are others, most notably, the Systems Biology Markup Language. SBML is driven by a set of goals a little different to CellML, but certainly overlapping in some areas of biological systems and modelling practice. It does not make sense to merge the two languages, though there are some common structures to both (MathML and Metadata specification), but instead provide the tools for translation between the languages where appropriate so that both the CellML and SBML communities can share models and application environments, such as simulation packages.

Tools currently available for translating between SBML and CellML/ModelML model representations include:

  • Maria Schilstra's CellML to SBML converter.

Tools that are currently available for translating between SBML and CellML/ModelML model representations include:

  • An SBML to CellML converter (this is a medium priority project).

Repositories

The main goals of the repository are to:

  • Present a resource for the community to store, retrieve, search, reference, and reuse (see 'import' construct in language description) CellML models. A publicly accessible repository of published models is available at http://www.cellml.org/models/. This site contains over 200 examples of models expressed as CellML files, together with associated metadata, citations, and figures. This is currently file-based and served by a webserver. Evolving this is necessary.
  • Provide a context for the models within the larger biological domain of data repositories. Examples of repositories, or interfaces to such repositories, are BioPax, Tambis, BioCyc, Gene Ontology NG, Physiome, aMaze, PSI, MGED, PEDRo, and Digital Anatomist. We are keen to link the CellML/ModelML programme into these efforts by developing interfaces and applications that enable information in such sites to be associated with our own model representations and development. This is part of the ontology development work (section 1.1.2).

Model use

Publication

CellML uses Content MathML to represent the underlying mathematical relationships between model variables. Specific tools allow the generation of structured documents representing the topology and equations from the mathematical representation used in the CellML model. This facility provides a way for automatically creating representations of models for publication.

Tools currently available for rendering publishable representations of CellML/ModelML models include:

  • The cellml.org MathML Renderer. This is an XSLT transform that converts the CellML subset of MathML to a LaTeX file (available from http://www.cellml.org/tools);
  • The cellml.org Equation Extractor for Viewing MathML is an XSLT transform that extracts the MathML from a CellML document and formats it into Presentation MathML (available from http://www.cellml.org/tools).

Visualisation

Tools are required to provide visual representations of models. Both generic and domain specific visualisation tools will be required. Generic tools provide visual representations of core ModelML model information (e.g. components and their connections, variables and their units, mathematics of components, grouping relationships, import hierarchies). Domain specific tools provide visual representations of additional model information provided by the ontologies associated with models (e.g. reaction pathways, anatomical/structural relationships).

Tools currently available for CellML/ModelML model visualisation include:

  • Java-based biochemical pathway diagram tool;
  • The front end to a domain-specific CMISS cell simulator;
  • Two part 4 software engineering students are constructing an interactive SVG representation of a CellML 1.1 model. The goal is to produce visualizations that help a modeller understand the relationships in their model. SVG helps us to render dynamic and interactive content while maintaining an interpretation of the meaning of the elements being rendered, in this case CellML modelling structures. The focus will be on generating nice layouts for different visual purposes, and to provide dynamic context dependent information.

Creation/modification

Both generic and domain specific model editing tools are required to create and modify models. Generic tools provide facilities for creating and modifying core ModelML model information. Domain specific tools provide added facilities to edit model information provided by the ontologies associated with models.

Tools currently available, or under development, for CellML/ModelML model editing include:

  • Virtual Cell (National Resource for Cell Analysis and Modeling) is a Java-based modelling and simulation environment that imports and exports CellML. The Virtual Cell software is freely available, and simulations are run on a remote server.
  • Cellular Open Resource (Mechano-Electric Feedback lab, Oxford Cardiac Electrophysiology Group) is a Microsoft Windows application for developing and running electrophysiological cellular and multi-cellular simulations.
  • Nigel Lovell and Socrates Dokos are creating a Java-based tool for authoring and parsing CellML models;
  • The Content MathML Editor is a Java-based visual editor for creating and modifying Content MathML. It provides concurrent Content MathML, its associated XML tree, and standard visual representations of mathematical expressions. The editor can read and write Content MathML representations of mathematical expressions used in CellML/ModelML;
  • Two part 4 software engineering students are constructing an interactive CellML 1.1 model editor based upon the Java-based content MathML editor. The goal is to produce an integrated generic model editor that provides facilities to create and modify CellML 1.1 model information.

Simulation

Simulation tools provide methods to solve the CellML/ModelML representation of models given particular boundary conditions. Because of the wide variety of model types able to be represented in CellML/ModelML, all simulation tools are currently domain specific.

Tools currently available, or under development, for CellML/ModelML model simulation include:

  • Domain specific CMISS cell simulator;
  • C++ implementation of a CellML 1.0 library;
  • Python implementation of a CellML 1.1 library;
  • Nigel Lovell and Socrates Dokos' Java implementation of a CellML library.
  • Virtual Cell (National Resource for Cell Analysis and Modeling) is a Java-based modeling and simulation environment that imports and exports CellML. The Virtual Cell software is freely available, and simulations are run on a remote server.
  • Cellular Open Resource (Mechano-Electric Feedback lab, Oxford Cardiac Electrophysiology Group) is a Microsoft Windows application for developing and running electrophysiological cellular and multi-cellular simulations.

We need to develop, or help others develop, libraries for integrating CellML into simulation environments such as Matlab, Mathematica, BioUML, etc.

Modelling community

People trying to build models, figure out CellML, use CellML, etc. The role of the current cellml.org site...

Developer tools

Developer-centric tools are those useful for application developers and are centred around particular implementations of libraries for CellML.

Tools that are required for CellML/ModelML application developers include:

  • The development of a language, and process independent CellML APIs (this is a high priority task);
  • The development of specific language (e.g. C, C++, Java, Python, Fortran, MatLAB, and Mathematica) implementations of these APIs (these are high priority tasks).

Tool development community

The management of open source development projects associated with CellML. Matt Halstead is putting together the tools required to support community based application development. Priorities for this are:

  • Specifications of CellML APIs for
    • CellML/ModelML parsing, creation, and in memory representation
    • Model repository interfaces
  • Implementation of APIs
  • Specific tools for
    • Model authoring
    • Model visualization
    • Model repository workflow
    • Biological database integration