Authors:
Melanie Nelson (Physiome Sciences Inc.)
Warren Hedley (Bioengineering Institute, University of Auckland)
Collaborators:
David Bullivant (Bioengineering Institute, University of Auckland)
Kam-Chuen Jim (Physiome Sciences Inc.)
Scott Lett (Physiome Sciences Inc.)
Dave Nickerson (Bioengineering Institute, University of Auckland)
Poul Nielsen (Bioengineering Institute, University of Auckland)
It would be very difficult to develop a robust data model to cover all
aspects of biology. By limiting the scope of CellML, we increase our
chances of developing a robust and flexible data model that will remain
valid as biological modeling advances. The scope of CellML is the
information that should be contained within the CellML data model and
which will therefore be represented by CellML elements (i.e. elements
defined in the CellML Document Type Definitions, or DTDs). Note that
information can also be included in a CellML document by including an
independent DTD (see the section “Use Of Other Languages”).
This document details the scope of CellML, as agreed in a
teleconference on June 28, 2000. Physiome Sciences was represented by
Melanie Nelson, Kam Jim, and Scott Lett. The University of Auckland was
represented by Warren Hedley, Poul Nielsen, David Bullivant, and David
Nickerson (standing in for Peter Hunter).
The CellML language itself will be limited to the following:
Description of the
organizational scheme/structure of the model. This includes the
organization of a model into components (parts), and how these parts
are connected together to create a complete model. Note that this
description is intended primarily for use by computer programs that
want to be able to read and run a cell model.
Metadata
describing the model and its parts, to be used to help users
categorize, organize, and retrieve models and model parts. This
metadata is intended for both computer programs accessing the model (so
that they can search for models and model components based on user
queries) and for human users of the model (who might want to view
portions of the metadata to evaluate whether a particular model meets
their requirements). [Technical note: The most basic set of metadata
can be handled with the Dublin Core RDF specification (RDF is the W3C's
method for handling metadata. The Dublin Core is a set of specific
metadata elements). Other metadata can be handled in a CellML-specific
RDF schema or in elements within CellML itself. A CellML-specific RDF
schema would be so tightly coupled with CellML that its development can
be considered part of the same project.]
When a cell model document needs to contain information that is outside
of the CellML scope, this will be done using separate languages/storage
formats. The CellML data model will only contain the information needed
to connect this "outside information" into the biological model. For
example, we currently use an <equations> element to indicate where math belongs in the biological model.
The CellML working group will decide which languages can be legally
included in CellML as the need to include different types of
information arises. In some cases, the CellML specification may limit
the subset of valid information from the separate language or format
that can be included in CellML.
Here is a list of information that we currently plan to
represent using other languages. Inclusion on this list means that this
information will not be represented using CellML elements. It does not
imply anything about whether or not the information can be included in
CellML documents. Some of this information will be included within
CellML documents (as we currently do with MathML), while other
information may be stored in a separate document (for instance,
rendering information may be kept separate).
Math (currently implemented via MathML)
Textual
and graphical documentation of the model intended solely for display to
human users of the model. (probably implemented via HTML).
Information about the actual computation of a cell model.
Data sets.
Ontologies
[Note: An ontology is the set of rules that govern what is allowable in
a biologically correct model. We anticipate that different types of
modeling will need different ontologies. It should also be possible to
define a model without using an ontology at all (in effect, the
ontology would be inside the modeller's head). However, using
ontologies would allow some "short cuts" in the model definition
process. Also note: an ontology can be separate from the model whether
we choose to express ontology information in CellML or in some other
language.]
Information about rendering of models This
list is not meant to be a comprehensive list of information that may be
used by CellML documents and represented in another language. Other
types of information may be included at the discretion of the CellML
working group. However, none of the items on this list will be
represented using CellML elements: they are not within the scope of the
CellML data model.