Personal tools
You are here: Home Specifications CellML 1.0
 

CellML 1.0 Specification

This Version:
          http://www.cellml.org/specifications/cellml_1_0
Latest Version:
          http://www.cellml.org/specifications
Previous Versions:
          http://www.cellml.org/specifications/archive
Authors:
          Warren Hedley (Bioengineering Research Group, University of Auckland)
          Melanie Nelson (Physiome Sciences Inc.)
Contributors:
          David Bullivant (Bioengineering Research Group, University of Auckland)
          Autumn Cuellar (Bioengineering Research Group, University of Auckland)
          Yi Ge (Physiome Sciences Inc.)
          Mark Grehlinger
          Kam Jim (Physiome Sciences Inc.)
          Scott Lett (Physiome Sciences Inc.)
          David Nickerson (Bioengineering Research Group, University of Auckland)
          Poul Nielsen (Bioengineering Research Group, University of Auckland)
          Haoyu Yu (Physiome Sciences Inc.)

Abstract

This document specifies CellMLTM 1.0, an XML-based language for describing and exchanging models of cellular and subcellular processes. MathML embedded in CellML documents is used to define the underlying mathematics of models. Models consist of a network of re-usable components, each with variables and equations manipulating those variables. Metadata may be embedded in CellML documents using RDF.

Status of this document

The 10 August 2001 version of the CellML 1.0 specification has been reviewed and endorsed by all of the authors of and contributors to the document. As of 10 August 2001, the syntax and semantics of all of the elements in the CellML 1.0 namespace are frozen.

The authors invite feedback from the public. Readers are encouraged to subscribe and send comments to the cellml-discussion mailing list. Alternatively, readers may send comments and questions via e-mail to info@cellml.org.

The latest version of the CellML specification is always available at the following URI:

The list of errata associated with this document is available at the following URI:

Contents

1  Introduction

1.1  Introduction to CellML

This document specifies CellMLTM 1.0, an XML-based language for describing and exchanging models of cellular and subcellular processes. CellML is being developed by scientists at the University of Auckland (in the Bioengineering Research Group) and at Physiome Sciences, Inc. The development of CellML is guided by an advisory board drawn from many different areas of biological modelling (see the project team page on the CellML website for more information). CellML is being developed as an open standard, and all interested parties are encouraged to send feedback to info@cellml.org, or to the cellml-discussion mailing list.

1.1.1  Purpose and scope of CellML

CellML is intended to support the definition of models of cellular and subcellular processes. CellML facilitates the re-use of models and parts of models by using a component-based architecture. Models are split into logical sub-parts called components that are connected together to form a model.

CellML separates the specification of the underlying mathematics of a model from a particular implementation of the model's solution. This makes a model independent of a particular operating system or programming language, and allows modellers to easily integrate parts of other peoples' models into their own models. CellML also allows the generation of equations for publishing from the same definition upon which the solution method is based, removing inconsistencies between the model and associated results in academic papers, and allowing others to reliably reproduce these results.

The scope of the CellML language is specifically limited to the definition of model structure. All other types of information that modellers need or want to include in a model document are incorporated using other languages. For instance, mathematics is included in CellML documents using Mathematical Markup Language (MathML). Metadata may be included using the Resource Description Framework and the Dublin Core Metadata Element Set.

1.1.2  What is XML?

The CellML language is defined in terms of a meta-language called eXtensible Markup Language (XML). XML is a standard published by the World Wide Web Consortium, the organisation responsible for defining many internet-related standards, most notably HTML. XML is essentially a means of adding structure to text documents, allowing machines to unambiguously associate text or binary data with a particular component in a document's data model.

XML is an appropriate medium for CellML because it is both human and machine readable. A model author can create a CellML document with a text editor or with CellML authoring software. XML is a well-defined and widely used specification. Many free software utilities and libraries for the processing of XML already exist, simplifying the development of CellML processing software. XML has also been designed to be usable over the internet, making CellML suitable for the interchange of models between software and databases at different locations.

A quick introduction to XML is available in the examples section of the CellML website.

1.1.3  Terminology

A model is a representation of the rules that govern the behaviour of a system. The terms in the following list provide two useful model classifications.


qualitative model

A model that defines the relationships between objects in the system, without defining any mathematics that represent the behaviour of those objects.

quantitative model

A model that defines the relationships between objects in the system, including the mathematics that represents the behaviour of those objects.

The terms defined in the following list are used in specifying the conformance of CellML documents and processing software to this specification.


may

Conforming CellML documents are permitted but not required to adhere to the limitation described. Conforming CellML software is permitted but not required to behave as described.

must

Conforming CellML documents must adhere to the limitation described. Conforming CellML software must behave as described.

for interoperability

A non-binding recommendation included to increase the chances that CellML documents will be processed in a consistent manner by different applications.

error

A violation of the rules of this specification; results are undefined. Conforming CellML software may detect and report an error and may recover from it. The recommended best practice is for software to make information about errors available to the user.

valid CellML document

A document that conforms to all of the rules in this specification.

valid CellML subset document

A valid CellML document that only uses MathML elements from the CellML subset defined in Section 4.2.3.

CellML conformant software

CellML processing software that will interpret any valid CellML subset document according to the language semantics and processor rules defined in this specification.

fully MathML capable software

Software that can correctly interpret the full set of MathML content markup elements.

1.2  Structure of the CellML Specification

The CellML specification is divided into several sections, each of which discusses a particular aspect of CellML:

  • Section 1 — Introduction — This section introduces CellML, XML, the terminology used throughout the specification, and the structure of the specification.
  • Section 2 — Fundamentals — This section explains concepts used in all other sections of the specification, such as the definition of a valid CellML identifier and the use of XML namespaces in CellML.
  • Section 3 — Model Structure — This section describes how models are organised in CellML. It includes an explanation of the use of a network of components to define a model and a discussion of variables in CellML.
  • Section 4 — Mathematics — This section describes how mathematical expressions are defined in CellML documents using MathML, and defines the CellML subset of MathML elements.
  • Section 5 — Units — This section explains the requirements for units in CellML and describes how a modeller can define arbitrary sets of units.
  • Section 6 — Grouping — This section explains how a model can be organised into logical encapsulation and geometric containment hierarchies by grouping components.
  • Section 7 — Reactions — This section introduces CellML syntax that allows the modeller to classify the involvement of the participants in the chemical expressions that make up reaction/pathway models.
  • Section 8 — Metadata Framework — This section describes how RDF is used in CellML documents to define metadata and associate it with models, model components, and other CellML elements.
  • Appendices — The appendices cover advanced and technical topics including the CellML DTD, recommendations for adding scripts to CellML documents, and units processing algorithms.

A valid CellML model can be created using nothing beyond the material covered in the fundamentals, model structure, mathematics, and units sections of the specification. The concepts in the remaining sections of the specification allow modellers to build more meaningful models.

Each section of the specification is further divided into five subsections:

  • Introduction — This subsection explains the purpose of the elements covered in the current section.
  • Basic Structure — This subsection describes the new elements and attributes introduced in the current section of the specification and how they are combined.
  • Examples — This subsection provides one or two basic examples of the correct use of the elements and attributes introduced in the current section of the specification. More extensive examples can be found in the examples section of the CellML website.
  • Rules for CellML Documents — This subsection provides formal rules for the use of the elements and attributes introduced in the current section to create valid CellML documents. These rules are specified as bulleted lists. Each rule may have an associated explanation, which appears directly after the rule, in square brackets ([]).
  • Rules for Processor Behaviour — This subsection provides some rules for correct CellML processor behaviour with regards to the elements and attributes introduced in the current section.

Throughout the CellML specification, all XML elements and attributes that occur in the text are in the CellML namespace unless explicitly stated otherwise.

2  Fundamentals

2.1  Introduction

This section of the CellML specification introduces some concepts that are used throughout the entire language and defines rules that apply to all or many of the other parts of the specification. These include the definition of names and use of namespaces in CellML.

2.2  Basic Structure

2.2.1  Definition of a valid CellML identifier

The most common use of a CellML identifier is the name attribute required on many basic elements in CellML. The value of this attribute can be used to reference that element from elsewhere in the model definition or from another model definition altogether. An object's name can generally be thought of as a unique identifier for that object. Although the XML specification defines a mechanism for specifying that the value of an attribute is unique across an entire document (with the ID attribute type), this functionality is not used in CellML 1.0 because an object's name need only be unique across its own class of objects.

The generation of computer code for running simulations is one of the target applications for CellML. The value of an object's name attribute is intended to be a suitable name for the same object when it is represented in computer code. For this reason CellML identifiers must consist of only alphanumeric characters and the underscore character ("_"), and are subject to some additional constraints outlined below. These names will generally not be the most effective way of identifying the object to humans working with CellML models as it is not possible to include whitespace or formatting. More human readable names can be defined and associated with CellML objects using the metadata functionality introduced in Section 8.

The XML specification is based on the Unicode standard, which defines a scheme for 16 bit character encoding. Thus it is possible to include, for instance, Japanese characters in a valid XML document. In the interests of making the code generation process as convenient as possible for those using mainstream programming languages, CellML identifiers are subject to the following constraints:

  • An identifier must consist only of alphanumeric characters from the US-ASCII character set and underscore characters,
  • An identifier must contain at least one alphanumeric character.

Convenient code generation is also the reason why colons, periods, and hyphens may not appear in CellML identifiers. CellML identifiers are case sensitive: a variable with an identifier of ABC is different from a variable with an identifier of abc.

2.2.2  Namespaces in CellML

Namespaces in XML is a companion specification to the XML 1.0 specification. XML namespaces add a second level of naming to elements and attributes, allowing processing software to distinguish between elements and attributes from different languages. A namespace is identified by a Uniform Resource Identifier (URI), which has the feature of being unique. The value of a namespace URI need have nothing to do with the XML document that uses it. However, it typically points to a document that defines the rules for the language. The URI may be mapped to a prefix, which may then be used in front of element and attribute names, separated by a colon. If not mapped to a prefix, the URI sets the default namespace for the current element and all of its children.

The CellML 1.0 specification defines a small number of elements and attributes and a namespace with which they must be associated. Putting CellML elements and attributes in the CellML namespace allows them to be distinguished from elements and attributes from other vocabularies with which CellML syntax might be combined in a CellML document. For instance, CellML makes use of the MathML vocabulary for the definition of equations, and all MathML elements must be placed in the MathML namespace in order for CellML processing software to recognise those elements. The use of namespaces also allows processing software to distinguish elements and attributes from different versions of the CellML specification. Applications that store their own proprietary data within a CellML document must define their own namespaces, and associate their own elements and attributes with those namespaces, as discussed in Section 2.2.3.

This specification is primarily concerned with the rules and semantics that relate to the elements and attributes in the CellML namespace, which are used in the definition of model structure. It is an error if documents contain elements and attributes in the CellML namespace that are not defined in this specification. This specification also defines how elements and attributes in the MathML, RDF and CellML Metadata namespaces can be combined with elements and attributes in the CellML namespace, and how processing software should deal with content in those namespaces. MathML is particularly important to CellML because content in this namespace is considered as fundamental as content in the CellML namespace. Metadata is defined using elements in the RDF namespace and linked to CellML elements using an id attribute in the CellML Metadata namespace as described in Section 8. Any CellML element may contain elements and attributes in other namespaces, which CellML processing software is free to ignore.

Table 1 lists the names, URIs and recommended prefixes of the namespaces referenced in this specification. For interoperability, the root element of any CellML document should set the default namespace and map the cellml prefix to the CellML 1.0 namespace URI. The latter simplifies the association of elements and attributes with the CellML namespace in regions of the document where the default namespace is not the CellML namespace. For instance, the MathML elements used to define equations are typically placed inside a <math> element that changes the default namespace to the MathML namespace. A cellml:units attribute in the CellML namespace can then be added to each of MathML's <cn> elements without having to redeclare the CellML namespace every time it is used.


Namespace Name Namespace URI Recommended Prefix
CellML "http://www.cellml.org/cellml/1.0#" cellml
CellML Metadata "http://www.cellml.org/metadata/1.0#" cmeta
MathML "http://www.w3.org/1998/Math/MathML" mathml
RDF "http://www.w3.org/1999/02/22-rdf-syntax-ns#" rdf

Table 1 The names, URIs and recommended prefixes of the namespaces referenced in this specification. See text for more details.


2.2.3  Extending CellML documents

Any namespace with a URI not defined in Table 1 is an extension namespace. Any element in an extension namespace is an extension element. Any attribute in an extension namespace is an extension attribute. Model authors and CellML processing software may store information not covered by the CellML specification in a CellML document by defining their own extension elements and extension attributes. When authors and implementors define extension namespaces, it is recommended that they use URIs under their jurisdiction. Extension elements and extension attributes may appear anywhere in a CellML document as long as the result is well-formed XML.

For interoperability CellML processing software should respect the extension elements and attributes of other applications. If a model is created in application A, which adds its own extension elements, and is subsequently edited in application B, then application B should attempt to include application A's extension elements in its output, even if these extension elements are now invalid. Applications will need to validate their own extension data if a CellML document is read in from a non-trusted location.

The namespace extension mechanism provides a convenient way to associate a small amount of application-specific information with a model defined in CellML. However, it is recommended that applications needing to store large amounts of information, such as rendering or simulation information, do so in a separate document. This will make CellML documents easier to exchange, and will prevent the loss of application-specific information if the model is passed through applications unaware of the extensions.

2.3  Examples

Figure 1 contains some example CellML elements, each of which defines a name attribute. The values of the name attribute on the first three elements are valid CellML identifiers. The values of the name attribute on the last two elements are invalid identifiers.


<!--
  The following elements have name attributes with valid values.
-->


<component name="my_favorite_component" />

<variable name="_ca2_conc" units="millimolar" />

<model name="model1345" />

<!--
  The following elements have name attributes with invalid values.
  Names may not consist purely of underscores or contain colons.
-->


<component name="___" />

<component name="my_model:my_component" />

Figure 1 XML elements defining name attributes. Valid and invalid CellML identifiers are shown, as noted in the comments.


Figure 2 contains portions of a typical CellML document that demonstrate the recommended use of namespaces. The root element sets the default namespace to the CellML namespace URI and explicitly maps the CellML namespace URI to the cellml prefix. The <math> element that encloses a set of equations inside a component element resets the default namespace to the MathML namespace. The units attribute on the <cn> element (which is in the MathML namespace) is placed in the CellML namespace by using the previously-defined cellml prefix.


<model
    
name="simple_electrophysiological_model"
    
xmlns="http://www.cellml.org/cellml/1.0#"
    
xmlns:cellml="http://www.cellml.org/cellml/1.0#">
  
  ...

  
<component name="extra_cellular_space">
    ...
    
<math xmlns="http://www.w3.org/1998/Math/MathML">
      
<apply><eq />
        
<apply><diff />
          
<bvar><ci> time </ci></bvar>
          
<ci> Na </ci>
        
</apply>
        
<apply><times />
          
<cn cellml:units="dimensionless"> -1.0 </cn>
          
<ci> I_Na </ci>
        
</apply>
       
</apply>
       ...
    
</math>  
  
</component>

  ...

</model>

Figure 2 A CellML fragment demonstrating the recommended use of namespaces in a CellML document. This fragment is taken from the simple electrophysiological model example on the CellML website.


Figure 3 demonstrates how software can embed its own information inside a valid CellML document using XML namespaces. The <model> element sets the default namespace to the CellML namespace, and maps the app prefix to an extension namespace (i.e., one not defined in Table 1). The app prefix is then used to define an <app:component_rendering_information> element and two attributes on a <component> element.


<model
    
name="cellml_model_with_extensions"
    
xmlns="http://www.cellml.org/cellml/1.0#"
    
xmlns:app="http://www.physiome.org.nz/cellml_processor"
    
xmlns:cellml="http://www.cellml.org/cellml/1.0#">

  
<app:component_rendering_information>
    cell : blue
    membrane : yellow
    channel : red
  
</app:component_rendering_information>

  
<component
      
name="cell"
      
app:component_type="cell"
      
app:render_corners="100, 100, 400, 400" />

</model>

Figure 3 A CellML document demonstrating the use of XML namespaces to embed application specific data inside a CellML document. The extension namespace URI was invented for demonstration purposes only.


2.4  Rules for CellML Documents

2.4.1  Valid CellML identifiers

  • A valid CellML identifier must consist of only letters, digits and underscores, and must contain at least one letter or digit. This can be written using Extended Backus-Naur Form (EBNF) notation as follows:
    letter     ::= 'a'...'z','A'...'Z'
    digit ::= '0'...'9'
    identifier ::= ('_')* ( letter | digit ) ( letter | '_' | digit )*
    [ The variant of EBNF used above is defined in Section 6 of the XML 1.0 Recommendation. ]

2.4.2  Proper use of the CellML namespace

  • A document must not contain elements or attributes in the CellML namespace that are not defined in this specification. [ Documents containing unknown elements or attributes in the CellML namespace are not valid CellML documents. Rules regarding the use of elements in the other namespaces defined in Table 1 are given in the appropriate sections. Note that attributes without an explicit prefix declaration are assumed to be in the same namespace as their parent element. ]

2.4.3  Extension namespaces

  • Although not explicitly stated throughout this specification, a document author may add extension elements and extension attributes to any CellML element in a CellML document without affecting the validity of the document. [ Note that attributes without an explicit prefix declaration are assumed to be in the same namespace as their parent element. ]
  • For interoperability, elements in the CellML namespace should not be defined inside extension elements. [ Specifically, applications should not define important model structure, mathematics or metadata information within extension elements, which other applications are free to ignore. ]
  • For interoperability, attributes in the CellML namespace should not be defined on extension elements.

2.4.4  Text nodes within CellML elements

  • Any characters that occur immediately within elements in the CellML namespace must be either space (#x20) characters, carriage returns (#xA), line feeds (#xD), or tabs (#x9). [ All of the elements in the CellML 1.0 namespace contain no text content. The characters listed above correspond to the definition of whitespace given in Section 2.3 of the XML Recommendation. Text content may still be included in extension elements inside CellML elements. ]

2.5  Rules for Processor Behaviour

2.5.1  Treatment of CellML identifiers

  • CellML processing software must handle identifiers in a case-sensitive manner. [ Two CellML elements of the same type may be defined with identifiers of A and a. Processing software is expected to match the identifiers in a case-sensitive manner when those elements are referenced at other places in the document. ]

2.5.2  Treatment of attribute namespaces

  • CellML processing software must treat attributes without an explicit namespace declaration as if they were in the same namespace as their parent element.

2.5.3  Treatment of extension namespaces

  • CellML processing software may ignore extension elements and extension attributes. [ If the namespace is unrecognised, then software should probably alert the user to its presence. Polite software should attempt to store non-CellML data so that it can write it out again when it exports the document. Software should validate its own non-CellML data carefully when reading documents from a non-trusted location. ]
  • CellML processing software may ignore the attributes and content of extension elements.

3  Model Structure

3.1  Introduction

Any model can be described as a network of connections between self-contained components. A component is a functional unit that may correspond to a physical compartment, event, or species or may be just a convenient modelling abstraction. A component contains variables and mathematical relationships that manipulate those variables. Connections exchange information between components. A connection contains mappings between variables in two components, allowing the value of a variable in one component to be passed to a variable in the other component.

3.2  Basic Structure

3.2.1  Definition of a model

A model is declared in CellML with a <model> element. This is the usual root element for a CellML document. The recommended best practice for specifying namespaces in a CellML document is described in Section 2.2.3.

The <model> element has a name attribute that allows this model to be unambiguously referenced by other models. For instance, this would be necessary if the model were to be combined with other models or partial models to create a larger model.

A <model> element may contain any number of the elements in the following list in any order. The recommended best practice is for elements placed within the <model> element to appear in the order given in the following list. This allows people to quickly find certain kinds of information within a CellML document.

  • <units> — A modeller can declare a set of units to use in the model, as described in Section 5.
  • <component> — Components are the smallest functional units in a model. Each component may contain variables that represent the key properties of the component and/or mathematics that describe the behaviour of the portion of the system represented by that component.
  • <group> — Groups allow the modeller to define logical and physical relationships between components. Groups are defined using the <group> element, as discussed in Section 6.
  • <connection> — Connections are used to connect components to each other and to map variables in one component to variables in another. Connections are defined using the <connection> element, as discussed in Section 3.2.4.

The <model> element (and indeed any of the elements in a CellML document) may define metadata to provide context for that object. This metadata might include documentation, citations from literature, or a modification history for the current CellML object. Adding metadata to a CellML document is discussed in detail in Section 8.

3.2.2  Definition of components

Constructing a model from multiple components encourages the re-use of components. For instance, an electrophysiological model of a cell might be organised into components that represent various ion channels. All of the mathematics that describe the behaviour of the L-type calcium channel would be defined in a single component representing this particular ion channel. If a modeller wished to re-use the portion of the model representing the L-type calcium channel in another model, he or she would only need to copy this component.

A <component> element is used to declare a CellML component. It must only be used inside a <model> element or as the root element of a CellML document. A <component> element that is the root of a CellML document does not define a complete model. It would probably be part of a library of standard components that could be imported and used in models. Eventually, CellML will include a mechanism that simplifies such re-use of components. At the present time, the component would need to be physically copied into a model document to be used in that model.

A CellML <model> may contain any number of <component> elements. Each <component> must have a name attribute, the value of which is a unique identifier for the component within the current model. The value of the name attribute is used to reference the component in other parts of the model, such as in connections and groups.

A <component> may contain any of the elements in the following list in any order. Again, recommended best practice is for elements placed within the <component> element to appear in the order given in the following list.

  • <units> — A modeller can declare a set of units to use within the component, as described in Section 5.
  • <variable> — A component may contain any number of <variable> elements, which define variables that may be mathematically related in the equation blocks contained in the component. Variables are discussed in Section 3.2.3.
  • <reaction> — A component may contain <reaction> elements, which are used to provide chemical and biochemical context for the equations describing a reaction. It is recommended that only one <reaction> element appear in any <component> element. The definition of reaction information is described in Section 7.
  • <mathml:math> — A component may contain a set of mathematical relationships between the variables declared in this component. These equations are marked up using MathML, as discussed in Section 4. The mathml prefix is used to indicate that the <math> element is in the MathML namespace.

A <component> element is also a sensible place to define metadata, using the syntax presented in Section 8.

The definitions of two <component> elements are included in the example described in Section 3.3.

3.2.3  Definition of variables

Models are usually developed to investigate the behaviour of a number of variables that have biological significance. Each variable in the model belongs to a single component, which may contain equations that modify the value of that variable. The value of a variable may be passed through connections into other components. The variable must also be declared in these components, which can then use the value of the variable in their own equations but must not modify it.

The <variable> element is used to declare a CellML variable. It can only be used inside a <component> element. Variables must define a name attribute, the value of which must be unique across all variables in the current component. The name of a variable is used when referencing variables inside connections (see Section 3.2.4) and reactions (see Section 7). All variables must also define a units attribute. The value of this attribute must correspond to one of the keywords in the CellML units dictionary or the value of the name attribute of a units element defined within the current component or model, as described in Section 5.

A <variable> element may also have the following attributes:

  • initial_value — This attribute provides a convenient means for specifying the value of a scalar real variable when all independent variables in the model have a value of 0.0. Independent variables are those with respect to which another variable is differentiated or integrated.
  • public_interface — This attribute specifies the interface exposed to components in the parent and sibling sets (see below). The public interface must have a value "in", "out", or "none". The absence of a public_interface attribute implies a value of "none".
  • private_interface — This attribute specifies the interface exposed to components in the encapsulated set (see below). The private interface must have a value "in", "out", or "none". The absence of a private_interface attribute implies a value of "none".

The name of the initial_value attribute derives from the fact that, in a model with only one independent variable, this would generally correspond to time, and so the value of the initial_value attribute sets the starting condition for a simulation which progressed from time equals 0.0. The initial values of variables need not be set in the model definition at all. When multiple simulations are to be run using the same model, initial and boundary conditions are most conveniently set in an external simulation configuration file loaded separately by CellML processing software.

Whether or not a component may obtain the value of a variable in another component depends on the public_interface and private_interface attributes on the variable declaration, and the place of the two components in the encapsulation hierarchy. Encapsulation allows the modeller to hide a complex network of components from the rest of the model and provide a single component as an interface to the hidden network. Encapsulation effectively divides the network into layers, where connections between the layers must only be made through the interface components.

The components to which any given component may connect can be divided into four distinct sets with respect to any given component (the current component). The set of all components immediately encapsulated by the current component is referred to as the encapsulated set. If the current component is encapsulated, then the encapsulating component is referred to as the parent, and the set of all other components encapsulated by the same parent is referred to as the sibling set. If the current component is not encapsulated, then it has no parent and the sibling set consists of all other components in the model that are not encapsulated. All other components, which are not available to make connections with the current component, make up the hidden set. The encapsulation hierarchy and its effects on variable mapping are described in Section 6.

When a variable is declared with either a public_interface or private_interface attribute value of "in", then the value of that variable must be imported from another component. Otherwise, a variable's value must be set and modified in the current component. The variable is then said to belong to or be owned by the current component.

Eventually, it will be possible to specify the temporal and/or spatial variation of a variable's value using FieldML. The capability to include FieldML is still under development. At the present time, all variables must have scalar real values.

3.2.4  Definition of connections

Connections provide the mechanism for mapping variables declared within one component to variables in another component, allowing information to be exchanged between the various components in the network. The mapping of variables involves the transfer of a variable's value from one component to another, a process which may involve a conversion to ensure the units match. (More information on units conversion can be found in Section 5.)

The complete set of variable mappings between any two components constitutes a connection. Only one connection may be created between any given pair of components in a model. Each connection references the two components involved in the connection, and then maps variables from each of the components together. The interface attributes of each pair of variables must be compatible — an "out" variable in one component's interface must map to an "in" variable in the other component's interface. The direction of each mapping is determined by the value of the public_interface and private_interface attributes on the two variables: the value is always passed from the variable with an interface value of "out" to the variable with an interface value of "in". The value of a variable declared with an interface value of "out" may be passed out to any number of variables in other components declared with interface values of "in". The component to which a variable belongs is found by tracing the variable back from "in" to "out" interfaces, following the model's connections.

The <connection> element is used to declare a CellML connection. It can only appear inside a <model> element.

A <connection> element must contain exactly one <map_components> element, which is used to reference the two components involved in the connection. Each <map_components> element must define component_1 and component_2 attributes, the values of which are the names of the components being referenced. In CellML 1.0, the referenced components must be defined within the current <model> element. It is anticipated that it will eventually be possible to reference components from other models, allowing models to be connected into larger models.

A <connection> element must also contain one or more <map_variables> elements, which are used to reference the variables being mapped between the two components in the connection. Each <map_variables> element must define variable_1 and variable_2 attributes, the values of which are equal to the names of variables defined in the components referenced by the component_1 and component_2 attributes on the <map_components> element, respectively. It is not necessary for the variables that are to be mapped to each other to have the same name, although this will typically be the case.

The CellML example discussed in Section 3.3 demonstrates the definition of a <connection> element.

3.3  Examples

Figure 4 contains a portion of the CellML description of the Hodgkin-Huxley squid axon model published in 1952. The excerpt contains the definitions of the components corresponding to the membrane and the sodium channel, and the connection between the two components. Most of the complexity from the full model definition has been left out for conciseness and clarity. This example is only used to demonstrate the standard use of the <component>, <variable>, and <connection> elements.


<model
    
name="hodgkin_huxley_model_excerpt"
    
xmlns="http://www.cellml.org/cellml/1.0#"
    
xmlns:cellml="http://www.cellml.org/cellml/1.0#"
    
xmlns:cmeta="http://www.cellml.org/metadata/1.0#">

  
<!--
    Units definitions which could be referenced from the <variable> elements
    would typically be inserted here. Units are discussed in Section 5.
  -->


  
<component name="membrane">
    
<!-- the following variable is used in other components -->
    
<variable
        
name="V" initial_value="-75.0"
        
public_interface="out" units="millivolt" />

    
<!-- the following variables are imported from other components -->
    
<variable name="time" public_interface="in" units="millisecond" />
    
<variable name="i_Na" public_interface="in" units="microA_per_cm2" />
    
<variable name="i_K" public_interface="in" units="microA_per_cm2" />
    
<variable name="i_L" public_interface="in" units="microA_per_cm2" />

    
<!-- the following variable is only used internally -->
    
<variable name="C" initial_value="1.0" units="microF_per_cm2" />

    
<math xmlns="http://www.w3.org/1998/Math/MathML">
      
<apply id="membrane_voltage_diff_eq"><eq />
        
<apply><diff />
          
<bvar><ci> time </ci></bvar>
          
<ci> V </ci>
        
</apply>
        
<apply><divide />
          
<apply><minus />
            
<ci> i_Na </ci>
            
<ci> i_K </ci>
            
<ci> i_L </ci>
          
</apply>
          
<ci> C </ci>
        
</apply>
      
</apply>
    
</math>
  
</component>

  
<component name="sodium_channel">
    
<!-- the following variables are used in other components -->
    
<variable name="i_Na" public_interface="out" units="microA_per_cm2" />

    
<!-- the following variables are imported from other components -->
    
<variable name="time" public_interface="in" units="millisecond" />
    
<variable name="V" public_interface="in" units="millivolt" />

    
<math xmlns="http://www.w3.org/1998/Math/MathML">
      
<apply id="i_Na_calculation"><eq />
        
<ci> i_Na </ci>
        ...  
<!-- a function of V & time -->
      
</apply>
    
</math>
  
</component>

  
<connection>
    
<map_components component_1="membrane" component_2="sodium_channel" />
    
<map_variables variable_1="V" variable_2="V" />
    
<map_variables variable_1="i_Na" variable_2="i_Na" />
  
</connection>

</model>

Figure 4 A small portion of the CellML description of the Hodgkin-Huxley squid axon model from 1952. This excerpt contains the definition of the components corresponding to the membrane and the sodium channel, and the connection between them. Much detail has been omitted, but this example clearly demonstrates the relationship between the <component>, <variable> and <connection> elements.


The membrane component declares six variables, which are divided into three categories. The first variable is called V, and it represents the membrane voltage in the model. It has a public_interface attribute value of "out", which indicates that the variable "belongs" to this component and that its value may be obtained by other components in the model via connections. It references a units definition by the name of millivolt (this definition is not included here), and is given an initial value of -75.0 millivolts.

The subsequent four variables are time, i_Na (sodium current), i_K (potassium current) and i_L (leakage current). They are all declared with a public_interface attribute value of "in", which indicates that their values are obtained from other components via connections.

Finally, a variable C (capacitance) is declared. This <variable> element defines neither a public_interface or a private_interface attribute. Both of these attributes therefore assume the default value of "none", which means that the variable belongs to the current component and is not visible to other components in the model.

After the variable declarations, a <math> element in the MathML namespace is used to define an equation relating V to the other variables. Only the values of the variables belonging to a component may be mathematically modified in that component. The equation included in Figure 4 is the well known differential equation from the Hodgkin Huxley model:

Equation: dV_dt (1)

The sodium_channel component declares three variables, all of which represent quantities that were also declared in the membrane component. The i_Na variable declared in this component has a public_interface attribute value of "out", indicating that the sodium current belongs to this component. The value of the sodium current is calculated in this component, although the actual math has been omitted.

Finally, a <connection> element references the membrane and sodium_channel components using a <map_components> element, and maps the V and i_Na variables in each component together, using two <map_variables> elements. The value of the variable_1 attribute on each <map_variables> element references the corresponding variable in the membrane component, since this is the component referenced by the component_1 attribute on the <map_components> element. Similarly, the values of the variable_2 attributes reference variables in the sodium_channel component.

3.4  Rules for CellML Documents

The following are the rules for using the <model>, <component>, <variable>, <connection>, <map_components>, and <map_variables> elements.

3.4.1  The <model> element

  1. Allowed use of the <model> element
    • A <model> element must contain only the following elements, which may appear in any order:
      • <units>, <component>, <group>, and <connection> elements in the CellML namespace,
      • <RDF> elements in the RDF namespace.
      [ The recommended best practice is to define the child elements in the CellML namespace in the order stated above. ]
    • Each <model> element must define a name attribute.
  2. Allowed values of the name attribute
    • The value of the name attribute must be a valid CellML identifier as discussed in Section 2.2.1.

3.4.2  The <component> element

  1. Allowed use of the <component> element
    • A <component> element must contain only the following elements, which may appear in any order:
      • <units>, <variable> and <reaction> elements in the CellML namespace,
      • <math> elements in the MathML namespace,
      • <RDF> elements in the RDF namespace.
      [ The recommended best practice is to define the child elements in the CellML and MathML namespaces in the order stated above. Note that a <component> element must not appear inside another <component> element. Such nesting could be intended to indicate a logical encapsulation relationship, a geometric containment relationship, or some other relationship between the two components. There is no reason to assume that the nesting hierarchy produced for one type of relationship would be consistent with the hierarchy produced for other types of relationship. Therefore, CellML defines these relationships using the <group> element, rather than nesting of <component> elements. ]
    • Each <component> element must define a name attribute.
  2. Allowed values of the name attribute
    • The value of the name attribute must be a valid CellML identifier as discussed in Section 2.2.1.
    • The value of the name attribute must be unique across all <component> elements contained in the parent <model> element.

3.4.3  The <variable> element

  1. Allowed use of the <variable> element
    • A <variable> element must contain only the following elements:
      • <RDF> elements in the RDF namespace.
    • Each <variable> element must define a name attribute and a units attribute. It may also define public_interface, private_interface, and initial_value attributes.
  2. Allowed values of the name attribute
    • The value of the name attribute must be a valid CellML identifier as discussed in Section 2.2.1.
    • The value of the name attribute of a <variable> element must be unique across all <variable> elements contained in the same <component> element. [ Two variables in the same component must not have the same name. However, two variables in different components may have the same name, and a variable may have the same name as its parent component. ]
  3. Allowed values of the units attribute
    • The value of the units attribute must either be one of the keywords defined in the standard dictionary or the value of the name attribute on a <units> element defined in the current component or model. [ The dictionary and the units element are described in Section 5. ]
  4. Allowed values of the public_interface attribute
    • If present, the value of the public_interface attribute must be "in", "out", or "none".
    • If not present, its value defaults to "none".
  5. Allowed values of the private_interface attribute
    • If present, the value of the private_interface attribute must be "in", "out", or "none".
    • If not present, its value defaults to "none".
  6. Proper use of the public_interface and private_interface attributes
    • A <variable> element must not define both public_interface and private_interface attributes with values equal to "in". [ A variable's value must only be obtained via one mapping. ]
  7. Allowed values of the initial_value attribute
    • If present, the value of the initial_value attribute must be a real number.
    • The absence of an initial_value attribute implies nothing. [ The absence of this attribute would usually mean either that the variable does not need an initial value or that this value will be supplied in a parameter file or by the user at the time simulations using the model are run. ]
  8. Proper use of the initial_value attribute
    • An initial_value attribute must not be defined on a <variable> element with a public_interface or private_interface attribute with a value of "in". [ These variables receive their value from variables belonging to another component. ]

3.4.4  The <connection> element

  1. Allowed use of the <connection> element
    • A <connection> element must contain only the following elements, which may appear in any order:
      • <map_components> and <map_variables> elements in the CellML namespace,
      • <RDF> elements in the RDF namespace.
    • Each <connection> element must contain exactly one <map_components> element.
    • Each <connection> element must contain at least one <map_variables> element. [ It does not make sense to define a connection that does not map variables together. This rule prevents software from using empty connections to imply information not defined in this specification. ]

3.4.5  The <map_components> element

  1. Allowed use of the <map_components> element
    • A <map_components> element must contain only the following elements:
      • <RDF> elements in the RDF namespace.
    • Each <map_components> element must define a component_1 attribute and a component_2 attribute.
  2. Allowed values of the component_1 attribute
    • The value of the component_1 attribute must equal the value of the name attribute of a <component> element contained within the current <model> element.
  3. Allowed values of the component_2 attribute
    • The value of the component_2 attribute must equal the value of the name attribute of a <component> element contained within the current <model> element.
  4. Proper use of the component_1 and component_2 attributes
    • The component_1 and component_2 attributes on a single <map_components> element must not have the same value. [ A connection must link two different components. ]
    • Each <map_components> element contained within <connection> elements that are contained within a given <model> element must define a unique pair of component_1 and component_2 attribute values. [ There can only be one connection between any two components in a network. This prevents setting up inconsistent, circular, or duplicate variable mappings between any two components in the network. However, it does not prevent a model author from creating inconsistent mathematical relationships between the variables. ]

3.4.6  The <map_variables> element

  1. Allowed use of the <map_variables> element
    • A <map_variables> element must contain only the following elements:
      • <RDF> elements in the RDF namespace.
    • Each <map_variables> element must define a variable_1 attribute and a variable_2 attribute.
  2. Allowed values of the variable_1 attribute
    • The value of the variable_1 attribute must equal the value of the name attribute of a <variable> element contained in the <component> element referenced by the component_1 attribute on the <map_components> element within the current <connection> element.
  3. Allowed values of the variable_2 attribute
    • The value of the variable_2 attribute must equal the value of the name attribute of a <variable> element contained in the <component> element referenced by the component_2 attribute on the <map_components> element within the current <connection> element.
  4. Proper use of the <map_variables> element to map variables to each other

    [ The rules for mapping a variable to other variables depend on the encapsulation hierarchy of the component that owns the variable. This hierarchy divides the rest of the components in the model into parent, sibling, encapsulated, and hidden sets, as described in Section 3.2.3. The public_interface attribute defines the availability of a variable to the parent component and components in the sibling set. The private_interface attribute defines the availability of a variable to components in the encapsulated set. Variables are not available to components in the hidden set. ]

    • Variables with a public_interface or private_interface attribute value of "in" must be mapped to variables with a public_interface or private_interface attribute value of "out".
    • A variable with either a private_interface or public_interface attribute value of "in" must be mapped to no more than one other variable in the model. [ Note that a similar restriction does not apply to variables with interface values of "out". An output variable can be mapped to multiple input variables in various components in the current model. ]
    • A variable with a public_interface attribute value of "in" must be mapped to a single variable owned by a component in the sibling set, provided the target variable has a public_interface attribute value of "out", or to a single variable owned by the parent component, provided the target variable has a private_interface attribute value of "out".
    • A variable with a public_interface attribute value of "out" may be mapped to variables owned by components in the sibling set, provided the target variables have public_interface attribute values of "in". It may also be mapped to variables owned by the parent component, provided the target variables have private_interface attribute values of "in".
    • A variable with a private_interface attribute value of "in" may be mapped to a single variable owned by a component in the encapsulated set, provided the target variable has a public_interface attribute value of "out".
    • A variable with a private_interface attribute value of "out" may be mapped to variables owned by components in the encapsulated set, provided the target variables have public_interface attribute values of "in".

3.5  Rules for Processor Behaviour

3.5.1  Mapping of variables

For interoperability, CellML processing software should take into account the units definitions referenced by any two variables that are mapped together. If the units references are not equivalent, as defined in Appendix C.2.1, then a conversion may be required. An algorithm for performing this conversion is proposed in Appendix C.3.5.

4  Mathematics

4.1  Introduction

CellML allows modellers to unambiguously specify the underlying mathematics of a cellular model. Model components may contain mathematical expressions that manipulate the values of variables that belong to them. These expressions are also free to use (but must not modify) the values of any other variable declared in those components.

Mathematical expressions are embedded in CellML documents using Mathematical Markup Language 2.0 (MathML), an XML-based language that encodes the underlying structure of a mathematical expression. CellML uses a subset of the elements from MathML 2.0, known as the content markup element set, which includes several deprecated elements from MathML 1.0.

CellML 1.0 does not require processing software to implement support for scripting. If software chooses to do so, some recommendations on the use of scripting are given in Appendix B.

4.2  Basic Structure

4.2.1  Definition of mathematics

All mathematical expressions defined using MathML must be placed inside a <mathml:math> element. <mathml:math> elements must only be defined in <cellml:component> or <cellml:role> elements. The mathml and cellml namespace prefixes are used throughout this section to indicate that elements are in the MathML and CellML namespaces, respectively. The <cellml:role>, <cellml:variable_ref> and <cellml:reaction> elements mentioned in this section are described in detail in Section 7 of this specification.

<mathml:math> elements that occur as child elements of <cellml:component> elements can be used to define arbitrary expressions relating the variables declared in that component. A mathematical expression may make use of any variable declared within the current component by placing the variable's name within a <mathml:ci> element. Expressions must only modify the values of variables that belong to that component. Variables that belong to a component are those that are not declared with a public_interface or private_interface attribute value of "in".

<mathml:math> elements that occur as child elements of <cellml:role> elements (these are defined within <cellml:variable_ref> elements, which are in turn defined within <cellml:reaction> elements) can be used to define expressions that modify the values of specific variables in specific ways. These expressions may make use of any variable declared in the current component but must only modify the value of the variable referenced by the ancestor <cellml:variable_ref> element, subject to further limitations that are described in Section 7.2.

CellML processing software must interpret MathML elements according to the semantics defined in the MathML 2.0 Recommendation. However, CellML 1.0 does define some restrictions on, and additions to, the MathML syntax. These are covered in the subsequent sections.

4.2.2  MathML's presentation and content markup elements

The complete set of elements defined in the MathML 2.0 Recommendation is split into two principal sub-vocabularies: the presentation markup and content markup elements. The presentation markup elements describe the visual rendering of mathematical expressions and objects. The content markup elements specify the underlying meaning of a mathematical expression or object, without regard to its presentation.

CellML is used to describe the structure and mathematics of cellular models. For this reason, valid CellML documents must only contain content markup elements within a <mathml:math> element. There is one exception: model authors may associate rendering information with a particular expression by placing MathML presentation markup elements inside a <mathml:annotation-xml> element. CellML processing software may ignore the contents of <mathml:annotation> and <mathml:annotation-xml> elements.

An example demonstrating the embedding of MathML content and presentation markup elements in a CellML document is presented in Section 4.3.

4.2.3  The CellML subset of MathML content elements

Valid CellML documents may contain any MathML content markup elements within a <mathml:math> element, as long as the arrangement of these elements follows the rules defined in the MathML 2.0 Recommendation. However, it is anticipated that it will be some time before software is able to interpret all of these elements. To encourage interoperability, this section defines a subset of the MathML content markup elements known as the CellML subset. CellML documents that only contain content markup elements from the CellML subset are known as valid CellML subset documents. CellML processing software may only call itself CellML conformant if it is able to correctly interpret all of the MathML elements in the CellML subset according to the semantics defined in the MathML 2.0 Recommendation.

The complete list of MathML elements in the CellML subset is given in Figure 5. Many of the elements in the CellML subset are included to provide facilities for the definition of algebraic and ordinary differential equations. Others (such as the trigonometric operators) have been included because they are reasonably straightforward to translate to computer code.



token elements

<cn>, <ci>

basic content elements

<apply>, <piecewise>, <piece>, <otherwise>

relational operators

<eq>, <neq>, <gt>, <lt>, <geq>, <leq>

arithmetic operators

<plus>, <minus>, <times>, <divide>, <power>, <root>, <abs>, <exp>, <ln>, <log>, <floor>, <ceiling>, <factorial>

logical operators

<and>, <or>, <xor>, <not>

calculus elements

<diff>

qualifier elements

<degree>, <bvar>, <logbase>

trigonometric operators

<sin>, <cos>, <tan>, <sec>, <csc>, <cot>, <sinh>, <cosh>, <tanh>, <sech>, <csch>, <coth>, <arcsin>, <arccos>, <arctan>, <arccosh>, <arccot>, <arccoth>, <arccsc>, <arccsch>, <arcsec>, <arcsech>, <arcsinh>, <arctanh>

constants

<true>, <false>, <notanumber>, <pi>, <infinity>, <exponentiale>

semantics and annotation elements

<semantics>, <annotation>, <annotation-xml>

Figure 5 The CellML subset of MathML content markup elements, grouped according to function. All elements in this figure are in the MathML namespace.


4.2.4  Ordering of expressions

The mathematics in a model defined using CellML 1.0 consist of a static system of expressions, which are distributed over a network of components. CellML does not define the order of evaluation of equations, as this is simulation information rather than model information.

4.2.5  Scope of expressions

Within a CellML model, all expressions are assumed to have unlimited scope with respect to the independent variables unless explicitly stated using MathML's <piecewise> construct or some other form of conditional expression. This means that if the initial conditions for a variable, the value of which is determined by a differential equation, are to be specified using an equality, the two equations should have their scope limited so that they do not contradict each other.

4.2.6  Associating units with numbers

To ensure that models are robust and portable, all variables and numbers that occur in mathematical expressions within a CellML document must have units associated with them. CellML's units framework is introduced in Section 5 and the association of units with variables is presented in Section 3.2.3. The association of units with numbers in equations requires an extension to MathML. This can be done in a manner consistent with the association of units with variables and with application-specific extensions to CellML by adding a units attribute in the CellML namespace to the <mathml:cn> element, which encloses all numbers. The example presented in Section 4.3 demonstrates this.

4.2.7  Definition of scripts

CellML 1.0 does not define a standard method by which model authors can embed scripts in CellML documents in a portable way. It is anticipated that this functionality will be defined in a subsequent version of CellML. However, the use of scripts in CellML is strongly discouraged. CellML is aimed at specifying a model in terms of its most basic governing equations. Wherever possible, mathematical equations should be used to specify the changing behaviour of a model's state variables.

If implementors do decide to add scripting functionality to CellML documents, these scripts must be defined within elements placed in an application-specific extension namespace. Implementors are advised to follow the recommendations on the best practices for embedding and executing scripts described in Appendix B. The key recommendations are summarised in the following list:

  • For interoperability, scripts should be defined using ECMAScript.
  • The <mathml:csymbol> element should be used from within MathML markup to call scripts defined using a non-MathML syntax. These elements must define a definitionURL that identifies the element containing the script, and an encoding attribute specifying the scripting language used.
  • Function names (or the identifier used to reference a script) should be valid CellML identifiers, as defined in Section 2.2.1.
  • The content of a <mathml:csymbol> element should be a human-readable identifier for the script, preferably the function name.
  • Functions must be side-effect free. That is, a function must not assign values to variables that are not local to that function. In particular, functions must not alter the values of their arguments or global variables.

4.3  Examples

The CellML fragment in Figure 6 demonstrates how MathML can be employed within CellML to define mathematical expressions. This fragment is part of the definition of a component that represents the behaviour of the n gate from the potassium channel in the Hodgkin-Huxley squid axon model of 1952. The component contains two units definitions (with syntax defined in Section 5), two variable declarations (with syntax defined in Section 3), and a block of MathML that defines an expression calculating the alpha variable of the n gate as well as the rendering of this equation, which is given in Equation 2.

Equation: alpha_n_in_hh (2)

<component
    
name="potassium_channel_n_gate"
    
xmlns="http://www.cellml.org/cellml/1.0#"
    
xmlns:cellml="http://www.cellml.org/cellml/1.0#"
    
xmlns:cmeta="http://www.cellml.org/metadata/1.0#">

  
<units name="per_millisecond">
    
<unit prefix="milli" units="second" exponent="-1" />
  
</units>
  
<units name="millivolt">
    
<unit prefix="milli" units="volt" />
  
</units>
  
<units name="per_millivolt">
    
<unit prefix="milli" units="volt" exponent="-1" />
  
</units>

  
<variable name="alpha_n" units="per_millisecond" />
  
<variable name="V" public_interface="in" units="millivolt" />

  
<math xmlns="http://www.w3.org/1998/Math/MathML">
    
<semantics>
      
<apply id="alpha_n_calculation"><eq />