CellML Umbrella Specification 1.0

CellML Umbrella 1.0 Specification
Authors:
        Andrew Miller (Bioengineering Institute, The University of Auckland)

Abstract

This document specifies the CellML Umbrella format, a top-level XML based meta-language from which languages for describing and exchanging mathematical models can be derived. CellML Umbrella does not directly provide the information needed to represent mathematical models. Instead, it provides a top-level XML element, and some general guidelines, by which mathematical models can be represented in XML. More specific formats can then be developed so that conformance with the more specific format implies conformance with the CellML Umbrella format. In order to comply with the CellML Umbrella format, a format must have been registered as specified in this standard. CellML Umbrella processing software may use the general provisions of this specification to process any such more specific format, and may determine which specific format is used.

The specification has been designed so that all CellML 1.0 and CellML 1.1 documents also comply to this specification, and it is likely that all future versions of CellML will also comply with this document.

Status of this document

This document was endorsed by the CellML team on the 20th of April, 2006.

Introduction

The CellML Umbrella format is an XML based meta-language from which languages for describing and exchanging mathematical models can be derived. CellML Umbrella does not directly provide the information needed to represent mathematical models. Instead, it provides a top-level XML element, and some general guidelines, by which mathematical models can be represented in XML. More specific formats can then be developed so that conformance with the more specific format implies conformance with the CellML Umbrella format. In order to comply with the CellML Umbrella format, a format must have been registered as specified in this standard. CellML Umbrella processing software may use the general provisions of this specification to process any such more specific format, and may determine which specific format is used.

The specification has been designed so that all CellML 1.0 and CellML 1.1 documents also comply to this specification, and it is likely that all future versions of CellML will also comply with this document.

The CellML Umbrella Format is defined in terms of a meta-language called eXtensible Markup Language (XML). XML is a standard published by the World Wide Web Consortium, the organisation responsible for defining many Internet-related standards, including HTML. XML is essentially a means of adding structure to text documents, allowing machines to unambiguously associate text or binary data with a particular component in a document's data model.

XML is an appropriate medium for CellML Umbrella because it is both human and machine readable. A model author can create a CellML Umbrella document with a text editor or with CellML Umbrella authoring software. XML is a well-defined and widely used specification. Many free software utilities and libraries for the processing of XML already exist, simplifying the development of CellML Umbrella processing software. XML has also been designed to be usable over the Internet, making CellML Umbrella suitable for the interchange of models between software and databases at different locations.

A quick introduction to XML is available in the examples section of the CellML website.

Conformance rules

A note on conformance

This section defines conformance rules for documents and document formats. In order for a specific document format to conform with this specification, it MUST conform to the rules for document formats, and MUST be defined such that all documents conforming to the specific format also conform to this specification.

XML Conformance

1) Documents in the CellML Umbrella Format MUST be well-formed XML documents, according to either the XML 1.0 or XML 1.1 specification. Conforming document formats MAY mandate compliance with a particular choice from these two XML specifications.

Namespace Conformance

2) Documents in the CellML Umbrella Format MUST meet conform to the namespaces in XML specification.
3) A conforming document format MUST NOT impose any additional constraints on the prefix associated with any namespace.

Basic Content Rules

4) A conforming document MUST be in the UTF8 character set.
5) A conforming document format MUST be defined such that XML comments and XML processing instructions shall have no effect on the interpretation of the document.
6) A conforming document format MUST be defined such that the insertion of additional whitespace (for the purposes of this specification, whitespace means a sequence of one or more of characters referenced by character references 	, 
, 
 or   ) immediately after a start-tag, or immediately before an end-tag (after comments and processing instructions have been removed), shall have no effect on the interpretation of the document.
7) A conforming document format MUST be defined such that elements(and their contents) in any namespace other than:
  • http://www.w3.org/XML/1998/namespace , or,
  • http://www.w3.org/2000/xmlns/ , or,
  • http://www.w3.org/1999/02/22-rdf-syntax-ns# , or,
  • any additional namespaces specified by the document format
are available for use as extension elements, and the document format MUST require that processing software encountering extension elements in an unrecognised namespace silently ignore them.

    Document Element

    8) Every conforming document format MUST have associated with it a root namespace, and that root namespace MUST be unique to that particular document format.
    9) The document element in a document in the CellML Umbrella Format MUST have the local name "model", and must have a namespace corresponding to the root namespace of a conforming document format.

    Conformance to document format

    10) Every conforming document MUST conform to the document format identified by the namespace of the document element.

    Registration of document format namespaces

    11) Every conforming document format MUST appear on the CellML Umbrella Format Registry.

    Use of RDF metadata

    12) Every conforming document format MUST allow valid RDF/XML elements(with local name RDF and namespace http://www.w3.org/1999/02/22-rdf-syntax-ns#) to appear in the document. A conforming document format MAY restrict the elements in which an RDF element may appear, provided that RDF elements are permitted as children of the document element.
    13) A conforming document format MUST NOT allow RDF/XML elements which are not valid RDF/XML to appear in the document.
    14) A conforming document format MUST be defined so that the RDF graph produced as the intersection of all RDF triples in a document is valid, and so no context or other information is lost in this process.

    The CellML Umbrella Format Registry

    The CellML Umbrella Format Registry is a list of all document formatswhich conform to the CellML Umbrella Format Registry. Each document format entry shall have a short name for the format, the root namespace for the document format, and a normative reference to the format.

    The registry shall be maintained and controlled by the CellML team, at the Bioengineering Institute, The University of Auckland. It shall be available for viewing at http://www.cellml.org/registries/formats

    The registry shall initially consist of entries for CellML 1.0 and CellML 1.1. New entries may be approved by the CellML team, after a two week review period on the cellml-discussion@cellml.org list. In coming to a decision on whether to accept the registration, the CellML team shall consider any discussion on the cellml-discussion list during the review period, as well as any other information it sees fit. The CellML team may reject any registration for any reason it sees fit.

    Once an entry has been made into the registry, it shall never be removed nor altered, except to mark the document format as deprecated, or to update the location(but not the content) of the normative reference. Likewise, the normative reference shall not be altered, except to update the status of the document, or to correct errors which do not modify the syntax or semantics of the format.

    It is the author's intention that as few entries as possible be made into the registry, in order to maximise interoperability.

    Authors wishing to update the syntax and/or semantics of registered formats have two options.
    1. They may define a new specification, such that documents which comply with the old format will also comply with the old one. In order to achieve this, the new format may restrict or deprecate part of the old format. The new format may also add some new elements in a different, previously unused, namespace. Conforming older software will treat these elements as extension elements, and so will ignore the new information. When this approach is used, the root namespace is unchanged, and documents still comply with the original specification. Therefore, a new entry in the registry is not required. This approach MUST only be used when it would make sense for existing software which is unaware of the new format to process the document.
    2. They may define a new version of the specification, and treat that as a new format. The new format will then need to be separately registered.

    Requirements for entry into the registry

    A format may only be entered into the registry if it meets all of the following requirements.
    1. It MUST meet to all of the conformance rules for document formats, apart from the rule that it be already registered, and,
    2. It MUST be defined by a clear and unambiguous document, and,
    3. The document mentioned in point 2 MUST be perpetually available to the general public, without entering into any contract or paying any fee, and,
    4. The document mentioned in point 2 MUST be complete and stable(the author is free to define a later version, but the entry in the registry applies to a fixed version and later versions must be registered separately), and,
    5. The author MUST declare that they unaware of any patents which may affect its implementation(or that the owner(s) of all such patent(s) have granted an irrevocable, royalty-free license to every person of any kind, to use the patent for any purpose).