changed:
-
Physiome ontologies and mark-up language standards
To link the rapidly growing knowledgebase of biological data into a physiome modelling framework, formal vocabularies need to be developed to reduce the growing heterogeneity of terms. This is especially important as models of physiological processes are developed that span multiple spatial scales (genes and proteins to cells, tissues and organs) and incorporate the parameter changes associated with disease. A formal and standardized representation of custom data structures in the many application-specific databases is also needed to provide a common interface between them. Standards must be developed for the representation of both experimental data and mathematical models of physiological processes. Ontologies that represent semantic descriptions of modelling concepts make the modelling environment richer and unambiguous. They also promote the integration of data from ontologies and databases in other areas of biological research and the building of software tools that interpret and use this information.
Roles and representation
Modelling biological systems involves at least three key roles:
1. Biologist – represents biological systems using terms from the biological domain.
2. Mathematician – represents the mathematical formulations of the biological system.
3. Computer scientist/engineer – helps develop representation languages that represent mathematical formulations as structures interpretable by software, for example abstractions of biological and mathematical entities, mathematical relationships and rules for their interpretation.
To provide an effective interplay between these roles, we need particular tools for representing and interpreting the data generated. Producing these representations is one of the roles of the computer scientist/engineer, i.e. to provide tools and structures that not only provide for the biologists and mathematicians, but also provide for the machine processing of data in simulations, inference engines, and databases.
One set of tools the Physiome group is developing are visual editors. Visual building blocks and interfaces are a very natural way for biologists and mathematicians to build concepts, navigate libraries, and interpret models (Fig 1.). To develop a framework for building toolsets, the computer scientist needs to develop machine interpretable representations of what modelers are describing. To do this, we have been using the representation languages :
1. "CellML":http://www.cellml.org – a modelling description language.
2. OWL – an ontology representation language specified by the W3C web Ontology Group
Our integration of ontologies into the physiome project is to provide an unambiguous, and machine interpretable, representation of concepts across the biological and mathematical domains of modelling, helping us to communicate biological models through tools for building, sharing, interpreting, simulating, and visualizing them. CellML, has been developed by the Auckland Physiome group over that last 4 years, and aims to represent models at the cellular level. While a full description of the language is beyond the scope of this paper, it is useful to touch on what we have gained from it. The CellML language itself is a set of constructs that have elegant interpretations within both the computational simulation domain and the object oriented programming domain. As a modeling description language, it is generic enough to represent mathematical descriptions of biological models at any level, and not just cells, so serves as a base for us to develop a more generalized modeling representation environment. Our evolution of such an environment so far includes 1) the integration of ontology data, which provides a machine interpretable pathway through the roles of modeling, and the further development and 2) the integration of "FieldML":http://www.bioeng.auckland.ac.nz/fieldml/pages/index.html which enables us to represent structural and continuum based information about biological and physical entities.
To illustrate an example of the use if CellML, it is useful to briefly describe a modeling project currently being done in the Auckland group( Nickerson, 2003). This project models cardio electromechanics by integrating models from the cellular, tissue and organ levels. Specifically, the modeling couples electrical events from the cellular level to the propagation of electrical excitation in myocardial tissue and the contribution to finite elasticity of myocytes due to active contraction. The models at the cellular level are represented in CellML and integrated into the continuum models built using the CMISS(reference?) framework which is able to process CellML data. Different cell models can be swapped in and out in a plug and play fashion, allowing different cellular theories to be tested within the electromechanics model Cell models used so far in that project are Nobel’s (1998) model of the guinea-pig ventricular cell(Nickerson, 2001) and a modified version of the "FitzHugh-Nagumo Simplified Cardiac Myocyte Model.":http://www.cellml.org/examples/repository/FN_simplified_model_1961_doc.html
Nickerson, D. 2003 : In Proceedings of World Congress on Medical Physics and Biomedical Engineering, August, Sydney, Australia.
Nickerson, D, P. Hunter, N. Smith 2001 : Phil. Trans. R. Soc. Lond. A, 359, 1159-1172.
Building and integrating ontologies
Ontologies are a vehicle to providing unambiguous
descriptions of terms and their relationship to one another.
To a computer scientist, they provide a formal framework
for describing the properties and relationships of concepts
that have both a formal logical foundation and a structure
amenable to machine processing, interpretation and
sharing. To a biologist or modeller, ontologies provide a
thesaurus and structure for understanding and binding
terms, ideas, data sets, and visualizations, etc.
Many different groups are constructing ontologies for
various biological domains. One approach to integrating
ontologies from different biological domains would be
to generate a large composite ontology. However, this
is not the intention of the IUPS project’s ontological
framework because the biological ontologies that currently
exist do not form pieces of the same puzzle – they
may have biology in common, but it usually stops
there. There is no currently agreed core framework or
methodology that can be used to guide the development
of compatible domain specific ontologies, but there
are efforts underway to promote such a platform. The
Unified Medical Language System (UMLS), for example,
is attempting to bring together various ontologies from
different domains into a composite ontology that fully
integrates these knowledge bases (Gangemi, 1999). Q4
The current view on the IUPS project’s ontological
framework is shown in Fig. 2. Some new ontologies
are being built from scratch while some existing
ones will need to be extended and integrated as
both a common framework and data source (i.e. a
composite approach). The focus at present is to describe
constructs for interpreting our computer based model
representations within the biological and mathematical
levels of modelling. The domains of modelling theory, and
the ML library domains (Fig. 2) capture representations
of mathematical relationships, model architectures, and
component structure (both physical and abstract).
Ontologies within the data, simulation, and
visualization domains provide a top level interface
to the resources they describe. The hierarchy of modelling
shown earlier describes levels at which a modeller
thinks about biological terms, for example – a particular
organ or cell, a particular process such as ion transport.
These processes and entities are concepts within
domains of biology that already have databases and
associated ontologies. Instead of defining one particular
interpretation of these concepts we can use these other
ontologies directly to describe any biological aspect we are
referring to in a particular model or ontological concept in
our domain. One of the necessary aims of groups such as
TAMBIS (Transparent Access To Multiple Bioinformatics
Information Stores; http://imgproj.cs.man.ac.uk/tambis),
BioPAX (Biological Pathways Exchange; www.biopax.org)
and PSI Protein Standards Initiative 20 Systems Biology
Markup Language; www.sbml.org) is that they work
together to ensure that their biological concepts are
compatible (Fig. 3). In the area of biochemical pathways,
the CellML developers are working closely with BioPAX
and SBML20 to establish the foundations for binding
cellular domains. An example use-case of such a binding is
a pathway of inference starting at concepts in the BioPAX
database and ending in selections of models from the
CellML database.
Relevant figures are :
- "levels of modelling":LevelsOfModelling.pdf
- "overview of ontology domains":Overview.pdf
- "use-cases":Figures.pdf
Use-case 1 -- A biologist, using an interface to the BioPAX
ontology, locates the cAMP/PKA signalling cascade that
participates in the regulation of l-type calcium channel
activity. From this concept they locate CellML models
that describe this system, and are able to run simulations,
manipulate these, and visualize their behaviour.
Use-case 2 -- A biologist locates l-type calcium channels
through an anatomical navigation interface. From here
they can investigate the 3D structure of the channel,
its physiological function, or publications that relate to
it. Each of these steps helps to gather or filter a set of
models with which they can continue with simulation and
visualization.
Use-case 3 -- A biologist with a protein domain motif,
perhaps with identifiers from the Gene Ontology or
protein interaction databases, obtains a set of models
that refer to this motif. From here they can look at 3D
structures, physiological function, or visualizations of its
behaviour in various models.
Use-case 4 -- A biologist with a data set wants to find a model
that could help them interpret data from experiments.
Using navigation or query interfaces, they can find a set
of models that contain the correct entities, or describe
the appropriate physiological process, or use particular
modelling theories. From here they enter an iterative
process of model fitting and system identification, i.e.
reducing the set of models to those that provide useful
levels of accuracy. The parameter data sets and the raw
data sets themselves can be submitted for peer review to
be included in the repositories for other people to use.
Use-case 5 -- A modeller has located a particular model. They
are able to run it in a simulation, visualize its behaviour,
interpret the mathematical theories it was built from, and
then edit it in a model editor. They are able to submit
annotations to the original model, or submit new models
for peer review to be included in the repositories and
ontologies.
Use-case 6 -- A modeller starts with a publication, obtains
a set of models that describe both the publication and
unpublished models of the same processes. They can view
comparisons of these models that highlight the similarities
and differences in architecture and modelling theories used
by the modellers who created them.
Use-case 7 -- A modeller has a particular goal in mind, in this
case, coupling their model to models that describe systems
at a finer physiological scale to theirs. They can find a
concept of coupling scales in the navigation interface that
interfaces with the modelling theory ontology. From here
they see mathematical systems or examples for coupling
between scales, and 20 through these select actual models
that implement these. They now select subsystems from
the library, or make up their own, and have the option
of selecting model templates that help them to couple
the subsystems into their model. For example, selecting
various subsystems that describe the signalling pathway
leading to the activation of l-type calcium channels, and
integrating these into their continuum model that may
couple a spatial variation of activation of β1 and β2
adrenergic receptors and the resulting spatio-temporal
propagation of activation of the muscle.