| |  | Review of Units Specification |  | 
 Author:Warren Hedley (Bioengineering Institute, University of Auckland)
 Contributors:
 David Bullivant (Bioengineering Institute, University of Auckland)
 Melanie Nelson (Physiome Sciences Inc.)
 Poul Nielsen (Bioengineering Institute, University of Auckland)
 
This document quickly sums up the history of development of units for CellML, and gives some justification for some of the changes in the way things have been done. It then goes on to propose some new ways of doing things, with particular regard to the SBML way of doing things. The January 12 scheme, described in Section 3, didn't last longer than three days however, and the subsequent changes proposed on January 15 are described in Section 4.
 
The CellML team intended to make the specification of units one of their priorities way back in the heady days of early development in 1999. Many hard years spent trying to develop working code from published models where the units had not been correctly or completely specified had led to this becoming a bit of an issue internally. So it was always clear that wherever a variable was first declared, and whenever bare numbers appeared in equations, units should be associated with these entities  —  the only real question was the best method.
 
In 1999 and early 2000, the method that had been developed used a unitsattribute as shown in Figure 1. Units were made up of the product of the base and derived SI units, where each base quantity was specified with a scaling prefix (e.g., m for milli) and an exponent. The notation was to put each triplet between square brackets, separating the members with commas. The absence of triplet parts corresponded to the appropriate part of[0, dimensionless, 1]. The whole system was incredibly cool because it was concise, and unambiguous, so you could fully specify the units everywhere. Also, the units strings could be easily split and manipulated using perl or XSLT. (Note that in some documentation the order of the triplet was[quantity, scale, exponent]and in others it was[scale, quantity, exponent]—  the latter is more readable, so this is used in the examples below.) 
 
<variable name="concentration_of_A" units="[m,mol,1][,l,-1]" /> Figure 1 
The preferred method of specifying units as of early 2000.
 
 
In early 2000, some people may have pointed out that this was perhaps hard to approach for non-mathematical people and could potentially make CellML documents hard to read for biologists and the like. So a system was devised for associating human-readable strings with units declarations, where these strings could then be used for associating units with variables and bare numbers. This system is shown in Figure 2.
 
 
<!-- the <units_abbreviation_table> appears inside a <model> element -->
<units_abbreviation_table>
<units abbreviation="dimensionless" expanded="[,,]" />
<units abbreviation="concentration" expanded="[n,mol,1][m,m,-3]" />
</units_abbreviation_table>
 
<!-- the <variable> element appears inside a <component> element -->
<variable name="A" units="concentration" /> Figure 2 
The units abbreviation scheme proposed early 2000.
 
 
In November, Warren met with the SBML development team at the ISCB in Tokyo, Japan. It was agreed that one of the areas where CellML and SBML should be interoperable if possible was the area of units. The SBML team considered the triplet-based form of units definition in CellML too complex to parse, and suggested breaking down the strings into their individual components, each with their own attributes. This seemed reasonable. The SBML spec at that time didn't allow a scale factor on each quantity, so this was added to SBML and then the two specs could basically agree. At the end of the November meeting, the proposed CellML specification of units looked something like that shown in Figure 3.
 
 
<!-- the <units_abbreviation_table> appears inside a <model> element -->
<units_abbreviation_table>
<units abbreviation="dimensionless" />
 
<units abbreviation="concentration">
<unit exponent="1" scale="n" type="mol" />
<unit exponent="-3" scale="m" type="m" />
</units>
</units_abbreviation_table>
 
<!-- the <variable> element appears inside a <component> element -->
<variable name="A" units="concentration" /> Figure 3 
The units scheme agreed on after discussions with the SBML team.
 
 
One of the key features ensuring robustness and re-usability of CellML components and models is the requirement that all variables and bare numbers have a set of units declared for them. This allows the connection of components and models where the units on variables that are to be mapped to one another are different (assuming that they are still of the same dimensionality), and for the consistency checking of equations.
 
CellML provides a dictionary of standard units that may be used in variable declarations and attached to bare numbers in math. This dictionary consists of the base SI units, the standard set of derived SI units, and some additional units commonly found in the types of biological models defined using CellML. References to these units should make use of the actual name of the units, rather than the standard abbreviation, thus avoiding confusion between units and scale factors. The full sets of base and derived SI units are shown in Figure 8 and Figure 9 respectively and the additional units are given in Figure 10. The full list of units that any CellML processing application should understand is given in Figure 4.
 
 
    
      | ampere1 |  | dimensionless3 |  | joule2 |  | lumen2 |  | numberof3 |  | sievert2 |  |  
      | becquerel2 |  | farad2 |  | katal2 |  | lux2 |  | ohm2 |  | steradian2 |  
      | calorie |  | gram3 |  | kelvin1 |  | meter1 |  | pascal2 |  | tesla2 |  
      | candela2 |  | gray2 |  | kilogram1 |  | metre1 |  | radian2 |  | volt2 |  
      | celsius |  | henry2 |  | liter3 |  | mole1 |  | second1 |  | watt2 |  
      | coulomb2 |  | hertz2 |  | litre3 |  | newton2 |  | siemen2 |  | weber2 |  Figure 4 
The dictionary of units keywords that CellML processing applications are expected to recognise. Keywords marked with a superscript of 1 are base SI units, those with a superscript of 2 are derived SI units, and those with a 3 are additions to the standard units defined purely for the convenience of model authors using CellML.
 
 
CellML also provides a facility whereby new units can be defined in terms of the units defined in the dictionary. This functionality allows the creation of complex units (made up of the products of simple units), define imperial units (which are expressed as a scaled version of an SI unit), and even create units that require an offset (such as Fahrenheit.) This allows model authors to work with whatever set of units they feel comfortable in, secure in the knowledge that their models can be integrated with those of other authors using other units.
 
New units are defined using the <units>element, which has anameattribute, the value of which is used to reference the units in variable declarations or on bare number elements. The contents of a<units>element is a sequence of<unit>elements, where each unit corresponds to one of the basic quantities, the product of which will be the final units type. 
Every <unit>element may contain no content and may have up to five attributes. The most important of these, and the only one which is required, is thetypeattribute. Thetypeattribute is used to set the base quantity for the current<unit>element, and its value must correspond to a string from the standard units dictionary, or to the value of the name of some previously defined units. 
The optional offsetattribute is used to shift the transformation between the current units and the base unit being referenced by thetypeattribute. This should only be necessary to define the Fahrenheit temperature scale. If theoffsetattribute is not present, it assumes a default value of 0. 
The scaleattribute, if present, can be used to indicate a scale attribute for the unit type. If its value is a letter, it must be from the standard set of unit pre-multiplier symbols given in Figure 5. If its value is an integer, then the type is pre-multiplied by 10 to the power of this number. If no scale attribute value is specified, it is assumed that the unit type stands alone i.e., is pre-multiplied by one. 
 
    
      | symbol | name | factor | symbol | name | factor |  
      | Y | yotta | 1024 | d | deci | 10-1 |  
      | Z | zetta | 1021 | c | centi | 10-2 |  
      | E | exa | 1018 | m | milli | 10-3 |  
      | P | peta | 1015 | u | micro | 10-6 |  
      | T | tera | 1012 | n | nano | 10-9 |  
      | G | giga | 109 | p | pico | 10-12 |  
      | M | mega | 106 | f | femto | 10-15 |  
      | k | kilo | 103 | a | atto | 10-18 |  
      | h | hecto | 102 | z | zepto | 10-21 |  
      | da | deka | 101 | y | yocto | 10-24 |  Figure 5 
The set of letters that may be used in the scaleattribute on a<unit>element and the corresponding scale factors that will pre-multiply the unit. 
 
The combination of scale,offsetandtypeis then raised to some power equal to the value of theexponentattribute. This should be an integer. If noexponentattribute value is specified, it is assumed that the unit occurs once i.e., theexponentattribute has a default value of one. Note that anexponentattribute value of"0"(zero) has the effect of removing the parent<unit>element from the current units. 
Finally a multiplierattribute can be used to pre-multiply the result so far by a further scale factor, allowing the introduction of floating point scale factors. This could be used, for instance, to define a "pound" unit in terms of the SI kilogram. 
The offsetattribute presents some mathematical problems in unit conversion, so some restrictions must be placed on its use. If theoffsetattribute is present on a<unit>element, it must be the sole<unit>element within a<units>element. That units element then defines a straightforward conversion according to the following formula (where "Type" refers to the unit being defined): 
x Type =  ( multiplier x + offset )( scale type ) exponent
 
Complex units are the product of numerous basic quantities, and are created by placing several <unit>elements inside a single<units>element. The conversion between the new units and the product of the units named in thetypeattributes of the<unit>elements is given by the following formula: 
x units = [ m1 ( s1  t1 ) e1 ] [ m2 ( s2 t2 ) e2 ] ... [ mn ( sn tn ) en ] x
 
It is not possible to use the offsetattribute on any<unit>element with anexponentattribute with value other than"1"or"-1", or that is inside a<units>element containing another<unit>element with a positiveexponentvalue. That is, a unit whose conversion involves an offset may not appear in a product on the "top line" of a units definition. Model authors may only create complex units from previously defined simple units that involve an offset in their conversion if they follow similar rules about exponents. These rules also apply to units defined with thecelsiuskeyword from the standard dictionary  —  this unit is calculated with an offset from the base SI unit Kelvin. 
Here are some practical examples of the effect of these rules: it is possible to define units corresponding to degrees fahrenheit (in fact these are defined in the CellML fragment in Figure 6). It is also possible to define, for example, inches per degree fahrenheit, but not fahrenheit inches or degrees fahrenheit squared. In the latter cases, a unit involving an offset appears in product on the top line of the units definition.
 
The CellML fragment in Figure 6 contains the definition of two simple units. In practice, software would usually want to perform the inverse transformation: i.e., given a number in the newly defined units, software would want to convert that back into SI units so that it could be used in simulation.
 
 
<units name="l">
<unit multiplier="1000" exponent="3" scale="c" type="metre" />
</units>
 
<units name="fahrenheit">
<unit multiplier="0.5555556" offset="-32.0" type="celsius" />
</units> Figure 6 
Some examples demonstrating the use of the <units>and<unit>elements. 
 
The first <units>element is used to define a litre (where we assign the new units the abbreviation"l"which doesn't clash with the keywordlitrefrom the standard dictionary of units). In the example a litre is defined as 1000 cubic centimetres. It would also be possible to define a litre as one thousandth of a cubic metre or using any number of possible multipliers and scales. The formula we obtain from the<units>definition is: 
x l= 1000 x ( cmetre) 3 
The second <units>element is used to define a degrees fahrenheit as a function of degrees celsius. The formula we obtain from this<units>element is: 
x fahrenheit= 0.5555556 ( x - 32.0 )celsius 
The definition of some complex units is shown in Figure 7, where the definition of the later units is based on the earlier definitions. In the first units element, second is re-named time. In the second units element,concentrationis defined as milli-moles per litre. Finally,fluxis defined in terms of change ofconcentrationwith respect totime. 
 
<units name="time">
<unit type="second" />
</units>
 
<units name="concentration">
<unit scale="m" type="mole" />
<unit exponent="-1" type="litre" />
</units>
 
<units name="flux">
<unit type="concentration" />
<unit exponent="-1" type="time" />
</units> Figure 7 
Further examples of units definition including the definition of complex units.
 
 
Just when you think you have something good going, Poul Nielsen waltzes (struts?) in and decides he's not happy with it. Not long after we had what we thought was a complete units scheme with documentation suitable for inclusion in the CellML specification (above), Poul started recommending changes. These changes were the result of numerous close readings of the "A Short Introduction To CellML" paper being written for inclusion in the July edition of The Philosophical Transactions of the Royal Society of London (numerous other changes to CellML resulting from the writing of this paper are described in the January 15 meeting minutes.) A quick summary of the changes and the justifications are as follows:
 
The scaleattribute on the<unit>element was re-named theprefixattribute. This is what the people at NIST call it, and as we hope to be a respected standards organisation like NIST, we should follow existing standards ourselves wherever possible.
The typeattribute on the<unit>element was re-named theunitsattribute. The word "type" was too ontology-like: Poul favoured a re-naming tounits, Warren favoured a re-naming toname_ref, and Melanie suggestedunits_ref. Basically, we wanted something that could be thought of as consistent with everything else, and since we'd already broken consistency with SBML with the previous change, it didn't really matter if we became more inconsistent. Usingunitsis consistent with the use of<variable>elements and also all referencing schemes throughout CellML.
It was decided that the values of the prefixattribute on the<unit>element should be the full name of the prefixes as they appear in the NIST table to be consistent with the values of theunitsattribute. So"milli"should be used in place of"m". This also neatly sidesteps the mu/u/micro problem.
The numberofkeyword was removed from the units dictionary. Poul didn't likenumberofbecause it was a quantity (like "length") rather than a unit, and because there is implied integer behaviour that we possibly don't want. He suggested that the worditemswould be more technically correct. This keyword was left out of the paper, but may yet be re-introduced into the CellML specification after correspondence with the SBML folks.
 
 
    
      | base quantity | name | symbol |  
      | length mass
 time
 electric current
 thermodynamic temperature
 amount of substance
 luminous intensity
 | metre (or meter) kilogram
 second
 ampere
 kelvin
 mole
 candela
 | m kg
 s
 A
 K
 mol
 cd
 |  Figure 8 
The SI base units.
 
 
 
    
      | base quantity | name | symbol | in base units | in other SI units |  
      | plane angle solid angle
 frequency
 force
 pressure
 energy, work, heat
 power, radiant flux
 electric charge
 electric potential
 capacitance
 electric resistance
 conductance
 magnetic flux
 magnetic flux density
 inductance
 luminous flux
 illuminance
 activity
 absorbed dose
 dose equivalent
 catalytic activity
 | radian steradian
 hertz
 newton
 pascal
 joule
 watt
 coulomb
 volt
 farad
 ohm
 siemens
 weber
 tesla
 henry
 lumen
 lux
 becquerel
 gray
 sievert
 katal
 | rad sr
 Hz
 N
 Pa
 J
 W
 C
 V
 F
 R (Omega)
 S
 Wb
 T
 H
 lm
 lx
 Bq
 Gy
 Sv
 kat
 | m m-1 m2 m-2
 s-1
 m kg s-2
        m-1 kg s-2
 m-1 kg s-2
 m2 kg s-2
 m2 kg s-3
 s A
 m2 kg s-3 A-1
 m-2 kg-1 s4 A2
 m2 kg s-3 A-2
 m-2 kg-1 s3 A2
 m2 kg s-2 A-1
 kg s-2 A-1
 m2 kg s-2 A-2
 
 
 s-1
 m2 s-2
 m2 s-2
 s-1 mol
 | 
 
 J / m
 N / m2
 N m
 J / s
 A s
 W / A
 C / V
 V / A
 A / V
 V s
 Wb / m
 Wb / A
 cd sr
 m2 cd sr
 
 J/kg
 J/kg
 
 |  Figure 9 
The derived SI units.
 
 
 
    
      | quantity | name | symbol | in SI units |  
      | dimensionless energy, work, heat
 temperature
 mass
 volume
 amount
 | dimensionless calorie
 celsius
 gram
 litre (or liter)
 numberof
 | cal
 degree C
 g
 l
 
 | x calorie = 4.1868 x joule
 x celsius = ( x + 273.15 ) kelvin
 x gram = 0.001 x kilogram
 x litre = 0.001 x metre3
 
 |  Figure 10 
The additional units added to CellML's standard units dictionary to make it convenient for model authors to define basic quantities.
 
 |