CellML Repository Observations

Some notes I've made on what I saw on/from/about the current state of the Physiome CellML Repository.

CellML Repository

Current Limitations:

  • Version and Variants are not distinguished properly

    • New repository will address the naming convention

    • Currently there is not much can be done for that.

  • Multiple References/Papers not handled at all

    • There is a bug that collapses both references into one, resulting in misleading (i.e. wrong) reference about the model.

    • No remedies for broken data, until a proper CellML metadata library is out to handle it

      • That has been created, in the process of integration with the repository.

      • Old models will need to be fixed, will be a mostly manual process.

  • Broken RDF graphs created

    • RDF graph generation was broken because resource nodes were presented as literal nodes. Bug/Feature in 4Suite compounded that problem by treating the graph as graphs are considered broken by software that deals with RDF in a compliant way, such as PCEnv.

      • New CellML metadata library fixes that.

  • Speed

    • It should not take a minute to upload/process a 100 kb CellML model from within the same network, or the same computer even. Inefficient code was culprit.

      • New CellML metadata library is significantly faster.

  • Model listing

    • Having all 300 models shown on a full listing at the front page is fine, but long, but there are over a thousand models that may start to chew up bandwidth.

      • Provide a front page with categories for models, search, and fine, a link to the full listing of models, broken down by pages filled with a smaller number of models (50 – 100?, user specified?)

    • Tags/Categories are a requested feature

      • CellML RDF metadata has provisions for that. This will be used

      • Can be used as a list also, searchable

  • Model Documentation

    • No way to edit that aside from editing the CellML file

      • Working on the Plone site. No way to reserialize the data back into the file for the lack of RDF spec on a more verbose documentation

    • No RDF spec for verbose documentation

      • A draft will probably need to be written

  • Validation/Validator

    • Lack of thereof

      • To properly integrate Jonathan Cooper's CellML validator, Python 2.4 will need to be used by Zope, but that may require an upgrade.

  • Lack of CellML 1.1 support

    • This is simple on the surface, but in order to properly support the features and synchronize everything. Such as providing properly changesets, and having the ability to retrieve submodels or flatten one. Relationship between various models will be required.

      • New Repository by Ting Ting should support that properly, and API will need to be provided.

  • Graphs, diagrams and sessions from software such as PCEnv, JSim, COR.

    • Need some way to attach files to a model so it can be used or viewed in a better light

      • Repository will need to be upgraded. A reference to a URI can be created to those files, which may be more flexible than storing the file directly into the repository. However this will scatter files all over the place and make backups difficult.

  • Model curation

    • Being worked on.

  • Archetypes

    • Not used properly. This one will have to wait, as there are various types of data in different formats, and I am not sure how to integrate the output form that's automatically generated with the Javascript add author function.