Ontology Data Model in Chado Database
Chado database is driven by ontology(Controlled vocabulary) and it is very important, core and somewhat complicated to model it properly. And like any other data model, it is better to break down the steps and approach it in step by step. The various parts of the model are desribed in piece by piece in their dependent order in which the first one need to be done first and so on.
Ontology information and namespace
The ontology namespace gets in the name column and the version and date values goes in the value column of cvprop table. Following the standard pattern in chado model, each value also gets qualified with an ontology terms from cvprop ontology. The cvprop ontology has to be stored first in order for any other ontology to store any of their ontology properties.
1 2 3 4 5 |
|
Miscellaneous namespaces
This is a one time setup and reused to hold various properties of ontology term(cvterm) such as comment, alternate ids, database cross references(xrefs) etc.
internal
Namespace in db table, stored in name column, used in case of dbxrefs without any defined namespace
comment
This is a cvterm namespace to store comment for every cvterm.For this, a comment cvterm is created under cvterm_property_type namespace inside the internal db namespace. Quite naturally, it is stored identically to the model of a cvterm(ontology term, details given below).
synonym
For this, a cv namespace synonym_type is created.
alternate ids
Exactly similar to comment structure, except a cvterm alt_id is created.
xrefs(database cross references)
An xref cvterm is created, model is identical to comment or alt_id.
synonym types
There are four synonym types, EXACT, BROAD, NARROW and RELATED, a cvterm is created for each of them under synonym_type namespace.
relationship property types
There are six of them …
- cyclic
- reflexive
- transitive
- anonymous
- domain
- range
Stored in a similar fashion as that of synonym_types
Cvterm(Ontology term)
Term name and id
The idea for modeling the term itself is essentially the same as that of comment and alt_ids.
The term’s name goes in cvterm.name and the identifier(id) goes to dbxref.accession column. The cv and db namespaces are created before and are reused for every instance of term. For relationship term, is_relationship column is set to true. Terms once stored generally don’t get deleted, rather is_obsolete column flag is toggled for that.
1 2 3 4 5 |
|
Synonym
Alternate id
Comment
Xref
Relationship term properties
Term relationship
cvterm_relationship holds relationship between terms in terms of triplets subject —> predicate —> object.
A seprate table for relationship(graph edges) allows to hold children with multiple parents.
1 2 3 4 |
|
1 2 3 4 5 |
|
1 2 3 4 5 |
|