dictyBase Developers

Solving one problem at a time

Ontology Data Model in Chado Database

Chado database is driven by ontology(Controlled vocabulary) and it is very important, core and somewhat complicated to model it properly. And like any other data model, it is better to break down the steps and approach it in step by step. The various parts of the model are desribed in piece by piece in their dependent order in which the first one need to be done first and so on.

Ontology information and namespace

The ontology namespace gets in the name column and the version and date values goes in the value column of cvprop table. Following the standard pattern in chado model, each value also gets qualified with an ontology terms from cvprop ontology. The cvprop ontology has to be stored first in order for any other ontology to store any of their ontology properties.

model for ontology properties
1
2
3
4
5
SELECT cv.name,cvprop.value FROM cv
 JOIN cvprop ON cv.cv_id=cvprop.cv_id
 JOIN cvterm propterm ON propterm.type_id=proptype.cvterm_id
 JOIN cv proptype ON proptype.cv_id=propterm.cvterm_id
 WHERE proptype.name = 'cvprop';

Miscellaneous namespaces

This is a one time setup and reused to hold various properties of ontology term(cvterm) such as comment, alternate ids, database cross references(xrefs) etc.

internal

Namespace in db table, stored in name column, used in case of dbxrefs without any defined namespace

comment

This is a cvterm namespace to store comment for every cvterm.For this, a comment cvterm is created under cvterm_property_type namespace inside the internal db namespace. Quite naturally, it is stored identically to the model of a cvterm(ontology term, details given below).

synonym

For this, a cv namespace synonym_type is created.

alternate ids

Exactly similar to comment structure, except a cvterm alt_id is created.

xrefs(database cross references)

An xref cvterm is created, model is identical to comment or alt_id.

synonym types

There are four synonym types, EXACT, BROAD, NARROW and RELATED, a cvterm is created for each of them under synonym_type namespace.

relationship property types

There are six of them …

  • cyclic
  • reflexive
  • transitive
  • anonymous
  • domain
  • range

Stored in a similar fashion as that of synonym_types

Cvterm(Ontology term)

Term name and id

The idea for modeling the term itself is essentially the same as that of comment and alt_ids.

The term’s name goes in cvterm.name and the identifier(id) goes to dbxref.accession column. The cv and db namespaces are created before and are reused for every instance of term. For relationship term, is_relationship column is set to true. Terms once stored generally don’t get deleted, rather is_obsolete column flag is toggled for that.

List of term name and identifier in an ontology
1
2
3
4
5
SELECT cvterm.name name, dbxref.accession identifier
 FROM cvterm
 JOIN dbxref ON cvterm.dbxref_id=dbxref.dbxref_id
 JOIN cv ON cv.cv_id=cvterm.cv_id
 WHERE cv.name = 'gene ontology'

Synonym

Alternate id

Comment

Xref

Relationship term properties

Term relationship

cvterm_relationship holds relationship between terms in terms of triplets subject —> predicate —> object.

A seprate table for relationship(graph edges) allows to hold children with multiple parents.

All immediate children of a term
1
2
3
4
SELECT children.name FROM cvterm  children
JOIN cvterm_relatationship cvrel ON children.cvterm_id = cvrel.subject_id
JOIN cvterm parent ON parent.cvterm_id = cvrel.object_id
WHERE parent.name = 'mitochondrion';
All immediate children with type of relation(variation)
1
2
3
4
5
SELECT children.name, relation.name FROM cvterm  children
JOIN cvterm_relatationship cvrel ON children.cvterm_id = cvrel.subject_id
JOIN cvterm parent ON parent.cvterm_id = cvrel.object_id
JOIN cvterm relation ON relation.cvterm_id = cvrel.type_id
WHERE parent.name = 'mitochondrion';
All immediate parents with type of relation
1
2
3
4
5
SELECT parent.name, relation.name FROM cvterm  parent
JOIN cvterm_relatationship cvrel ON parent.cvterm_id = cvrel.object_id
JOIN cvterm children ON children.cvterm_id = cvrel.subject_id
JOIN cvterm relation ON relation.cvterm_id = cvrel.type_id
WHERE children.name = 'mitochondrion';