Then follow the basic introduction about using
Modware-Loader.
Export genome annotations
As mentioned before, annotations are exported in pieces.
First gene models(canonical, non-coding, curated and predicated), then alignments
and promoters. Exports are done by the export subcommand of Modware-Loader.
$_> modware-export helpAvailable commands:
commands: list the application's commands help: display a command's help screen
chado2alignmentgff3: Export alignment from chado database in GFF3 format
chado2canonicalgff3: Export canonical gene models from chado database in GFF3 format
chado2dictycanonicalgff3: Export GFF3 with canonical gene models of Dictyostelium discoideum
chado2dictycuratedgff3: Export GFF3 with curated gene models of Dictyostelium discoideum
chado2dictynoncanonicalgff3: Export GFF3 with sequencing center gene models of Dictyostelium discoideum
chado2dictynoncanonicalv2gff3: Export GFF3 with repredicted gene models of Dictyostelium discoideum
chado2dictynoncodinggff3: Export GFF3 with non coding gene models of Dictyostelium discoideum
chado2fasta: Export fasta sequence file from chado database
Common config file
A basic yaml config file to be used for all the exports.
All exports are done with —feature_name options that exports the name of
reference feature in GFF3 column 1.
Canonical
‘subcommand to export canonical gff3’ (modware-export-canonical.txt)download
123456789101112131415161718192021
$_> modware-export chado2dictycanonicalgff3 [-?chlopu][long options...] -h -? --usage --help Prints this usage information.
--reference_id reference feature name/ID/accession number. In
this case, only all of its associated
features will be dumped
-o --output Name of the output file, if absent writes to
STDOUT
--write_sequence To write the fasta sequence(s) of reference
feature(s), default is true --attr --attribute Additional database attribute
--pass -p --password database password
--feature_name Output feature name instead of sequence id in
the seq_id field, default is off.
--dsn database DSN
--schema_debug Output SQL statements that are executed,
default to false -u --user database user
--log_level Log level of the logger, default is error
-l --logfile Name of logfile, default goes to STDERR
-c --configfile yaml config file to specify all command line
options
It exports complete coding gene models along with contig and reference features. It could
be either of curated or predicted(sequencing center) gene models where curated
models take precedence.
There will be three exports, one for curated, one for sequencing center and one for
reprediction pipeline.
‘subcommand to export sequencing center gene models’ (modware-export-noncanonical.txt)download
12345678910111213141516171819202122232425
$_> modware-export chado2dictynoncanonicalgff3 [-?chlopu][long options...] -h -? --usage --help Prints this usage information.
--reference_id reference feature name/ID/accession
number. In this case, only all of its
associated features will be dumped
-o --output Name of the output file, if absent
writes to STDOUT
--attr --attribute Additional database attribute
--feature_name Output feature name instead of sequence
id in the seq_id field, default is off.
--pass -p --password database password
--write_sequence_region write sequence region header in GFF3
output, default if off
--source Name of database/piece of
software/algorithm that generated the
gene models. By default it is *Sequencing
Center*.
--dsn database DSN
--schema_debug Output SQL statements that are executed,
default to false -u --user database user
--log_level Log level of the logger, default is error
-l --logfile Name of logfile, default goes to STDERR
-c --configfile yaml config file to specify all commandline options
Though, we use different subcommands theirs options are identical.
$_> modware-export chado2alignmentgff3 [-?chlopu][long options...] --write_sequence_region write sequence region header in GFF3
output, default if off
-h -? --usage --help Prints this usage information.
--feature_name Output feature name instead of sequence
id in the seq_id field, default is off.
--rt --reference_type The SO type of reference feature,
default is supercontig
-o --output Name of the output file, if absent
writes to STDOUT
--feature_type SO type of alignment features to be
exported
--attr --attribute Additional database attribute
--match_type SO type of alignment feature that will be
exported in GFF3, *_match* is appended to
the feature_type by default.
--pass -p --password database password
--force_name Adds the value of GFF3 *ID* attribute to
*Name* attribute(if absent), off by
default
--add_description If present, add the GFF3 *Note*
attribute. It looks for a feature
property with *description* cvterm. Off
by default
--dsn database DSN
--property List of additional cvterms which will be
used to extract additional feature
properties
--schema_debug Output SQL statements that are executed,
default to false -u --user database user
--log_level Log level of the logger, default is error
-l --logfile Name of logfile, default goes to STDERR
--species Name of species
--genus Name of the genus
-c --configfile yaml config file to specify all commandline options
--org --organism Common name of the organism whose genomic
features will be exported