If the feature is annotated as being in the reverse sense of a nucleic acid sequence, then that feature's sub-sequence is reverse-complemented before being written out.
It is often useful to have some information on the context of the feature. extractfeat allows you to specify a number of bases or residues before and/or after the feature to write out.
If you are interested in extracting the sequence of the region around the start or end of the feature, then this can also be specified.
'joined' features can either be extracted as individual sequences, or as a single concatenated sequence if the '-join' qualifier is used.
Please remember that the output feature sequence is only as good as the annotation. If you rely upon other people's, or other program's annotation of features, then some of these will be incorrect.
|
Feature tables in Swissprot, EMBL, GFF, etc. format can be added using '-ufo featurefile' on the command line.
The sequences of the specified features are written out.
The ID name of the sequence is formed from the original sequence name with the start and end positions of the feature appended to it. So if the feature came from a sequence with an ID name of 'XYZ' from positions 10 to 22, then the resulting ID name of the feature sequence will be 'XYZ_10_22'
The name of the type of feature is added to the start of the description of the sequence in brackets, e.g.: '[exon]'.
The sequence is written out as a normal sequence.
If the feature is in the reverse sense of a nucleic acid sequence, then it is reverse-complemented before being written.
If you are extracting 'joined' features and one of more of the component features is in a different sequence entry, then the whole joined feature is ignored.