The output is sent to the screen by default for the user to view, but it can write the results to a file.
The output highlights various differences or similarities between each of the sequences and a reference sequence by setting selected types of matches to a reference sequence to be '.' characters.
The reference sequence can be either the calculated consensus sequence (the default) or it can be one of the set of aligned sequences, specified by either the ordinal number of that sequence in the input file, or by its name.
The output sequences can be displayed in either the input order (the default) or they can be sorted in order of their similarity to the reference sequence or sorted alphabetically by their names.
By using the '-show' option, the displayed sequences can either be shown as:
A small table of the way these alignments are displayed illustrates this.
If we have a reference protein sequence of "III" and a sequence aligned
to this of "ILW", then we have an identical matching residue, then a
similar one, then a dissimilar one.
The different methods of display would give the following:
Reference III All ILw Identical I.. Non-id .lW Similar Il. Dissimilar ..W
Changing the similar matches to lowercase can optionally be disable by using the option -nosimilarcase.
The displayed sequence can be numbered by placing a ruler with ticks above the sequence.
The width of a line can be set. The width of a margin to the left of the sequences that shows the sequence names can be set.
Specified regions of the sequence can be displayed in uppercase to highlight them.
The output can be formatted for HTML.
If the output is being formatted for HTML, then specified regions of the sequence can be displayed in any valid HTML colours.
The uppercase consensus symbol is indicates that the consensus is strong and lowercase indicates that it is weak.
The cutoff for setting the case of the consensus is set by the qualifier '-setcase'. If the number of residues at that position that match the consensus value is greater than this, then the symbol is in uppercase, otherwise the symbol is in lowercase. By default, the value of setcase is set so that if there are more than 50% of residues identical to the consunsus at that position, then the consensus is in uppercase.
To put all of the consensus symbols into uppercase or lowercase, make -setcase zero or very large (try 100000 ?).
|
You can specifiy a file of ranges to display in uppercase by giving the '-uppercase' qualifier the value '@' followed by the name of the file containing the ranges. (eg: '-upper @myfile').
The format of the range file is:
An example range file is:
# this is my set of ranges 12 23 4 5 this is like 12-23, but smaller 67 10348 interesting region
You can specifiy a file of ranges to highlight in a different colour when outputting in HTML format (using the '-html' qualifier) by giving the '-highlight' qualifier the value '@' followed by the name of the file containing the ranges. (eg: '-highlight @myfile').
The format of this file is very similar to the format of the above
uppercase range file, except that the text after the start and end
positions is used as the HTML colour name. This colour name is used 'as
is' when specifying the colour in HTML in a ''
construct, (where 'xxx' is the name of the colour).
The standard names of HTML font colours are given in:
An example highlight range file is:
http://http://www.w3.org/TR/REC-html40/types.html
and
http://www.ausmall.com.au/freegraf/ncolour2.htm
and
http://mindprod.com/htmlcolours.html
(amongst other places).
# this is my set of ranges
12 23 red
4 5 darkturquoise
67 10348 #FFE4E1
Output file format
showalign writes out a text file, optionally formatted for HTML.
Data files
showalign reads in scoring matrices to determine the consesnus
sequence and to determine which matches are similar or not.
Notes
None.
References
None.
Warnings
None.
Diagnostic Error Messages
None.
Exit status
It always exits with status 0.
Known bugs
None.
Author(s)
History
Target users
Comments