This program takes two sequences and finds regions where they are identical. These regions are reported in the output file (and optionally) in GFF (Gene Feature Format) files.
It will not find identical regions smaller than the specified wordsize.
|
The normal 'report' header is output. It contains the details of the program run and the input sequences.
The data lines consist of five columns separated by spaces or TAB characters. Each line contains the information on one identical region. The first column is the length of the match. The second column is the name of the first sequence. The third column is the start and end position of the match. The next two columns are the name and positions of the second sequence.