The two sequences are placed on the axes of a rectangular image and wherever there is a similarity between the sequences a dot is placed on the image.
Where the two sequences have substantial regions of similarity, many dots align to form diagonal lines. It is therefore possible to see at a glance where there are local regions of similarity.
dotpath is very similar to the program dottup which looks for places where words (tuples) of a specified length have an exact match in both sequences and draws a diagonal line over the position of these words.
Using a longer word size thus displays less random noise, runs extremely quickly, but is less sensitive.
dotpath finds all matches of size -wordsize or greater between two sequences. It then reduces the matches found to the minimal set of long matches that do not overlap. This is a way of finding the (nearly) optimal path aligning two sequences. It is not the true optimal path as produced by the algorithms used in water or needle, but for very closely related sequences it will produce the same result and will work well with very long sequences.
If you wish to compare the path found by dotpath to the set of all matches found then the qualifier -overlaps will show all matches in red except for the matches in the minimal path which are shown in black, as normal.
|
With the -data qualifier a file of the positions of the matches in the minimal non-overlapping set of matches is output.
This program is closely based on dottup with the addition of by default displaying only the minimal set of non-overlapping matches.
This program uses the same algorithm as diffseq for finding a minimal set of very good matches between two sequences. diffseq may be more convenient if you are looking at the differences between two nearly identical sequences.