twofeat

Function

Description

twofeat reads in the feature tables of sequences and reports occurances of pairs of specified features.

This is intended for use as a simple data-mining tool to enable you to look for instances of pairs of features that occur near to each other in the same sequence entry.

For each of the pair of features, you can specify its type name, its sense, its score and any tag/value pairs, amongst other things.

You can then specify the type of relationship that the two features should have. You can specify the minimum and maximum distance between them. You can specify the type of overlap allowed: Any type of overlap or no overlap is allowed, Overlap required, No overlaps are allowed, Overlap required but one feature must not be completely within the other, Feature A must be completely enclosed within feature B, Feature B must be completely enclosed within feature A. You can specify that the distance should be measured from the nearest ends of the two features, From the left ends, From right ends, From the furthest ends. You can specify that the features should be in any sense, in the same sense or in opposite senses. You can specify that the features should be in any order, Feature A then feature B, Feature B then feature A.

By default the resulting pairs of features found are then written to a report file as a single feature from the first postion of the left-most feature to the last position of the right-most feature. You can modify the output to report the pairs of features with no changes made to them.

Algorithm

For each sequence:
	identify the features that match the criteria for Feature A
	identify the features that match the criteria for Feature B
	compare all pairs of features
	if they satisfy the requirements output them to the report file

Usage

Command line arguments


Input file format

twofeat reads any normal sequence USAs.

Output file format

twofeat outputs a report format file. The default format is table

Data files

None.

Notes

It can't find features that are not in the input sequences. It has no way of checking whether the input features are correct or not. Remember this when you are searching public databases.

References

None.

Warnings

None.

Diagnostic Error Messages

None.

Exit status

It always exits with status 0.

Known bugs

There is a slight memory leak that must be fixed at some time. This does no affect the results.

Author(s)

History

Target users

Comments