notseq

Function

Description

When you have a set of sequences (a file of multiple sequences?) and you wish to remove one or more of them from the set, then use notseq.

This program was written for the case where a file containing several sequences is being used as a small database, but some of the sequences are no longer required and must be deleted from the file.

notseq splits the input sequences into those that you wish to keep and those you wish to exclude.

notseq takes a set of sequences as input together with a list of sequence names or accession numbers. It also takes the name of a new file to write the files that you want to keep into, and optionally the name of a file that will contain the files that you want excluded from the set.

notseq then reads in the input sequences. It outputs the ones that match one of the sequence names or acession numbers to the file of excluded sequences, and those that don't match are output to the file of sequences to be kept.

Note that the names of the sequences to be excluded are not standard EMBOSS USAs. Only the name or accession number shoudl be specified, not the database or file that these entries may occur in. These excluded sequence names will be matched against the names of the input sequences to see if there is a match. Wildcarded names may be specified by using '*'s. Any specified names of sequences to be excluded that are not found are simply ignored.

Usage

Command line arguments


Input file format

notseq reads normal sequence USAs.

The names (or accession numbers) of the sequences to be excluded can be entered as a file of such names by specifying an '@' followed by the name of the file containing the sequence names. For example: '@names.dat'.

The names or accession numbers of the sequences to be excluded are not standard EMBOSS USAs. Only the ID name or accession number can be specified, you cannot specify the sequences as 'database:ID', 'file:accession', 'format::file', etc.

Output file format

notseq writes normal a sequence file.

Data files

None.

Notes

Note that the names or accession numbers of the sequences to be excluded are not standard EMBOSS USAs. Only the ID name or accession number can be specified, you cannot specify the sequences as 'database:ID', 'file:accession', 'format::file', etc.

References

None.

Warnings

None.

Diagnostic Error Messages

If no matches are found to any of the specified sequence names, the message "This is a warning: No matches found." is displayed.

Exit status

It exits with a status of 0 unless no matches are found to any of the input sequences name, in which case it exits with a status of -1.

Known bugs

None.

Author(s)

History

Target users

Comments