The umgap snaptaxon command takes one or more taxon IDs. For each taxon, it searches amongst its ancestors if any are of the specified rank (e.g., -r species), or are one of the listed taxa (e.g., -t 1598 -t 1883). If so, the taxon is replaced by the most specific of these matching taxa. Otherwise, it is mapped to the root of the taxonomy.

Usage

The input is given on standard input and may be any sequence of FASTA headers and/or lines containing a single taxon ID. The FASTA headers (if any) are just copied over to standard output.

The taxonomy to be used is passed as an argument to this command. This is a preprocessed version of the NCBI taxonomy.

$ cat input.fa
>header1
888268
186802
1598
1883
$ umgap snaptaxon 2020-04-taxons.tsv -r order < ~/input.fa
>header1
38820
186802
186826
85011
$ umgap snaptaxon 2020-04-taxons.tsv -t 1239 2 < ~/input.fa
>header1
1
1239
1239
2
-h / --help
Prints help information
-i / --invalid
Include the invalidated taxa from the taxonomy
-V / --version
Prints version information
-r / --rank r
The rank to snap towards [possible values: superkingdom, kingdom, subkingdom, superphylum, phylum, subphylum, superclass, class, subclass, infraclass, superorder, order, suborder, infraorder, parvorder, superfamily, family, subfamily, tribe, subtribe, genus, subgenus, species group, species subgroup, species, subspecies, varietas, forma]
-t / --taxons t...
A taxon to snap towards (allow multiple times)