About this computational tool
AltaiR: a C toolkit for alignment-free and spatial-temporal analysis of multi-FASTA data.
This method provides alignment-free and spatial-temporal analysis of multi-FASTA data through the implementation of a C toolkit highly flexible and with characteristics covering large-scale data, namely extensive collections of genomes/proteomes. This toolkit is ideal for scenarios entangling the presence of multiple sequences from epidemic and pandemic events. AlcoR is implemented in C language using multi-threading to increase the computational speed, is flexible for multiple applications, and does not contain external dependencies. The tool accepts any sequence(s) in (multi-) FASTA format.
The AltaiR toolkit contains one main menu (command: AltaiR) with the six sub menus for computing the features that it provides, namely
- average: moving average filter of a column float CSV file (the column to use is a parameter);
- filter: filters FASTA reads by characteristics: alphabet, completeness, length, CG quantity, multiple string patterns and pattern absence;
- frequency: computes the alphabet frequencies for each FASTA read (it enables alphabet filtering);
- nc: computes the Normalized Compression (NC) for all FASTA reads according to a compression level;
- ncd: computes the Normalized Compression Distance (NCD) for all FASTA reads according to a reference;
- raw: computes Relative Absent Words (RAWs) with CG quantity estimation for all RAWs.