@@ -5,7 +5,10 @@ Pre-parsed static version of the databases are provided. Beware: if you want to
An example of the command needed to run the pipeline is found in CMD_example.
To generate the amplicons [and in silico amplicons] starting from the amplicons protein sequence, use hmmsearch tool with the HMM profile provided in this repo:
In order to create the feature table, I recommend following one of the many tutorial available in the qiime2 tutorial pages for creating [feature tables] [https://docs.qiime2.org/2020.2/tutorials/moving-pictures/#obtaining-and-importing-data ] from raw amplicon reads. We suggest to use only the forward reads as in our experience they contain the majority of the information and adding the reverse non-overlapping reads mostly introduces additional issues such as problems when merging the forward and reverse with an N in between which is not supported by many tools.
Once the reads are denoised [with DADA2] [example: https://docs.qiime2.org/2020.2/tutorials/moving-pictures/#option-1-dada2 ]. After this step you should be able to export both the feature-table and the feature-data to non-qiime formats. The feature-data, which contains the denoised nucleotide amplicon sequences can be translated to protein sequence with any tool capable of it, here we used transeq from the EMBOSS suite [ftp://emboss.open-bio.org/pub/EMBOSS/EMBOSS-6.6.0.tar.gz].
To generate the amplicons [this includes in silico amplicons from any paired database or metagenome you may have paired with this data] starting from the amplicons protein sequence, use hmmsearch tool with the HMM profile provided in this repo: