Update README.md

7f390ccc · Tracanna, Vittorio · d810c1c2 · 7f390ccc
Commit 7f390ccc authored 5 years ago by Tracanna, Vittorio
--- a/README.md
+++ b/README.md
@@ -5,7 +5,10 @@ Pre-parsed static version of the databases are provided. Beware: if you want to

 An example of the command needed to run the pipeline is found in CMD_example.

-To generate the amplicons [and in silico amplicons] starting from the amplicons protein sequence, use hmmsearch tool with the HMM profile provided in this repo:
+In order to create the feature table, I recommend following one of the many tutorial available in the qiime2 tutorial pages for creating [feature tables] [https://docs.qiime2.org/2020.2/tutorials/moving-pictures/#obtaining-and-importing-data ] from raw amplicon reads. We suggest to use only the forward reads as in our experience they contain the majority of the information and adding the reverse non-overlapping reads mostly introduces additional issues such as problems when merging the forward and reverse with an N in between which is not supported by many tools.
+Once the reads are denoised [with DADA2] [example: https://docs.qiime2.org/2020.2/tutorials/moving-pictures/#option-1-dada2 ]. After this step you should be able to export both the feature-table and the feature-data to non-qiime formats. The feature-data, which contains the denoised nucleotide amplicon sequences can be translated to protein sequence with any tool capable of it, here we used transeq from the EMBOSS suite [ftp://emboss.open-bio.org/pub/EMBOSS/EMBOSS-6.6.0.tar.gz].
+
+To generate the amplicons [this includes in silico amplicons from any paired database or metagenome you may have paired with this data] starting from the amplicons protein sequence, use hmmsearch tool with the HMM profile provided in this repo:

 `hmmsearch -o /path/to/hmmsearch/output/and/filename /path/to/hmm_profile.hmm /path/to/protein/sequences.faa`