Previously, I shared some workflows for denoising 16S rRNA gene sequence data used by our lab. One of the examples used the popular UNOISE3 algorithm developed by Robert Edgar and implemented in USEARCH.
Below I provide scripts to implement several workflows for denoising 16s rRNA gene sequences used by the Microbial Metagenomics Analysis Center (MMAC) at CCHMC for paired-end data. These scripts are written to run on the CCHMC high-performance computing (HPC) cluster.
During the Introduction to Metagenomics Summer Workshop we discussed denoising amplicon sequence variants and worked through Ben Callahan’s DADA2 tutorial. During that session, I mentioned several other approaches and algorithms for denoising or clustering amplicon sequence data including UNOISE3, DeBlur and Mothur.