One of the major applications of Next Generation Sequencing (NGS) technology is to detect single nucleotide polymorphism (SNP). Several tools have been developed to call SNPs based on NGS data. However, most of them require high sequencing depth, which is too expensive to obtain. Here, we propose a novel SNP-detection program, FaSD, to call SNPs from NGS data. Evaluated on two independent datasets from The Cancer Genome Atlas project (TCGA) with Illumina and Affymetrix SNP arrays as gold standards, FaSD showed superior performance over current state-of-the-art SNP calling softwares. FaSD is particularly accurate in calling SNPs when the sequencing depths are low, achieving area under curve (AUC) 95.2% at sequencing depths of 4-5, comparing to 86.4% for SOAPsnp and MAQ, and 73.1% for SNVmix2 on normal tissues. FaSD finishes SNP calling within 4 hours for ten-fold human genome NGS data (total 91 GB) in a standard desktop computer.

Suggested Pipeline

Here is the pipeline for analysis of pooled or individual sequencing data(paired-end from Illumina) for your reference. You may need to modify several of them for different situations, such as single-end reads and data from other platforms. evaluation pipeline


