IntSIM: An Integrated Simulator of Next-Generation Sequencing Data
Objective: Next-generation sequencing data has been widely used for DNA variant discovery and tumor study through computational tools. Effective simulation of such data with many realistic features is very necessary for testing existing tools and guiding the development of new tools. Methods: We present an integrated simulation system, IntSIM, to simulate common DNA variants and to generate sequencing reads for mixture genomes. IntSIM has three novel features in comparison with other simulation programs: 1) it is able to simulate both germline and somatic variants in the same sequence, 2) it deals with tumor purity so as to generate reads corresponding to heterogeneous genomes and also produce tumor-normal matched samples, and 3) it simulates correlations among SNPs, among CNVs/CNAs based on HMM models trained from real sequencing genomes, and can simulates broad and focal CNV/CNA events. Results: The simulation data of IntSIM can reflect characteristics observed from real data and are consistent with input parameters. The IntSIM software package is freely available at http://intsim.sourceforge.net/ . Conclusion: Based on a great number of experiments, IntSIM performs better than other program for some scenarios, such as simulation of heterozygous SNPs, CNVs/CNAs, and can achieve some functions that other programs cannot achieve. Significance: Simulation with IntSIM can be expected to evaluate performance of methods in detecting various types of variants, analyzing tumor samples, and especially providing a realistic assessment of effect of tumor purity on identification of somatic mutations.
- 원문이 없습니다.
NDSL에서는 해당 원문을 복사서비스하고 있습니다. 위의 원문복사신청 또는 장바구니 담기를 통하여 원문복사서비스 이용이 가능합니다.
- 이 논문과 함께 출판된 논문 + 더보기