본문 바로가기
HOME> 논문 > 논문 검색상세

논문 상세정보

Scientific reports v.6, 2016년, pp.36671 -    SCI SCIE
본 등재정보는 저널의 등재정보를 참고하여 보여주는 베타서비스로 정확한 논문의 등재여부는 등재기관에 확인하시기 바랍니다.

Combining Multiple Hypothesis Testing with Machine Learning Increases the Statistical Power of Genome-wide Association Studies

Mieth, Bettina (Machine Learning Group, Technische Universität Berlin, Berlin, 10587, Germany ) ; Kloft, Marius (Department of Computer Science, Humboldt University of Berlin, Berlin, 10099, Germany ) ; Rodríguez, Juan Antonio (Institut de Biología Evolutiva (CSIC-UPF). Departament de Ciències Experimentals i de la Salut. Universitat Pompeu Fabra, Barcelona, 08003, Spain ) ; Sonnenburg, Sören (TomTom Research, Berlin, 12555, Germany ) ; Vobruba, Robin (Machine Learning Group, Technische Universität Berlin, Berlin, 10587, Germany ) ; Morcillo-Suárez, Carlos (Institut de Biología Evolutiva (CSIC-UPF). Departament de Ciències Experimentals i de la Salut. Universitat Pompeu Fabra, Barcelona, 08003, Spain ) ; Farré, Xavier (Institut de Biología Evolutiva (CSIC-UPF). Departament de Ciències Experimentals i de la Salut. Universitat Pompeu Fabra, Barcelona, 08003, Spain ) ; Marigorta, Urko M. (School of Biology, Georgia Institute of Technology, Atlanta, 30332, GA, USA ) ; Fehr, Ernst (Department of Economics, Laboratory for Social and Neural Systems Research, University of Zurich, Zurich, 8006, Switzerland ) ; Dickhaus, Thorsten (Institute ) ; Blanchard, Gilles ; Schunk, Daniel ; Navarro, Arcadi ; Müller, Klaus-Robert ;
  • 초록  

    The standard approach to the analysis of genome-wide association studies (GWAS) is based on testing each position in the genome individually for statistical significance of its association with the phenotype under investigation. To improve the analysis of GWAS, we propose a combination of machine learning and statistical testing that takes correlation structures within the set of SNPs under investigation in a mathematically well-controlled manner into account. The novel two-step algorithm, COMBI, first trains a support vector machine to determine a subset of candidate SNPs and then performs hypothesis tests for these SNPs together with an adequate threshold correction. Applying COMBI to data from a WTCCC study (2007) and measuring performance as replication by independent GWAS published within the 2008-2015 period, we show that our method outperforms ordinary raw p-value thresholding as well as other state-of-the-art methods. COMBI presents higher power and precision than the examined alternatives while yielding fewer false (i.e. non-replicated) and more true (i.e. replicated) discoveries when its results are validated on later GWAS studies. More than 80% of the discoveries made by COMBI upon WTCCC data have been validated by independent studies. Implementations of the COMBI method are available as a part of the GWASpi toolbox 2.0.


 활용도 분석

  • 상세보기

    amChart 영역
  • 원문보기

    amChart 영역

원문보기

무료다운로드
유료다운로드
  • 원문이 없습니다.

유료 다운로드의 경우 해당 사이트의 정책에 따라 신규 회원가입, 로그인, 유료 구매 등이 필요할 수 있습니다. 해당 사이트에서 발생하는 귀하의 모든 정보활동은 NDSL의 서비스 정책과 무관합니다.

원문복사신청을 하시면, 일부 해외 인쇄학술지의 경우 외국학술지지원센터(FRIC)에서
무료 원문복사 서비스를 제공합니다.

NDSL에서는 해당 원문을 복사서비스하고 있습니다. 위의 원문복사신청 또는 장바구니 담기를 통하여 원문복사서비스 이용이 가능합니다.

이 논문과 함께 출판된 논문 + 더보기