an application of random forests to a genome-wide association dataset methodological considerations new findings应用随机森林全基因组关联数据集方法注意事项和新发现.pdfVIP
- 1、本文档共13页,可阅读全部内容。
- 2、原创力文档(book118)网站文档一经付费(服务费),不意味着购买了该文档的版权,仅供个人/单位学习、研究之用,不得用于商业用途,未经授权,严禁复制、发行、汇编、翻译或者网络传播等,侵权必究。
- 3、本站所有内容均由合作方或网友上传,本站不对文档的完整性、权威性及其观点立场正确性做任何保证或承诺!文档内容仅供研究参考,付费前请自行鉴别。如您付费,意味着您自己接受本站规则且自行承担风险,本站不退款、不进行额外附加服务;查看《如何避免下载的几个坑》。如果您已付费下载过本站文档,您可以点击 这里二次下载。
- 4、如文档侵犯商业秘密、侵犯著作权、侵犯人身权等,请点击“版权申诉”(推荐),也可以打举报电话:400-050-0827(电话支持时间:9:00-18:30)。
- 5、该文档为VIP文档,如果想要下载,成为VIP会员后,下载免费。
- 6、成为VIP后,下载本文档将扣除1次下载权益。下载后,不支持退款、换文档。如有疑问请联系我们。
- 7、成为VIP后,您将拥有八大权益,权益包括:VIP文档下载权益、阅读免打扰、文档格式转换、高级专利检索、专属身份标志、高级客服、多端互通、版权登记。
- 8、VIP文档为合作方或网友上传,每下载1次, 网站将根据用户上传文档的质量评分、类型等,对文档贡献者给予高额补贴、流量扶持。如果你也想贡献VIP文档。上传文档
查看更多
anapplicationofrandomforeststoagenome-wideassociationdatasetmethodologicalconsiderations
Goldstein et al. BMC Genetics 2010, 11:49
/1471-2156/11/49
R E S E A R C H A R T I C L E Open Access
Research article
An application of Random Forests to a
genome-wide association dataset: Methodological
considerations new findings
1,3 1 2 1
Benjamin A Goldstein* , Alan E Hubbard , Adele Cutler and Lisa F Barcellos*
Abstract
Background: As computational power improves, the application of more advanced machine learning techniques to
the analysis of large genome-wide association (GWA) datasets becomes possible. While most traditional statistical
methods can only elucidate main effects of genetic variants on risk for disease, certain machine learning approaches
are particularly suited to discover higher order and non-linear effects. One such approach is the Random Forests (RF)
algorithm. The use of RF for SNP discovery related to human disease has grown in recent years; however, most work has
focused on small datasets or simulation studies which are limited.
Results: Using a multiple sclerosis (MS) case-control dataset comprised of 300 K SNP genotypes across the genome,
we outline an approach and some considerations for optimally tuning the RF algorithm based on the empirical
dataset. Importantly, results show that typical default parameter values are not appropriate for large GWA datasets.
Furthermore, gains can be made by sub-sampling the data, pruning based on linkage disequilibrium (LD), and
removing strong effects from RF analyses. The new RF results are compared to findings from the original MS GWA study
and demonstrate overlap. In addition, four new interesting candidate MS genes are identified, MPHOSPH9, CTNNA3,
PHACTR2 and IL7, by RF analysis and warrant further follow-up in independent studies.
Conclusions: This
您可能关注的文档
- alexithymia, emotion processing and social anxiety in adults with adhd述情障碍、情感处理和成人adhd的社会焦虑.pdf
- algebraic comparison of metabolic networks, phylogenetic inference, and metabolic innovation代数比较代谢网络、系统发育推断、创新和代谢.pdf
- algebraic correction methods for computational assessment of clone overlaps in dna fingerprint mapping代数校正方法计算评价dna指纹图谱克隆重叠的映射.pdf
- algae extracts and methyl jasmonate anti-cancer activities in prostate cancer choreographers of ‘the dance macabre’海藻提取物和甲基jasmonate在前列腺癌抗癌活动舞蹈指导的u201c舞蹈的象征u201d.pdf
- algal mips, high diversity and conserved motifs藻mips、高多样性和守恒的图案.pdf
- algal functional annotation tool a web-based analysis suite to functionally interpret large gene lists using integrated annotation and expression data藻功能注释工具套件的功能解释网络分析大型基因列表使用集成的注释和表达数据.pdf
- algorithms and complexity analyses for control of singleton attractors in boolean networks算法和复杂性分析单布尔网络的吸引子的控制权.pdf
- aldh1a2 (raldh2) genetic variation in human congenital heart diseasealdh1a2(raldh2)人类先天性心脏病遗传变异.pdf
- algorithmic approaches to aid species' delimitation in multidimensional morphospace算法的方法在多维morphospace援助物种界定.pdf
- algorithm-driven artifacts in median polish summarization of microarray data由工件在波兰中间微阵列数据的总结.pdf
- 北京市石景山区2023_2024学年高二数学上学期期末考试试卷含解析.doc
- 江西省上饶市弋阳县第一中学、横峰中学、铅山县第一中学2024-2025学年高二下学期4月月考数学试卷(含详解).docx
- 2024-2025学年江苏省苏州新区一中初二(上)英语10月月考试题及答案.pdf
- 安徽省C20教育联盟2025届九年级下学期中考“功夫”卷(三)数学试卷(含解析).docx
- 北京汇佳职业学院《中学历史教学论》2023-2024学年第一学期期末试卷.doc
- 安徽省C20教育联盟2025年九年级下学期中考“功夫”卷(二)数学试卷(含解析).docx
- 北京汇佳职业学院《职业生涯教育与就业指导(含创新创业教育)》2023-2024学年第一学期期末试卷.doc
- 北京汇佳职业学院《仪器分析综合实训》2023-2024学年第一学期期末试卷.doc
- 宝石矿物的物理性质研究考核试卷.docx
- 北京汇佳职业学院《现代生物技术》2023-2024学年第一学期期末试卷.doc
文档评论(0)