控制人口分层下的复杂疾病基因的关联检验
发布时间:2018-10-20 16:07
【摘要】: 我们利用不相关的人群来研究遗传关联的问题可能会引起假阳性的结果,但现在已建立起一些方法,如SPT方法来克服假阳性结果的出现.但在此方法求遗传背景变量的过程中,往往出现数据缺失的现象,并在一定程度上影响遗传关联结果的准确性.在不增加实验次数的情况下,缺失值估计是降低缺失数据对后续分析影响的有效方法.如果使用填充值代替缺失值进行数据分析,则填充值的准确性会直接影响分析结果.因此为了保证数据分析和处理的正确性和有效性,确保提供有效数据,对缺失值进行正确处理是一个非常重要的预处理过程.针对SPT方法中样本表达数据取值的不连续的特点,采取了通过度量样本之间的相关性来选择最邻近样本,然后利用EM算法来估计缺失值.
[Abstract]:We use unrelated populations to study genetic association problems that may lead to false positive results, but now we have established some methods, such as SPT method, to overcome false positive results. However, in the process of finding genetic background variables by this method, the absence of data often occurs, and the accuracy of genetic association results is affected to some extent. Without increasing the number of experiments, the estimation of missing values is an effective method to reduce the impact of missing data on subsequent analysis. If the missing value is replaced by the fill value, the accuracy of the fill value will directly affect the analysis result. Therefore, in order to ensure the correctness and validity of data analysis and processing and provide effective data, the correct processing of missing values is a very important preprocessing process. Aiming at the discontinuity of sample representation data in SPT method, the nearest neighbor sample is selected by measuring the correlation between samples, and then the missing value is estimated by EM algorithm.
【学位授予单位】:黑龙江大学
【学位级别】:硕士
【学位授予年份】:2008
【分类号】:R394;O213
本文编号:2283664
[Abstract]:We use unrelated populations to study genetic association problems that may lead to false positive results, but now we have established some methods, such as SPT method, to overcome false positive results. However, in the process of finding genetic background variables by this method, the absence of data often occurs, and the accuracy of genetic association results is affected to some extent. Without increasing the number of experiments, the estimation of missing values is an effective method to reduce the impact of missing data on subsequent analysis. If the missing value is replaced by the fill value, the accuracy of the fill value will directly affect the analysis result. Therefore, in order to ensure the correctness and validity of data analysis and processing and provide effective data, the correct processing of missing values is a very important preprocessing process. Aiming at the discontinuity of sample representation data in SPT method, the nearest neighbor sample is selected by measuring the correlation between samples, and then the missing value is estimated by EM algorithm.
【学位授予单位】:黑龙江大学
【学位级别】:硕士
【学位授予年份】:2008
【分类号】:R394;O213
【参考文献】
相关期刊论文 前2条
1 邱浪波;王广云;王正志;;基因表达缺失值的加权回归估计算法[J];国防科技大学学报;2007年01期
2 杨涛,骆嘉伟,王艳,吴君浩;基于马氏距离的缺失值填充算法[J];计算机应用;2005年12期
,本文编号:2283664
本文链接:https://www.wllwen.com/yixuelunwen/shiyanyixue/2283664.html
最近更新
教材专著