dinucleotide weight matrices for predicting transcription factor binding sites generalizing the position weight matrix二核苷酸权重矩阵预测转录因子结合位点推广位置权重矩阵.pdfVIP
- 16
- 0
- 约7.77万字
- 约 10页
- 2017-09-01 发布于上海
- 举报
dinucleotide weight matrices for predicting transcription factor binding sites generalizing the position weight matrix二核苷酸权重矩阵预测转录因子结合位点推广位置权重矩阵
Dinucleotide Weight Matrices for Predicting
Transcription Factor Binding Sites: Generalizing the
Position Weight Matrix
Rahul Siddharthan*
The Institute of Mathematical Sciences, Chennai, Tamil Nadu, India
Abstract
Background: Identifying transcription factor binding sites (TFBS) in silico is key in understanding gene regulation. TFBS are
string patterns that exhibit some variability, commonly modelled as ‘‘position weight matrices’’ (PWMs). Though convenient,
the PWM has significant limitations, in particular the assumed independence of positions within the binding motif; and
predictions based on PWMs are usually not very specific to known functional sites. Analysis here on binding sites in yeast
suggests that correlation of dinucleotides is not limited to near-neighbours, but can extend over considerable gaps.
Methodology/Principal Findings: I describe a straightforward generalization of the PWM model, that considers frequencies
of dinucleotides instead of individual nucleotides. Unlike previous efforts, this method considers all dinucleotides within an
extended binding region, and does not make an attempt to determine a priori the significance of particular dinucleotide
correlations. I describe how to use a ‘‘dinucleotide weight matrix’’ (DWM) to predict binding sites, dealing in particular with
the complication that its entries are not independent probabilities. Benchmarks show, for many factors, a dramatic
improvement over PWMs in precision of predicting known targets. In most cases, significant further improvement arises by
extending the commonly defined ‘‘core motifs’’ by about 10bp on either side. Though this flanking sequence shows no
strong motif at the nucleotide level, the predictive power of the dinucleotide model suggests that the ‘‘signature’’ in DNA
sequence of protein-binding affinity extends beyond the core p
您可能关注的文档
- dia1r is an x-linked gene related to deleted in autism-1dia1r autism-1中删除的x染色体基因相关.pdf
- diabetes is the main factor accounting for hypomagnesemia in obese subjects糖尿病是会计的主要因素在肥胖受试者低镁症.pdf
- dia2 controls transcription by mediating assembly of the rsc complexdia2控制转录rsc的中介组装复杂.pdf
- diabetes and the risk of multi-system aging phenotypes a systematic review and meta-analysis糖尿病和多系统衰老表型的风险系统回顾和荟萃分析.pdf
- diabetes in danish bank voles (m. glareolus) survivorship, influence on weight, and evaluation of polydipsia as a screening tool for hyperglycaemia糖尿病在丹麦银行田鼠(m . glareolus)生存,影响体重,和评价烦渴的高血糖症的筛查工具.pdf
- dha supplemented in peptamen diet offers no advantage in pathways to amyloidosis is it time to evaluate composite lipid dietdha补充peptamen饮食中没有提供优势途径淀粉样变是时候来评估复合脂质饮食.pdf
- diabetes alters intracellular calcium transients in cardiac endothelial cells糖尿病改变心脏内皮细胞胞内钙瞬变.pdf
- diabetes mellitus increases the risk of active tuberculosis a systematic review of 13 observational studies糖尿病会增加活动性结核病的风险的系统回顾13观察性研究.pdf
- diabetes with hypertension as risk factors for adult dengue hemorrhagic fever in a predominantly dengue serotype 2 epidemic a case control study糖尿病与高血压的危险因素为成年登革出血热主要在登革热流行血清型2病例对照研究.pdf
- developmental expression of kv potassium channels at the axon initial segment of cultured hippocampal neurons发展kv钾离子通道的表达在培养海马神经元的轴突初始段.pdf
原创力文档

文档评论(0)