? 2 0 1 1 N a tu re A m e ri c a , In c . A ll r ig h ts r e s e rv e d . protocol nature protocols | VOL.7 NO.1 | 2012 | 45 IntroDuctIon To understand how cis-regulatory elements determine gene expres- sion, the global identi?ication o? in vivo transcription ?actor binding sites is an invaluable tool. It is usually achieved by ChIP ?ollowed by microarray analysis (i.e., ChIP-chip) 1,2 , or, more recently, by deep sequencing (ChIP-seq) 3,4 . The ?ocus o? many current ChIP- seq studies is the comparison o? transcription ?actor binding pro- ?iles across di??erent conditions such as di??erent developmental time points 5,6 , cell types (e.g., within one cell lineage 7,8 ) or closely related species 9,10 . However, such comparative ChIP-seq studies are highly dependent on appropriate computational approaches, which are o?ten still lacking. Most notably, stringent thresholds are typically used to reliably identi?y transcription ?actor binding sites. However, this method does not discriminate subthreshold binding ?rom truly nonbound regions, and it is subject to noise, which can lead to an underestimation o? the overlap in binding between two data sets. Here we present a computational approach ?or the compara- tive analysis o? ChIP-seq data that we recently developed to compare binding o? the mesodermal transcription ?actor Twist across six closely related Drosophila species 9 (Fig. 1). We describe technical guidelines and provide code with sample data ?or the preprocessing and mapping o? ChIP-seq reads, the translation o? ChIP-seq data to a common re?erence genome (?or cross-species analyses), approaches ?or a threshold-?ree comparison o? global binding similarity, an analysis o? binary presence/absence bind- ing o? patterns (e.g., to estimate the conservation o? binding) and the assessment o? quantitative changes in binding. We also discuss ?unctional and comparative sequence analyses o? tran- scription ?actor binding. Although this pro


