全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

A strand specific high resolution normalization method for chip-sequencing data employing multiple experimental control measurements

DOI: 10.1186/1748-7188-7-2

Full-Text   Cite this paper   Add to My Lib

Abstract:

We formulated such a normalization method based on linear regression and made a proof-of-concept implementation in R and C++. It was tested on simulated as well as on publicly available ChIP-seq data on binding sites for two transcription factors, MAX and FOXA1 and two control samples, Input and IgG. We applied three different peak-callers to (i) raw (un-normalized) data using statistical background models and (ii) raw data with control samples as background and (iii) normalized data without additional control samples as background. The fraction of called regions containing the expected transcription factor binding motif was largest for the normalized data and evaluation with qPCR data for FOXA1 suggested higher sensitivity and specificity using normalized data over raw data with experimental background.The proposed method can handle several control samples allowing for correction of multiple sources of bias simultaneously. Our evaluation on both synthetic and experimental data suggests that the method is successful in removing background noise.High-throughput sequencing of chromatin immunoprecipitated DNA, or ChIP-seq [1], has replaced microarray-based techniques as the standard tool for investigating protein-DNA interactions in the cell. However, the data generated will always contain noise due to sequencing biases, PCR-artefacts, low complexity regions/mappability, chromatin structure or non-specific antibody interactions in the ChIP-step. The noise appears as a background distribution of reads, or tags, which must be taken into account in downstream analyses such as peak-calling.Experimental assessments of the background read distribution is favoured over purely theoretical and therefore not experimentally validated background models [2]. One such assessment is to sequence the sonicated sample prior to immunoprecipitation (IP). The resulting read distribution is commonly referred to as 'input'. Ideally this distribution would be uniform but Kharchenko et al [2]

Full-Text

comments powered by Disqus

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133