File Name: circuitry and dynamics of human transcription factor regulatory networks .zip
With relatively low efficiency, differentiated cells can be reprogrammed to a pluripotent state by ectopic expression of a few transcription factors. An understanding of the mechanisms that underlie data emerging from such experiments can help design optimal strategies for creating pluripotent cells for patient-specific regenerative medicine. We have developed a computational model for the architecture of the epigenetic and genetic regulatory networks which describes transformations resulting from expression of reprogramming factors. Importantly, our studies identify the rare temporal pathways that result in induced pluripotent cells. Further experimental tests of predictions emerging from our model should lead to fundamental advances in our understanding of how cellular identity is maintained and transformed.
Thank you for visiting nature. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser or turn off compatibility mode in Internet Explorer.
Genomic DNase I footprinting enables the quantitative, nucleotide-resolution delineation of sites of transcription factor occupancy within native chromatin 3 , 4 , 5 , 6. However, only a small fraction of such sites have been precisely resolved on the human genome sequence 6.
Here, to enable comprehensive mapping of transcription factor footprints, we produced high-density DNase I cleavage maps from human cell and tissue types and states and integrated these data to delineate about 4. We map the fine-scale structure within about 1. Cell-context-dependent cis -regulation is chiefly executed by wholesale modulation of accessibility at regulatory DNA rather than by differential transcription factor occupancy within accessible elements.
We also show that the enrichment of genetic variants associated with diseases or phenotypic traits in regulatory regions 1 , 7 is almost entirely attributable to variants within footprints, and that functional variants that affect transcription factor occupancy are nearly evenly partitioned between loss- and gain-of-function alleles.
Unexpectedly, we find increased density of human genetic variation within transcription factor footprints, revealing an unappreciated driver of cis -regulatory evolution. Our results provide a framework for both global and nucleotide-precision analyses of gene regulatory mechanisms and functional genetic variation. Genome-encoded recognition sites for sequence-specific DNA binding proteins are the atomic units of eukaryotic gene regulation.
Currently we lack a comprehensive, nucleotide-resolution annotation of such elements and their selective occupancy in different cell types and states. Such a reference is essential both for analysis of cell-selective regulation and for systematic integration of regulation with genetic variation associated with diseases and phenotypic traits.
The advent of DNA footprinting using the non-specific nuclease DNase I 8 marked a turning point in analyses of gene regulation, and facilitated the identification of the first mammalian sequence-specific DNA binding proteins 9.
DNase I footprints pinpoint regulatory factor occupancy on DNA and can be used to discriminate sites of direct versus indirect occupancy when integrated with chromatin immunoprecipitation and sequencing ChIP—seq experiments 4. Cognate transcription factors TFs can be assigned to footprints on the basis of matching consensus sequences, enabling the TF-focused analysis of gene regulation and regulatory networks 10 and of the evolution of regulatory factor binding patterns These, in turn, reflect both the topology and the kinetics of coincidently bound proteins.
Here we combine sampling of more than 67 billion uniquely mapping DNase I cleavages from over human cell types and states to index human genomic footprints with unprecedented accuracy and resolution, and thereby to identify the sequence elements that encode TF recognition sites within the human genome. We leverage this index to i systematically assign footprints to TF archetypes; ii define patterns of cell-selective occupancy; and iii analyse the distribution and effect of human genetic variation on regulatory factor occupancy and the genetics of common diseases and traits.
Collectively, we uniquely mapped On average, To identify DNase I footprints genome-wide, we developed a computational approach that incorporates both chromatin architecture and exhaustively enumerated empirical DNase I sequence preferences to determine expected per-nucleotide cleavage rates across the genome, and to derive, for each biosample, a statistical model for testing whether its observed cleavage rates at individual nucleotides deviated significantly from expectation Extended Data Fig.
We note that the derivation of cleavage variability models for each biosample individually accounts for additional sources of technical variability beyond DNase I cleavage preference. Nucleotide protection tracked closely with both the presence of known TF recognition sequences and the level of per-nucleotide evolutionary conservation Extended Data Fig. Within each biosample, footprints encompassed an average of around 7. Comparative footprinting across cell types has the potential to illuminate both the structure and function of regulatory DNA, but a systematic approach for joint analysis of genomic footprinting data has been lacking.
Given the scale and diversity of the cell types and tissues surveyed, we sought to develop a framework that could integrate hundreds of available footprinting datasets to increase the precision and resolution of footprint detection and, furthermore, to provide a scaffold for a common reference index of TF-contacted DNA genome-wide.
To accomplish this, we implemented an empirical Bayes framework that estimates the posterior probability that a given nucleotide is footprinted by incorporating a prior on the presence of a footprint determined by footprints independently identified within individual datasets and a likelihood model of cleavage rates for both occupied and unoccupied sites Fig.
Figure 1b depicts per-nucleotide footprint posterior probabilities computed for two DHSs within a representative locus RELB across all biosamples. A notable feature of these data is the positional stability and discrete appearance of footprints seen within each DHS across tens to hundreds of biosamples. Plotting individual nucleotides scaled by their footprint prevalence across all samples precisely resolves the core recognition sequences of diverse TFs Fig.
Top, windowed DNase I cleavage density. Below, per-nucleotide cleavage and footprint posterior probabilities within two DHSs. Below, DHS sequence scaled by footprint prevalence.
To establish a reference set of TF-occupied DNA elements genome-wide, we applied the Bayesian approach to all DHSs detected within one or more of the biosamples, and applied the same consensus approach used to establish a consensus DHS index 13 to collate overlapping footprinted regions across individual biosamples into distinct high-resolution consensus footprints Supplementary Methods.
Collectively, we delineated approximately 4. As expected, consensus that is, empirical Bayes footprints were markedly more reproducible than footprints detected using individual datasets average Jaccard similarity between replicate biosamples 0. Most consensus footprints Collectively, consensus footprints annotated 2.
Given the strong dependency of footprint detection on sequencing depth Extended Data Fig. De novo footprint detection after iteratively subsampling the most deeply sequenced DNase I libraries more than million sequenced tags showed that footprints detected increased linearly with sequencing depth Extended Data Fig.
Because the consensus approach favours footprints with support from many biosamples, the consensus footprint space reported here is likely to represent a substantial proportion of TF binding sites that are shared across many cell and tissue types. Recognition sequences now exist for all major families and subfamilies of TFs, and for a large number of individual TF isoforms We thus sought to create a reference mapping between annotated TFs and consensus footprints by i compiling and clustering all publicly available motif models 15 , 16 , 17 ; ii creating non-redundant TF archetypes by placing closely related TF family members on a common sequence axis Extended Data Fig.
In total, To gauge the sensitivity and accuracy of the motif-to-consensus footprint mappings, we evaluated the posterior footprint probability as metric to classify motif occupancy by using the genomic master regulator CCCTC-binding factor CTCF. Lower CTCF motif match scores were strongly associated with false-positive footprint or motif classifications, so the incorporation of motif match strength in addition to footprint probability is expected to increase classification precision Extended Data Fig.
Overall, footprinted motifs showed an approximately 2. Because TF engagement creates subtle alterations in DNA shape and protects underlying phosphate bonds from nuclease attack via steric hindrance 6 , we investigated to what extent fluctuations in corrected DNase I cleavage rates within individual consensus footprints accurately reflected the topology of the TF—DNA interface.
Notably, previous efforts to resolve such features 4 were obscured by subtle intrinsic cleavage preferences and lacked resolving power at individual TF footprints on the genome.
Poly-zinc fingers are the most prevalent class of human TFs and have recognition interfaces that potentially cover tens of nucleotides Transposing the average corrected per-nucleotide cleavage propensity with an extended co-crystal structure of CTCF 21 accurately traced all features of the protein—DNA interaction interface, including focal hypersensitivity within the hinge region between zinc fingers 7 and 9 5 , 22 , 23 Fig.
Critically, these topological features were evident at the level of individual TF footprints on the genome Fig. As such, the extended profile of corrected per-nucleotide DNase I cleavage across entire regulatory regions should, in principle, provide a snapshot of the primary structure of active regulatory DNA. Below, aggregate summed nuclease cleavage relative to footprinted motifs. Right, nuclease cleavage observed and expected at three footprints randomly selected across genome.
TFs compete cooperatively with nucleosomes for access to regulatory DNA 25 , However, it is unclear how steady-state chromatin accessibility is maintained by TFs in place of a canonical nucleosome, and whether this results primarily from local protein—protein interactions or the synergistic effects of independent TF—DNA binding We reasoned that the number, relative spacing, and morphology of TF binding events within individual regulatory elements could be used to gain insight into the mechanistic basis of TF cooperativity.
Conversely, independent TF—DNA interaction events should yield compact and widely spaced footprints that contain single TF recognition sites. As such, the prevalence of cooperativity mediated by direct TF—TF interactions rather than by synergy of independent binding events should be reflected in the relative proportion of wide, multi-motif footprints compared to that of well-spaced single footprints.
Larger footprints are overwhelmingly associated with two or more recognition sequences Fig. Right, proportion of footprints uniquely overlapped by 0, 1 or 2 or more recognition sequences.
Solid lines and shaded regions indicate median and middle 50th percentile, respectively. To quantify global footprint spacing patterns, we first binned each DHS by its average accessibility across all biosamples as footprint discovery depends on total DNase I cleavage; Extended Data Fig.
The density of footprints within the most deeply sampled DHSs genome-wide plateaued at an average of 5. Together, these results are compatible with the observed lack of evolutionary constraint on the spacing and orientation 29 , 30 , 31 , 32 , 33 of TF motifs and strongly suggest that steady-state regulatory DNA accessibility is maintained chiefly by independent but synergistic TF binding modes Fig.
Footprint occupancy across all biosamples showed marked enrichment for the recognition sequences of key regulatory TFs in their cognate lineages Extended Data Fig.
In total, we identified motif models that matched footprinted sequences Supplementary Methods ; these models encompassed 64 distinct archetypal TF recognition codes Supplementary Table 2 , representing virtually all major DNA-binding domain families. For degenerate motifs where the same sequence is recognized by many distinct TFs, we observed highly cell-selective occupancy patterns that could be decomposed into coherent groups that corresponded to cell type and function Extended Data Fig.
Given that most DHSs are shared across at least two cell types or states 13 , 34 , we queried how the pattern of footprints within a DHS and hence its topology differed with cellular context.
Although differential TF occupancy can be discerned upon manual inspection 4 , systematic analysis has not been possible owing to the dominance of intrinsic DNase I cleavage propensities. To enable unbiased detection of differential footprint occupancy, we developed a statistical framework to test for differences in relative cleavage rates at individual nucleotides across many samples, analogous to methods developed for the identification of differentially expressed genes Supplementary Methods.
To estimate the proportion of differentially regulated footprints within DHSs of a given cell or tissue, we focused on the neural lineage, for which many biosamples were available. We selected 67, DHSs that were highly accessible in at least 10 nervous- and non-nervous-derived samples, and for each DHS, performed a per-nucleotide differential test Extended Data Figs. Most of these DHSs contained a single differentially regulated footprint, whereas a small fraction contained 2—4 differentially occupied elements Extended Data Fig.
Collectively, the above results indicate that the vast majority of regulatory DNA regions marked by DHSs encode a single structural topology that reflects a fixed pattern of footprint occupancy. Nonetheless, at a small minority of elements, DHSs provide a scaffold for cell-context-specific TF occupancy that is typically confined to one or a small number of footprinted elements.
Identifying genetic variants that are likely to affect regulatory function has remained challenging. Deep sequence coverage at DHSs enables de novo genotyping of regulatory variants and simultaneous characterization of their functional effect on local chromatin architecture by quantifying and comparing cleavage for each allele 2 , 4.
The biosamples we analysed were derived from individuals, and de novo genotyping Supplementary Methods revealed 3. Across individuals, we conservatively identified , chromatin-altering variants CAVs that altered DNA accessibility on individual alleles median 2. Within DHSs, CAVs were markedly enriched in core consensus footprints, even after controlling for the increased detection power that is, sequencing depth within this compartment Fig.
Top, allelically resolved per-nucleotide DNase cleavage aggregated from 56 heterozygotes. Middle, DNase cleavage in two samples homozygous for reference or alternative alleles.
Colour indicates statistical significance —log 10 P of per-nucleotide differential test Methods. Variant and differentially footprinted nucleotides precisely colocalize at the NFIX element. Grey line, fitted linear model. Shown is median log-odds score reference versus alternate allele of all tested variants within footprinted motifs binned by allelic ratio. Error bars show 5th and 95th percentiles of log-odds motif scores in each bin. In protein-coding regions, most functional genetic variation is expected to be deleterious, with rare gain-of-function alleles Protein—DNA recognition interfaces are likewise presumed to be susceptible to disruption at critical nucleotides, predisposing to loss-of-function alleles Notably, we found that CAVs were nearly evenly partitioned between loss-of-function disruption of binding and gain-of-function increased or de novo binding alleles Fig.
Homozygosity for either the reference or alternative allele paralleled results from heterozygotes and further revealed that structural changes due to TF occupancy were precisely confined to the DNA sequence recognition interface Fig. In many cases, SNVs that were detected in both heterozygous and homozygous configurations showed strong agreement between allelic ratios and relative footprint strength Fig.
Skip to Main Content. A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. Use of this web site signifies your agreement to the terms and conditions. Regulation by competing: A hidden layer of gene regulatory networks Abstract: Quantitative understanding of biological regulation is essential for studying natural biosystems and for constructing synthetic systems. Current studies on gene regulation are used to build models under the assumption that gene regulators acting on a single or few targets.
Understanding how gene expression programs are controlled requires identifying regulatory relationships between transcription factors and target genes. Gene regulatory networks are typically constructed from gene expression data acquired following genetic perturbation or environmental stimulus. Single-cell RNA sequencing scRNAseq captures the gene expression state of thousands of individual cells in a single experiment, offering advantages in combinatorial experimental design, large numbers of independent measurements, and accessing the interaction between the cell cycle and environmental responses that is hidden by population-level analysis of gene expression. To leverage these advantages, we developed a method for scRNAseq in budding yeast Saccharomyces cerevisiae. We pooled diverse transcriptionally barcoded gene deletion mutants in 11 different environmental conditions and determined their expression state by sequencing 38, individual cells. We benchmarked a framework for learning gene regulatory networks from scRNAseq data that incorporates multitask learning and constructed a global gene regulatory network comprising 12, interactions.
Plant responses to environmental and intrinsic signals are tightly controlled by multiple transcription factors TFs. These TFs and their regulatory connections form gene regulatory networks GRNs , which provide a blueprint of the transcriptional regulations underlying plant development and environmental responses. This review provides examples of experimental methodologies commonly used to identify regulatory interactions and generate GRNs. Additionally, this review describes network inference techniques that leverage gene expression data to predict regulatory interactions. These computational and experimental methodologies yield complex networks that can identify new regulatory interactions, driving novel hypotheses. Biological properties that contribute to the complexity of GRNs are also described in this review.
Беккер обернулся как во сне. - Senor Becker? - прозвучал жуткий голос. Беккер как завороженный смотрел на человека, входящего в туалетную комнату. Он показался ему смутно знакомым. - Soy Hulohot, - произнес убийца.
Его костюм выглядел так, будто он в нем спал. Стратмор сидел за современным письменным столом с двумя клавиатурами и монитором в расположенной сбоку нише. Стол был завален компьютерными распечатками и выглядел каким-то чужеродным в этом задернутом шторами помещении. - Тяжелая неделя? - спросила. - Не тяжелей, чем обычно.
- Если мы вызовем помощь, шифровалка превратится в цирк. - Так что же вы предлагаете? - спросила Сьюзан. Она хотела только одного - поскорее уйти. Стратмор на минуту задумался.
Your email address will not be published. Required fields are marked *