Genomic methods for measuring DNA replication timing

 

We develop new experimental and computational approaches for measuring replication timing across the entire genome. The common principle underlying these methods is that when DNA is replicated, its copy number doubles, and the earlier a genomic region replicates, the higher its average copy number in a population of replicating cells. In one method, replicating S phase cells are FACS-sorted, DNA is extracted and the entire genome sequenced. Fluctuations in DNA copy number along chromosomes are then used to retrieve the cells' replication timing program. This method is highly accurate and flexible, yet applicable to a handful of samples at a time. In a complementary, and predominantly computational method, whole-genome DNA sequences that were derived (knowingly or not) from cell cultures or tissues that were proliferating at the time of DNA extraction are used to analyze DNA copy number. Copy number fluctuations in these samples reflect DNA replication timing. This method enables studying replication timing in hundreds of samples. We are applying these methods to study replication timing in different people, cell types, and species, both in our lab and through various collaborations. We are also developing new sequencing-based methods for high-throughput analysis of DNA replication timing in single-cells.

Massey et al., BioRxiv 2021

Koren et al., Bioinformatics 2021

Massey et al., Genes 2019

Hulke et al., Chromosome Research 2019

Siefert et al., Genome Research 2017

Koren et al., Cell 2014

Koren et al., American Journal of Human Genetics 2012

Koren et al., Genome Research 2010

single cell.png
Replication timing variation illustration2.jpg

The genetic basis of DNA replication timing

 

The ability to measure DNA replication timing on a population scale enables a systematic investigation of its genetic basis. By comparing hundreds of people, we found that replication timing is highly variable, with numerous genomic regions being replicated at different times in different people. Furthermore, we can compare these people's replication profiles to their genotypes, which reveals polymorphic sequences (SNPs) that influence when a genomic region replicates. We call these “replication timing quantitative trait loci” (cis-rtQTLs). rtQTL mapping links DNA replication timing to other molecular and phenotypic traits. For instance, we showed that an rtQTL near the JAK2 gene links DNA replication timing with the susceptibility to blood neoplasms and leukemia. Analysis of the epigenetic features at rtQTL sites uncovers a highly combinatorial mechanism of replication timing control, composed of combinations of histone modifications as well as transcription factors that affect DNA replication both positively and negatively. A major focus of our lab is to fine-map many more rtQTLs, in several cell types, and to use them to study the molecular causes and the cellular and phenotypic consequences of human DNA replication timing. In addition, by utilizing samples from patients with genetic diseases and experimentally-induced gene knockouts, we are studying the genes and factors that operate in trans to determine the replication timing program.

 

Caballero et al., BioRxiv 2021

Ding et al., Nature Communications 2021

Ding and Koren, Trends in Genetics 2020

Koren et al., Cell 2014

DNA replication timing shapes the

mutational landscape of the genome

 

One of the most important implications of replication timing is on genome stability, in particular mutation rates and patterns. We and others have shown that late-replicating genomic regions have a 4-6-fold higher mutation rate compared to early-replicating regions. By analyzing large-scale sequencing data of family trios, we found that there is a higher rate of de-novo germline mutations in late-replicating genomic regions. However, this was only seen in the offspring of younger fathers; there is an age-dependent decrease in the association of mutations with replication timing in males. As a result, mutations in older fathers are less biased to late-replicating genomic regions and are therefore more likely to influence genes and affect phenotype.

Replication timing is also a major factor affecting the distribution of mutations in cancer genomes. Before this was realized, many genes were thought to be cancer drivers because they were found to be recurrently mutated in cancer genomes; many of these genes turned out to be late-replicating genes that simply have a high background mutation rate. In addition, different cancer types have different mutational profiles which can be traced to differences in replication timing and other epigenetic variables in their cell type of origin. Furthermore, the distribution of replication origins in the genome has a major influence on mutational strand asymmetry in numerous tumors; in turn, patterns of mutational asymmetry can be used to illuminate mechanisms of DNA repair and mutagenesis during S phase in cancer cells. 

GoNL.png
mutations vs genes.jpg

Human advances                        Human delays

The evolution of DNA replication timing

 

Some of the most interesting questions relate to how the DNA replication timing program has evolved, as well as to how DNA replication timing could influence sequence evolution by modulating the mutational landscape. To address this, we are comparing replication timing between humans, chimpanzees and rhesus macaques. We find hundreds of regions in which replication timing differs among species, including evidence for birth and death of replication origins during human evolution. We are mapping and analyzing cis-rtQTLs and sequence evolution at these regions in order to identify the genetic basis of replication timing evolution and study its consequences.  

evolution.png

Our research is funded by:

NSF_logo.png
fpwr-logo.jpg