Multimodal Information for Early Cancer Detection by using cfDNA TAPS Technology | Literature Review

About the corresponding author:

Dr Chunxiao Song received his PhD from the University of Chicago in 2013. From 2013 to 2016, he did postdoctoral research with Stanford University in the United States. Since 2016, he has been a research team leader at the Ludwig Institute for Cancer Research, University of Oxford, UK.

In 2019, Dr Chunxiao Song and his research partner Benjamin Schuster-Boeckler published an article in Nature Biotechnology, reporting a new DNA methylation sequencing method TAPS technology, and published 6 articles on TAPS technology from April 2019 to September 2021 Relevant high-level articles covering the advantages of TAPS in long-read sequencing, common types of detection of different epigenetic modifications, comparison of methylation and hydroxymethylation with BS methods, etc.

The article today mainly demonstrates the feasibility of the TAPS method by constructing methylation models for liver and pancreatic cancer.

Literature Background

①Current cancer-related research provides new methods for cancer treatment, but early detection is still the best chance for cancer treatment. Early treatment not only greatly improves patient survival but also greatly reduces costs. However, the existing detection methods cannot meet this demand. Circulating cell-free DNA (cfDNA) — DNA free in plasma, derived from dead cells in a variety of healthy and diseased tissues — holds great potential for developing methods for early cancer detection.

② Genetic information in cfDNA, such as mutations and copy number variations (CNVs), has potential utility in monitoring cancer progression and treatment. However, the detection of genetic variants is challenging given the low proportion of tumor ctDNA in early-stage disease. Furthermore, genetic mutations have insufficient information on the tissue of origin required to determine the location of origin of the malignancy. Instead, broad epigenetic changes, such as DNA methylation in cancer cells and the tumor microenvironment, occur early in tumorigenesis. Recent studies have shown that cfDNA methylation is one of the most promising early-stage cancer biomarkers, providing thousands of methylation changes that can be combined to overcome detection limitations and tissue-derived information, allowing high confidence in cancer localization.

③ DNA methylation is determined by whole-genome, base-resolution and quantitative sequencing methods, such as bisulfite sequencing. However, bisulfite sequencing has serious DNA damage, the data matching rate is low, and the sequencing cost is expensive. Therefore, current cfDNA methylation sequencing is limited by low depth, targeted or low-resolution and qualitatively enriched sequencing. Therefore, the cfDNA methylome cannot be fully captured.

Technical Background

01 Bisulfite seq:

Advantages: The gold standard for methylation sample pretreatment, the sample conversion rate is as high as 99.5%, and this technology is used to draw methylation maps (growth and development, tumors, complex diseases, etc.)

Disadvantages: The use of bisulfite treatment will cause damage to the sample, which will be cleaved into fragments smaller than 1.5kb, and some regions will be lost; the conversion of C-T in the sample reduces the complexity of the sample, resulting in low sample sequencing quality and unique matching rate. lower (Figure 1).

02 EM-seq:

Advantages: Consistent with the gold standard comparison results, the conversion rate is 98-99%, the enzymatic conversion is used, the reaction is mild, and the fragment will not be damaged; the data quality is high, and the original sample is preserved.

Disadvantages: The transformation of C-T in the sample reduces the complexity of the sample. The transformation process needs to be completed in two steps, and the transformation efficiency is greatly affected by the operation of the experimenter.

TAPS (developed by Dr Song Chunxiao’s team): Consistent with the gold standard comparison results, enzymatic conversion of 5mC-T (4%), etc., the reaction is mild and will not cause damage to the fragment, and the fragment can reach 10kb; the sample complexity is high, and the data The quality is high, the unique matching rate is high, the cost of sequencing is reduced, and the original sample is preserved, which can be combined with other variants for detection.

Experimental design:

In this paper, cfDNA TAPS (cfTAPS) was optimized to provide high-quality and high-depth genome-wide methylome from only 10 ng of cfDNA. A total of 87 samples were used in the experiment, of which 2 samples had a transformation efficiency lower than 85 %, excluding the subsequent analysis of samples, the average 5mC conversion rate of 85 samples (HCC: 21, PDAC: 23, Control: 30, pancreatitis: 7, liver cirrhosis: 4,) was 97.0%. We demonstrate that the rich information from cfTAPS enables the integration of multimodal epigenetic and genetic analyses of differential methylation, tissue origin, and fragment profiles to accurately distinguish cfDNA samples from patients with HCC and PDAC and those with precancerous inflammation.


Adaptation for cfTAPS sequencing

Figure 2: A and B show the experimental process and transformation principle. C shows the comparison of the proportion of total reads, unique matches and deduplicated unique matching data in the total data in the 87 cfDNA TAPS libraries. Panel D 5mC conversion and false-positive rates based on 85 cfDNA TAPS libraries with modified or unmodified cytosines at known positions. The estimated false positive rate (conversion rate of unmodified C) based on the peak of the unmodified amplicon was 0.28%, confirming that cfTAPS can detect 5mC in cfDNA with high sensitivity and specificity.

Genome-wide DNA methylation from cfTAPS

Samples: 52% of PDAC and 67% of HCC patients were in stage I and II, comparing the proportion of genome-wide methylation of cfDNA in cancer and control samples. Mean CpG methylation levels in control samples were similar to those in cancer cfDNA, and only a few samples in liver cancer were globally hypomethylated; principal component analysis was performed on HCC and PDAC, which could be partially distinguished from controls; The top 200 sites with the highest correlation of principal component 2 of PDAC were annotated and found to be enriched in enhancers; the top 200 sites with the highest correlation of principal component 1 of PDAC were annotated and found to be enriched in promoters.

Differential DNA methylation from cfTAPS

The ROC curve based on the classification performance of the model for differentially methylated enhancers in HCC and controls and the ROC curve based on the classification performance of the model based on differentially methylated promoters between PDAC and controls, respectively, yielded the LOO cancer prediction score, which can be used to convert 3 cases of cirrhosis (out of 4 cases) and 6 cases of pancreatitis (out of 7 cases) were also distinguished from carcinoma.

Source model for cfTAPS organization

We collated methylation data on cpg levels from 144 publicly available tissue and blood cell WGBS and classified them into 32 physiologically distinct tissue and blood cell types, including liver tumor tissue. Considering the ubiquity of tissue-specific DNA methylation in enhancer regions, we constructed a tissue methylation reference map of enhancer clustering and calculated tissue in cfTAPS samples by performing non-negative least squares regression (NNLS). contribute. We observed a significantly increased contribution from liver tumors in HCC alone and from memory T cells in PDAC samples. cfTAPS provides valuable tissue-derived information for early cancer detection.

cfTAPS Fragmentation Model

The cfDNA fragmentation pattern detected with cfTAPS was consistent with the cfDNA fragmentation pattern generated by whole-genome sequencing (WGS), the main peak was located at 167 bp, the secondary peak was located at 320 bp, and the smaller peak below 167 bp had a periodicity of 10 bp, Reflects nucleosome fragmentation patterns. Compared with controls, cancer patients had a higher proportion of <150bp fragments and a lower proportion of 310-500bp fragments. It was further confirmed that cfTAPS preserves the fragment information of cfDNA.

We developed a new method to characterize the fragmentation profile of cfDNA using cfTAPS, but fragment-based models did not discriminate cirrhosis or pancreatitis well from cancer compared with methylation-based models, indicating that fragments It carries less cancer-specific information.

Multiple Cancer Detection Using cfTAPS

The top five DMRs for each pairwise comparison (non-cancer control vs HCC, non-cancer control vs PDAC, HCC vs PDAC) were selected as features in the multi-cancer differential methylation model. We trained a support vector machine (SVM) model to estimate the respective probability that the blood samples came from each group. We found that the methylation model could achieve an overall accuracy of 0.77, outperforming both the tissue contribution model and the fragment feature model.

In order to further improve the performance of multi-cancer detection, a mixed model combining three features of differential methylation, tissue traceability and fragment pattern was constructed. The overall accuracy (accurate classification) of the combined model reached 0.86 (64/74), distinguishing cancer and non-cancer accuracy 0.92 (68/74), highlighting the benefits of multimodal information for cancer type prediction.


The methylation library was constructed using a low starting amount of 10ng cfDNA, and it was applied to liver cancer and pancreatic cancer. It showed good performance. Compared with cfDNA WGBS, the advantages were demonstrated in this study: ①reduce the sample input, ②get Higher data quality, ③more complete methylation site model, ④more retention of genetic information; deep sequencing through cfTAPS, obtained cfDNA methylation and gDNA methylation difference regions and found methylated organisms Markers for early cancer detection.

In the present study, CNVs and fragment information were extracted from cfTAPS, which were lost in cfDNA WGBS. We further demonstrate that an integrated approach combining differential methylation, tissue origin, and fragment profiles can improve model performance for multiple sub-assays.


First, it has been tested in a relatively small number of patients. A methylation-based HCC classifier capable of being validated on an independent cfDNA HCC WGBS dataset. However, due to the limited sample size and lack of available public cfDNA PDAC genome-wide methylation data, the results of this study have not been validated in a larger, independent cohort.

Second, due to limited publicly available genome-wide tissue methylation data (e.g., no PDAC tissue data is available) and comparative reliance on the WGBS database, current studies are far from exploiting the full potential of tissue origin information for cfTAPS.

For more information, you are welcome to visit our social media pages:



Back to list

Leave a Reply

Your email address will not be published. Required fields are marked *