Search all

Revvity Sites Globally

Select your location.

*e-commerce not available for this region.

Australia

Austria

Belgium

Brazil *

Canada

China *

Denmark

Finland

France

Germany

Hong Kong (China) *

India *

Ireland

Italy

Japan *

Luxembourg

Mexico *

Netherlands

Norway

Philippines *

Republic of Korea *

Singapore *

Spain

Sweden

Switzerland

Thailand *

United Kingdom

United States

Blog

NGS

Aug 12th 2024

2 min read

Challenges in analyzing data with UMI-UDIs in next-generation sequencing.

Help us improve your Revvity blog experience!

Feedback

As next-generation sequencing (NGS) technologies continue to advance, researchers have increasingly sought to maximize the accuracy and efficiency of sequencing data analysis. One promising approach has been the adoption of unique molecular identifiers NEXTFLEX UDI-UMI Barcodes (1-8) Discover (UMIs) and unique dual indexes NEXTFLEX UDI Barcodes (10NT, 1-1,536) Discover (UDIs). These molecular tags facilitate the identification and quantification of individual DNA or RNA molecules. Despite their potential, incorporating UMI-UDIs into NGS workflows comes with its own set of challenges.

UMIs are short, random nucleotide sequences added to individual DNA or RNA molecules during library preparation. These identifiers enable the detection of PCR duplicates or errors by allowing the identification of unique, original molecules. Similarly, UDIs consist of two sets of unique nucleotide sequences introduced during library preparation. UDIs aid in distinguishing between different samples in multiplexed sequencing experiments, minimizing the potential for cross-contamination.

Error correction in UMIs and UDIs

Molecular barcodes such as UMIs and UDIs are not immune to errors introduced during PCR amplification or sequencing, especially in patterned flow cells. These errors include polymerase misincorporation, template switching, and PCR mediated recombination. Implementing efficient error correction methods for UMI-UDIs is crucial to ensure the reliability of sequencing results.

Computational complexity

The addition of UMIs and UDIs increases the computational complexity of downstream analysis. Processing these unique identifiers requires specialized software packages such as UMI-tools, AmpUMI, UMIAnalyzer or UMI-VarCal. Existing bioinformatic tools and workflows must be adapted or redesigned, which often means that ability to work with command line scripts is required.

Data storage resources

UMIs and UDIs significantly increase the volume of metadata associated with each sequencing run. Managing and storing this additional data can strain existing infrastructure, necessitating investments in more advanced data storage solutions and efficient data management practices. Ensuring data integrity and accessibility over time adds another layer of complexity to NGS projects.

Standardization and compatibility

Currently, there is no standardized approach for incorporating UMIs and UDIs into NGS workflows, resulting in a diverse array of protocols and techniques adopted by different laboratories. This complicates the comparison and integration of datasets generated from various sources.

In conclusion, while UMIs and UDIs offer significant advantages for enhancing the accuracy and reliability of NGS data analysis, their implementation presents a range of technical, computational, and practical challenges. Addressing these challenges will be essential in maximizing the potential of UMI-UDIs and facilitating their widespread adoption in NGS-based research.

References:

Li, H., Wang, C., Qi, X., & Ma, T. (2020). Unique molecular identifiers: the way forward for single-cell and beyond. Genome Biology, 21(1), 128. https://doi.org/10.1186/s13059-020-02018-w
DeRosa, M. C., Lee, K. K., Afgan, E., Hall, A. B., Amodio, J. M., & Rands, C. M. (2018). Barcoding bias in high-throughput multiplex sequencing. Scientific Reports, 8(1), 1341. https://doi.org/10.1038/s41598-018-19680-8
Gervais, C., & Meneghini, M. D. (2019). UMI-Red: efficient estimation of unique molecular identifiers (UMIs) in scRNA-seq datasets. Bioinformatics (Oxford, England), 35(12), 2192–2194. https://doi.org/10.1093/bioinformatics/btz097
Gierliński, M., Hauschild, A. C., Rizzardi, L., Hall, R. J., Olshen, A. B., & Andersen, C. L. (2020). Error-corrected unique molecular identifiers improve single-cell RNA-sequencing accuracy and reproducibility. Genome Biology, 21(1), 135. https://doi.org/10.1186/s13059-020-0

Help us improve your Revvity blog experience!

Feedback