Next-generation sequencing (NGS) technologies have significantly advanced the field of agrigenomics, making genome analysis more affordable, accessible, and practical for a range of agricultural applications. With these advancements, researchers must choose between several sequencing strategies, notably Skim-seq and low-pass whole genome sequencing (lp-WGS) . Understanding the differences, strengths, and limitations of each method is essential to selecting the appropriate sequencing strategy for specific agrigenomic objectives.
Skim-seq
Definition and Sequencing Coverage: Skim-seq involves sequencing genomes at very low coverage (typically 0.01x–0.5x), enabling cost-effective genome-wide single nucleotide polymorphism (SNP) genotyping through imputation. The name originates from “genome-skimming”, which refers to low coverage.
Common Applications: Primarily used in genomic selection and SNP genotyping, skim-seq efficiently handles large-scale population studies, breeding programs, and genomic predictions.
Integration with Reference Panels and Imputation: Skim-seq relies heavily on imputation—predicting missing genotypes based on reference panels of previously sequenced individuals—to fill gaps in low-coverage data. High-quality reference panels are essential to achieve accuracy.
Pros and cons:
Pros:
- Extremely cost-effective
- Ideal for high-throughput genotyping and breeding programs
Cons:
- Lower resolution
- Highly dependent on reference panel quality
- Less effective for structural variant detection
Low-pass Whole Genome Sequencing (lp-WGS)
Definition and Sequencing Coverage: lp-WGS sequences the entire genome at moderate-to-low coverage (typically 0.5x–4x), offering broader genomic profiling at a manageable cost.
Primary Applications: Effective for detecting copy number variations (CNVs), genomic diversity analysis, and genomic prediction tasks. It supports broad genomic surveillance, particularly useful in breeding programs that require genome-wide insights.
Workflow Compatibility: lp-WGS integrates smoothly with standard WGS workflows, leveraging standard laboratory and analytical pipelines, making it practical and flexible.
Pros and cons:
Pros:
- Good genomic resolution for CNVs and genomic diversity
- Relatively straightforward analysis
Cons:
- Slightly higher cost compared to Skim-seq
- Lower efficiency in SNP-specific genotyping without additional imputation
Key differences
Feature | Skim-seq | lp-WGS |
---|---|---|
Sequencing Depth | 0.01x - 0.5x | 0.5x - 4x |
Data Type | SNP genotype/imputation | Genome-wide CNVs, genomic diversity |
Typical Applications | SNP genotyping, genomic selection | CNV detection, genomic diversity |
Reference Panel Requirement | Essential | Optional, depending on application |
Reliance on Imputation | High | Moderate to Low |
Cost | Lower | Moderate |
Role of imputation
Imputation is a statistical method used to infer missing genetic data. In Skim-seq, imputation is critical because the ultra-low coverage inherently produces sparse data. The accuracy of SKIM-seq thus depends greatly on the quality of the reference panel. In lp-WGS, imputation can be used to enhance SNP-level resolution, but it’s not always necessary, particularly when analyzing CNVs or larger genomic features.
Practical considerations
Choosing between Skim-seq and lp-WGS depends on several factors:
- Project Goals: Skim-seq suits SNP-focused studies or large-scale genotyping, while lp-WGS is ideal for broader genomic investigations.
- Budget Constraints: Skim-seq is more budget-friendly for extensive population studies; lp-WGS offers a balanced cost-performance ratio for smaller or CNV-focused projects.
- Resource Availability: Quality of reference panels and computational resources should inform the choice, especially for imputation-dependent Skim-seq.
- Data Analysis Capacity: lp-WGS requires less reliance on complex imputation, making it preferable for teams with limited analytical expertise.
Conclusion
Both Skim-seq and lp-WGS offer valuable capabilities in agrigenomics, but their suitability depends on specific research goals, budget, and analytical capacity. Carefully evaluating these considerations will help researchers select the optimal sequencing strategy to achieve precise, cost-effective, and meaningful genomic insights.
References:
- Browning BL, Browning SR. "Genotype Imputation with Millions of Reference Samples. American Journal of Human Genetics, 2016. https://doi.org/10.1016/j.ajhg.2016.07.020
- Golicz AA, et al. "Low coverage sequencing for the characterization of genetic variation in crops." Plant Biotechnology Journal, 2020. https://doi.org/10.1111/pbi.13393
- Li H, et al. "The Sequence Alignment/Map format and SAMtools." Bioinformatics, 2009. https://doi.org/10.1093/bioinformatics/btp352
- Wang W, et al. "Genomic selection methods for crop improvement: Current status and prospects." The Crop Journal, 2018. https://doi.org/10.1016/j.cj.2018.03.001
- Rowan BA, et al. "Rapid and inexpensive whole-genome genotyping-by-sequencing for crossover localization and fine-scale genetic mapping." G3: Genes, Genomes, Genetics, 2019.