The Globe and Mail UK

Your Global Mail

deepvariant quay.io/mlin/glnexus:v1.2.7
Blog

Deepvariant quay.io/mlin/glnexus:v1.2.7

In the fast-evolving world of genomics, precision and scalability are key. DeepVariant, developed by Google, is a highly accurate variant caller that uses deep learning to identify genetic variants from next-generation sequencing (NGS) data. On the other hand, GLnexus is a powerful tool for aggregating and analyzing variant data from multiple samples. The container image quay.io/mlin/glnexus:v1.2.7 is the latest version of GLnexus, optimized for large-scale genomics projects.

This article will guide you through the key features, installation, and usage of DeepVariant and GLnexus, focusing on version v1.2.7.

What is DeepVariant?

DeepVariant is an open-source tool that uses deep neural networks to call variants from NGS data, specifically from high-throughput sequencing platforms like Illumina. It translates raw sequence data into variant calls by transforming the sequencing data into images and applying a deep learning model trained to distinguish between true variants and sequencing errors.

Key Features of DeepVariant:

  1. High Accuracy: DeepVariant achieves a high level of accuracy in identifying single nucleotide polymorphisms (SNPs) and insertions-deletions (indels).
  2. Deep Learning Models: It utilizes convolutional neural networks (CNNs) trained on a wide range of genomic datasets.
  3. Support for Various Sequencing Platforms: DeepVariant can handle sequencing data from platforms like Illumina, PacBio, and Oxford Nanopore.
  4. Versatile Input: DeepVariant can take input in BAM/CRAM formats, generating output in standard VCF (Variant Call Format).

What is GLnexus?

GLnexus is an open-source tool designed to perform joint genotyping on multiple samples. It allows you to aggregate multiple VCF files and then analyze them as a single dataset, facilitating cohort-level analysis.

Key Features of GLnexus:

  1. Joint Genotyping: GLnexus is tailored for multi-sample genotyping, making it a great tool for projects involving large cohorts.
  2. High Performance: The tool is optimized to scale efficiently for projects with hundreds or even thousands of samples.
  3. Compatibility: GLnexus supports VCF files generated by various variant callers like DeepVariant and GATK.
  4. Customizable Output: It allows fine-tuning of output variant calls, ensuring researchers can control the balance between sensitivity and precision.

Overview of quay.io/mlin/glnexus

The container image quay.io/mlin/glnexus:v1.2.7 is the latest stable version of GLnexus. It is available on Quay.io, a container registry service. This containerized version ensures that GLnexus runs efficiently across different computing environments, avoiding common dependency issues.

Key Updates in v1.2.7:

  1. Improved Performance: Enhanced algorithms for faster variant aggregation and joint genotyping.
  2. Bug Fixes: Resolved minor issues from previous versions for more reliable results.
  3. Compatibility Enhancements: Better integration with variant callers like DeepVariant and optimized handling of multi-sample data.

How to Use quay.io/mlin/glnexus

GLnexus is a command-line tool, and version v1.2.7 can be easily pulled from the Quay.io repository and run in any containerized environment like Docker or Kubernetes.

Step-by-Step Instructions:

  1. Pull the Image: To pull the latest version of GLnexus from Quay.io, run the following command:
    bash
    docker pull quay.io/mlin/glnexus:v1.2.7
  2. Run GLnexus: Once the image is pulled, you can run GLnexus using Docker:
    bash
    docker run -v /path/to/data:/data quay.io/mlin/glnexus:v1.2.7 \
    /data/input.vcf /data/output.vcf
  3. Input and Output:
    • Input: Provide one or more VCF files that need joint genotyping.
    • Output: The resulting joint genotyped VCF file will be generated.

Integration with DeepVariant

DeepVariant and GLnexus work seamlessly together, especially in large genomic projects where multiple samples are sequenced and analyzed. After running DeepVariant on individual samples, you can aggregate the resulting VCF files using GLnexus to perform joint genotyping across all samples.