CPANG18 Computational PANGenomics
Reference genomes have become central to bioinformatics approaches, and form the core of standard analyses using contemporary sequencing data. However, the use of linear reference genomes, which provide the sequence of one representative genome for a species, is increasingly becoming a limitation as the number of sequenced genomes grows. In particular, they tend to bias us away from the observation of variation in the genomes we study.
A general solution to this problem is to use a pangenome that incorporates both sequence and variation from many individuals as our reference system. This pangenome is naturally modeled as a graph with annotations, and can provide all the functionality traditionally provided by linear reference genomes. Unlike linear reference genomes, a pangenome readily incorporates both small and large variation, allowing bias-free genotyping at known alleles.
In this course we will explore the use of modern bioinformatic tools that allow researchers to use pangenomes as their reference system when engaging in studies of organisms of all types. Such techniques will aid any researcher working on organisms of high genetic diversity or on organisms lacking a high-quality reference genome. This course targets all researchers interested in learning about an exciting paradigm shift in computational genomics.
Participants first will learn about limitations of linear reference-based methods and work through a brief refresher or introduction to standard approaches for processing sequencing data, including read alignment and variant calling. Provided these motivating examples, we will use data from a variety of relevant sources to develop an intuition about pangenomic methods and a practical familiarity with applicable tools.
You can see the detailed program through the link below.
Register until February 23rd, following the instructions in the link below.
Contact: For any questions regarding this course, please contact GTPB (contacs below)