TURNIP Documentation

Introduction

TURNIP stands for Tracking UnResolved Nucleotide Polymorphisms.

It is a software package comprising a set of Perl scripts and modules to search for variation within unassembled genome sequences. The main difference between TURNIP and other such programs is that TURNIP has been specifically built to handle large repetetive sequences, such as the ribosomal DNA tandem repeats in Saccharomyces yeasts. This family of yeasts can have up to, and potentially in excess of, 150 tandem repeats of the ~9Kb rDNA array. Given these high copy sequential areas, they are nigh on impossible to assemble and compare directly to a consensus. TURNIP allows sequencing reads to be aligned to a single rDNA copy consensus region (derived from other seqencing experiments or previous TURNIP run to deduce a suitable consensus) and investigated to derive fine-scale microheterogeneity such as SNPs, indels and pSNPs.

pSNPs are "partial SNPs" that cover a proportion of reads stacked up (BLAST'ed and MUSCLE aligned) against the single rDNA copy consensus (see Fig. 1). Whereas a SNP will differ in 100% of the subject reads with respect to the consensus, a pSNP will differ in a only proportion of these.


More information about TURNIP and pSNPs can be found in the following resources:
  • TURNIP is presented as an Applications Note in Bioinformatics: http://bioinformatics.oxfordjournals.org/content/26/22/2908
  • NRP Bioinformatics Symposium talk, John Innes Centre, 2nd Dec 2009 PDF
  • Annual Science Day talk, Institute of Food Research, 2008 PDF
  • Repetitive sequence variation and dynamics in the ribosomal DNA array of Saccharomyces cerevisiae as revealed by whole-genome resequencing Stephen A. James, Michael J.T. O'Kelly, David M. Carter, Robert P. Davey, Alexander van Oudenaarden and Ian N. Roberts Genome Res. 2009. 19:626-635 DOI: 10.1101/gr.084517.108

Howto

Learn how to install and use TURNIP.

Perldoc

Perldoc documentation for TURNIP can be found below:

Core modules:
Turnip.pm
HitSeries.pm
IndexDict.pm
BlastFactory.pm
Prefs.pm
Utils.pm

Helper scripts:
turnip.pl
process_fastq.pl
storable_convert.pl
matrix_gen.pl
tree_gen.pl

GBrowse Documentation

Extra Glyphs

The TURNIP suite comes with the following new custom glyphs for GBrowse to display the GFF data that is produced by TURNIP.

turnip_pie_multi.pm - this glyph shows the raw TURNIP frequency data of each variation type as a pie chart. This file needs to be placed in the same directory as the other BioPerl glyphs, e.g. /usr/local/lib/perl5/site_perl/5.8.8/Bio/Graphics/Glyph/ (Perldoc)

Configuration

A sample GBrowse configuration file is available that provides an example of how to set up GBrowse to accept TURNIP data and display it using the custom glyphs outlined above. It is usually placed in the /etc/httpd/conf/gbrowse.conf/ directory, or wherever your GBrowse conf files reside if you have a non-default installation.