Generic Genome Browser Version 2: A Tutorial for Administrators

Author: Lincoln Stein, 20 January 2010
Modified for Workshop on Comparative Genomics, North America by Sheldon McKay, July 2011

table of contents

  1. Expected Learning Outcome
  1. The Basics
    1. The Data File
    2. Defining Tracks
    3. Adding Descriptions to a Feature
    4. Adjusting GBrowse Name Searches
    5. Linking
    6. Adding Popup Balloons to Tracks
      1. Customizing Balloons
  2. Displaying Common Types of Features
    1. Multi-segmented features
    2. Protein-Coding Genes
    3. Reading Frames
    4. Grouped Features
    5. Quantitative Data (basic)
    6. Quantitative Data (advanced)
    7. DNA and 3-frame translations
    8. ESTs and Other Alignments
      1. Adding DNA to Alignments
  3. GBrowse Enhancements
    1. Semantic Zooming

0. Expected Learning Outcome

The objective of this learning activity to establish familiarity with, and provide guidance for, the primary steps in configuring GBrowse as a server.

Important: this activity is designed for GBrowse 2.00 and will not work with earlier versions of the software.

1. The Basics

As a starting point it is assumed that you have successfully set up Perl, GD, BioPerl and the other GBrowse dependencies. If you have not, please see the GBrowse HOWTO For this tutorial, we will be using the “in-memory” GBrowse database (no relational database required).

  • data_files/: DNA and features files to load into the local database.
  • conf_files/: GBrowse configuration files for you to take and modify.

On the Ubuntu filesystem, the locations of the above files are:

/var/www/gbrowse2/tutorial/data_files
/var/www/gbrowse2/tutorial/conf_files

The location of the “live” configuration files on the Ubuntu file system is:

/etc/gbrowse2

The location of the “live” data files is:

/var/www/gbrowse2/databases/volvox

We will be working with simulated Volvox genome annotation data. The database will be named “volvox” and GBrowse will be invoked with this URL:

http://localhost/cgi-bin/gb2/gbrowse/volvox

You’ll now edit the main GBrowse.conf configuration file to tell it about the new data source. Open/etc/gbrowse2/GBrowse.conf with a text editor.

$  cd /etc/gbrowse2 
$  sudo gedit GBrowse.conf

Now paste in the stanza:

[volvox]
description  = Tutorial database
path            = volvox.conf

Be sure to leave a blank line between the bottom of the previous stanza and the top of the new one (i.e., there should be a blank line above “[volvox]”).

You should now be able to view the data set. Point your web browser at http://localhost/cgi-bin/gb2/gbrowse/volvox and type in “ctgA” in the search box. The result is shown in Figure 1.

Figure 1: volvox_remarks.gff3 data with volvox.conf config file.

If You are Having Problems…

If for some reason you get a blank page or an “Internal server error,” there are a couple of things to check. First, open the file volvox.conf with a text editor (“Notepad” on Windows systems, emacs, pico or vi on UNIX systems) and confirm that the path to the volvox database directory in this section is correct:

[GENERAL]
db_adaptor    = Bio::DB::SeqFeature::Store
db_args       = -adaptor memory
		-dir     '/var/www/gbrowse2/databases/volvox'

If there is a space in /var/www/gbrowse2 then you must be certain to put single quotes around the path as shown in the example above.

Next check that the volvox_remarks.gff3 file does exist inside the volvox database directory and that it is readable by all users on your system. Similarly, check that the volvox.conf configuration file is in the same directory as yeast.conf, and that it is readable by all users on your system.

Microsoft Windows has an unpleasant tendency to add a .txt extension to files without warning. If something seems to be wrong with the config or GFF file and you can’t figure out what, check that the file extension hasn’t been modified. To avoid this phenomenon, I suggest that you select All File Types from the popup menu in theFile -> Save dialog. You might also want to configure your Folder display to show known file extensions.

If you’re still having no luck, check the bottom of the Apache server error log for error messages. This file is located in various places depending on how Apache is installed. Look for the file error_log, typically located in/usr/local/apache/logs, C:\Program Files\Apache Group\Apache2\logs, /var/log/www, or /var/log/httpd. The error message will usually point you in the right direction.

If this doesn’t fix the problem, please ask an instructor for help or send an e-mail to GBrowse support at [email protected]. Someone will be happy to assist you.

1.1 The Data File

Let’s look at the data file we loaded in detail, now. If you open the volvox_remarks.gff3 file in a text editor, you will see that it contains a series of 15 genome “features” that look like this:

ctgA example contig 1     50000 . . . Name=ctgA
ctgA example remark 1659  1984  . + . Name=f07;Note=This is an example
ctgA example remark 3014  6130  . + . Name=f06;Note=This is another example
ctgA example remark 4715  5968  . - . Name=f05;Note=Ok! Ok! I get the message.
ctgA example remark 13280 16394 . + . Name=f08
...

Each feature has a “source” of “example”, a type of “remark”, and occupies a short range (roughly 1.5k) on a contig named “ctgA.” In addition to the features themselves, there is an entry for the contig itself (type “contig”). This entry is needed to tell GBrowse what the length of ctgA is.

The load file uses a standard known as GFF3 (General Feature Format version 3). Each line of the file corresponds to a feature on the genome, and the nine columns are separated by tabs.