Introduction: Annotation With PhageNotes
PhageNotes is an annotation tool used to annotate bacteriophage genomes. Annotating a genome consists of determining the location of the genes within the genome and the function of those genes. In order to determine a likely location for genes, DNAMaster, an annotation software is used. However, this does not always predict the best location for a gene, and additional information is needed to support the presence of a gene.
The annotaters job is to determine if the calls made by the software are correct and record that additional information, and PhageNotes is a tool that makes that easy. PhageNotes provides places to input all of the necessary information and concatenates it in the proper format for submission to GenBank, a database that publishes annotated genomes.
This instructable walks you through the steps of how to annotate a bacteriophage gene using PhageNotes.
Tools Required:
.fasta File of bacteriophage genome (available on phagesdb or GenBank)
DNAMaster (Wine Virtual Platform if using Mac)
PhageNotes
NCBI, PhagesDB, HHPred
Approximate Time to Complete: 30 minutes
Step 1: Preparing DNAMaster
1. If on a windows computer, right click the DNAMaster executable and Run as Administrator. If using a virtual machine on a Mac, open normally.
2. Once the Welcome Screen has loaded, select "Import and Annotate a New (FastA) File" and Begin.
3. When prompted, select a FastA file provided by your instructor or downloaded from an outside source such as NCBI, GenBank, or PhagesDB.
4. Once the FastA file is loaded, click the Annotate button and click Yes when asked if you want to "Erase all features and Annotate."
5. Once the annotation is complete, close the Annotation Log and Profile Windows.
6. Select the features tab on the remaining open screen.
Step 2: Choosing the Start Codon for Your Gene
1. On the Home page of PhageNotes, enter your name on the gene you will be annotating for record purposes.
2. In DNAMaster, go to the DNA tab at the top of screen and select Frames. A window will open up that shows putative reading frames from within the genome. Press the ORF's button. This will show all of the genes predicted by DNAMaster. Find your gene of interest. Click on the left end above the gene to select the whole gene. Now click RBS. A new window will open. This will display the potential starting positions for you gene of interest. Select the start codon with the least negative (largest) Final Score and largest Open Reading Frame (ORF). This is sometimes a difficult call to make.
3. After choosing the start position, return to the features table and update the start position (5') under the description tab if necessary. Record the SD score of the start position chosen in the SD tab on PhageNotes. Indicate how the score ranks among other scores. If the best score was not chosen (to maintain longest Open Reading Frame) indicate this in the notes section. Record the Z-value associated with this score as well.
4. Go to the LO tab on PhageNotes. Indicate whether you chose the Longest Open Reading Frame. If you did not, explain why you made that choice (i.e. poor SD score/Z value).
5. The SCC tab records information regarding the Start and Stop of the gene. In the Start space, input the number found in the 5' end on the DNAMaster Description tab for your gene. For the Stop codon, input the number found in the 3' end in the same tab. To determine if the gene is Forward or Reverse, look at the direction label under the description tab. To calculate the gap or overlap:
Subtract the 5' end of current gene from 3' end of previous gene. If the number is negative this indicates an overlap. Take the absolute value and add 1. Otherwise, this is a gap. Subtract 1. Input the gap or overlap into PhageNotes.
6. Go to the SCS tab on PhageNotes. When asked if the start codon compares with Glimmer, compare the Start Codon you selected to the Original Glimmer Call in the Notes section of the Description tab on DNAMaster. Repeat this process for Genemark. If Genemark is not mentioned in the Notes section, then it is assumed the Genemark call is the same as the Glimmer Call.
Step 3: BLAST Results
1. Click on the gene you are annotating in the Features Table in DNAMaster.
2. On the right hand side, select the Product tab and copy the sequence from this tab.
3. Go to the NCBI BLASTp website https://blast.ncbi.nlm.nih.gov/Blast.cgi?PROGRAM=b... and paste the sequence in the Enter Query Sequence Box.
4. Click the BLAST button and wait for the BLAST to complete. The screen may update several times before it is complete. DO NOT refresh. If you do, your query will be sent to the bottom of the queue and the BLAST will take longer. Once the BLAST is complete, scroll down to alignments and compare the hits. Look at the Expect values on each hit. If there is a result with an Expect value smaller than 1e-5 then this is a good hit. Copy the information about the result (Name of hit) into PhageNotes and record the Expect value under E value. Look at the Q and S numbers on the first hit. This is the alignment of your chosen hit. Record this in the format Q_:S_ after the name of the hit in PhageNotes. If no good hit was found, indicate this on the NCBI Blast Result Page.
5. Go to the PhagesDB Blastp website http://phagesdb.org/blastp/ and paste the sequence into the box prompting for a FastA sequence.
6. Click the BLAST button and wait for the BLAST to finish. Scroll down to the alignments section. Look for the top hit with an E-value less than 1e-5. Record the data on this hit in the PhagesDB Blast Page on PhageNotes. If no good hit was found, indicate this on that page.
Step 4: Determining Function
1. Go to HHPred https://toolkit.tuebingen.mpg.de/hhpred and paste the product sequence used in the previous step into the input box. Click submit. HHPred runs several complex algorithms and predicts complex structures, so it will take several minutes for the algorithms to finish. DO NOT refresh during this time or the session will be killed.
2. Scroll down to the list of hits. Choose the top hit with an E-value < 1e-5. If there are no hits that meet this criteria, then there is no good hit. If there was a hit, enter the information about the hit into PhageNotes. If there was no good hit, indicate that appropriately.
3. Go to the Function (F) tab on PhageNotes. If NCBI, PhagesDB, or HHPred predicted a function (i.e. the listed function is not function unknown) then select this function from the drop down menu. If the function is not present, indicate this in the notes tab. If there is no known function from the three databases or the function does not appear on the drop down list, select No Known Function (NKF).
4. Go to the Function Source (FS) tab. If a function was found, include sources (databases) that support that function call. If NKF was selected, include sources that support an unknown function call.
Step 5: Copying Annotation to DNAMaster
1. Make sure that each section has been filled out completely in the PhageNotes Annotating Tool.
2. Go to the DNAMaster tab on PhageNotes and go to the annotation for the gene you have been working on. Here, the concatenated annotation in the format necessary for submission to GenBank will appear. Copy this annotation.
3. In DNA Master in the features tab, go to the descriptiaon tab on the gene you have been annotating. In the box marked Notes, paste your annotation below the information regarding Glimmer and GeneMark. Click the Post button to save your changes.
You have completed your Annotation!