Using VectorNTI Part 2

previousnext

In this module we will learn some of the advanced functions of the vectorNTI package. We will begin by translating a nucleotide sequence, and then using BLAST to compare that sequence to known sequences. Then we will use alignX to align all those sequences that are returned.

Translating a nucleotide sequence

Open the BaculoDirect Linear DNA sample that is 139370 bp long.

Choose the last CDS on that has introns and then right click and choose Translate-> feature into new protein. Select the sub-base in which to save the protein, and click OK. The protein display view will open.

The protein display view is similar to the nucleotide display with which we have worked previously.

Backtranslation

VectorNTI can back translate from a protein to a suggested nucleotide sequence. Because of degeneracy in the codon usage table, vectorNTI gives you choices of both codon usage and the degree of redundancy to allow in the sequence.

To backtranslate a sequence, choose BackTranslate from the Analyze menu. This will open the backtranlation window:

There are several things to note about this window:

  • You can select the genetic code from the pull down menu
  • You can alter the amount of redundancy by using the sliding bar from most ambiguous to most probable
  • You can toggle between 1-letter and 3-letter amino acid codes.
  • You can copy all, or part, of the nucleotide sequence.

BLAST search in vectorNTI

Now we will compare this protein sequence to the NR database using BLAST:

  1. From the tools menu, select BLAST search.
  2. Check the whole sequence box, and click OK
  3. Choose the NCBI-BLAST server and click OK
  4. The search window will open:
      Note that you can choose the program to run (usually BLAST2.0), the program (nucleotide or protein searches) and the database that you want to search.

      You may also edit all the parameters in the BLAST search by clicking on the Parameters and Matrix tabs in this window.

  5. Once you have set your conditions, press the Submit button.
      Note that the display in the lower pane changes to show the status of the search:
    • Initially it will say that the search is Not ready, and is in prgress
    • Next, it will say that the results are being retrieved
    • Finally, it will display Finished, and you may view the results
  6. Double click the search icon to open the results.
  7. Note that there are several panes in this window:

    1. Results summary.
    2. The pane on the left contains a summary of both the query sequence and the results. Click on any result to show the alignment and overview of the results. The folders contain summaries of all the data, including the degree of similarity and the E value.
    3. The right pane is separated into three separate views:
      • The sequence profile/hit distribution.
      • This view contains a summary of the query sequence. The sequence profile, shown in green, is a representation of how well each base or amino acid is conserved in all the queries. The Hit distribution is a count of all the hits at each position. Hence, you can have a region that is not very well conserved, but similar to a lot of proteins, and this will have a low green peak and a high blue peak, you can have a sequence that is well conserved over a few proteins (high green peak, low blue peak), or any combination of these.
      • The alignment profile
      • The regions of similarity between the query sequence (top line) and database sequence (bottom line) are highlighted. If a protein matches in more than one spot (as shown in the example above) the active alignment is highlighted and the others are greyed out. You may select the alternate alignments by clicking either in this pane or on the folders on the left hand side.
      • The hit overview (hit map graph)
      • Shows where each of the database proteins is similar to the query sequence. This is clickable, any of the lines my be clicked to highlight that alignment.
    4. At the bottom is the sequence alignment between the query sequence and the active database sequence. Conserved residues are normally shown as red characters on a yellow background, and similar as black characters on a green background. Note that you can alter the appearance of this, or any of the panes, by right clicking and choosing display setup.

    Notice that the sequence has 39 hits, and many of them appear to be repeats. When we translated the protein sequence we did not splice the mRNA to remove introns.

    We will now repeat this analysis removing the introns.

    1. Begin at the BaculoDirect Linear DNA sequence view
    2. Highlight the same ORF and select Translate -> CDS with Splicing into New Protein.
    3. Another protein window will appear. This is a smaller protein (as we have removed the introns) and we can now BLAST that against the database as before.
    4. Click Tools --> BLAST Search and click OK to the next two boxes.
    5. Click Submit to perform the BLAST search.
    6. When the results come back, open the results to view them.

    Notice now that there are only 53 hits, and every hit only matches a single region of the query protein. In the previous search, several hits matched multiple regions of the protein.

    Also notice that the sequence profile and hits distribution have altered, suggesting that the results from the second search are more meaningful than from the first search.

previousnext