Using SeqWeb - Protein Pattern Recognitions

We are going to use the LacI sequence for this analysis. The crystal structure of LacI was determined in 1996 and is shown in the image below that was pulled from that page.

Figure 5. From Lewis et al., (1996) Science 271(5253):1247-1254. (A) View of the lac repressor-DNA complex monomer where the domains have been colored separately. The DNA helix-turn-helix binding domain is colored in red, the DNA binding hinge helix is shown in yellow, the NH2-terminal subdomain of the core is light blue, the COOH-terminal subdomain of the core is dark blue, and the helix of the tetramerization domain is purple. (B) Topology diagram of the lac repressor-DNA complex monomer. Helices are depicted as circles and strands as squares. The DNA binding domain consists of helix 1 (residues 6 to 12), helix 2 (17 to 25), helix 3 (32 to 45), and the hinge helix 4 (50 to 58) (red and yellow). The NH2-terminal subdomain of the core (light blue) is comprised of strand A (63 to 68), helix 5 (74 to 90), strand B (92 to 98), helix 6 (104 to 116), strand C (121 to 124), helix 7 (131 to 137), strand D (145 to 149), strand E (158 to 161), helix 13 (293 to 309), and strand K (316 to 318). The COOH-terminal subdomain of the core (dark blue) consists of helix 8 (164 to 175), strand F (182 to 185), helix 9 (192 to 205), strand G (214 to 217), helix 10 (222 to 233), strand H (240 to 244), helix 11 (247 to 259), strand I (269 to 274), helix 12 (279 to 281), strand J (287 to 290), and strand L (322 to 324). The helix of the tetramerization domain (purple) is helix 14 (340 to 357) and residues 358 to 360 are disordered. The linkers between the NH (2-terminal) and COOH-terminal core subdomains are highlighted in green. N-terminus, NH2-terminus; C-terminus, COOH-terminus. From: Lewis: Science, Volume 271(5253).March 1, 1996.1247-1254

We are using this sequence becuase many of the features are already known for it. Notably there is a helix-turn-helix motif at the start of the protein. This motif is a common one in protein sequences that bind DNA, and we can predict that from some of the analyses that we will do.

Helix Predictor

The HelicalWheel program can predict whether there are helices in short regions of proteins. We can use this program with just the N-terminal region of lacI.

Load the sequence into SeqWeb, and run the HelicalWheel program. Your output will look something like this, although I have added amino acid numbers to make the results clearer.

The three residues in blue (G9, V10, S11) form the turn in the helix-turn-helix. However the distribution of the other sequences clearly demonstrates the helical wheel.

Helix-Turn-Helix Motifs

In addition to just predicting individual helices, you can also predict helix-turn-helix motifs. This is better than the HelicalWheel program because it can use the whole sequence.

  1. Use the full length lacI sequence. To obtain this sequence, we will add it from the database. Click the   sequence manager icon, and then select Add and click Database as highlighted below:
  2. Choose SWISSPROT from the pull down menu, and enter the accession number: P03023
  3. Check the box next to the appropriate sequence:

    and click Add selected

  4. Now, go back to SeqWeb: and choose HTHScan:

  5. Select the protein, and click run:
  6. Your results should look like this:

 

HTHScan Results

HTHScan February 27, 2002 16:03

Weight matrix: htharac.dat Minimum score for H-T-Hs (threshold): 4.0
SequenceHit #ScoreProbability
laci_ecoli.swissprot111.13.420E-03
> sequence: laci_ecoli
      name: laci_ecoli.swissprot  check: 1946  from: 1  to: 360

   1. 6 LYDVAEYAGVSYQTVSRVVN 25
      Score: 11.1
      Probability: 3.420E-03


  Input sequences searched: 1
  Number of sequences with predicted H-T-Hs: 1
  CPU time (sec): 0.22

This shows that there is a helix-turn-helix between positions one and 25, as shown in the crystal structure above

Coiled-Coil Motifs

CoilScan can look for coiled-coil motifs, regions of the protein that could be involved in leucine-zipper type dimerizations.

  1. Select coilscan from the seqweb menu:

  2. Choose the lacI protein sequence, and set the window size to 14:

  3. click run:
  4. The results will look like this:

You will also have a table that shows the probability that each amino acid is part of a coil.

Note that this protein correctly predicts the C-terminal coil that is part of the tetramerization domain of LacI and is shown in the crystal structure above