Primo Optimum 3.4 Help: Optimum For Gene Expression

About Primo Optimum 3.4: Online

Primo Optimum is designed for optimal expression of genes of interest in a host organism (e.g., E. Coli, yeast, plants, or mammalian cell lines). Different species have different codon usage frequencies. For high level expression in a host, optimization of the gene sequence to match the host's codon bias increases yield and is frequently necessary. Primo Optimum provides a tool for optimizing gene expression in any host species and designs primers for the one-step or stepwise PCR synthesis of the optimal gene. In addition it checks for a number of factors that might be detrimental for gene expression.

For more PCR primer design tools (standard PCR, multiplex PCR, profile PCR, MSP PCR, degenerate PCR, etc.), please visit Primo Home.

Browser requirements:

Primo Online runs on the following Java-enabled browser:
  • PC: Internet Explorer 5.5 or higher and Netscape 4.08 or higher.
  • Mac (OS 9.*): Internet Explorer 5.0
  • Mac (OSX10): Early version of OSX10 IE5.1 has a bug. Copy and paste will kill the browser. The bug should have been fixed in later versions of Mac IE.

    If you use one of the above browser and you can't run Primo Online, please make sure Java is enabled in your browser. For Internet Explorer, go to Tools/Internet Options, click on Security Settings, scroll down to find Microsoft VM, deselect "Disable Java". Mac OSX 10 IE5.1 has a bug, it does not allow copy/paste into a Java text field, thus you will not be able to import a new sequence. Attempting copy/paste may kill your browser. Stand-alone versions don't require IE to run, thus don't have this problem.
  • How to use:

    To start using Optimum, copy and paste your sequences into the input window. On Windows you may need to use Ctrl-C (copy) and Ctrl-V (paste). On Macs you may need to use Apple-C (copy) and Apple-V (paste). Numbers and white spaces will be ignored.

    Both nucleotide and protein sequences can be used as input. Users don't need to specify whether the sequence is nucleotides or amino acids, the program will determine the sequence type automatically. Note degenerate codes for nucleotides are not acceptable, since the protein sequence can not be determined uniquely from degenerate nucleotide sequences. (The program might mistake a nucleotide sequence as a protein sequences if it contains degenerate codes.)

    For gene synthesis without optimizing codon usage, check "Disable optimization" or input your DNA sequences in the "optimized sequence" window (and leave the top-left windwon blank).

    Engineer restriction sites:

    Optimum can be used to add/remove restriction sites from the optimized sequences without altering the protein sequences encoded (silent mutation).

  • Remove restriction sites: remove the selected restriction sites from the full length optimized sequences.
  • Add restriction sites: add selected restriction site in a user-inputted region. Users are suggested to check first with "any" restriction sites to find a list of restriction sites that will not alter the protein sequences encoded. Restriction sites from the list may be added one at a time.

    Figure 1.Engineering restriction sites (silent mutation)

    Step 1.Select a region where you would like to add a restriction site. You may chose the region where two primers overlap. Or input the specific positions (e.g. 50-100 for nucleotide positions 50 to 100). The position input will be ignored if any one of the primer choices is not "N/A." In another word, to input specific positions, users need to select "N/A" for the two overlapping primers.

    Step 2.Check "Add restriction site" and select "any" from the restriction enzyme list. Press on "Check" to find a list of restriction sites that can be added without altering the protein sequences encoded.

    Step 3.Uncheck "any" and select one restriction site. Press "Check" and then "Confirm." You may also need to remove any restriction sites outside the region you selected. To do so, simply check "Remove restriction site" and select the enzyme name from the enzyme list before pressing on "Check."

    Behind the scene:

    1. Diagnosis

    To run the diagnosis function, you must input a nucleotide sequence in the top-left window. The program will find the longest open reading frame, and check for a number of features such as Kozak translation initiation sequence, Shine Dalgarno sequence, rare codons, etc. It is not intended as an extensive diagnosis, rather serves as a check list to remind researchers of optimization options.

    2. Optimum reverse translation

    Optimum program will reverse translate a protein sequence to a nucleotide sequence using the "optimal codons," i.e., codons with the highest usage frequencies.

    If a nucleotide sequence is the input, the longest open reading frame will be determined, and the ORF will be reverse translated. 5' and 3' untranslated regions will not be optimized.

    3. Gene synthesis

    To design PCR primers for gene synthesis you may input a protein or DNA sequence in the top-left window and let the program determine the optimal sequence. Alternatively leave the top-left window blank, and input or edit your optimal sequence in the window with a label "Optimal DNA Sequence:".

    The program will find a series of overlapping PCR primers for the one-step or stepwise PCR synthesis of genes. The program will not check for primer-primer dimers since the templates (primers also are templates) are in much higher concentration.

    There are three options for gene synthesis:

  • 5' Extension --- gene synthesis will start from the 3'-end and extend towards the 5' end.
  • 3' Extension --- gene synthesis will start from the 5'-end and extend towards the 3' end.
  • Alternating --- gene synthesis may start in both directions.

    Shown in Figure 2 are several experimental designs for gene synthesis. Note the orientations of the primers are different for different experimental designs.

    Figure 2. PCR gene synthesis uses a series of overlapping PCR primers, which also serve as templates. In the one-step approach, the outer primers are at a higher concentration than the inner primers. As a result, the full-length gene is eventually synthesized. Primers designed this way may also be used to synthesize the gene fragments using two or three primers at a time and then elongate the PCR product by additional round of PCR synthesis or ligation. Note the orientations of the primers are different for different experimental designs.

    4. Host species

    Two species (human and E. coli) are available in the Optimum Online. More species are available in the stand-alone version (see below). Users can add more host species in the stand-alone version, provide that the codon usage frequencies and genetic codes are known for the hosts.

    5. Codon usage

    The codon usage frequency relative to that of the optimal codon is diagramed. Mouse click on the diagram to show sequences begin at the clicked position. Note the diagram shows the relative codon usage frequencies, while the absolute frequencies are used for the determination of rare codons in the diagnosis mode. The two views complement each other.

    6. Add new host species

    To add new species, users need to download codon usage table from the Codon Usage Database. Select the species, then the genetic code and the GCG style (Figure 2). The codon usage and genetic code will be combined into one table as shown in Figure 3. Copy the table (without the first row of column names) and use it as the input for the "Add Host" function of Primo Optimum. Alternatively, users may edit the "hostdata.txt" file in the "codontable" folder.

    Figure 2. Codon usage and the genetic codes can be downloaded from the Codon Usage Database. Primo Optimum uses the GCG format

    Gly     GGG      7215.00     10.37      0.16
    Gly     GGA     15218.00     21.86      0.34
    Gly     GGT      9675.00     13.90      0.21
    Gly     GGC     12935.00     18.58      0.29
    Glu     GAG     27761.00     39.88      0.65
    Glu     GAA     15275.00     21.95      0.35
    Asp     GAT     15788.00     22.68      0.44
    Asp     GAC     19762.00     28.39      0.56
    Val     GTG     18841.00     27.07      0.44
    Val     GTA      4443.00      6.38      0.10
    Val     GTT      8958.00     12.87      0.21
    Val     GTC     10109.00     14.52      0.24
    Ala     GCG      6471.00      9.30      0.14
    Ala     GCA     10998.00     15.80      0.24
    Ala     GCT     13625.00     19.57      0.30
    Ala     GCC     14418.00     20.71      0.32
    Arg     AGG      6893.00      9.90      0.18
    Arg     AGA      9336.00     13.41      0.25
    Ser     AGT      9043.00     12.99      0.15
    Ser     AGC     14337.00     20.60      0.24
    Lys     AAG     20053.00     28.81      0.52
    Lys     AAA     18383.00     26.41      0.48
    Asn     AAT     11144.00     16.01      0.38
    Asn     AAC     18365.00     26.38      0.62
    Met     ATG     18183.00     26.12      1.00
    Ile     ATA      4800.00      6.90      0.15
    Ile     ATT     11105.00     15.95      0.34
    Ile     ATC     17031.00     24.47      0.52
    Thr     ACG      5748.00      8.26      0.15
    Thr     ACA     11238.00     16.15      0.29
    Thr     ACT      9742.00     14.00      0.25
    Thr     ACC     12646.00     18.17      0.32
    Trp     TGG      8046.00     11.56      1.00
    End     TGA       756.00      1.09      0.49
    Cys     TGT      7956.00     11.43      0.47
    Cys     TGC      9101.00     13.08      0.53
    End     TAG       249.00      0.36      0.16
    End     TAA       542.00      0.78      0.35
    Tyr     TAT      8627.00     12.39      0.40
    Tyr     TAC     12961.00     18.62      0.60
    Leu     TTG      8148.00     11.71      0.13
    Leu     TTA      4231.00      6.08      0.07
    Phe     TTT     11678.00     16.78      0.44
    Phe     TTC     14848.00     21.33      0.56
    Ser     TCG      4671.00      6.71      0.08
    Ser     TCA      9158.00     13.16      0.15
    Ser     TCT     11737.00     16.86      0.19
    Ser     TCC     12019.00     17.27      0.20
    Arg     CGG      4970.00      7.14      0.13
    Arg     CGA      4720.00      6.78      0.13
    Arg     CGT      4659.00      6.69      0.12
    Arg     CGC      7152.00     10.28      0.19
    Gln     CAG     23603.00     33.91      0.73
    Gln     CAA      8524.00     12.25      0.27
    His     CAT      7392.00     10.62      0.39
    His     CAC     11553.00     16.60      0.61
    Leu     CTG     24932.00     35.82      0.41
    Leu     CTA      4105.00      5.90      0.07
    Leu     CTT      8256.00     11.86      0.13
    Leu     CTC     11817.00     16.98      0.19
    Pro     CCG      6863.00      9.86      0.17
    Pro     CCA     11022.00     15.84      0.27
    Pro     CCT     11781.00     16.93      0.29
    Pro     CCC     10427.00     14.98      0.26
    Figure 3. Copy the data in the format shown on the left and use as input for the "Add Host" function of the Primo Optimum.
    7. Melting temperature and annealing temperature

    Melting temperature is determined by the regions of over-lapping primers for Optimum. Melting temperature is the temperature at which 50% of the oligo and its perfect complement are in duplex. PCR annealing temperature a few degree (4-6) lower than the melting temperature is usually used to increase the probability of primer binding. There are two options for calculating the melting temperature. The first uses the simple rule of 2 degree for each A or T and 4 degree for each C or G.

       Melting temperature = 4 * Number of G or C + 2 * Number of A or T.

    The second "Nearest N" predicts melting temperature using the "Nearest Neighbor" model (John SantaLucia, Proc. Natl. Acad. Sci. Vol. 95, p1460-1465 (1998)). The cation concentration is assumed to be 50 mM and the primer concentration is assumed to be 200 nanomolar. The "Nearest N" is presented because it is more accurate and other formulae can be viewed as approximations of the "Nearest N".

    Species available in the stand-alone version

    Species Name Common Name
    Arabidopsis thaliana Arabidopsis
    Zea mays Corn
    Bos taurus Cow
    Drosophila melanogaster Drosophila
    E. coli E. coli
    Xenopus laevis Frog
    Homo sapiens Human
    Mus musculus Mouse
    Rattus norvegicus Rat
    Oryza sativa Rice
    Saccharomyces cerevisiae Yeast
    Danio rerio Zebrafish
  • Copyright 2002-2004 Chang Bioscience, Inc. All rights reserved.