FAQs list

Any question should be forwarded to Wubin Qu [quwubin AT gmail.com] or Chenggang Zhang [zhangcg AT bmi.ac.cn].

  1. What is MFEprimer?
  2. MFEprimer is a program for checking the specificity of PCR primers based on multiple-factors, including sequence similarity, stability at the 3' end of the primer, melting temperature, GC content, and number of binding sites. It can help the user to select more suitable primers before running either standard or multiplex PCR reactions.

  3. Why we need to check the specificity of PCR primers?
  4. Designing specific PCR primer is crucial for a successful PCR reaction. The non-successful results of a typical PCR reaction include non-specific products, smear bands or no any bands at all. The alternative products are mostly caused by non-unique PCR primers that amplify additional regions from the DNA template. So, it is essential to check the specificity of primers before PCR experiments.

  5. Why multiple-factors evaluation on the specificity of PCR primers?
  6. Usually, the users run NCBI BLASTN to check the primers specificity. However, it is not proper to consider only sequence similarity between primer and the DNA templates. Other factors, such as melting temperature (Tm), Gibbs free energy (∆G) of primers are also the important factors that affecting the specificity of PCR reaction. In addition, the number of binding sites, GC content and number of predicted PCR products are also the major factors causing failure in PCR when used large genomes such as human genome as DNA templates.

  7. Why do we need to choose database to run MFEprimer?
  8. As described in FAQ3, the databases are used for checking the sequence similarity between primers and the templates. The RefSeq databases are mounted as mRNA/cDNA templates while the genomic DNA databases (reference assembly version) are mounted as genomic DNA templates.

    All of the databases are downloaded from NCBI ftp site (ftp://ftp.ncbi.nih.gov). The user can select one databasesfor analysis. Both the cDNA database and genomic DNA database (gDNA) are mounted for analysis. The mitochondrial genomic DNA sequence is integrated in the gDNA database of the same species. The naming rules for the databases are as followings (taken "Homo sapiens" as an example):

    H.sapiens - Genomic DNA

    H.sapiens - RefSeq mRNA

  9. What is the purpose of “General settings”?
  10. Please see following FAQs (What is the "Amplicon size to display"?, What is the PPC and what is the PPC cutoff?, What is the E-value and Word size?) for details.

  11. What is the “Amplicon size to display”?
  12. Inherently, MFEprimer works as running a “Virtual PCR” and predict all the amplicons which a pair of primers would work. When the user is designing a pair of primers, the size of target amplicon should be already known. So the user can specify an expected amplicons size range to remove the amplicons whose sizes are far from the target/specific amplicon(s).

  13. What is the PPC and what is the PPC cutoff?
  14. We defined primer pair coverage (PPC) to score the ability of the primer pair (a forward primer coupled with a reverse primer) binding to the DNA template using following formula:

    formula

    Where Fm and Rm are sequence overlaps of the forward primer and reverse primer with the DNA template, and Fl and Rl are the full lengths of the forward primer and reverse primer respectively. CVfr is the coefficient of variability of matched length of forward primer (Fm) and reverse primer (Rm). The maximum value of PPC is 100%, indicating a pair of primers with the same length and both of them binding to the template completely. See Fig. 1 and Fig. 2 below.

    The PPC cutoff is an experiential value for removing the very unlikely happened amplicons in PCR reaction. However, this experiential value may not be proper according to different PCR conditions.

    ppc_100

    Fig. 1 PPC = 100.0%, one of predicted amplicons of MFEprimer

    ppc_28

    Fig. 2 PPC = 28.8 %, one of predicted amplicons of MFEprimer

  15. What is the E-value and Word size?
  16. NCBI BLAST (Altschul, et al., 1990) is an effective and efficient tool for sequence similarity searching. MFEprimer use BLAST to check the sequence similarity between primers and the DNA templates. E-value and word size are two parameters of BLAST and they have major effects on BLAST performance. With larger E-value or smaller word size, BLAST would report more hits than smaller E-value or larger word size. See BLAST help page and "Altschul, S.F et al., (1990) Basic local alignment search tool, J Mol Biol, 215, 403-410." for details. The user can ignore if he or she doesn’t clearly know about them.

  17. What is the purpose of “Advanced settings”?
  18. The nearest-neighbor (NN) model was used to calculate melting temperature (using the entire binding length between primer and template indicated as green in Fig. 3) and Gibbs free energy (∆G, using the last five resides of the 3' end of primer indicated as red in Fig. 3). These settings are needed for the calculation.

    tm

    Fig. 3 Diagram of calculation of Tm and Gibbs free energy. MFEprimer calculates Tm value using entire binding length of primer indicated as green. But the last five resides of the 3' end of the primer was used to calculate Gibbs free energy, indicated as red.

    Concentration of monovalent cations

    The millimolar concentration of salt (usually KCl) in the PCR. MFEprimer uses this argument to calculate primer melting temperatures. Default value is 50 mM.

    Concentration of divalent cations

    The millimolar concentration of divalent salt cations (usually MgCl2) in the PCR. MFEprimer converts concentration of divalent cations to concentration of monovalent cations using formula (7) suggested in the paper Ahsen et al., Clin Chem. 2001 Nov;47(11):1956-61.. Default value is 1.5 mM.

                         [Monovalent cations] = [Monovalent cations] + 120*(√([divalent cations] - [dNTP])) 

    According to the formula concentration of desoxynucleotide triphosphate [dNTP] must be smaller than concentration of divalent cations. The concentration of dNTPs is included to the formula beacause of some magnesium is bound by the dNTP. Attained concentration of monovalent cations is used to calculate oligo/primer melting temperature. See Concentration of dNTPs to specify the concentration of dNTPs.

    Concentration of dNTPs

    The millimolar concentration of deoxyribonucleotide triphosphate. This argument is considered only if Concentration of divalent cations is specified. Default value is 0.25 mM.

    Salt correction formula

    MFEprimer use the salt correction formula (8) described in the paper SantaLucia 1998, DOI:10.1073/pnas.95.4.1460 for the melting temperature calculation.

    Annealing Oligo Concentration

    The nanomolar concentration of annealing oligos in the PCR reaction. MFEprimer uses this argument to calculate primer melting temperatures. The default (50nM) works well with the standard protocol used at the Whitehead/MIT Center for Genome Research--0.5 microliters of 20 micromolar concentration for each primer oligo in a 20 microliter reaction with 10 nanograms template, 0.025 units/microliter Taq polymerase in 0.1 mM each dNTP, 1.5mM MgCl2, 50mM KCl, 10mM Tris-HCL (pH 9.3) using 35 cycles with an annealing temperature of 56 degrees Celsius. This parameter corresponds to 'c' in Rychlik, Spencer and Rhoads' equation (ii) (Nucleic Acids Research, vol 18, num 21) where a suitable value (for a lower initial concentration of template) is "empirically determined". The value of this parameter is less than the actual concentration of oligos in the reaction because it is the concentration of annealing oligos, which in turn depends on the amount of template (including PCR product) in a given cycle. This concentration increases a great deal during a PCR; fortunately PCR seems quite robust for a variety of oligo melting temperatures.

  19. What is the effect of "Number of binding sites", "Primer GC content", "Number of predicted PCR products" on PCR reaction?
  20. In this paper "Predicting failure rate of PCR in large genomes" (Reidar Andreson et al., 2008), the authors use statistical models to analyze the factors causing failure of PCR reaction. And the models show that the number of binding sites, primer GC content and number of predicted PCR products are the three major factors causing failure of PCR reaction. See the paper for details. The following Figure A. B. C are adaped from this paper.

    The effect of "Number of binding sites" on PCR reaction

    The effect of "Primer GC content " on PCR reaction

    The effect of "Number of predicted PCR products" on PCR reaction

  21. FASTA format
  22. A sequence in FASTA format begins with a single-line description, followed by lines of sequence data. The description line is distinguished from the sequence data by a greater-than (">") symbol in the first column. The word following the ">" symbol is the identifier (unique required) of the sequence, and the rest of the line is the description (both are optional). There should be no space between the ">" and the first letter of the identifier. It is recommended that all lines of text be shorter than 80 characters. The sequence ends if another line starting with a ">" appears; this indicates the start of another sequence. A simple example of four sequence in FASTA format (Note that the bold part should be unique):

    >Primer1 The first primer

    CTGTTTAAGACTCACCCTGAGAC

    >Primer2 The second primer

    GGTGCAACCATGCTTCTTCA

    >Primer3 The third primer

    CTGCCGAGATCCAGCCTCTA

    >Primer4 The fourth primer

    GCATCTGCTCCAAAGTCCCC

References

Altschul, S.F., Gish, W., Miller, W., Myers, E.W. and Lipman, D.J. (1990) Basic local alignment search tool, J Mol Biol, 215, 403-410.

Rozen, S. and Skaletsky, H. (2000) Primer3 on the WWW for general users and for biologist programmers, Methods Mol Biol, 132, 365-386.