The real-time reverse transcription polymerase chain reaction (RT-PCR) uses fluorescent reporter molecules to monitor the production of amplification products during each cycle of the PCR reaction. This combines the nucleic acid amplification and detection steps into one homogeneous assay and obviates the need for gel electrophoresis to detect amplification products. Use of appropriate chemistries and data analysis eliminates the need for Southern blotting or DNA sequencing for amplicon identification. Its simplicity, specificity and sensitivity, together with its potential for high throughput and the ongoing introduction of new chemistries, more reliable instrumentation and improved protocols, has made real-time RT-PCR the benchmark technology for the detection and/or comparison of RNA levels.
The real-time, fluorescence-based reverse transcription polymerase chain reaction (RT-PCR) is one of the enabling technologies of the genomic age and has become the method of choice for the detection of mRNA (Bustin 2000). Several factors have contributed to the transformation of this technology into a mainstream research tool: (i) as a homogeneous assay it avoids the need for post-PCR processing; (ii) a wide (>107-fold) dynamic range allows straightforward comparison between RNAs that differ widely in their abundance; and (iii) the assay realises the inherent quantitative potential of the PCR, making it a quantitative as well as a qualitative assay (Ginzinger 2002). The recent focus on nucleic acid quantification, together with the introduction of second-generation instrumentation and alternative chemistries, has facilitated the migration of this technology into individual research laboratories. This has resulted in its extensive application to functional genomics studies, molecular medicine, forensics, virology, microbiology and biotechnology (http://www.gene-quantification.info/).
It would be reasonable to presume that such widespread penetration is the result of an established and standardised technology. However, this is not so and many limitations of this assay are the same as those recorded for conventional endpoint RT-PCR (Bustin 2002). Furthermore, some of these problems have been exacerbated by the quantitative aspirations of this technology, not least due to the intimate association between quantification and amplification efficiency (Pfaffl 2001, Pfaffl et al. 2002). Worryingly, the extent of the unreliability of quantitative RT-PCR data, and its effect on their biological validity, is still not widely appreciated or acknowledged. One example is the threshold cycle (Ct), which records the cycle when sample fluorescence exceeds a chosen threshold above background fluorescence. The Ct is used for quantifying target copy number, yet its value is entirely subjective, as the threshold can be altered at will (Bustin & Nolan 2004). This brief perspective deals with the RT and data normalisation steps, two fundamental issues in need of urgent consideration. Their unpredictability constitutes a serious obstacle to the usefulness of this assay as an accurate and meaningful description of real-life mRNA levels. Other examples are exhaustively described elsewhere (Bustin 2004).
The ostensibly small step of converting RNA into a cDNA template is an important contributor to the variability and lack of reproducibility frequently observed in RT-PCR experiments. There are several reasons for this. First, the dynamic state of cells makes it inevitable that there is inherent variation in RNA prepared from biological samples. Secondly, purified RNA may be of variable quality and, once extracted, is rather unstable. Thirdly, the efficiency of RNA-to-cDNA conversion is dependent on template abundance. It is significantly lower when target templates are rare (Karrer et al. 1995) and is negatively affected by non-specific or background nucleic acid present in the RT reaction (Curry et al. 2002, Stahlberg et al. 2004b). Little effort has been made to draw attention to the fourth reason, the different priming approaches used to synthesise cDNA. cDNA can be synthesised using random primers, oligo-dT, target gene-specific primers or a combination of oligo-dT and random primers.
Approximately 30% of cDNA priming in real-time RT-PCR assays is carried out using random primers. This approach primes the RT at multiple origins along every RNA template, hence produces more than one cDNA target per original mRNA target. Furthermore, the majority of cDNA synthesised from total RNA is ribosomal RNA (rRNA)-derived. This could create problems if the mRNA target of interest is present at low levels, as it may not be primed proportionately and its subsequent amplification may not be quantitative. Indeed, it has been demonstrated that random hexamers can overestimate mRNA copy numbers by up to 19-fold compared with a sequence-specific primer (Zhang & Byrne 1999). Another drawback is that a reaction primed by random primers is linear over a narrower range than a similar reaction primed by target-specific primers (Bustin & Nolan 2004). This has immediate consequences for the accuracy of quantification. Any correlation of this problem with specific RT remains to be determined, as does the difference between priming from total or mRNA, but for the moment it is not known how important an obstacle this constitutes to reliable and reproducible quantification.
Oligo-dT is used to prime approximately 40% of real-time RT-PCR assays. It is more specific than random priming, and is the best method to use when the aim is to obtain a faithful cDNA representation of the mRNA pool. It is also the most appropriate choice when aiming to amplify several target mRNAs from a limited RNA sample. However, as it requires full-length RNA it is not an effective choice for transcribing RNA that is likely to be fragmented, such as that typically obtained from archival material. Furthermore, the RT may fail to reach the upstream primer-binding site if secondary structures exist or if the primer-binding site is at the extreme 5′-end of a long mRNA. This may be the case if the target mRNA contains a very long untranslated 3′-region or if splice variants differ at the 5′-end of the mRNA (e.g. the MHC class II transactivator isoforms I, III and IV) (Sanderson et al. 2004). Approximately 10% of real-time RT-PCR assays use a combination of oligo-dT and random primers. However, while this may be acceptable for qualitative assays, this approach could exacerbate the problems inherent with the individual methods.
Target-specific primers are used in approximately 20% of RT-PCR assays. Their use results in the synthesis of the most specific cDNA and may provide the greatest sensitivity for quantitative assays (Lekanne Deprez et al. 2002). The main disadvantage of this method is that it requires separate priming reactions for each target; hence is not possible to return to the same preparation and amplify other targets at a later stage. It is also wasteful if only limited amounts of RNA are available. While it is possible to amplify more than one target in a single reaction tube (multiplex) (Wittwer et al. 2001), this is not trivial and requires careful experimental design and optimisation of reaction conditions if quantitative data are expected to be an accurate reflection of target mRNA levels.
Target abundance may also influence the choice of most appropriate primer for the RT step. For example, RT using specific primers may be appropriate for a very abundant target, but random priming may be better if the target is present at very low copy numbers.
Regardless of which method is used to prime cDNA synthesis, the PCR step requires target-specific primers. These are usually designed in isolation, using single templates of very limited genetic complexity. While the specificity of individual primers may be tested using BLAST, no further consideration may be given to the influence of non-target sites that can result in suboptimal binding of the primers. Accurate quantification requires primer sets that facilitate maximum amplification efficiency. In our experience, it is usually necessary to design, synthesise and validate several primer pairs, until a set is obtained that generates no primer dimers and results in near 100% amplification efficiency. Primers are best evaluated using SYBR Green-I chemistry and melting curve analysis. A recent report describes a useful algorithm for the identification of sequence-specific primers, which applies the highest filter stringency to residues at the 3′-end of the primer and to adventitious matches with abundant non-coding RNA (Wang & Seed 2003). These authors have also established a primer database containing >100 000 primers with uniform properties specifying most human and mouse genes. Not only would their general use simplify primer design for the individual researcher, but it would also initiate a process of standardisation, which is crucial for the generation of reproducible results.
In summary, each of the methods used to generate cDNA differ significantly with respect to specificity as well as cDNA yield and variety. Consequently, it is important to realise that RT-PCR results are comparable only when the same priming strategy and reaction conditions are used (Stahlberg et al. 2004a). In addition, it is not widely appreciated that random priming occurs whether or not primers are present and this can lead to a lowered and variable signal in the subsequent PCR assay (Frech & Peterhans 1994) (http://www.ambion.com/catalog/CatNum.php?1740).
The principle of quantification is straightforward: the more copies of target there are at the beginning of the assay, the fewer cycles of amplification are required to generate the number of amplicons that can be detected reliably. Consequently, fewer amplification cycles are required for the fluorescence to reach the threshold level of detection (e.g. fitted line or a Ct value calculated by a mathematical algorithm) that is specific for every real-time detection instrument. In practice, the relationship between target copy number and detection is not as clear-cut. First, reproducible quantification of any low abundance target (<1000 copies) is problematic due to the inherent limitation of PCR amplification of small amounts of template contained within a complex nucleic acid mixture (Monte Carlo effect) (Karrer et al. 1995). Secondly, since many biological samples contain inhibitors of the RT and/or the PCR step, it is crucial to assess the presence of any inhibitors of polymerase activity in RT and PCR. This is most easily achieved by running a reference RT-PCR assay, to which sample RNA is added, and measuring shifts in Ct (Smith et al. 2003). Thirdly, it is essential to apply a normalisation strategy to control for the amount of starting material, variation of amplification efficiencies and differences between samples. Unfortunately, despite the suggestion of numerous normalisation strategies, this remains the most intractable problem for real-time quantification (Thellin et al. 1999). Real-time RT-PCR experiments that rely on the extraction of RNA from complex tissue samples are averaging the data from numerous, variable subpopulations of cells of different lineage at different stages of differentiation. Cellular differences in mRNA expression patterns may well be masked by this variability, a problem exacerbated when attempting to compare mRNA levels between different individuals. For blood samples, flow cytometry (Raaijmakers et al. 2002) or antibody-coated beads (Deggerdal & Larsen 1997) can be used to sort cells and enrich for specific populations. However, even cellular subpopulations of the same pathological origin can be highly heterogeneous (Goidin et al. 2001). For solid tissue biopsies there is no practical way of sorting or counting cells without affecting the expression profile of the sample. Tumour biopsies, in particular, are made up not just of normal and cancer epithelial cells, but there may be several subclones of epithelial cancer cells together with stromal, immune and vascular components. This variability means that while it is acceptable to generate qualitative results, there must be a question mark over quantitative data. Indeed, it is worth considering whether whole tissue biopsies, whether from solid tissue or not, should be analysed quantitatively at all. Fortunately, the introduction of laser capture microdissection promises to help address this particular problem (Fink et al. 1998). This technique has the added advantage that target mRNA levels can be reported conveniently as copies per area or cell dissected.
Normalisation against high quality, accurately measured total RNA mass (Bustin 2002) has been shown to produce quantification results that are biologically relevant (Tricarico et al. 2002). This approach is crucially dependent on accurate quantification and quality assessment of the RNA. The opportune development of Agilent’s 2100 Bioanalyser and LabChip technology has provided a new standard of RNA quality control as well as permitting concomitant quantification of RNA. Similar in concept, but requiring an additional RT-PCR assay, is normalisation against one of the rRNAs (Bhatia et al. 1994, Zhong & Simons 1999). rRNA levels may vary less under conditions that affect the expression of mRNAs (Schmittgen & Zakrajsek 2000) and the use of rRNA has been claimed to be more reliable than that of several reference genes in rat livers (de Leeuw et al. 1989) and human skin fibroblasts (Mansur et al. 1993). A recent report comparing expression levels between activated and resting nucleated blood cells identified 18S rRNA as the most stable reference target. Furthermore, cytokine analysis revealed that only normalisation to 18S rRNA gave a result that satisfactorily reflected target gene mRNA expression levels per cell (Bas et al. 2004). However, normalisation against total RNA does not overcome the problem of variable subpopulations leading to inappropriate quantification and conclusions. Furthermore, total RNA levels may be elevated in highly proliferating cells and this will affect the accuracy of any comparison of copy numbers, for example between normal and tumour cells. In addition, it is not always possible to quantify total RNA, especially when dealing with very limited amounts of clinical samples. For normalisation against 18S RNA, concern has been expressed regarding rRNA transcription by a different RNA polymerase and possible imbalances in rRNA and mRNA fractions between different samples (Solanas et al. 2001). Furthermore, rRNA levels can be affected by biological factors and drugs (Spanakis 1993). Perhaps most importantly, the vast difference in abundance between all rRNA and most target mRNAs will result in different amplification kinetics that may generate misleading quantification data. A final drawback is that rRNA cannot be used for normalisation when quantifying targets from polyA-enriched samples.
In theory, the use of internal reference genes is the most appropriate solution for the normalisation problem. There is a constant stream of publications advocating the use of one or other individual reference genes, usually for more and more specialist application. However, invariably other reports contradict these findings and propose their own alternatives. There are numerous publications highlighting the fact that no single gene is able to fulfil the criteria required of a universal reference gene. All are regulated to some extent and none are constitutively expressed in all cell types and under all conditions independently of experimental design. Nevertheless, scrutiny of recently published papers quantifying cellular mRNA levels reveals that many continue to use single reference genes without demonstrating their appropriateness. The obvious alternative is to use multiple internal control genes. Different methods for identifying the most suitable combination of reference genes have been proposed. One ranks reference genes according to the similarity of their expression profile using a pair-wise comparison and uses their geometric mean as a normalisation factor (Vandesompele et al. 2002). The underlying assumption is that gene pairs showing stable expression patterns relative to each other are appropriate control genes. However, this model requires extensive practical validation to identify a combination of reference genes appropriate for individual experiment. Furthermore, it will top rank co-expressed genes. This drawback is addressed by another model that takes into consideration not just overall expression variation, but also systematic variation across sample subgroups (Andersen et al. 2004). These authors make the point that it does not matter whether a universal reference genes exists, as most experimental designs are restricted to a few tissue types or a few different histological stages of the same tissue. However, while this may be acceptable for comparisons between tissue culture cell lines or cloned rodent tissue, this underestimates the huge variability seen between individuals when analysing human tissues. This is borne out by their model identifying GAPDH as one of the most suitable reference genes in colorectal cancer, when there is considerable evidence that mRNA levels of this gene vary significantly between individuals and between paired normal and cancer tissue (Bustin et al. 1999, Bustin 2000, Tricarico et al. 2002). In any case, its gene product has a number of diverse activities unrelated to its glycolytic function (Sirover 1999). Other models addressing the most appropriate method for normalising results exist (Akilesh et al. 2003, Szabo et al. 2004), but are neither straightforward nor ‘out-of-the-box’ solutions for general use. This variability emphasises the point that there are numerous ways of presenting data and that there is significant discordance between results obtained in different laboratories.
In summary, while appropriate normalisation is critical for obtaining biologically relevant results, the question of what constitutes appropriate normalisation remains to be answered in a satisfactory manner. Clearly, no one strategy is applicable to every experimental situation and it remains up to individual researchers to identify and validate the method most appropriate for their experimental conditions.
Real-time technology has significantly extended the use and scope of RT-PCR assays, with the potential for quantification of mRNA targets a particular advantage. However, there is little appreciation of how subjective real-time RT-PCR results are and considerable doubts remain about the biological validity of quantitative data. This brief perspective has highlighted only two of the outstanding problems. There is an urgent need for universal agreement on basic issues such as quality and quantity control of RNA, guidelines for analysis and reporting of results and standardisation of protocols. There is a particular requirement for rules concerning the information relating to experimental and analytical procedures that should be made publicly available with any publication involving this technology. Until this is implemented, real-time RT-PCR will not be able to make the most of its potential beyond its current role as a research tool.
S A B would like to thank the Bowel & Cancer Research for financial support. The authors declare that there is no conflict of interest that would prejudice the impartiality of this scientific work.
BustinSA2004A–Z of Quantitative PCR.
GoidinD2001 Ribosomal 18S RNA prevails over glyceraldehyde-3-phosphate dehydrogenase and beta-actin genes as internal standard for quantitative comparison of mRNA levels in invasive and noninvasive human melanoma cell subpopulations. Analytical Biochemistry29517–21.