Main

Oral and pharyngeal cancers are the sixth most frequent tumour with over 482 000 new cases and 273 000 deaths worldwide in 2008 (Ferlay et al, 2010). The role of high-risk human papillomavirus (HR-HPV) in the carcinogenesis of the uterine cervix is well recognised (Bosch et al, 1995), and owing to numerous studies in the past 10 years, HR-HPV is now also a well-known risk factor in oropharyngeal squamous cell carcinomas (OPSCCs) in addition to established factors such as tobacco and alcohol exposure (Dayyani et al, 2010). Compared with other head and neck squamous cell carcinomas (HNSCCs), HPV-related OPSCCs have different epidemiology, histopathological characteristics, therapeutic response, and clinical outcome (Shah and Patel, 2003; Fakhry and Gillison, 2006; De Vita et al, 2008; Robinson et al, 2010; Westra, 2012).

The small, non-enveloped, DNA virus HPV belongs to the Papillomaviridae family and is known commonly to infect squamous epithelial cells (Doorbar et al, 2012). Cell morphology alone is insufficient to determine the presence of HPV (Lewis et al, 2012), although HPV-positive oropharyngeal cancers are often characterised histologically by a non-keratinising or basaloid morphologic pattern. Two techniques are generally used to diagnose HPV: polymerase chain reaction (PCR) and in situ hybridisation (ISH). Both have strengths and limitations. Human papillomavirus-specific PCR is not routinely available in most diagnostic laboratories; few HPV PCR tests are approved for clinical use, and the method requires a high level of technical skills and special laboratory facilities to prevent contamination. When applied to extracts made from fresh-frozen biopsy samples, the highest sensitivity is obtained, but the PCR analysis does not distinguish the mere presence of HPV from a clinically relevant HPV infection, where the HPV genome is often integrated into the host genome and actively transcribes HPV oncoproteins. Detection of HPV with ISH provides evidence of viral genomes through mRNA or DNA present in the tumour nuclei and is highly specific, although less sensitive than PCR (Robinson et al, 2010). This method does not differentiate between integrated and non-integrated genomes.

The presence of HR-HPV DNA is insufficient to classify accurately tumours as an HPV infection as it may be biologically inactive and not the cause of malignancy. Along with HPV diagnostics, immunohistochemical detection of p16 (p16-IHC) is often used as a surrogate marker for HPV infection and an activity of viral oncoproteins. P16 is a tumour suppressor gene that inhibits cyclin-dependent kinase 4A. In the presence of transcriptionally active HPV, hypophosphorylated retinoblastoma protein (pRb) bind to the HPV oncoprotein E7, allowing the transcriptional activator E2F to be constitutionally active while effectively stopping the negative feedback of free pRb on p16. Overexpression of p16 ensues. Independent of treatment modality, OPSCC patients with p16 overexpression have better prognosis and clinical outcome (Langendijk and Psyrri, 2010). P16-IHC is generally accessible and its technical costs are estimated to be 2–16 times lower than other HPV-specific tests (Lewis, 2012). Several studies have reported difficulties in HPV and p16 diagnostics, as there is no consensus on defining overexpression of p16 by a clear percentage cutoff level, and definitions vary from 5%, 75% to numerous less specific verbal definitions, for example, ‘diffuse and strong nuclear and cytoplasmatic staining’ (Smeets et al, 2007; Lewis, 2012). This may be problematic because different staining patterns can correlate differently to HPV-positive and -negative tumours, and staining patterns may ultimately distinguish transcriptionally from non-transcriptionally active HPV infections and thereby help determine prognosis and clinical outcomes.

The aim of this systematic review was to define and categorise overexpression of p16 based on immunohistochemical staining and correlate the categories to HPV-positive and -negative OP-SCCs.

Methods

Search strategy and selection criteria

One author (CGL) undertook electronic literature searches within PubMed (Medline), Embase, and the Cochrane Library. The search strategy was as follows including MESH terms and keywords: ‘HPV’ or ‘papillomavirus’ or ‘papillomaviridae’ and ‘p16’ or ‘cdkn2a’ or ‘cyclin-dependent kinase inhibitor p16’ or ‘p16 genes’ and ‘oropharynx’ or ‘oropharyngeal’ or ‘palatine tonsil’ or ‘tonsil’ or ‘palatine’ or ‘tongue’ or ‘mouth’ or ‘oral’. Two authors (CGL and MG) independently reviewed the relevance of all resulting study titles and abstracts identified through the above search, and full-text copies of potentially eligible articles were assessed. Finally, one author (CGL) reviewed reference lists of the initially included studies. Studies with identical authors were contacted to avoid including the same study population twice.

We included all studies published in English from January 1980 to October 2012 regardless of funding source. The inclusion criteria were restricted to: age above 18 years, a minimum 20 cases of site-specific OP-SCCs (morphologic variants were included), and HPV and p16 results stated.

Data synthesis

Two authors (CGL and MG) independently extracted relevant data from the included studies and entered them into a piloted data extraction form. The following information were recorded: country, year(s) of biopsy collection, demographics, number of cases, tumour site (base of tongue, palatine tonsils, or other), tumour morphology (keratinising, non-keratinising, or mixed), histopathological grade (carcinoma in situ, poor, moderate, or high differentiation), IHC staining probe, definition of p16 overexpression, biopsy preservation (fresh frozen or paraffin embedded), IHC evaluation by pathologists (yes or no), HPV results (negative or HPV-16, HPV-18, HPV-33, HPV-35, and HPV-58 positive), HPV diagnostics (HPV DNA PCR, HPV DNA ISH, and HPV DNA ISH followed by PCR, HPV RNA RT–PCR, and HPV RNA ISH), and the number of p16-positive and negative cases.

Included studies were categorised into three groups by their definition of p16 overexpression: (a) a verbal definition (e.g. ‘Cases were classified in a binary manner as either positive (any cells with nuclear and cytoplasmatic staining) or negative’), (b) 5–69% nuclear and cytoplasmatic staining, and (c) 70% staining.

Statistical analysis

Statistics were carried out using IBM SPSS Statistics 19.0 (IBM SPSS, Chicago, IL, USA). Descriptive statistics are presented as actual numbers and percentages, or median and range where appropriate. We conducted a meta-analysis using the bivariate model (Reitsma et al, 2005). In the bivariate model, the logit-transformed sensitivities and specificities and the correlation between them across studies are modelled directly. The model accounts for sampling variability within studies and also account for between-study variability through the inclusion of random effects. In the preliminary meta-analyses for each definition of p16 positivity, we fitted the bivariate model separately for each test, and obtained a diagnostic odds ratio, sensitivity, and specificity. Hierarchical summary receiver-operator curve (HSROC) was applied in the meta-analysis and is recommended in the current meta-analytic literature for diagnostic meta-analyses (Leeflang et al, 2013). In addition, HSROCs were plotted with 95% CI. Afterwards, we compared the tests in two separate models, where the definitions used were included as covariates in a meta-regression. Variance components were estimated by restricted maximum likelihood, because of the number of studies and the heterogeneity of the included studies. Statistical analyses on meta-regression were performed in R using the mada package function reitsma.

Results

The initial literature search yielded a total of 778 records. From these, we manually selected 160 articles for full-text assessment, of which 112 articles were later excluded. Accordingly, 48 studies were left eligible for inclusion (Figure 1). Additional three studies were later identified through searching reference lists. Studies with identical authors were contacted and resolved in 12 studies excluded; 11 studies were confirmed duplicates by authors; and one study excluded without reply from authors. Thus, a total of 39 studies (n=3926) were included in the review (Table 1).

Figure 1
figure 1

PRISMA diagram.

Table 1 Overview of studies

In the pooled analysis of all studies with demographic information (n=3625), the majority of patients were male subjects (n=2921, 80.6%). Age ranged from 20 to 93 years with a median of 58 years. Thirty-four studies (n=3420 subjects) were European, Australian, or US based, and five studies (n=506 subjects) were Asian. Ethnicity was reported in 22 studies (n=2265), with 69.2% of these patients being Caucasian (n=1568), 11.9% (n=269) were of Asian origin, and 18.9% (n=428) had mixed ethnicity. Tumours were represented throughout the oropharynx, but were primarily located in the palatine tonsils (n=1420, 36.2%). Tumours at the base of the tongue (n=414, 10.5 %) and of unspecified location represent the remaining (n=2092, 53.3%) (Table 2).

Table 2 Patient characteristics

A total of 52.5% cases (n=2062) were found HPV positive by PCR, ISH, or both. For HPV diagnostics, 22 studies (n=1980) used PCR, 6 studies (n=668) used ISH, and 11 studies (n=1258) used both techniques. In the PCR-based HPV-testing group, 49.6% (n=984) of cases were said to be positive and 59.8% of cases (n=412) were positive in the ISH group, whereas 52.9% (n=666) were positive when both diagnostic approaches were used. The definition of p16 overexpression varied, but all studies dichotomised the results to either negative or positive. In the pooled analysis, p16 overexpression was shown by 37.6% (n=1478) of subjects based on a verbal definition, by 42.9% (n=1684) of subjects based on staining between 5 and 69%, and finally, by 19.5% (n=764) of subjects based on staining equal to or exceeding 70% (Table 2).

Centres placed in the United States defined p16 as positive when staining was between 5 and 69% (6 centres, n=770) or based on staining equal to or exceeding 70% (4 centres, n=482). Six centres (n=507) used a verbal definition. European centres either defined p16 as positive when staining was between 5 and 69% (9 centres, n=602) or based on staining exceeding 70% (3 centres, n=282). Four centres (n=562) used a verbal definition. Three centres (n=194) in Asia used a verbal definition, and two centres (n=312) defined p16 as positive when staining was between 5 and 69%. No Asian centres defined p16 as positive based on staining equal to or exceeding 70%.

Eleven studies (n=861) reported data on histopathologic grade (poorly differentiated, moderate differentiated, highly differentiated, or carcinoma in situ), and six studies (n=634) reported status on tumour morphology (keratinising, non-keratinising, mixed, or unknown). The limited availability of data on tumour morphology did not allow us to examine systematically to what degree the non-keratinising tumours were related to the presence of HPV, as has been observed previously. We found no trends regarding publication year and definition of p16, likely owing to the fact that the included studies were all published in the past 10 years.

Twenty-five studies (n=2888) provided sufficient information to construct a two-by-two table of both p16-negative/-positive and HPV-negative/-positive biopsies. The correlation between HPV and p16 overexpression was numerically greater, when positivity was defined as staining above 70% with a sensitivity of 0.927 (95% CI: 0.793–0.974). The verbal group and >5–<70% group had a sensitivity of 0.791 (95% CI: 0.608–0.888) and 0.894 (95% CI: 0.805–0.942), respectively. The false-positive rate of 0.059 (95% CI: 0.031–0.112) for the verbal group was superior to the rate of 0.201 (95% Cl: 0.12–0.337) of p16 70% (see Figure 2).

Figure 2
figure 2

Hierarchical summary receiver-operator curve (HSROC) of the studies from Table 1. The studies have been divided into three groups based on their definition of p16 staining: a verbal group, a <70% group, and a >70% group, including 95% CIs for the summary point. The verbal group has a lower false discovery rate, while the >70% group had a greater overall sensitivity and a smaller 95% CI.

Discussion

This is the first systematic review exploring the correlation between HPV infection and p16 overexpression in OPSCCs. This review shows that p16 overexpression correlates numerically better to HPV results if staining of tumour cells exceeds 70% rather than lower percentages or positivity based on a verbal definition. The issue of determining a specific cutoff value for p16 positivity has earlier been addressed in smaller samples supporting staining above 75% or staining above 50% combined with >25% confluent areas to define p16 positivity (Begum and Westra, 2008). We found no statistically significant difference between groups of p16 definition correlated to HPV, which may be because of the great heterogeneity among studies, including different p16 antibodies. In addition, ISH and PCR methods vary from centre to centre, leading to a loss of statistical power to detect differences. The explanation might also be that all p16 groups are equally correlated to HPV status; thus, the level of p16 staining is less important and the status of positivity or negativity is evident for a given staining, that is, most p16-positive tumours are above 70% when positive. Histopathologic grade and morphology was insufficiently reported and an agreement on a grading scheme applicable to OPSCC and consensus on reporting data is important for future research. As to p16 antibodies, an FDA-approved recommendation might be profitable to uniform research methods. It is widely assumed that HPV-related oropharyngeal cancers are poorly differentiated based on the immature appearance of the tumour cells, but in fact they are commonly highly differentiated as they emulate the specialised epithelium of the tonsillar crypts (Westra, 2009). Further data for analysis on this matter might question the challenge of interpreting p16-IHC in mixed and keratinising SSCs. In addition, it should be considered if carcinoma in situ should be included in future similar studies.

In future studies applying p16-IHC and HPV diagnostics, the real value of IHC must be questioned once the site of the tumour is known (oropharynx) and the morphology is recognised (non-keratinising); the chance of a non-keratinising OPSCC being HPV positive is still not known.

Previous data report a prevalence of HPV in OPSCC of 51%, which is similar to our results (O'Rorke et al, 2012). Regardless if studies used PCR, ISH, or both, similar results were achieved.

Oropharyngeal squamous cell carcinomas are characterised by a heterogeneous clinical and molecular profile (Huang et al, 2002; Shah and Patel, 2003; Bosch et al, 2004; De Vita et al, 2008) and have interestingly proven to have a better prognostic outcome in cases with p16 overexpression (Lewis et al, 2010; Ang et al, 2010a). P16-IHC is, however, a diagnostic method causing much debate, and concerns have been raised: p16 overexpression might be associated with functional pRb disturbances irrelevant for the HPV infection (Marur et al, 2010). High-risk human papillomavirus-infected OPSCCs have not necessarily lost the 9p21 allele encoding p16 (Braakhuis et al, 2004), and p16-IHC has been reported 100% sensitive but 79% specific as to carcinomas with HPV infection (Smeets et al, 2007). P16-IHC is performed on just one slide of tumour tissue and staining might vary allowing false-negative results explaining a lower specificity. Lately, cutoff values above 70 or 75% have proven to be of wider use (Ang et al, 2010a; Evans et al, 2011; Schache et al, 2011a) as compared with, e.g., values >10% as a ‘validated’ definition of p16 overexpression. In a retrospective study based on material from The Danish Society for Head and Neck Oncology (DAHANCA), the cutoff value was changed in a Letter to the Editor after publication from >10 to >70% (Lassen and Overgaard, 2012).

In conclusion, substantial differences exist in the definition of p16 overexpression and means of HPV diagnostics between studies. To achieve the highest correlation between p16-IHC and HPV results, we advise clinicians and researchers to define p16 overexpression as >70% staining of tumour cells. Future research in this field should report on p16 and HPV results, allowing a better understanding of the association between the two.