Systematic review of diagnostic accuracy of reflectance confocal microscopy for melanoma diagnosis in patients with clinically equivocal skin lesions

Background: Melanoma is a cancer of the skin and is increasing in incidence in the UK and Europe. Melanoma is a condition that is often curable if detected at an early stage, which makes accurate diagnosis vital. Reflectance confocal microscopy (RCM) is a tool used to image the skin. It gives high magnification images of the skin, which may provide more accurate diagnosis of lesions that are equivocal on clinical examination and dermoscopy. Objective: To determine the diagnostic accuracy of reflectance confocal microscopy (RCM), for melanoma diagnosis, as an add-on test to clinical examination and dermoscopy in the diagnosis of equivocal pigmented skin lesions using histopathology as the reference standard. Methods: A search was conducted of MEDLINE, EMBASE and six other electronic databases from inception to present. Forward citation searching and hand searching of reference lists were also conducted. Diagnostic accuracy studies that assess RCM in the diagnosis of melanoma were included in the review. Two contributors conducted the search, data extraction and assessment of methodological quality using QUADAS-2. Statistical analysis was performed using hierarchical bivariate random effects meta-analysis. Results: 951 titles and abstracts were screened. Five studies comprising 909 lesions were eligible for meta-analysis. Meta-analysis returned a per lesion sensitivity of 93% [95% CI 89–96] and a specificity of 76% [95% CI 68–83]. Conclusions: The utility of reflectance confocal microscopy (RCM) as an add-on test for the diagnosis of melanoma depends on the trade off between over-excising benign lesions and misdiagnosing melanoma as benign. This becomes important when considering lesions on surgically difficult or cosmetically important areas of the body.


Introduction
Melanoma is a cancer of the skin which is increasing in frequency both in the UK and Europe1.Cancer research UK (CRUK) have calculated that in the 35 years from 1975-2010 the age standardized incidence rate in the UK rose from 3.2 per 100000 to 17.2 per 1000002.The biggest risk factor for developing melanoma is exposure to ultraviolet light [3].
Prognosis for melanoma is very much dependent on the stage of the disease when it is diagnosed so early accurate diagnosis of melanoma is crucial.The five-year survival for stage 1A melanoma is 97%.The five-year survival drops rapidly to 10-15% for stage 4 metastatic disease [4].This rapid decline in survival with higher stage is because the only potentially curative treatment is surgical excision [5].Adjuvant therapy for non-metastatic melanoma has not yet been demonstrated to provide a survival benefit [6] and no therapy has proven to extend survival for metastatic melanoma [7,8].
The currently accepted best diagnostic method for melanoma is dermoscopy [9].
A recent meta-analysis of dermoscopy in the diagnosis of melanoma pooled the sensitivities and specificities and found a sensitivity of 91% and a specificity of 86% [10].Most dermoscopy research has been conducted in white skinned populations however there is some evidence of the ability of dermoscopy to work equally well in non-white populations [11].
Reflectance confocal microscopy (RCM) also known as confocal laser scanning microscopy (CLSM) of the skin was first described in the early 1990s [12].This technology uses a near infrared laser to obtain images of the top layers of the skin.These images are magnified such that they are "quasihistological."From the images, information can be obtained regarding cell structure and the architecture of the surrounding tissues.The images are analyzed and combinations of features are assessed to give a positive or negative diagnosis of melanoma.Several criteria have been developed to analyze images of RCM [13].The test itself takes about ten minutes for imaging and evaluation of a skin lesion.
The goal of diagnosing melanoma is to correctly identify melanomas, while at the same time, excising as few benign lesions as possible.The most appropriate first line examination for this is dermoscopy, which has been shown to be a more accurate diagnostic tool than unaided eye examination [9].Given the time needed to use RCM, it is most appropriate as a secondary examination add test to dermoscopy for lesions where dermoscopy does not give a confident diagnosis.This role has been suggested previously [14,15].
There have been many narrative reviews on the use of

Assessment of methodological quality
Two authors independently assessed methodological quality of the studies using the QUADAS-2 tool [23].Any disagreements were resolved by discussion.The results of the quality assessment are presented with a textural methodological quality summary and graphical representation.

Search
The search of the databases was conducted on February 8, 2012.After screening for duplicates 951 studies were examined.A flow diagram of the search can be found in Figure 1.
After examining titles and abstracts the full text of 39 articles were retrieved.There were five articles that met the inclusion criteria.These are shown in Table 1.

Excluded studies
There were five studies, which were derivation studies, or studies that did not validate on a new set of patients.There were 15 descriptive correlation studies, which only described which RCM features were associated with melanoma.There were four case reports or small case series, two narrative review articles, one editorial and one study looking at observer agreement of the RCM features associated with melanoma.

Methodological quality assessment
The exclusion criteria for studies in the review included two major methodological quality criteria.The studies could not be case control studies nor could they be studies that set a diagnostic threshold i.e.: studies that developed a scoring system.Case control studies have been demonstrated to overestimate diagnostic accuracy when compared to cohort studies that use an appropriate spectrum [24].Studies that derive/set a threshold use multivariable analysis to derive a score.These scores are derived on a certain population.It is very often the case that these scoring systems perform worse when they are validated in another population, however similar [25].
This resulted in a low risk of bias regarding the applicability of the included patients and the appropriateness of the index test.In this study, the reporting of patient selection was generally poor however all domains were graded as low risk of bias.The methodological quality assessment is shown graphically in Table 2.

Findings
Five studies were identified comprising 909 lesions.The average prevalence of melanoma was 36.2% with a range from 29-39.Three studies used the RCM diagnostic scor-

Type of study
Cohort studies of diagnostic test accuracy with a predefined threshold that was established on separate data are eligible for inclusion.

Target condition
Melanoma of the skin.

Study population
Patients presenting with lesions suspicious for melanoma that were equivocal to clinical and dermoscopic diagnosis.No restriction was placed upon participant characteristics such as age, sex, ethnicity etc.

Index test
Reflectance confocal microscopy.There was no restriction on the type of algorithm or diagnostic process.

Reference standard
Histopathology of the excised skin lesion or long-term clinical follow-up.

Statistical analysis and data synthesis
Data were extracted by two reviewers independently.Hierarchical bivariate random-effects meta-analysis [19] was used to perform the statistical meta-analysis as this has been demonstrated to be the most robust method [19].
If there appeared to be no or minimal threshold differences between the studies clinically or on the receiver operator characteristic (ROC) plot then a summary statistic in the form of sensitivity and specificity was planned [20].If there were, clinically and visually, the appearance of a threshold effect then the summary ROC curve was planned as the most appropriate summary measure [20].
If a study presented several sensitivity and specificity estimates on a receiver operator characteristic curve (ROC) then the point estimate used for meta-analysis was the point chosen by the author of the article.
The results are presented graphically using RevMan5 [21].The studies were combined in a statistical meta-analysis using the METANDI function in STATA [22].
Subgroup analyses was intended for investigation of operator experience and algorithm method however there was an insufficient number of studies.
Review | Dermatol Pract Concept 2013;3(4):5 Per lesion sensitivity and specificity are shown on a forest plot in Figure 2.There appears to be minimal heterogeneity per lesion in sensitivity across the studies, with more heterogeneity in the specificity.
Based upon this a hierarchical summary receiver operator characteristic (HSROC) curve was obtained using the bivariate method.The diagnostic accuracy results are quite similar in all studies and there is no evidence of a threshold effect or apparent threshold effect.The plot of this is shown in Figure 3. From meta-analysis the operating point had a sensitivity of 93.3% [95% CI 88.5-96.2,range 91% to 97%] and a specificity of 75.9% [95% CI 67.9-82.5, range 68% to 86%].The purpose of this review was to evaluate RCM as an add-on test to existing diagnostic pathways, not to evaluate it as a replacement test.It has been suggested that RCM is more sensitive than dermoscopy [13].If all lesions that were suspicious to the unaided eye examination were examined Given the low number of studies included in the review, statistical subgroup analysis and covariate hierarchical modeling for investigation of heterogeneity were not performed due to low statistical power.

Discussion
When examining the use of a new diagnostic test it is important to consider whether its introduction will improve   with some monitoring procedure they may miss a melanoma.
To gauge the trade off between the reduction in unnecessary biopsies and the missed melanoma diagnoses the sensitivity and specificity can be applied to an estimated prevalence of melanoma in the spectrum of patients that would be selected for RCM examination.
The average prevalence of melanoma in the studies included in the review was 36%.In a 2002 systematic review of dermoscopy the mean frequency of melanoma was 28%.
Previous research has suggested a malignant to benign ration of 1:4 with the expert use of dermoscopy [32].This translates to a frequency melanoma in dermoscopy positive lesions of 20%.If we assume that in real clinical practice the clearly positive melanomas would not be examined with RCM, we can gauge an estimated frequency of disease in dermoscopy positive lesions would be slightly lower.
Figure 4 demonstrates a flow diagram of the impact of RCM in its proposed role as an add on test to dermoscopy using a sensitivity of 93% and a specificity of 75% as calculated in the meta-analysis and a melanoma frequency of 20%.
For 1000 dermoscopy positive lesions, there would be 200 melanomas.RCM would correctly identify 186 of these and miss [14].There would be 192 benign lesions excised and 608 benign lesions not excised.
The only benefit of RCM in this pathway is to increase the specificity of diagnosis and reduce the number of benign lesions excised.The value of reflectance confocal microscopy as an add-on test in the diagnosis of melanoma depends on the trade off between the harms associated with excising benign lesions and the harms associated with misdiagnosing a melanoma as benign.If RCM is to be used in clinical practice a decision has to be made weighing up the consequences with dermoscopy and RCM then this is no doubt the case.
This, however, is not helpful for clinical practice.It takes seconds to examine a lesion with dermoscopy and minutes to examine a lesion with RCM.RCM is not going to take on the role of dermoscopy.Therefore it is not useful to compare RCM to the sensitivity and specificity of dermoscopy.Instead, it should be considered as an add-on test to the best current clinical diagnostic tool, which in this case is dermoscopy.
If RCM were to be used as an add-on test in clinical practice, the population examined with RCM would be the narrow pre-selected group of those in whom dermoscopy was not clearly positive or not clearly negative.The population of lesions being examined with RCM in these studies was not clear and reproducible.The terms "clinically suspicious" and "equivocal" do not give the reader sufficient information.It is not certain that the lesions examined by RCM in the studies were the same that would be examined by RCM in clinical practice.If the lesions examined in these studies included those that were clearly melanoma then the spectrum of disease would be different to that in clinical use and this could bias the result leading to an over estimate of sensitivity and specificity.
These factors combined with the concept that diagnostic accuracy determined from laboratory condition studies may be different from the diagnostic accuracy in the real life clinical setting [31], mean that the external validity of these results has to be taken cautiously.A study looking at the agreement between observers in identifying these features found high overall levels of reproducibility [16].

Weaknesses
A weakness of this review is that the current studies may not have focused on the pertinent patient populations to test the ability of RCM as an add-on test to dermoscopy.It is noted that in three of the five studies included in this review the main operators using RCM were Giovanni Pellacani and of missing a melanoma and the harms averted by avoiding performing un-necessary excisions.
Excision of skin lesions on most areas of the body is often a quick and easy process that does not carry a great risk of morbidity.The situations where this is not the case are when lesions are on cosmetically sensitive areas of the body such as the face, head and neck or where skin surgery becomes complex, involving the use of skin grafts or flaps.It is these lesions where the reduction in benign lesion excision would have the most impact.
The algorithms that have been developed for use in melanoma diagnosis are based upon several features observed RCM in the diagnosis of melanoma.These articles have focused mainly on describing the technology and discussing its potential role in melanoma diagnosis.RCM technology has advanced since the first instruments were introduced in Review | Dermatol Pract Concept 2013;3(4):5 Per lesion data was extracted onto a study specific data extraction sheet by two authors independently.The following data was collected: the details of the study population, details of the reference standard and index test, blinding of the reference standard and the index test.Prevalence of melanoma, information to complete the 2 x 2 table.
ing system developed by Pellacani 200526 (Curchin 2011, Guitera 2009, Pellacani 2007), two used a scoring system for lentigo maligna developed by Guitera 201027 (Guitera 2010, Curchin 2011) and one did not use a specific diagnostic algorithm but made RCM diagnoses based upon pre-specified melanoma associated features (Langley 2007).The operators were self identified as experienced in four out of the five studies and inexperienced in one (Curchin 2011).There were no explicit differences in the spectrum of disease of patients being examined with RCM.All studies examined equivocal skin lesions.One study was exclusively limited to equivocal skin lesions on the face (Guitera 2010) and two studies excluded lesions on the face (Pellacani 2007 Guitera 2009).

Figure 4 .
Figure 4. Proposed role of RCM in diagnostic pathway: Hypothetical example based on 1000 lesions positive with dermoscopy.[Copyright: ©2013 Stevenson et al.]