In this task, participants are asked to complete two independent binary image classification tasks that involve three unique diagnoses of skin lesions (melanoma, nevus, and seborrheic keratosis). In the first binary classification task, participants are asked to distinguish between (a) melanoma and (b) nevus and seborrheic keratosis. In the second binary classification task, participants are asked to distinguish between (a) seborrheic keratosis and (b) nevus and melanoma.
Definitions: * Melanoma – malignant skin tumor, derived from melanocytes (melanocytic) * Nevus – benign skin tumor, derived from melanocytes (melanocytic) * Seborrheic keratosis – benign skin tumor, derived from keratinocytes (non-melanocytic)
Lesion classification data includes the original image, paired with a gold standard (definitive) diagnosis, referred to as "Ground Truth".
Training Image Data
2000 images are provided as training data, including 374 "melanoma", 254 "seborrheic keratosis", and the remainder as benign nevi (1372). The training data is provided as a ZIP file, containing dermoscopic lesion images in JPEG format and a CSV file with some clinical metadata for each image.
All images are named using the scheme
<image_id> is a 7-digit unique identifier. EXIF tags in the images have been removed; any remaining EXIF tags should not be relied upon to provide accurate metadata.
The CSV file contains three columns:
image_id, identifying the image that the row corresponds to
age_approximate, containing the age of the lesion patient, rounded to 5 year intervals, or
sex, containing the sex of the lesion patient, or
Ground Truth Data
The Training Ground Truth file is a single CSV (comma-separated value) file, containing 3 columns:
* The first column of each row contains a string of the form
<image_id> matches the corresponding Training Data image.
* The second column of each row pertains to the first binary classification task (melanoma vs. nevus and seborrheic keratosis) and contains the value 0 or 1.
* The number 1 = lesion is melanoma
* The number 0 = lesion is nevus or seborrheic keratosis
* The third column of each row pertains to the second classification task (seborrheic keratosis vs. melanoma and nevus) and contains the value 0 or 1.
* The number 1 = lesion is seborrheic keratosis
* The number 0 = lesion is melanoma or nevus
Malignancy diagnosis data were obtained from expert consensus and pathology report information. Participants are not strictly required to limit development to the training data, and are free to train their algorithm using external data sources. However, any other sources of data in system development must be properly cited in the abstract.
This year, there are two phases for result submission:
An optional Validation Phase, with 150 images. Submissions to the Validation Phase are immediately evaluated and made public, allowing participants to test their submission systems and get some feedback on the performance of their submitted algorithm.
An official Test Phase, with 600 images.. Submissions to the Test Phase are made against a blind held-out dataset and are immediately evaluated, but not made public until after the final submission date, as they constitute the final evaluation of participants' algorithms.
Participants may make unlimited and independent submissions to each phase, but only the most recent submission to the Test Phase will be used for official judging.
Participants will be ranked according to each category individually, as well as the average performance across both categories (giving rise to the possibility of 3 distinct "winners"). Ranks and awards will be assigned based only on area under the receiver operating characteristic curve (AUC). However, submissions will also be evaluated using using a variety of common binary classification metrics, reported for scientific completeness, including:
- sensitivity at 0.5 confidence threshold
- specificity at 0.5 confidence threshold
- accuracy at 0.5 confidence threshold
- average precision evaluated at sensitivity of 100%
- specificity evaluated at a sensitivity of 82%
- specificity evaluated at a sensitivity of 89%
- specificity evaluated at a sensitivity of 95%
- area under the receiver operating characteristic curve (AUC)