ISIC 2016 Challenge - Task 2: Lesion Dermoscopic Feature Extraction [Closed]

Goal

Participants are challenged to submit automated predictions of clinical dermoscopic features on supplied superpixel tiles.

Data

Lesion dermoscopic feature data includes the original lesion image and a corresponding superpixel mask, paired with superpixel-wise expert annotations of the presence and absence of the "globules" and "streaks" dermoscopic features.

Superpixel Overview

To reduce the variability and dimensionality of spatial feature annotations, the lesion images have been subdivided into superpixels using the SLIC0 algorithm.

A lesion image's superpixels should be semantically considered as an integer-valued label map mask image. All superpixel mask images will have the exact same X and Y spatial dimensions as their corresponding lesion image. However, to simplify storage and distribution, superpixel masks are encoded as 8-bit-per-channel 3-channel RGB PNG images. To decode a PNG superpixel image into a label map, use the following algorithm (expressed as pseudocode):

uint32 decodeSuperpixelIndex(uint8 pixelValue[3]) {
    uint8 red = pixelValue[0]
    uint8 green = pixelValue[1]
    uint8 blue = pixelValue[2]
    // "<<" is the bit-shift operator
    uint32 index = (red) + (green << 8) + (blue << 16)
    return index
}

As an actual Python function using NumPy, this algorithm is:

import numpy
def decodeSuperpixelIndex(rgbValue):
    """
    Decode an RGB representation of a superpixel label into its native scalar value.
    :param pixelValue: A single pixel, or a 3-channel image.
    :type pixelValue: numpy.ndarray of uint8, with a shape [3] or [n, m, 3]
    """
    return \
        (rgbValue[..., 0].astype(numpy.uint64)) + \
        (rgbValue[..., 1].astype(numpy.uint64) << numpy.uint64(8)) + \
        (rgbValue[..., 2].astype(numpy.uint64) << numpy.uint64(16))

# This may be used as:
from PIL import Image
image = Image.open('ISIC_0000003_superpixels.png')
assert image.mode == 'RGB'
image = numpy.array(image)
image = decodeSuperpixelIndex(image)
assert image.shape == (767, 1022)
assert image.min() == 0
assert image.max() == 990

Training Data

Download Training Data

The Training Data file is a ZIP file, containing 807 lesion images in JPEG format and 807 corresponding superpixel masks in PNG format. All lesion images are named using the scheme ISIC_<image_id>.jpg, where <image_id> is a 7-digit unique identifier. EXIF tags in the lesion images have been removed; any remaining EXIF tags should not be relied upon to provide accurate metadata. All superpixel masks are named using the scheme ISIC_<image_id>_superpixels.png, where <image_id> matches the corresponding lesion image for the superpixel mask.

Training Ground Truth

Download Training Ground Truth

The Training Ground Truth file is a ZIP file, containing 807 dermoscopic feature files in JSON format. All feature files are named using the scheme ISIC_<image_id>.json, where <image_id> matches the corresponding Training Data lesion image and superpixel mask for the feature file.

Each feature file contains a top-level JSON Object (key-value map) with 2 keys:globules and streaks, representing the dermoscopic features of interest. The value of each of theses Object elements is a JSON Array, of length N, where N is the total number of superpixels in the corresponding superpixel mask. Each value within the Array at position k, where 0<= k < N, corresponds to the region within the decoded superpixel index k. The Array values are each JSON Numbers, and equal to either:

  • 0: representing the absence of a given dermoscopic feature somewhere within the corresponding superpixel's spatial extent
  • 1: representing the presence of a given dermoscopic feature somewhere within the corresponding superpixel's spatial extent

For example, the feature file:

{
    "globules": [0, 0, 1, 0, 1, 0],
    "streaks": [1, 1, 0, 0, 0, 0]
}

would correspond to a superpixel file with 6 superpixels (encoded in PNG as R=0, G=0, B=0 through R=5, G=0, B=0). The lesion image pixels overlaid by superpixels 2 and 4 (counting from 0) would contain the "globules" dermoscopic feature, while the lesion image pixels overlaid by superpixels 0 and 1 would contain the "streaks" dermoscopic feature.

Notes

Feature data were obtained from expert superpixel-level annotations, with cross-validation from multiple evaluators.

The dermoscopic features of "globules" and "streaks" are not mutually exclusive (i.e. both may be present within the same spatial region or superpixel). Additionally, a dermoscopic feature must only be present anywhere within a superpixel region for the superpixel to be considered positive for that feature; it is not required that the dermoscopic feature fill the entire superpixel region.

Relevant information to automatically determine the label of a superpixel tile may not necessarily be constrained to within the tile alone, but may involve contextual information of the surrounding region as well.

Participants are not strictly required to utilize the training data in the development of their lesion classification algorithm and are free to train their algorithm using external data sources.

Dermoscopic Feature Tutorial

The following tutorial is designed to assist participants in understanding the underlying semantics of the "globules" and "streaks" dermoscopic features:

Globules and Streaks Tutorial

Submission Format

Test Data

Given the Test Data file, a ZIP file of 335 lesion images and 335 corresponding superpixel masks of the exact same formats as the Training Data, participants are expected to generate and submit a file of Test Results.

The Test Data file should be downloaded via the "Download test dataset" button below, which becomes available once a participant is signed-in and opts to participate in this phase of the challenge.

Test Results

The submitted Test Results file should be in the same format as the Training Ground Truth file. Specifically, the Test Results file should be a ZIP file of 335 feature files in JSON format. Each feature file should contain the participant's best attempt at a fully automated per-superpixel detection of the globules and streaks features on the corresponding lesion image and superpixel mask in the Test Data. Each feature file should be named and encoded according to the conventions of the Training Ground Truth.

Note, the JSON Numbers in the submitted Test Results should not be only 0.0 and 1.0, but instead should be floating-point values in the closed interval [0.0, 1.0], where values:

  • 0.0 to 0.5: represent some confidence that the feature is absent from the lesion image anywhere within the given superpixel, with relatively lesser values indicating relatively more confidence in the absence
  • > 0.5 to 1.0: represent some confidence that the feature is present in the lesion image anywhere within the given superpixel, with relatively greater values indicating relatively more confidence in the presence

Note, arbitrary score ranges and thresholds can be converted to the range of 0.0 to 1.0, with a threshold of 0.5, trivially using the following sigmoid conversion:

1 / (1 + e^(-(a(x - b))))

where x is the original score, b is the binary threshold, and a is a scaling parameter (often the measured standard deviation on a held-out dataset).

Submission Process

Shortly after being submitted, participants will receive a confirmation email to their registered email address to confirm that their submission was parsed and scored, or to provide a notification that parsing of their submission failed (with a link to details as to the cause of the failure). Participants should not consider their submission complete until receiving a confirmation email.

Multiple submissions may be made with absolutely no penalty. Only the most recent submission will be used to determine a participant's final score. Indeed, participants are encouraged to provide trial submissions early to ensure that the format of their submission is parsed and evaluated successfully, even if final results are not yet ready for submission.

Evaluation

Submitted Test Results feature classifications will be compared to private (until after the challenge ends) Test Ground Truth. The Test Ground Truth was produced from the exact same source and methodology as the Training Ground Truth (both sets were randomly sub-sampled from a larger data pool).

Submissions will be compared using using a variety of common classification metrics, including:

However, participants will be ranked and awards granted based only on average precision.

Some useful resources for metrics computation include: