Shifts Challenge 2022

Evaluating robustness and uncertainty on real-world data


The Shifts Project is happy to announce the launch of the Shifts Challenge 2022! This will be the second iteration of the Shifts Challenge, which successfully debuted last year at NeurIPS 2021. Our intent is to raise awareness among the research community about the problems of distributional shift, robustness, and uncertainty estimation, and to identify new solutions to address them. We hope to help move the community away from small-scale image classification tasks and towards realistic, complex modalities and tasks taken from real-world, industrial applications where distributional shift is a significant problem. 

This year, the competition will consist of two new tracks:

  • Track 1: White Matter Multiple Sclerosis (MS) lesion segmentation in 3D Magnetic Resonance Imaging (MRI) of the brain

  • Track 2: Marine cargo vessel power estimation

The challenge will consist of two phases:

  • Development - here participants download the data, take time in developing and researching their proposed solutions. They can compare their solutions to everyone else's on the Development leaderboards of both tracks. Participants are also encouraged to form teams at this stage. Details on team formation will be provided later. 
  • Evaluation - here participants submit their solutions to the evaluation leaderboard to compete for the top place and win.


Both tasks are high-risk ML applications strongly affected by distributional shift, have strict requirements on robustness and are of significant social relevance. 

Multiple Sclerosis (MS) is a debilitating, incurable and progressive disorder of the central nervous system that negatively impacts an individual's quality of life. Estimates claim that every five minutes a person is diagnosed with MS, reaching 2.8 million cases in 2020 and that MS is two-to-four times more prevalent in women than in men. Magnetic Resonance Imaging (MRI) plays a crucial role in the disease diagnosis and follow-up. However, manual annotations are expensive, time-consuming, and prone to errors. Automatic, ML-based methods may introduce objectivity and labor efficiency in the tracking of MS lesions. However, the availability of training images for machine learning methods is limited. No publicly available dataset fully describes the heterogeneity of the pathology. Furthermore, changes in MRI scanner vendors, configurations, imaging software and medical personnel leads to significant variability in the imaging process. These differences, which are exacerbated when considering images collected from multiple medical centers, represent a significant distributional shift for ML-based MS detection models, reducing the applicability and robustness of automated models in real-world conditions. The development of robust MS lesion segmentation models is necessary to bring improvements in the quality and throughput of the medical care available to the growing number of MS patients.

Maritime transport delivers around 90 % of the world's traded goods [Christodoulou et al., 2019], emitting almost a billion tonnes of CO2 annually and increasing [Hilakari, 2019]. Energy consumption varies greatly depending on the chosen routes, speeds, operation and maintenance of ships, but the complex underlying relationships are not fully known or taken into account at the time these decisions are taken, leading to significant fuel waste. Lack of predictability of fuel needs also leads to vessels carrying more fuel than necessary, costing even more fuel to carry. Training accurate consumption models, both for use on their own and with downstream route optimisation algorithms, can therefore help significantly reduce costs and emissions. However, weather and sea conditions that affect vessel power consumption are highly variable based on seasonality and geographical location and cannot all be fully measured.  Furthermore, phenomena such as the accumulation of marine growth on the vessel's hull (hull fouling) cause the relationship between conditions and power to shift over time in unpredictable ways. The result of the above is that significant distributional shifts can be expected to occur between the real use cases of models and the data used to train and evaluate them. Inaccurate power prediction and the resultant errors in fuel planning and route optimisation can be considerably costly, hazardous and place the vessel, its crew and cargo at high risk. Thus, the development of uncertainty-aware and robust models is essential to enable the effective deployment of this technology to reduce the carbon footprint of global supply chains.


The task of this track is the segmentation of White Matter Lesions (WML) of Multiple Sclerosis (MS) in Magnetic Resonance Images (MRIs). This involves the generation of a 3D per-voxel segmentation mask identifying each voxel as lesion or non-lesion tissue [Rovira et al., 2015][Wattjes wt al., 2021]. 

MRIs are multi-modal images of the brain, with MS diagnosis being based mainly on :

  • T1-weighted

  • FLAIR (Fluid-Attenuated Inversion Recovery)

The objectives of this track are two-fold, as the submission will be evaluated on their: 

  • Voxel-scale lesion segmentation performance under distributional shift.

  • Quality of voxel-scale uncertainty estimates for voxel-scale error detection.

The task of this track is a scalar regression task that involves predicting the power consumption of a merchant vessel given tabular features describing the operational state of the vessel and weather and sea conditions. If a model is probabilistic, then it would yield a probability density over the power consumption, which is treated as a continuous random variable.

The objectives of this track are similarly two-fold, as the submission will be evaluated on their: 

  • Power consumption prediction performance under distributional shift.

  • Quality of  the uncertainty estimates for error detection.


Contact Information

The best way to keep in touch with both the organisers and other participants is via our Discord Server. Here you can ask us questions, collaborate with other participants and form teams. Additionally, you are also welcome to join our Mailing List and also follow us on Twitter

REFERENCES

[Amodei et al., 2016] Dario Amodei, Chris Olah, Jacob Steinhardt, Paul F. Christiano, John Schulman, and Dan Mané, “Concrete problems in AI safety,” arXiv: 1606.06565, 2016.

[Gal, 2016] Yarin Gal, Uncertainty in Deep Learning, Ph.D. thesis, University of Cambridge, 2016.

[Malinin, 2019] Andrey Malinin, Uncertainty Estimation in Deep Learning with application to Spoken Language Assessment, Ph.D. thesis, University of Cambridge, 2019.

[Walton et al., 2020] Clare Walton, Rachel King, Lindsay Rechtman, Wendy Kaye, Emmanuelle Leray, Ruth Ann Marrie, Neil Robertson, Nicholas La Rocca, Bernard Uitdehaag, Ingrid van der Mei, et al., “Rising prevalence of multiple sclerosis worldwide: Insights from the atlas of MS,” Multiple Sclerosis Journal, vol. 26, no. 14, pp. 1816–1821, 2020.

[Zeng et al., 2020] Chenyi Zeng, Lin Gu, Zhenzhong Liu, and Shen Zhao, “Review of deep learning approaches for the segmentation of multiple sclerosis lesions on brain MRI,” Frontiers in Neuroinformatics, vol. 14, 2020.

[Christodoulou et al., 2019] Anastasia Christodoulou and Johan Woxenius, “Sustainable short sea shipping,” 2019. [Hilakari, 2019] Marianna Hilakari, “Carbon footprint calculation of shipbuilding,” 2019.

[Hilakari, 2019] Marianna Hilakari, “Carbon footprint calculation of shipbuilding,” 2019.