The hypothalamus is a small brain structure with an essential role in sleep regulation, body temperature control, and metabolic homeostasis. Hypothalamic structural abnormalities have been reported in neuropsychiatric disorders, such as schizophrenia, amyotrophic lateral sclerosis, and Alzheimer's. Although magnetic resonance imaging (MRI) is the standard exam to evaluate this region, hypothalamic morphological landmarks are unclear, leading to subjectivity and high variability during manual segmentation. Due to these limitations, it is common to find contradictory results in the literature in terms of hypothalamic volumetry. To the best of our knowledge, there are only two automated methods available in the literature for hypothalamus segmentation, being the first our previous method based on U-Net.

Aiming to support further development of hypothalamus segmentation models, we present here the first public hypothalamus segmentation dataset, consisting of a diverse T1-weighted MRI dataset comprising 1381 subjects from IXI, CC359, OASIS, and MiLI (the latter created specifically for this benchmark). All data are provided with automatically generated hypothalamus masks and a subset containing manually annotated masks. As a baseline, we present a teacher-student-based model for fully automated segmentation of the hypothalamus on T1-weighted MR images.


The annotation for the data varies according to the dataset. We provide five types of annotation:

  • Specialist manual annotation: Segmentation performed by a specialist, following the segmentation protocol established for the study;

  • Inexperienced rater manual annotation: Segmentation performed by an inexperienced rater. It does not follow the segmentation protocol determined by the specialist. However, it increases the amount of data available for training;

  • Automated annotation: Segmentation performed by available automated tools;

  • STAPLE: Simultaneous truth and performance level estimation. Applied on cases where we had three types of segmentation: Inexperienced rater manual annotation and two automated annotations; and

  • Consensus: Applied in cases where we did not have manual segmentation, creating a consensus of the two automated segmentation methods.

Aiming to support the development of models with good generalization capability, we provide hypothalamus labels for four different datasets: CC359, IXI, OASIS, and MiLI (MICLab-LNI Initiative). The number of images and type of labels for each dataset is shown in the table below:


Aiming to encourage new methods' development, we present here a leaderboard for any researcher that wishes to apply. We evaluate four metrics:

  • Dice Coefficient

  • Average Hausdorff Distance

  • Hausdorff Distance

  • Volume Symmetry


If you wish to have your team on our leaderboard, please submit your segmentation result on the test set as .nii or .nii.gz files on a compacted folder for l180545@dac.unicamp.br

Data available on the Download section