Featuring Every Evaluation Result Ever on Hugging Face Model Pages

Hugging Face model pages now prominently feature evaluation results from every benchmark ever run, making it easier for researchers and developers to compare model performance at a glance. As part of this ongoing effort, the dataset cais/hle (HLE) has been updated and highlighted on the platform.


Key Details


  • Dataset: cais/hle – a benchmark dataset used for evaluating language models.
  • Last Updated: January 20, 2026 (as of the latest refresh timestamp).
  • Usage Stats:
  • 2.5k downloads
  • 27.7k views
  • 847 likes

This integration demonstrates Hugging Face's commitment to providing transparent, up-to-date evaluation metadata directly on model pages, allowing users to quickly assess model capabilities without navigating external sources.

via Hugging Face Blog

Related