πŸš€ Launching Napolab: The Natural Portuguese Language Benchmark πŸ“Š

Napolab is here: a curated collection of Portuguese datasets designed for easy evaluation of language models.

Explore and contribute on GitHub:

https://github.com/ruanchaves/napolab

πŸ” Why Napolab?

🌿 Natural: Contains only native or professionally translated Portuguese datasets.

βœ… Reliable: Provides trustworthy evaluation metrics.

🌐 Publicly Accessible: All datasets are available for public access.

πŸ‘©β€πŸ”§ Human-Annotated: Every dataset exclusively features expert human annotations.

πŸŽ“ General-Purpose: No domain-specific knowledge or advanced preparation is needed to solve dataset tasks.

Napolab also offers translated versions of all datasets in the following languages:

  • Catalan
  • English
  • Galician
  • Spanish

Get started and download with just two commands:

pip install napolab

python -m napolab

#Napolab #PortugueseBenchmark #NLP #Datasets

--

--

Ruan Chaves Rodrigues

Machine Learning Engineer. MSc student at the EMLCT programme. Personal website: https://ruanchaves.github.io/