Skip to content

Model for arXiv text summarization

  • Description: This is an AI benchmark to evaluate how well text data is summarized, using the arXiv dataset. Here we use the recall-oriented understudy for gisting evaluation (ROUGE) score as a metric.


Reference(s): https://doi.org/10.48550/arXiv.1905.00075, https://github.com/usnistgov/chemnlp

Model benchmarks

Model nameDataset Rouge Team name Dataset size Date submitted Notes
transformers_t5_basearxiv_summary0.2602ChemNLP8714801-14-2023CSV, JSON, run.sh, Info