Skip to content

Model for arXiv text class

  • Description: This is an AI benchmark to evaluate how accurately text data is classified into different categories, using the PubChem dataset. Here we use accuracy of classification (ACC) to compare how well each model classifies the text data, comparing to the ground truth classification of the PubChem categories.


Reference(s): https://github.com/usnistgov/chemnlp, https://doi.org/10.1093/nar/gkaa971

Model benchmarks

Model nameDataset Accuracy Team name Dataset size Date submitted Notes
random_forest_text_title_abstractpubchem0.9674ChemNLP4450001-14-2023CSV, JSON, run.sh, Info
logisticreg_model_text_abstractpubchem0.9276ChemNLP4450001-14-2023CSV, JSON, run.sh, Info
random_forest_text_abstractpubchem0.9317ChemNLP4450001-14-2023CSV, JSON, run.sh, Info
logisticreg_model_text_title_abstractpubchem0.9674ChemNLP4450001-14-2023CSV, JSON, run.sh, Info
random_forest_text_titlepubchem0.9449ChemNLP4450001-14-2023CSV, JSON, run.sh, Info
svc_model_text_title_abstractpubchem0.94ChemNLP4450001-14-2023CSV, JSON, run.sh, Info
svc_model_text_titlepubchem0.9458ChemNLP4450001-14-2023CSV, JSON, run.sh, Info
logisticreg_model_text_titlepubchem0.9206ChemNLP4450001-14-2023CSV, JSON, run.sh, Info
svc_model_text_abstractpubchem0.94ChemNLP4450001-14-2023CSV, JSON, run.sh, Info