Run description: I split the query sentence into "Abstract sentences", i.e., sentences which I would expect to match with the Abstract of a Wikipedia document. For example, a query sentence regarding the movie release date is an Abstract sentence, since it's usually in the first paragraph of a movie wikipedia document. I used the movie tip-of-the-tongue queries and answers from the Reddit ToT dataset in https://github.com/webis-de/QPP-23
Run description: I split the query sentence into "Abstract sentences", i.e., sentences which I would expect to match with the Abstract of a Wikipedia document. For example, a query sentence regarding the movie release date is an Abstract sentence, since it's usually in the first paragraph of a movie wikipedia document. I search only the abstract sentences, only on the abstracts of the documents. I used the movie tip-of-the-tongue queries and answers from the Reddit ToT dataset in https://github.com/webis-de/QPP-23
Run description: We used predicted sentence annotations to boost certain sentences in the query. We used the train and dev set to predict sentence annotations using a KNN classifier. The degree of boosting was proportional to the classifier's predicted confidence value.
Run description: We used predicted sentence annotations to boost certain sentences in the query. We used the train and dev set to predict sentence annotations using a KNN classifier.
Run description: During training, we used them to augment data by cropping unnecessary information. We followed the paper [1] to select what is 'unnecessary'- annotations that lower the model performance are cropped. During test, we removed unnecessary sentences in the query. [1] Jaime Arguello, Adam Ferguson, Emery Fine, Bhaskar Mitra, Hamed Zamani, and Fernando Diaz. Tip of the tongue known-item retrieval: A case study in movie identification. In Proc. ACM CHIIR21, pp. 5-14. 2021. english wikipedia and bookcorpus (via BERT-base)
Run description: During training, we used them to augment data by cropping unnecessary information. We followed the paper [1] to select what is 'unnecessary'- annotations that lower the model performance are cropped. During test, we removed unnecessary sentences in the query. [1] Jaime Arguello, Adam Ferguson, Emery Fine, Bhaskar Mitra, Hamed Zamani, and Fernando Diaz. Tip of the tongue known-item retrieval: A case study in movie identification. In Proc. ACM CHIIR21, pp. 5-14. 2021. I used the BERT-base model as my backbone model, so English Wikipedia and Bookcorpus are used as pretraining data.
Run description: During training, we used them to augment data by cropping unnecessary information. We followed the paper [1] to select what is 'unnecessary'- annotations that lower the model performance are cropped. [1] Jaime Arguello, Adam Ferguson, Emery Fine, Bhaskar Mitra, Hamed Zamani, and Fernando Diaz. Tip of the tongue known-item retrieval: A case study in movie identification. In Proc. ACM CHIIR21, pp. 5-14. 2021. I used the BERT-base model as my backbone model, so English Wikipedia and Bookcorpus are used as pretraining data.
Run description: We used the TOMT-KIS dataset (tip-of-my-tongue known-item search: https://webis.de/downloads/publications/papers/froebe_2023c.pdf) to train deepct for long query reduction. We removed questions from the MS-TOT dataset, but we did not checked if other questions that are not in the MS-TOT dataset but in our dataset would link to the same known-item.