Runs - Podcast 2021¶
baseline-BM25¶
Results
| Participants
| Input
| Summary (QD)
| Summary (QE)
| Summary (QR)
| Summary (QS)
| Appendix
- Run ID: baseline-BM25
- Participant: BASELINES
- Track: Podcast
- Year: 2021
- Submission: 9/2/2021
- Type: automatic
- Task: retrieval
- MD5:
3dc8635299b59fe1058267fa48fdef30
- Run description: Baseline using Pyserini BM25
baseline-BM25-D¶
Results
| Participants
| Input
| Summary (QD)
| Summary (QE)
| Summary (QR)
| Summary (QS)
| Appendix
- Run ID: baseline-BM25-D
- Participant: BASELINES
- Track: Podcast
- Year: 2021
- Submission: 9/2/2021
- Type: automatic
- Task: retrieval
- MD5:
f3dd830db3c2e2a2af4296bdbc2513d1
- Run description: Baseline using Pyserini BM25, including Description field
Baseline-oneminute¶
Results
| Participants
| Input
| Summary (QD)
| Summary (QE)
| Summary (QR)
| Summary (QS)
| Appendix
- Run ID: Baseline-oneminute
- Participant: BASELINES
- Track: Podcast
- Year: 2021
- Submission: 9/3/2021
- Type: automatic
- Task: summarization
- Run description: First minute of podcast
baseline-QL-D¶
Results
| Participants
| Input
| Summary (QD)
| Summary (QE)
| Summary (QR)
| Summary (QS)
| Appendix
- Run ID: baseline-QL-D
- Participant: BASELINES
- Track: Podcast
- Year: 2021
- Submission: 9/2/2021
- Type: automatic
- Task: retrieval
- MD5:
13282e5078271250d5b481b378f7f130
- Run description: Baseline run using Pyserini QL, including Description field
baseline-QL-Q¶
Results
| Participants
| Input
| Summary (QD)
| Summary (QE)
| Summary (QR)
| Summary (QS)
| Appendix
- Run ID: baseline-QL-Q
- Participant: BASELINES
- Track: Podcast
- Year: 2021
- Submission: 9/2/2021
- Type: automatic
- Task: retrieval
- MD5:
712e62a030bbdb6159b8233348d8d8f9
- Run description: Baseline run using Pyserini QL
f_b25_coil¶
Results
| Participants
| Input
| Summary (QD)
| Summary (QE)
| Summary (QR)
| Summary (QS)
| Appendix
- Run ID: f_b25_coil
- Participant: CFDA_CLIP
- Track: Podcast
- Year: 2021
- Submission: 9/6/2021
- Type: automatic
- Task: retrieval
- MD5:
de79c36f26c4a07beefe01efc6cafdf0
- Run description: encoding: transcripts only bm25 + tct-coil trained on passage ranking dataset, score fusion
f_b25_tct¶
Results
| Participants
| Input
| Summary (QD)
| Summary (QE)
| Summary (QR)
| Summary (QS)
| Appendix
- Run ID: f_b25_tct
- Participant: CFDA_CLIP
- Track: Podcast
- Year: 2021
- Submission: 9/6/2021
- Type: automatic
- Task: retrieval
- MD5:
ad83573c2736aff405006a86422185bf
- Run description: encoding: transcripts only bm25 + tct trained on document ranking dataset, score fusion
f_coil_tct¶
Results
| Participants
| Input
| Summary (QD)
| Summary (QE)
| Summary (QR)
| Summary (QS)
| Appendix
- Run ID: f_coil_tct
- Participant: CFDA_CLIP
- Track: Podcast
- Year: 2021
- Submission: 9/6/2021
- Type: automatic
- Task: retrieval
- MD5:
b2364b98564c2dc00b7a54008de70c51
- Run description: encoding: transcripts only tct-coil trained on psg ranking + tct trained on document ranking dataset, score fusion
Hotspot1¶
Results
| Participants
| Input
| Summary (QD)
| Summary (QE)
| Summary (QR)
| Summary (QS)
| Appendix
- Run ID: Hotspot1
- Participant: Spotify
- Track: Podcast
- Year: 2021
- Submission: 9/5/2021
- Type: automatic
- Task: summarization
- Run description: This run is based on hotspot detection. Each episode audio is split into clips, where each clip corresponds to one sentence from the transcript. Speech emotion recognition is then performed on each clip, and the clips with the highest emotion scores are selected as "hotspots" and added to the summarization. SentenceBERT model is used to generate embeddings for sentences within the first four minutes of each episode as well as a document embedding that sums up all sentence embeddings. A similarity score between each sentence embedding and the document embedding is calculated, and the sentence with the highest score is selected and inserted in the beginning of the summarization. Finally, the summarization is generated both in audio and text forms.
ms_mt5¶
Results
| Participants
| Input
| Summary (QD)
| Summary (QE)
| Summary (QR)
| Summary (QS)
| Appendix
- Run ID: ms_mt5
- Participant: h2oloo
- Track: Podcast
- Year: 2021
- Submission: 9/4/2021
- Type: automatic
- Task: retrieval
- MD5:
c528fd2fde147b96b3367b78fb4e331d
- Run description: Pyserini Default BM25 using segments. 6-3 sliding window MaxP with a monoT5-3B trained on MS-MARCO 1K (Query Format: (Q + D))
osc_tok_vec¶
Results
| Participants
| Input
| Summary (QD)
| Summary (QE)
| Summary (QR)
| Summary (QS)
| Appendix
- Run ID: osc_tok_vec
- Participant: OSC
- Track: Podcast
- Year: 2021
- Submission: 9/6/2021
- Type: automatic
- Task: retrieval
- MD5:
c1e11aa5a3379c796a6c71e675af6d09
- Run description: Max normalized scores combined with Cosine similarity for embeddings in window.
osc_token¶
Results
| Participants
| Input
| Summary (QD)
| Summary (QE)
| Summary (QR)
| Summary (QS)
| Appendix
- Run ID: osc_token
- Participant: OSC
- Track: Podcast
- Year: 2021
- Submission: 9/6/2021
- Type: automatic
- Task: retrieval
- MD5:
4f2c86cafcfd659d53730ebfaa561565
- Run description: Elasticsearch's combined_fields query with field boosting to prioritize transcript
osc_vec_tok¶
Results
| Participants
| Input
| Summary (QD)
| Summary (QE)
| Summary (QR)
| Summary (QS)
| Appendix
- Run ID: osc_vec_tok
- Participant: OSC
- Track: Podcast
- Year: 2021
- Submission: 9/6/2021
- Type: automatic
- Task: retrieval
- MD5:
a9fccc9240435ed70d29604f556baabb
- Run description: Cosine similarity on SBERT embeddings for recall
osc_vector¶
Results
| Participants
| Input
| Summary (QD)
| Summary (QE)
| Summary (QR)
| Summary (QS)
| Appendix
- Run ID: osc_vector
- Participant: OSC
- Track: Podcast
- Year: 2021
- Submission: 9/6/2021
- Type: automatic
- Task: retrieval
- MD5:
f5030ce207ea53a6dda34d6ee744e43d
- Run description: Cosine similarity on SBERT embeddings
PoliTO_100_32-128¶
Results
| Participants
| Proceedings
| Input
| Summary (QD)
| Summary (QE)
| Summary (QR)
| Summary (QS)
| Appendix
- Run ID: PoliTO_100_32-128
- Participant: PoliTO
- Track: Podcast
- Year: 2021
- Submission: 9/3/2021
- Type: automatic
- Task: summarization
- Run description: In this run, we set the maximum number of selected sentences to 100. The abstractive summarization model has been limited to provide summaries with minimum and maximum lengths set to 32 and 128 respectively. Podcasts' transcripts have been used as input both for supervised extraction and for the abstractive summarization step. Episodes' descriptions of the training set have been used as references for the supervised selection and abstractive summarization (only during training). The audio files haven't been directly used, instead, we use the opensmile feature representations. Please note: The audio summaries have the same content for all our runs.
PoliTO_25_32-128¶
Results
| Participants
| Proceedings
| Input
| Summary (QD)
| Summary (QE)
| Summary (QR)
| Summary (QS)
| Appendix
- Run ID: PoliTO_25_32-128
- Participant: PoliTO
- Track: Podcast
- Year: 2021
- Submission: 9/3/2021
- Type: automatic
- Task: summarization
- Run description: In this run, we set the maximum number of selected sentences to 25. The abstractive summarization model has been limited to provide summaries with minimum and maximum lengths set to 32 and 128 respectively. Podcasts' transcripts have been used as input both for supervised extraction and for the abstractive summarization step. Episodes' descriptions of the training set have been used as references for the supervised selection and abstractive summarization (only during training). The audio files haven't been directly used, instead, we use the opensmile feature representations. Please note: The audio summaries have the same content for all our runs.
PoliTO_50_32-128¶
Results
| Participants
| Proceedings
| Input
| Summary (QD)
| Summary (QE)
| Summary (QR)
| Summary (QS)
| Appendix
- Run ID: PoliTO_50_32-128
- Participant: PoliTO
- Track: Podcast
- Year: 2021
- Submission: 9/3/2021
- Type: automatic
- Task: summarization
- Run description: In this run, we set the maximum number of selected sentences to 50. The abstractive summarization model has been limited to provide summaries with minimum and maximum lengths set to 32 and 128 respectively. Podcasts' transcripts have been used as input both for supervised extraction and for the abstractive summarization step. Episodes' descriptions of the training set have been used as references for the supervised selection and abstractive summarization (only during training). The audio files haven't been directly used, instead, we use the opensmile feature representations. Please note: The audio summaries have the same content for all our runs.
PoliTO_50_64-128¶
Results
| Participants
| Proceedings
| Input
| Summary (QD)
| Summary (QE)
| Summary (QR)
| Summary (QS)
| Appendix
- Run ID: PoliTO_50_64-128
- Participant: PoliTO
- Track: Podcast
- Year: 2021
- Submission: 9/4/2021
- Type: automatic
- Task: summarization
- Run description: In this run, we set the maximum number of selected sentences to 50. The abstractive summarization model has been limited to provide summaries with minimum and maximum lengths set to 64 and 128 respectively. Podcasts' transcripts have been used as input both for supervised extraction and for the abstractive summarization step. Episodes' descriptions of the training set have been used as references for the supervised selection and abstractive summarization (only during training). The audio files haven't been directly used, instead, we use the opensmile feature representations. Please note: The audio summaries have the same content for all our runs.
s_tasb¶
Results
| Participants
| Input
| Summary (QD)
| Summary (QE)
| Summary (QR)
| Summary (QS)
| Appendix
- Run ID: s_tasb
- Participant: CFDA_CLIP
- Track: Podcast
- Year: 2021
- Submission: 9/6/2021
- Type: automatic
- Task: retrieval
- MD5:
616db4d91602953ae7c216c1a46f5b1f
- Run description: encoding: transcripts only tas-b model trained on msmarco psg ranking dataset
s_tct¶
Results
| Participants
| Input
| Summary (QD)
| Summary (QE)
| Summary (QR)
| Summary (QS)
| Appendix
- Run ID: s_tct
- Participant: CFDA_CLIP
- Track: Podcast
- Year: 2021
- Submission: 9/6/2021
- Type: automatic
- Task: retrieval
- MD5:
99a091911d996d5c46cd9d65bd7f6dd1
- Run description: encoding: transcripts only tct model trained on msmarco doc ranking dataset
theTuringTest1¶
Results
| Participants
| Input
| Summary (QD)
| Summary (QE)
| Summary (QR)
| Summary (QS)
| Appendix
- Run ID: theTuringTest1
- Participant: theTuringTest
- Track: Podcast
- Year: 2021
- Submission: 9/2/2021
- Type: automatic
- Task: summarization
- Run description: Used Feature engineering, including Unigram Model. Also used Metrics such as Rouge1, Rouge2, RougeL, and Meteor to get best possible extractive summary. Applied TOPSIS to rank each sentence based on selected features. Iteratively remove worst performing sentences until we can get the best scoring extractive summary.
theTuringTest2¶
Results
| Participants
| Input
| Summary (QD)
| Summary (QE)
| Summary (QR)
| Summary (QS)
| Appendix
- Run ID: theTuringTest2
- Participant: theTuringTest
- Track: Podcast
- Year: 2021
- Submission: 9/2/2021
- Type: automatic
- Task: summarization
- Run description: Used Feature engineering, including Unigram Model. Also used Metrics such as Rouge1, Rouge2, RougeL, and Meteor to get best possible extractive summary. Applied TOPSIS to rank each sentence based on selected features. Iteratively remove worst performing sentences until we can get the best set of sentences. T5 Model is applied on this set to produce the final abstractive summary.
tp_mt5¶
Results
| Participants
| Input
| Summary (QD)
| Summary (QE)
| Summary (QR)
| Summary (QS)
| Appendix
- Run ID: tp_mt5
- Participant: h2oloo
- Track: Podcast
- Year: 2021
- Submission: 9/4/2021
- Type: automatic
- Task: retrieval
- MD5:
db1cc52cbfd64b42a26603047fbadd6a
- Run description: Pyserini Default BM25 using segments. 6-3 sliding window MaxP with a monoT5-3B trained on MS-MARCO 1K -> 2020 Trec Podcasts Topics Transcripts (Query Format: (Q + D))
tp_mt5_f1¶
Results
| Participants
| Input
| Summary (QD)
| Summary (QE)
| Summary (QR)
| Summary (QS)
| Appendix
- Run ID: tp_mt5_f1
- Participant: h2oloo
- Track: Podcast
- Year: 2021
- Submission: 9/4/2021
- Type: automatic
- Task: retrieval
- MD5:
f1a265db63711673a69fff16d916766b
- Run description: Pyserini Default BM25 using segments. 6-3 sliding window MaxP with a monoT5-3B trained on MS-MARCO 1K -> 2020 Trec Podcasts Topics Transcripts (Query Format: (Q + D)) Feature weight Yamnet 1
tp_mt5_f2¶
Results
| Participants
| Input
| Summary (QD)
| Summary (QE)
| Summary (QR)
| Summary (QS)
| Appendix
- Run ID: tp_mt5_f2
- Participant: h2oloo
- Track: Podcast
- Year: 2021
- Submission: 9/4/2021
- Type: automatic
- Task: retrieval
- MD5:
2c2a250869c4c1e1142bc9b900fbfd67
- Run description: Pyserini Default BM25 using segments. 6-3 sliding window MaxP with a monoT5-3B trained on MS-MARCO 1K -> 2020 Trec Podcasts Topics Transcripts (Query Format: (Q + D)) Feature weight Yamnet 2
TUW_hybrid_cat¶
Results
| Participants
| Proceedings
| Input
| Summary (QD)
| Summary (QE)
| Summary (QR)
| Summary (QS)
| Appendix
- Run ID: TUW_hybrid_cat
- Participant: TU_Vienna
- Track: Podcast
- Year: 2021
- Submission: 9/6/2021
- Type: automatic
- Task: retrieval
- MD5:
98eac69f6dc90146413637c6a1a53b8e
- Run description: This run first combines a standard BM25 (Pyserini) run and our full TAS-B run (both top1000) and then applies a knowledge distilled DistilBERT_Cat Re-ranking model (https://huggingface.co/sebastian-hofstaetter/distilbert-cat-margin_mse-T2-msmarco) to generate the final ranking. For QE re-rankings we utilize a BERT based emotion classifier trained on go-emotions dataset and for QS re-rankings we utilize RoBERTA classifier trained on a argument/non-argument labeled dataset and combine this a simple dictionary-based subjectivity classifier.
TUW_hybrid_ws¶
Results
| Participants
| Proceedings
| Input
| Summary (QD)
| Summary (QE)
| Summary (QR)
| Summary (QS)
| Appendix
- Run ID: TUW_hybrid_ws
- Participant: TU_Vienna
- Track: Podcast
- Year: 2021
- Submission: 9/6/2021
- Type: automatic
- Task: retrieval
- MD5:
83c31c51fff66efbe129409d392efea7
- Run description: We use our publicly available checkpoint (https://huggingface.co/sebastian-hofstaetter/distilbert-dot-tas_b-b256-msmarco) trained on MS MARCO passage collection v1 to encode the segments and generate a faiss index. We generate a bm25 sparse-index (Pyserini) and using both indices we follow a hybrid sparse-dense retrieval approach (Pyserini). For QE re-rankings we utilize a BERT based emotion classifier trained on go-emotions dataset and for QS re-rankings we utilize RoBERTA classifier trained on a argument/non-argument labeled dataset and combine this a simple dictionary-based subjectivity classifier.
TUW_tasb192_ann¶
Results
| Participants
| Proceedings
| Input
| Summary (QD)
| Summary (QE)
| Summary (QR)
| Summary (QS)
| Appendix
- Run ID: TUW_tasb192_ann
- Participant: TU_Vienna
- Track: Podcast
- Year: 2021
- Submission: 9/6/2021
- Type: automatic
- Task: retrieval
- MD5:
1bab5f8d3dbe880723bf05324b587088
- Run description: This TAS-Balanced trained model (based on DistilBERT) uses a compression layer at the end to produce 192 dimensional embeddings in fp16 (a 8x reduction to a default 768 dim output in fp32), we then indexed the vectors with HNSW (using 128 neighbors per vector). For inference we use ONNX runtime and BERT optimizations with fp16 (resulting vectors are also fp16). For QE re-rankings we utilize a BERT based emotion classifier trained on go-emotions dataset and for QS re-rankings we utilize RoBERTA classifier trained on a argument/non-argument labeled dataset and combine this a simple dictionary-based subjectivity classifier.
TUW_tasb_cat¶
Results
| Participants
| Proceedings
| Input
| Summary (QD)
| Summary (QE)
| Summary (QR)
| Summary (QS)
| Appendix
- Run ID: TUW_tasb_cat
- Participant: TU_Vienna
- Track: Podcast
- Year: 2021
- Submission: 9/6/2021
- Type: automatic
- Task: retrieval
- MD5:
3db5f8ed432ba1850701077adcaeb031
- Run description: We use our publicly available checkpoint (https://huggingface.co/sebastian-hofstaetter/distilbert-dot-tas_b-b256-msmarco) of our TAS-Balanced trained DistilBERT dense retrieval model in a brute-force search configuration. For inference we use ONNX runtime and BERT optimizations with fp16 (resulting vectors are also fp16). For QE re-rankings we utilize a BERT based emotion classifier trained on go-emotions dataset and for QS re-rankings we utilize RoBERTA classifier trained on a argument/non-argument labeled dataset and combine this a simple dictionary-based subjectivity classifier.
UCL_audio_1¶
Results
| Participants
| Input
| Summary (QD)
| Summary (QE)
| Summary (QR)
| Summary (QS)
| Appendix
- Run ID: UCL_audio_1
- Participant: podcast2021_ucl
- Track: Podcast
- Year: 2021
- Submission: 9/3/2021
- Type: automatic
- Task: retrieval
- MD5:
089e960fb7967ffa1cd459a137254b30
- Run description: This run trains three classification models based on a small manually labelled subset of the podcast dataset. The input are tree selected eGeMAPS or YAMNet features. Either Random Foreset or Support Vector Classifier with RBF kernel are trained for each emotion. The initial ranked list are reranked based on the probability predicted by each emotion classification model.
UCL_audio_2¶
Results
| Participants
| Input
| Summary (QD)
| Summary (QE)
| Summary (QR)
| Summary (QS)
| Appendix
- Run ID: UCL_audio_2
- Participant: podcast2021_ucl
- Track: Podcast
- Year: 2021
- Submission: 9/3/2021
- Type: automatic
- Task: retrieval
- MD5:
be1a6979287516748fdb4dbd210153a0
- Run description: This run uses both eGeMAPS and YAMNet features to create three mood metrics for each emotion. By labelling a small subset of the podcast dataset, an exploratory approach is employed on a case-by-case basis to establish preliminary, "proof of concept" mood metrics.
Unicamp1¶
Results
| Participants
| Proceedings
| Input
| Summary (QD)
| Summary (QE)
| Summary (QR)
| Summary (QS)
| Appendix
- Run ID: Unicamp1
- Participant: Unicamp
- Track: Podcast
- Year: 2021
- Submission: 9/3/2021
- Type: automatic
- Task: summarization
- Run description: mBART adapted to a Longformer version and trained on Portuguese and English podcasts transcripts and episode descriptions.
Unicamp2¶
Results
| Participants
| Proceedings
| Input
| Summary (QD)
| Summary (QE)
| Summary (QR)
| Summary (QS)
| Appendix
- Run ID: Unicamp2
- Participant: Unicamp
- Track: Podcast
- Year: 2021
- Submission: 9/4/2021
- Type: automatic
- Task: summarization
- Run description: This is a multilingual longformer model capable of generating abstractive summarizations. This model is the mBART-50 model converted to a Longformer version and capable of processing 4096 input tokens. It was finetuned on the XL-SUM dataset (English and Portuguese) and then finetuned on podcast transcripts and episode descriptions (English and Portuguese)
Webis_pc_abstr¶
Results
| Participants
| Proceedings
| Input
| Summary (QD)
| Summary (QE)
| Summary (QR)
| Summary (QS)
| Appendix
- Run ID: Webis_pc_abstr
- Participant: Webis
- Track: Podcast
- Year: 2021
- Submission: 9/3/2021
- Type: automatic
- Task: summarization
- Run description: Trained Cola Model (https://github.com/google-research/google-research/tree/master/cola) unsupervised on 10,000h of podcast audio files. Use combined Cola embeddings and embeddings generated by pretrained Roberta model as basis to train a model on 1000 manually annotated podcast snippets to classify how 'entertaining' a snippet is. Retrieve 5 sentences from episode that are most entertaining. Add their surrounding sentences. Concatenate them all and use as input for distilbart model that is trained on the CNN summarization dataset. Audio summary: Audio clips of the sentences that get used as input for summarization model.
Webis_pc_bs¶
Results
| Participants
| Proceedings
| Input
| Summary (QD)
| Summary (QE)
| Summary (QR)
| Summary (QS)
| Appendix
- Run ID: Webis_pc_bs
- Participant: Webis
- Track: Podcast
- Year: 2021
- Submission: 9/3/2021
- Type: automatic
- Task: retrieval
- MD5:
8825f4ece431d4376b6c0cee2e185563
- Run description: Retrieve using Elasticsearch implementation of BM25. No reranking. This run is just a baseline for our other runs.
Webis_pc_co_rob¶
Results
| Participants
| Proceedings
| Input
| Summary (QD)
| Summary (QE)
| Summary (QR)
| Summary (QS)
| Appendix
- Run ID: Webis_pc_co_rob
- Participant: Webis
- Track: Podcast
- Year: 2021
- Submission: 9/3/2021
- Type: automatic
- Task: retrieval
- MD5:
a5b0e524df19a5bf7e0a3d5a2ffb1c79
- Run description: Retrieve using Elasticsearch implementation of BM25. Trained Cola Model (https://github.com/google-research/google-research/tree/master/cola) unsupervised on 10,000h of podcast audio files. Use combined Cola embeddings and embeddings generated by pretained Roberta model as basis to train a classifier for every feature (entertaining, subjective, discussion) on 1000 manually annotated podcast snippets. Rerank using the generated features.
Webis_pc_cola¶
Results
| Participants
| Proceedings
| Input
| Summary (QD)
| Summary (QE)
| Summary (QR)
| Summary (QS)
| Appendix
- Run ID: Webis_pc_cola
- Participant: Webis
- Track: Podcast
- Year: 2021
- Submission: 9/3/2021
- Type: automatic
- Task: retrieval
- MD5:
30f4a211c00990637a7af15eaf8bca6b
- Run description: Retrieve using Elasticsearch implementation of BM25. Trained Cola Model (https://github.com/google-research/google-research/tree/master/cola) unsupervised on 10,000h of podcast audio files. Use embeddings generated by this model as basis to train a classifier for every feature (entertaining, subjective, discussion) on 1000 manually annotated podcast snippets. Rerank using the generated features.
Webis_pc_extr¶
Results
| Participants
| Proceedings
| Input
| Summary (QD)
| Summary (QE)
| Summary (QR)
| Summary (QS)
| Appendix
- Run ID: Webis_pc_extr
- Participant: Webis
- Track: Podcast
- Year: 2021
- Submission: 9/3/2021
- Type: automatic
- Task: summarization
- Run description: Trained Cola Model (https://github.com/google-research/google-research/tree/master/cola) unsupervised on 10,000h of podcast audio files. Use combined Cola embeddings and embeddings generated by pretrained Roberta model as basis to train a model (SVM) on 1000 manually annotated podcast snippets to classify how 'entertaining' a snippet is. Add all sentences from the episode to graph as nodes. Set starting weights to a value calculated from semantic similarity and the entertainment score of both sentences. Use Textrank algorithm to rank all sentences in episode. Use 10 highest ranked sentences as summary. Audio summary: Audio clips of extracted sentences.
Webis_pc_rob¶
Results
| Participants
| Proceedings
| Input
| Summary (QD)
| Summary (QE)
| Summary (QR)
| Summary (QS)
| Appendix
- Run ID: Webis_pc_rob
- Participant: Webis
- Track: Podcast
- Year: 2021
- Submission: 9/3/2021
- Type: automatic
- Task: retrieval
- MD5:
84ad9aa522099d1bfc442ae375c2aeb3
- Run description: Retrieve using Elasticsearch implementation of BM25. Use embeddings generated by pretrained Roberta model as basis to train a classifier for every feature (entertaining, subjective, discussion) on 1000 manually annotated podcast snippets. Rerank using the generated features.