Runs - Podcast 2020¶

2306987O_abs_run1¶

Run ID: 2306987O_abs_run1
Participant: UoGTr
Track: Podcast
Year: 2020
Submission: 8/24/2020
Type: automatic
Task: summarization
Run description: For this run, a pre-trained T5 model that was fine-tuned on the provided episode descriptions was used to generate the summaries. As part of the summary generation pipeline, some post-processing was done on the model's outputs to remove as much promotional material (links, hashtags etc) as possible.

2306987O_extabs_run2¶

Run ID: 2306987O_extabs_run2
Participant: UoGTr
Track: Podcast
Year: 2020
Submission: 9/1/2020
Type: automatic
Task: summarization
Run description: For this run, the first 15 sentences were extracted from the podcast transcript and fed as input to a T5 model that was fine-tuned/trained using the podcast transcripts and episode descriptions.

2306987O_extabs_run3¶

Run ID: 2306987O_extabs_run3
Participant: UoGTr
Track: Podcast
Year: 2020
Submission: 9/2/2020
Type: automatic
Task: summarization
Run description: For this run, the podcast transcripts were first fed through an extractive pipeline to pick out the top 15 most representative sentences. This extractive pipeline used SpanBert to generate the embeddings of the text, and a K-means classifier to cluster those embeddings into 15 (number of desired sentences) clusters. The 15 sentences that constitute the output from this pipeline are those that are closest to the K-means cluster centroids. The output of the extractive pipeline is then given to a T5 model that was fine-tuned on podcast transcripts and their respective episode descriptions.

bartcnn¶

Run ID: bartcnn
Participant: podcast_baselines
Track: Podcast
Year: 2020
Submission: 9/3/2020
Type: automatic
Task: summarization
Run description: The model inference code was used out of the box from huggingface/transformers.

bartpodcasts¶

Run ID: bartpodcasts
Participant: podcast_baselines
Track: Podcast
Year: 2020
Submission: 9/3/2020
Type: automatic
Task: summarization
Run description: We fine-tuned the pretrained BART summarization model from huggingface/transformers using the first 1024 tokens of the transcripts as inputs and the descriptions as outputs.

BERT-DESC-Q¶

Run ID: BERT-DESC-Q
Participant: spotify
Track: Podcast
Year: 2020
Submission: 9/2/2020
Type: automatic
Task: retrieval
MD5: f10d10202c6189f6ec9a2b8d5b192c20
Run description: (1) Generate a pool of top 50 candidates with BM25 using the queries, (2) rerank topic description-segments pairs using BERT reranking model; The model pre-trained on MS MARCO passage reranking data (Nogueira et al) and fine-tuned on automatically generated questions - segments pairs.

BERT-DESC-S¶

Run ID: BERT-DESC-S
Participant: spotify
Track: Podcast
Year: 2020
Submission: 9/2/2020
Type: automatic
Task: retrieval
MD5: e30b716ce266292d377aae5752e0c35c
Run description: (1) Generate a pool of top 50 candidates with BM25 using the queries, (2) rerank topic description-segments pairs using BERT reranking model; The model pre-trained on MS MARCO passage reranking data (Nogueira et al) and fine-tuned on extra topics and relevance judgments from crowdsourcing.

BERT-DESC-TD¶

Run ID: BERT-DESC-TD
Participant: spotify
Track: Podcast
Year: 2020
Submission: 9/2/2020
Type: automatic
Task: retrieval
MD5: ce584a7f5c82374a7edfb52ce3dc5771
Run description: (1) Generate a pool of top 50 candidates with BM25 using the queries, (2) rerank topic description-segments pairs using BERT reranking model; The model pre-trained on MS MARCO passage reranking data (Nogueira et al) and fine-tuned on synthetic data from the podcast dataset. The top relevant segments within the episode were retrieved using the episode title as the query and the episode description-segments were used as reranking pairs.

BM25¶

Results | Participants | Input | Summary | Appendix

Run ID: BM25
Participant: podcast_baselines
Track: Podcast
Year: 2020
Submission: 9/2/2020
Type: automatic
Task: retrieval
MD5: 8f43c4bb18e80cc8ef24794d3961678e
Run description: Traditional IR model, BM25; implemented with Anserini toolkit and default parameters (b=0.9 and k=0.4)

categoryaware1¶

Run ID: categoryaware1
Participant: spotify
Track: Podcast
Year: 2020
Submission: 9/3/2020
Type: automatic
Task: summarization
Run description: This run is after one epoch of fine-tuning.

categoryaware2¶

Run ID: categoryaware2
Participant: spotify
Track: Podcast
Year: 2020
Submission: 9/3/2020
Type: automatic
Task: summarization
Run description: This run is after two epochs of fine-tuning.

coarse2fine¶

Run ID: coarse2fine
Participant: spotify
Track: Podcast
Year: 2020
Submission: 9/3/2020
Type: automatic
Task: summarization
Run description: We used TextRank to extract central regions (chunks of sentences) of the transcript. The most central regions (up to about 1000 tokens) were concatenated in order of appearance and used as input for fine-tuning the Bart CNN/Daily Mail summarization model from huggingface/transformers, with episode descriptions as output. Output summaries were constrained to a maximum of 250 tokens.

cued_speechUniv1¶

Run ID: cued_speechUniv1
Participant: cued_speechUniv
Track: Podcast
Year: 2020
Submission: 9/2/2020
Type: automatic
Task: summarization
Run description: Two-step approach: (1) sentence filtering based on sentence-level attention scores of the hierarchical model, (2) BART summarisation using the filtered sentence as the input at both training & inference time. We optimised BART on maximum likelihood criterion and subsequently on reinforcement learning (sequence-level optimisation) criterion. Finally, we perform an ensemble of BART models from different checkpoints and different data shuffles.

cued_speechUniv2¶

Run ID: cued_speechUniv2
Participant: cued_speechUniv
Track: Podcast
Year: 2020
Submission: 9/2/2020
Type: automatic
Task: summarization
Run description: same as Run1 (cued_speechUniv2) - difference being the ensemble consists of 3 models

cued_speechUniv3¶

Run ID: cued_speechUniv3
Participant: cued_speechUniv
Track: Podcast
Year: 2020
Submission: 9/2/2020
Type: automatic
Task: summarization
Run description: - This is meant to be the most standard approach (i.e. our baseline) - Fine-tuning CNN/Daily trained BART model on the podcast data - If the transcription at training and inference time exceeds 1,024 tokens, it gets truncated.

cued_speechUniv4¶

Run ID: cued_speechUniv4
Participant: cued_speechUniv
Track: Podcast
Year: 2020
Submission: 9/3/2020
Type: automatic
Task: summarization
Run description: - same as RUN1 "cued_speechUniv1", with the difference being that this system is not optimised on the RL criterion and it is a single-model system rather than an ensemble

hk_uu_podcast1¶

Run ID: hk_uu_podcast1
Participant: hk_uu_podcast
Track: Podcast
Year: 2020
Submission: 9/3/2020
Type: automatic
Task: summarization
Run description: The model was trained for 3 epochs and the best rogue2 checkpoint on a created validation split was chosen. The model was trained using and input sequence length of 4096 and a target max length of 200.

hltcoe1¶

Results | Participants | Input | Summary | Appendix

Run ID: hltcoe1
Participant: hltcoe
Track: Podcast
Year: 2020
Submission: 9/4/2020
Type: automatic
Task: retrieval
MD5: d2ee581babda85ed80422c954a8df344
Run description: Statistical language model with linear interpolation. Rocchio-style relevance feedback and term reweighting. Overlapping, word-spanning, character 5-gram tokenization.

hltcoe2¶

Results | Participants | Input | Summary | Appendix

Run ID: hltcoe2
Participant: hltcoe
Track: Podcast
Year: 2020
Submission: 9/4/2020
Type: automatic
Task: retrieval
MD5: 9c86dc3b9eb1d0f49512f8d11ca9603c
Run description: Statistical language model with linear interpolation. Rocchio-style relevance feedback and term reweighting. Unstemmed words used for tokenization.

hltcoe3¶

Results | Participants | Input | Summary | Appendix

Run ID: hltcoe3
Participant: hltcoe
Track: Podcast
Year: 2020
Submission: 9/4/2020
Type: automatic
Task: retrieval
MD5: 51b7a815b8a665f7e8dddc71327939d8
Run description: Statistical language model with linear interpolation. No query modification or relevance feedback was employed. Unstemmed words used for tokenization.

hltcoe4¶

Results | Participants | Input | Summary | Appendix

Run ID: hltcoe4
Participant: hltcoe
Track: Podcast
Year: 2020
Submission: 9/4/2020
Type: automatic
Task: retrieval
MD5: 5b505126e2ac891314aa11124e7afacc
Run description: Statistical language model with linear interpolation. Rocchio-style relevance feedback and term reweighting. Unstemmed words used for tokenization.

hltcoe5¶

Results | Participants | Input | Summary | Appendix

Run ID: hltcoe5
Participant: hltcoe
Track: Podcast
Year: 2020
Submission: 9/4/2020
Type: automatic
Task: retrieval
MD5: b94954bf6db0fd88d8bf3f4df4f26b77
Run description: Independently decoded audio data (baseline transcript was not used). Statistical language model with linear interpolation. Rocchio-style relevance feedback and term reweighting. Overlapping, word-spanning character 4-gram tokenization.

LRGREtvrs-r_1¶

Run ID: LRGREtvrs-r_1
Participant: LRG_REtrievers
Track: Podcast
Year: 2020
Submission: 8/31/2020
Type: automatic
Task: retrieval
MD5: f8902c4b7a18df3a165e5d2f7fa0e8a9
Run description: We ranked the user's query with BM25 scores for every podcast episode in the dataset and filtered the top 200 podcasts. We then divided the filtered podcasts into 2 minute segments and re-ranked them with a regressive XLNet model and returned the top 1000 results.

LRGREtvrs-r_2¶

Run ID: LRGREtvrs-r_2
Participant: LRG_REtrievers
Track: Podcast
Year: 2020
Submission: 9/1/2020
Type: automatic
Task: retrieval
MD5: a7c7e79b62c6c05f13afa50370b7f40e
Run description: To tackle the problem statement, we adopt a neural re-ranking approach, using BM25 to filter episodes and RM3 for query expansion. We then split the episodes into two minute segments. For each query-segment pair in the training set, we use a transformers-based model. We then find the contextual embeddings using XLNet (keeping two layers unfrozen) and compute the similarity matrix between the query and the document, followed by kernel pooling techniques and linear layers to finally arrive at a relevance score for the document.

LRGREtvrs-r_3¶

Run ID: LRGREtvrs-r_3
Participant: LRG_REtrievers
Track: Podcast
Year: 2020
Submission: 9/1/2020
Type: automatic
Task: retrieval
MD5: 25e2a11f66071c0d2e208529867ea3dc
Run description: To tackle the problem statement, we adopt a neural re-ranking approach, we first split each episode into 2 minute segments. Then using BM25 to filter episodes and RM3 for query expansion we create a curated list of 5000 segments. For each query-segment pair in the training set, we use a regression based transformer model, after which we re-rank the documents according to the regression scores.

LRGREtvrs-r_4¶

Run ID: LRGREtvrs-r_4
Participant: LRG_REtrievers
Track: Podcast
Year: 2020
Submission: 9/2/2020
Type: automatic
Task: retrieval
MD5: 3490b3763a141e7a47637f941e7a38bc
Run description: We ranked the user's query with BM25 scores for every podcast episode in the dataset and filtered the top 400 podcasts. We then divided the filtered podcasts into 2 minute segments and re-ranked them with a regressive XLNet model and returned the top 1000 results.

onemin¶

Run ID: onemin
Participant: podcast_baselines
Track: Podcast
Year: 2020
Submission: 9/2/2020
Type: automatic
Task: summarization
Run description: The first minute of the transcript is extracted and used as the summary.

oudalab1¶

Results | Participants | Input | Summary | Appendix

Run ID: oudalab1
Participant: oudalab
Track: Podcast
Year: 2020
Submission: 9/4/2020
Type: automatic
Task: retrieval
MD5: 8ff5420f22d01244c61c617e8baf8347
Run description: Using the above-mentioned method, this run was a trial run with a few data points. We used the top 10 closest segments based on distance to find the episodes to use for our BERT QA task. We then chose the top 3 answers with the lowest similarity scores according to BERT, removing duplicates.

QL¶

Results | Participants | Input | Summary | Appendix

Run ID: QL
Participant: podcast_baselines
Track: Podcast
Year: 2020
Submission: 9/2/2020
Type: automatic
Task: retrieval
MD5: 29f529f98bbbb230c34347f19fc61217
Run description: Traditional IR model, query likelihood; implemented with Anserini toolkit, and default hyperparameters (for Dirichlet smoothing μ = 1000).

RERANK-DESC¶

Results | Participants | Input | Summary | Appendix

Run ID: RERANK-DESC
Participant: podcast_baselines
Track: Podcast
Year: 2020
Submission: 9/2/2020
Type: automatic
Task: retrieval
MD5: 178140590bd272270d8e1a52e9c04c5d
Run description: (1) Generate a pool of top 50 candidates with BM25 using the queries, (2) rerank topic description-segments pairs using BERT reranking model pre-trained on MS MARCO passage reranking data (Nogueira et al). The model was used without any further fine-tuning.

RERANK-QUERY¶

Results | Participants | Input | Summary | Appendix

Run ID: RERANK-QUERY
Participant: podcast_baselines
Track: Podcast
Year: 2020
Submission: 9/2/2020
Type: automatic
Task: retrieval
MD5: 75e41ba41563c7448b30378fb13f2386
Run description: (1) Generate a pool of top 50 candidates with BM25 using the queries, (2) rerank topic query-segments pairs using BERT reranking model pre-trained on MS MARCO passage reranking data (Nogueira et al). The model was used without any further fine-tuning.

run_dcu1¶

Run ID: run_dcu1
Participant: DCU-ADAPT
Track: Podcast
Year: 2020
Submission: 9/1/2020
Type: automatic
Task: retrieval
MD5: e5eb3bb396ee53bb8894a10f77414a9b
Run description: Nouns and proper nouns are identified automatically using Spacy natural language processing toolkit, and those words are added to queries. From the documents of 1st pass retrieval, words relevant to query nouns are identified using wordnet, and these words are added after being ranked using Robertson offer weight. The queries are processed by the DPH model, and further Bo1 query expansion model was applied.

run_dcu2¶

Run ID: run_dcu2
Participant: DCU-ADAPT
Track: Podcast
Year: 2020
Submission: 9/1/2020
Type: automatic
Task: retrieval
MD5: df6299db1ef8194672883331ae579edc
Run description: Nouns and named entities are identified automatically using Spacy natural language processing toolkit, and those words are added to queries. The queries are processed by the DPH model, and further Bo1 query expansion model was applied.

run_dcu3¶

Run ID: run_dcu3
Participant: DCU-ADAPT
Track: Podcast
Year: 2020
Submission: 9/1/2020
Type: automatic
Task: retrieval
MD5: 0be485c98b8ce2978fbee980e261519a
Run description: Nouns and named entities are identified automatically using Spacy natural language processing toolkit, and those words are added to queries. From the documents of 1st pass retrieval, words relevant to query nouns are identified using wordnet, and these words are added after being ranked using Robertson offer weight. The queries are processed by the DPH model, and further Bo1 query expansion model was applied.

run_dcu4¶

Run ID: run_dcu4
Participant: DCU-ADAPT
Track: Podcast
Year: 2020
Submission: 9/1/2020
Type: automatic
Task: retrieval
MD5: 6245881af107cb477116c70a8faa531e
Run description: A collection of webtext was compiled using Google search engine. From the collection, terms relevant to queries were found using Robertson offer weight and added to the queries. Nouns and named entities from query description were also added to the queries.

run_dcu5¶

Run ID: run_dcu5
Participant: DCU-ADAPT
Track: Podcast
Year: 2020
Submission: 9/1/2020
Type: automatic
Task: retrieval
MD5: b439b372162c454e6abc26926983bdc0
Run description: This is a combination of all of the query extension approaches from the previous submissions.

textranksegments¶

Run ID: textranksegments
Participant: podcast_baselines
Track: Podcast
Year: 2020
Submission: 9/3/2020
Type: automatic
Task: summarization
Run description: We chunked the transcript into ~50 word segments (respecting sentence boundaries), and ran TextRank, using TF-IDF cosine similarity as the edge weights, with aggregate vertex degree as the centrality measure (not PageRank). Up to ~150 words from the top segments were selected for the summary, with segments kept in order. We specified a set of stopwords, consisting of the most common terms in the whole corpus, to be ignored.

textranksentences¶

Run ID: textranksentences
Participant: podcast_baselines
Track: Podcast
Year: 2020
Submission: 9/3/2020
Type: automatic
Task: summarization
Run description: We split the transcript into sentences using spaCy, and ran TextRank, using TF-IDF cosine similarity as the edge weights, with aggregate vertex degree as the centrality measure (not PageRank). The top sentences, up to about 150 words in total, were selected for the summary, with sentences kept in the order they appear. We specified a set of stopwords, consisting of the most common terms in the whole corpus, to be ignored.

UCF_NLP1¶

Run ID: UCF_NLP1
Participant: UCF_NLP
Track: Podcast
Year: 2020
Submission: 9/3/2020
Type: automatic
Task: summarization
Run description: Our summarization system (UCF_NLP1) focuses on generating abstractive summaries from podcast transcripts. It employs an encoder-decoder model to condense the first few segments of the transcript into an abstractive summary.

UCF_NLP2¶

Run ID: UCF_NLP2
Participant: UCF_NLP
Track: Podcast
Year: 2020
Submission: 9/3/2020
Type: automatic
Task: summarization
Run description: Our summarization system (UCF_NLP2) focuses on generating abstractive summaries from podcast transcripts. It consists of an abstractor that employs an encoder-decoder model to compose summaries and an extractor that enhances content selection by identifying summary-worthy segments from lengthy transcripts and provide them as input to the abstractor.

udel_wang_zheng1¶

Run ID: udel_wang_zheng1
Participant: udel_wang_zheng
Track: Podcast
Year: 2020
Submission: 8/28/2020
Type: automatic
Task: summarization
Run description: We build a model using the first 1024 tokens from the transcript and fine-tune it on distilBART-cnndm.

udel_wang_zheng2¶

Run ID: udel_wang_zheng2
Participant: udel_wang_zheng
Track: Podcast
Year: 2020
Submission: 9/1/2020
Type: automatic
Task: summarization
Run description: We perform LDA on transcript to extract the topics covered in the episode, and then select top-scoring sentences for fine-tuning.

udel_wang_zheng3¶

Run ID: udel_wang_zheng3
Participant: udel_wang_zheng
Track: Podcast
Year: 2020
Submission: 9/1/2020
Type: automatic
Task: summarization
Run description: Select sentences for fine-tuning.

udel_wang_zheng4¶

Run ID: udel_wang_zheng4
Participant: udel_wang_zheng
Track: Podcast
Year: 2020
Submission: 9/1/2020
Type: automatic
Task: summarization
Run description: Combine our output from previous three submission and generate summary again.

UMD_ID_run4¶

Run ID: UMD_ID_run4
Participant: UMD_IR
Track: Podcast
Year: 2020
Submission: 9/3/2020
Type: automatic
Task: retrieval
MD5: eb8c143b43c146995afb00785cfbdf47
Run description: 7 systems (unstemmed LM, unstemmed LM+word2vec query expansion, stemmed weighted LM with stopwords, unstemmed TFIDF, stemmed LM+sdm, unstemmed 5min long segments LM, stemmed LM with documents expanded with metadata), each system re-reranked using either T5 or BERT and then combined into a single system.

UMD_IR_run1¶

Run ID: UMD_IR_run1
Participant: UMD_IR
Track: Podcast
Year: 2020
Submission: 9/2/2020
Type: automatic
Task: retrieval
MD5: b4a90a6bed2ffc141f49cb4962c7240d
Run description: Baseline model prepared from Indri LM with sequential dependency model applied, the results are re-ranked using T5 BERT model trained using MS MARCO dataset.

UMD_IR_run2¶

Run ID: UMD_IR_run2
Participant: UMD_IR
Track: Podcast
Year: 2020
Submission: 9/2/2020
Type: automatic
Task: retrieval
MD5: da9e8e1e56a3b84dcaae8da759b1df2d
Run description: Indri LM with sequential dependency model.

UMD_IR_run3¶

Run ID: UMD_IR_run3
Participant: UMD_IR
Track: Podcast
Year: 2020
Submission: 9/3/2020
Type: automatic
Task: retrieval
MD5: 75f5f2cf929466b63a7856511bf308ed
Run description: 7 systems (unstemmed LM, unstemmed LM+word2vec query expansion, stemmed weighted LM with stopwords, unstemmed TFIDF, stemmed LM+sdm, unstemmed 5min long segments LM, stemmed LM with documents expanded with metadata ) combined into a single run, which is reranked using 3 MS MARCO trained models (T5 and BERT) and combined with the baseline run.

UMD_IR_run5¶

Run ID: UMD_IR_run5
Participant: UMD_IR
Track: Podcast
Year: 2020
Submission: 9/3/2020
Type: automatic
Task: retrieval
MD5: 23429c601ffc40adce398db7c3450e29
Run description: CombMNZ Combination of the run1 - run4

unhtrema1¶

Run ID: unhtrema1
Participant: TREMA-UNH
Track: Podcast
Year: 2020
Submission: 9/2/2020
Type: automatic
Task: summarization
Run description: A GAN model is used to generate abstractive summary of chunks of input texts. Sentence-transformer method is used to embed each of these summary lines as fixed length vectors. Another LSTM network is trained to output a summary embedding vector given input summary embedding vectors. Then the generated summary lines are sorted based on the cosine similarity of their embedding vector to this synthetic summary vector. Top k summary lines are chosen from these sorted lines as the overall summary. For this run, k=3 with max output sequence length of 15 for the GAN model. Input text is split into 10 parts each input chunk of 1000 words.

unhtrema2¶

Run ID: unhtrema2
Participant: TREMA-UNH
Track: Podcast
Year: 2020
Submission: 9/2/2020
Type: automatic
Task: summarization
Run description: A GAN model is used to generate abstractive summary of chunks of input texts. Sentence-transformer method is used to embed each of these summary lines as fixed length vectors. Another LSTM network is trained to output a summary embedding vector given input summary embedding vectors. Then the generated summary lines are sorted based on the cosine similarity of their embedding vector to this synthetic summary vector. Top k summary lines are chosen from these sorted lines as the overall summary. For this run, k=10 with max output sequence length of 15 for the GAN model. Input text is split into 10 parts each input chunk of 1000 words.

unhtrema3¶

Run ID: unhtrema3
Participant: TREMA-UNH
Track: Podcast
Year: 2020
Submission: 9/2/2020
Type: automatic
Task: summarization
Run description: A GAN model is used to generate abstractive summary of chunks of input texts. Sentence-transformer method is used to embed each of these summary lines as fixed length vectors. Another LSTM network is trained to output a summary embedding vector given input summary embedding vectors. Then the generated summary lines are sorted based on the cosine similarity of their embedding vector to this synthetic summary vector. Top k summary lines are chosen from these sorted lines as the overall summary. For this run, k=10 with max output sequence length of 20 for the GAN model. Input text is split into 100 parts each input chunk of 100 words.

unhtrema4¶

Run ID: unhtrema4
Participant: TREMA-UNH
Track: Podcast
Year: 2020
Submission: 9/3/2020
Type: automatic
Task: summarization
Run description: A GAN model is used to generate abstractive summary of chunks of input texts. Sentence-transformer method is used to embed each of these summary lines as fixed length vectors. Another LSTM network is trained to output a summary embedding vector given input summary embedding vectors. Then the generated summary lines are sorted based on the cosine similarity of their embedding vector to this synthetic summary vector. Top k summary lines are chosen from these sorted lines as the overall summary. For this run, k=20 with max output sequence length of 20 for the GAN model. Input text is split into 100 parts each input chunk of 100 words.

UTDThesis1¶

Run ID: UTDThesis1
Participant: UTDThesis
Track: Podcast
Year: 2020
Submission: 9/2/2020
Type: automatic
Task: summarization
Run description: These are the abstractive summaries generated by the Dialogue Action Tokenized T5-Transformer described above.

UTDThesis_Run1¶

Results | Participants | Input | Summary | Appendix

Run ID: UTDThesis_Run1
Participant: UTDThesis
Track: Podcast
Year: 2020
Submission: 9/2/2020
Type: automatic
Task: retrieval
MD5: bf94389052d00735052d02c0c8324a8d
Run description: This run collects 100 ranked documents for each query, using the previously described ranking method.