Runs - Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN) 2025¶

03_01_Baseline¶

Run ID: 03_01_Baseline
Participant: SCIAI
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-19
Task: trec2025-dragun-repgen
MD5: 76ac6a4f0c9494662969833c6b066fd1
Run description: This run uses a single round for the IR agents and report generation step to provide a baseline.

ConvF_all-t12_5¶

Participants | Proceedings | Input | dragun-qgen | Appendix

Run ID: ConvF_all-t12_5
Participant: TREMA-UNH
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-23
Task: trec2025-dragun-qgen
MD5: b31942c848ff5f7683c6450b1dc75dfb
Run description: Run 7 incorporates the "convince false" article in all phases except question generation and report generation, which use only the original article. Query generation uses a maximum of 5 iterations and the Qwen2.5 7B model.

ConvF_all-t12_5_RG¶

Run ID: ConvF_all-t12_5_RG
Participant: TREMA-UNH
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-23
Task: trec2025-dragun-repgen
MD5: c648e9d7695c22d627d46d5b87e4c34c
Run description: Run 7 incorporates the "convince false" article in all phases except question generation and report generation, which use only the original article. Query generation uses a maximum of 5 iterations and the Qwen2.5 7B model.

ConvF_all_MI_5¶

Participants | Proceedings | Input | dragun-qgen | Appendix

Run ID: ConvF_all_MI_5
Participant: TREMA-UNH
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-23
Task: trec2025-dragun-qgen
MD5: 0acb1ffcb0797b2630f1b6ee43b251e3
Run description: Run 7 incorporates the "convince false" article in all phases. Query generation uses a maximum of 5 iterations and the Qwen2.5 7B model.

ConvF_all_MI_5_RG¶

Run ID: ConvF_all_MI_5_RG
Participant: TREMA-UNH
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-23
Task: trec2025-dragun-repgen
MD5: 6a53084c7d20e79c35d0ad48f72ad235
Run description: Run 7 incorporates the "convince false" article in all phases. Query generation uses a maximum of 5 iterations and the Qwen2.5 7B model.

cru-ablR-conf_¶

Run ID: cru-ablR-conf_
Participant: HLTCOE
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-20
Task: trec2025-dragun-repgen
MD5: e6d503b3e2eb20ddf7192deb105293e3
Run description: Crucible@dragun

Original run tag: strict-filtered-crucible-retrieved_docs-most_common-retrieved-reranker.retrieved_docs.jsonl-SupportedAnswerabilityExtractorRequest Answerability prompt. Just check citation support, then rely on extraction confidence.

Crucible report generation. Guiding nuggets: most_common Document source: nugget citations. Nugget extraction prompt 'SupportedAnswerExtractorAll' on collection "ragtime-mt" LLM: llama3.3-70b-instruct Sentences retained when citations supported according to argue_eval. Using abstractive summarization Only retain sentences that have extraction confidence value >= 0.5, are not already selected (according to stopped and stemmed match), do not contain the expression 'source document' For each nugget, among remaining sentence candidates, select the sentence with highest extraction confidence. Chop to 250 words. Created on 2025-08-20

cru-ablR_¶

Run ID: cru-ablR_
Participant: HLTCOE
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-20
Task: trec2025-dragun-repgen
MD5: 22b3090fde6dec9ff58659225829e155
Run description: Crucible@dragun

Original run tag: strict-filtered-covered-covextr-crucible-retrieved_docs-most_common-retrieved-reranker.retrieved_docs.jsonl-SupportedAnswerabilityExtractorRequest Answerability prompt. Will LLM judges generalize across tracks?

Crucible report generation. Guiding nuggets: most_common Document source: nugget citations. Nugget extraction prompt 'SupportedAnswerExtractorAll' on collection "ragtime-mt" LLM: llama3.3-70b-instruct Sentences retained when citations supported, at least one nugget covers the summary sentence, at least one nugget covers extractive document segment according to argue_eval. Using abstractive summarization Only retain sentences that have extraction confidence value >= 0.5, are not already selected (according to stopped and stemmed match), do not contain the expression 'source document' For each nugget, among remaining sentence candidates, select the sentence with highest extraction confidence. Chop to 250 words. Created on 2025-08-20

cru-claude¶

Participants | Proceedings | Input | dragun-qgen | Appendix

Run ID: cru-claude
Participant: HLTCOE
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-20
Task: trec2025-dragun-qgen
MD5: d35a0715c9562fc872e2ef3cc3af81c7
Run description: Prompt-based extraction of nuggets from source article.

This prompt results in nuggets with shorter gold answers, which we will use in our crucible report generation methods.

cru-claude-chatty¶

Participants | Proceedings | Input | dragun-qgen | Appendix

Run ID: cru-claude-chatty
Participant: HLTCOE
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-20
Task: trec2025-dragun-qgen
MD5: 29063f69f717b36c80d681a777d53d90
Run description: Prompt-based extraction of nuggets from source article.

This prompt results in gold answers that are long-winded (hence "chatty"), we usually don't like these for our crucible report generation method. But they seem more appropriate for this task.

cru-cloch-ablR-conf_¶

Run ID: cru-cloch-ablR-conf_
Participant: HLTCOE
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-20
Task: trec2025-dragun-repgen
MD5: 6680eff9576da8cd439326fe8d48a3d9
Run description: Crucible@dragun

Original run tag: strict-filtered-crucible-retrieved_docs-claudechatty-retrieved-reranker.retrieved_docs.jsonl-SupportedAnswerabilityExtractorRequest Answerability prompt on ClaudeChatty nuggets. Just check citation support, then rely on extraction confidence.

Crucible report generation. Guiding nuggets: claudechatty Document source: nugget citations. Nugget extraction prompt 'SupportedAnswerExtractorAll' on collection "ragtime-mt" LLM: llama3.3-70b-instruct Sentences retained when citations supported according to argue_eval. Using abstractive summarization Only retain sentences that have extraction confidence value >= 0.5, are not already selected (according to stopped and stemmed match), do not contain the expression 'source document' For each nugget, among remaining sentence candidates, select the sentence with highest extraction confidence. Chop to 250 words. Created on 2025-08-20

cru-clod-ablR-conf_¶

Run ID: cru-clod-ablR-conf_
Participant: HLTCOE
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-20
Task: trec2025-dragun-repgen
MD5: d620e59953fbe44641f1c5c6dfd87920
Run description: Crucible@dragun

Original run tag: strict-filtered-crucible-retrieved_docs-claude-retrieved-reranker.retrieved_docs.jsonl-SupportedAnswerabilityExtractorRequest Answerability prompt on Claude nuggets. Just check citation support, then rely on extraction confidence.

Crucible report generation. Guiding nuggets: claude Document source: nugget citations. Nugget extraction prompt 'SupportedAnswerExtractorAll' on collection "ragtime-mt" LLM: llama3.3-70b-instruct Sentences retained when citations supported according to argue_eval. Using abstractive summarization Only retain sentences that have extraction confidence value >= 0.5, are not already selected (according to stopped and stemmed match), do not contain the expression 'source document' For each nugget, among remaining sentence candidates, select the sentence with highest extraction confidence. Chop to 250 words. Created on 2025-08-20

cru-confirm-ansR_¶

Run ID: cru-confirm-ansR_
Participant: HLTCOE
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-20
Task: trec2025-dragun-repgen
MD5: 52a4ee44a286ee7f6c6c96bc11feedd1
Run description: Crucible@dragun

Original run tag: strict-filtered-covered-covextr-crucible-retrieved_docs-most_common-retrieved-reranker.retrieved_docs.jsonl-SupportedAnswerExtractorRequest Question-answering prompt with answers from request article. We are only looking for confirmation.

Crucible report generation. Guiding nuggets: most_common Document source: nugget citations. Nugget extraction prompt 'SupportedAnswerExtractorAll' on collection "ragtime-mt" LLM: llama3.3-70b-instruct Sentences retained when citations supported, at least one nugget covers the summary sentence, at least one nugget covers extractive document segment according to argue_eval. Using abstractive summarization Only retain sentences that have extraction confidence value >= 0.5, are not already selected (according to stopped and stemmed match), do not contain the expression 'source document' For each nugget, among remaining sentence candidates, select the sentence with highest extraction confidence. Chop to 250 words. Created on 2025-08-20

cru-most_common¶

Participants | Proceedings | Input | dragun-qgen | Appendix

Run ID: cru-most_common
Participant: HLTCOE
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-20
Task: trec2025-dragun-qgen
MD5: 1edde5a7e1040bfbfa14a144f23b36a9
Run description: Prompt-based extraction of nuggets from source article.

We use crucible's standard nugget extractor "most_common". The questions are probably more boring, but the gold answers can be matched to other source documents for report generation.

CUET-DeepSeek-R1-Qwen-32B¶

Participants | Proceedings | Input | dragun-qgen | Appendix

Run ID: CUET-DeepSeek-R1-Qwen-32B
Participant: CUET
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-08
Task: trec2025-dragun-qgen
MD5: 12649a47bb39289f37511ff4f15ebd9c
Run description: This run processes news topics from the TREC 2025 dataset to generate exactly 10 ranked investigative questions for each topic, emphasizing trustworthiness, bias, motivation, and factual accuracy. It uses a few-shot prompt template with specific examples, sends the topic title and body to a quantized LLM through LangChain, extracts clean numbered questions via regex, retries up to three times if fewer than 10 are generated, fills missing ones with “N/A,” and outputs the results in a TSV submission file (CUET_run9.tsv) formatted with topic ID, team ID, run ID, rank, and question. Duplicate questions are also identified for review.

CUET-Mistral-Small-24B¶

Participants | Proceedings | Input | dragun-qgen | Appendix

Run ID: CUET-Mistral-Small-24B
Participant: CUET
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-08
Task: trec2025-dragun-qgen
MD5: 0276e6cbf468e64fc2048ffbec63ed04
Run description: This run processes the official TREC 2025 topic file (trec-2025-dragun-topics.jsonl) to generate exactly 10 ranked investigative questions for each news article. A custom prompt template with few-shot examples is used to guide the model toward producing concise, non-redundant questions focused on evaluating trustworthiness, including aspects like bias, motivation, diversity of viewpoints, and factual accuracy. The code uses regex-based parsing to extract and deduplicate questions, retrying up to three times if fewer than 10 valid outputs are produced. Results are stored in a TSV file for submission, and duplicate detection is performed as a quality check.

CUET-qwen14B-v1¶

Participants | Proceedings | Input | dragun-qgen | Appendix

Run ID: CUET-qwen14B-v1
Participant: CUET
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-08
Task: trec2025-dragun-qgen
MD5: 3132d76ea6d5a597faaba0f8a2b4d05c
Run description: This run uses the unsloth/Qwen3-14B-unsloth-bnb-4bit large language model to generate 10 concise, ranked, and critical questions for each topic from the TREC 2025 dataset. The prompt is richly enhanced with two few-shot examples—one inspired by PolitiFact and the other by MBFC-style analysis—which train the model to emulate high-quality fact-checking strategies. The questions aim to assess news credibility, focusing on source bias, factual accuracy, omissions, and framing. LangChain's LLMChain handles inference through a HuggingFace pipeline with sampling enabled. Each article’s title and truncated body are passed through this chain, and output is cleaned using regex. A retry mechanism ensures quality (≥10 questions) with deduplication and padding if needed. Results are saved in a TREC-compatible TSV file CUET_run6.tsv.

CUET-qwen14B-v2¶

Participants | Proceedings | Input | dragun-qgen | Appendix

Run ID: CUET-qwen14B-v2
Participant: CUET
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-08
Task: trec2025-dragun-qgen
MD5: 3e7f5ce1be88003a60acb792d2a4fb98
Run description: This run uses the unsloth/Qwen3-14B-unsloth-bnb-4bit model to generate 10 concise and investigative questions per topic from the TREC 2025 dataset. The prompt is enhanced with two few-shot examples to guide the model in producing fact-check-style questions in the spirit of PolitiFact or MBFC. The questions aim to probe the trustworthiness and factual quality of a news article based on its title and truncated body (first 2000 characters). The model is invoked using LangChain’s LLMChain and a HuggingFace pipeline with sampling (temperature=0.7, do_sample=True) for diversity. A regex filter ensures only properly numbered and unique questions of up to 300 characters are accepted. A retry loop allows up to 3 attempts for quality control. If fewer than 10 valid questions are returned, the output is padded with "N/A". The final structured submission is saved in a TSV file named CUET_run7.tsv with fields: topic ID, team ID, run ID, question rank, and cleaned question text.

CUET-qwen14B-v3¶

Participants | Proceedings | Input | dragun-qgen | Appendix

Run ID: CUET-qwen14B-v3
Participant: CUET
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-08
Task: trec2025-dragun-qgen
MD5: c00f78a1e51030c1abadbb75e6ef36a0
Run description: This run utilizes the unsloth/Qwen3-14B-unsloth-bnb-4bit model to generate 10 investigative and critical questions per topic from the TREC 2025 dataset. The questions are designed to help readers assess the credibility and bias of each article. The prompt includes two detailed few-shot examples modeled after PolitiFact and MBFC, guiding the model to focus on:

Evidence and factual integrity

Bias and one-sided reporting

Missing viewpoints or counterarguments

Language framing and sensationalism

Conflicts of interest or affiliations

LangChain’s LLMChain is used to wrap a HuggingFace text generation pipeline with settings that enable diverse outputs (temperature=0.6, top_p=0.9, do_sample=True, max_new_tokens=600). Each article’s body is truncated to the first 2000 characters to fit within the model’s 2048-token context window. A regex is used to extract properly formatted numbered questions up to 300 characters long. The model attempts up to 3 retries per topic to get at least 10 valid questions, padding with "N/A" if not enough are generated. The final output is saved in a tab-separated file named CUET_run8.tsv, with columns: topic ID, team ID, run ID, question rank, and cleaned question.

CUET-qwen14B-v5¶

Participants | Proceedings | Input | dragun-qgen | Appendix

Run ID: CUET-qwen14B-v5
Participant: CUET
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-08
Task: trec2025-dragun-qgen
MD5: 8d0452112c8e166f54ee953f0eaa06ce
Run description: This run is designed to generate 10 investigative questions per news article to assess its trustworthiness for the TREC 2025 shared task. The code loads article topics from a JSONL file (trec-2025-dragun-topics.jsonl), and for each article, it uses a Qwen3-14B language model (through the Unsloth implementation) to generate questions that follow strict guidelines focusing on source credibility, evidence quality, origin tracing, and balance. The questions are generated using a LangChain LLMChain and a structured PromptTemplate. A retry loop ensures at least 10 valid and unique questions are produced per topic. The final output is saved in a TSV file for submission.

CUET-qwen4B-v2¶

Participants | Proceedings | Input | dragun-qgen | Appendix

Run ID: CUET-qwen4B-v2
Participant: CUET
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-08
Task: trec2025-dragun-qgen
MD5: 3ff2641eb3c3ae50120b20ec96623305
Run description: This run performs automated question generation for the TREC 2025 dataset using the unsloth/Qwen3-4B model, enhanced with few-shot prompting. It begins by loading the dataset of news articles and sets up a detailed prompt template containing two examples of ideal outputs to guide the LLM toward generating high-quality questions. The LangChain pipeline is used with HuggingFace's pipeline integration for efficient inference. Each topic (title and body) is passed through the LLMChain up to 3 times if needed, attempting to generate at least 10 valid, critical, investigative questions. A regex is used to extract and validate the questions. If fewer than 10 questions are generated after retries, the list is padded with "N/A" placeholders. Finally, all questions are cleaned and saved in TSV format for submission as CUET_run3.tsv.

CUET-qwen4B-v3¶

Participants | Proceedings | Input | dragun-qgen | Appendix

Run ID: CUET-qwen4B-v3
Participant: CUET
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-08
Task: trec2025-dragun-qgen
MD5: 41d0178cd0c93da1d038ec0a88879770
Run description: This run generates 10 ranked investigative questions for each topic in the TREC 2025 dataset using the unsloth/Qwen3-4B model. The prompt is enhanced with few-shot examples and explicitly instructs the model to rank questions based on importance, emphasizing critical thinking on bias, motivation, factual accuracy, and viewpoint diversity, including right-wing and centrist perspectives. The LangChain LLMChain is built around a HuggingFace pipeline with sampling enabled for generation. Each topic (title + truncated body) is passed to the model, and output is parsed using a regex to extract uniquely numbered questions up to 300 characters. The process includes a retry mechanism (up to 3 attempts) to ensure at least 10 valid questions, with padding as needed. The cleaned and deduplicated questions are saved in CUET_run4.tsv for TREC submission.

CUET-QwQ-32B¶

Participants | Proceedings | Input | dragun-qgen | Appendix

Run ID: CUET-QwQ-32B
Participant: CUET
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-08
Task: trec2025-dragun-qgen
MD5: 3520231dc040f97db5dbd33d9f34575f
Run description: This run loads the TREC 2025 topics dataset and applies a 4-bit quantized version of the UnsLoTh QwQ-32B language model to generate ten critical investigative questions per news article. The questions aim to evaluate the trustworthiness of the articles by focusing on source bias, motivation, diversity of viewpoints, and factual accuracy. A carefully crafted prompt with few-shot examples guides the model. The output is parsed to extract unique questions, with multiple attempts per topic to ensure completeness. Finally, the results are formatted into a submission file for further use.

CUET-unsloth-Mistral-Small¶

Participants | Proceedings | Input | dragun-qgen | Appendix

Run ID: CUET-unsloth-Mistral-Small
Participant: CUET
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-08
Task: trec2025-dragun-qgen
MD5: 25135ea796d4c68f6a445cd298a1ffd5
Run description: This run processes the TREC 2025 topic file (trec-2025-dragun-topics.jsonl) containing news article titles and bodies. A custom PromptTemplate is used to instruct the LLM to generate 10 concise and critical investigative questions for each article, focusing on source bias, intent, diversity of viewpoints, and factual accuracy. The model’s output is parsed using a regex pattern to extract exactly 10 unique questions per topic, which are then saved in TSV format for submission.

cursor-enhanced¶

Participants | Input | dragun-qgen | Appendix

Run ID: cursor-enhanced
Participant: cycraft
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-21
Task: trec2025-dragun-qgen
MD5: d151426ca13e4dd8fd2799b41002f5be
Run description: Automatically enhance the start kit by providing a list of contrastive examples.

cursor-report¶

Run ID: cursor-report
Participant: cycraft
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-21
Task: trec2025-dragun-repgen
MD5: 9eeb340939e66bd7d70b3cf398e9ef63
Run description: Automatically enhance the start kit by providing a list of contrastive examples.

dragun-organizers-starter-kit-task-1¶

Participants | Input | dragun-qgen | Appendix

Run ID: dragun-organizers-starter-kit-task-1
Participant: coordinators
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-05
Task: trec2025-dragun-qgen
MD5: a179afed11faafd58a3c79aff9c587cd
Run description: https://github.com/trec-dragun/2025-starter-kit

dragun-organizers-starter-kit-task-2¶

Run ID: dragun-organizers-starter-kit-task-2
Participant: coordinators
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-05
Task: trec2025-dragun-repgen
MD5: a29554a7fe952cb4aabdd328dece6059
Run description: https://github.com/trec-dragun/2025-starter-kit

feedback-rerank¶

Participants | Input | dragun-qgen | Appendix

Run ID: feedback-rerank
Participant: cycraft
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-21
Task: trec2025-dragun-qgen
MD5: 1bfbb9906a9b3f1f41c55a243e0f4cb7
Run description: Use LLM to rerank the questions based on the LLM-generated feedback

feedbackintheloop¶

Participants | Proceedings | Input | dragun-qgen | Appendix

Run ID: feedbackintheloop
Participant: WaterlooClarke
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-18
Task: trec2025-dragun-qgen
MD5: f3db57a837dd8558019b596f58246386
Run description: With automatically generated feedback in the loop

garag_rubric¶

Run ID: garag_rubric
Participant: WaterlooClarke
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-19
Task: trec2025-dragun-repgen
MD5: 2c5896bd0caca0f8ba18c32e8b969f4d
Run description: Generate first (with open web search)

garamp_dragun_t2_q7b¶

Run ID: garamp_dragun_t2_q7b
Participant: DUTH
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-23
Task: trec2025-dragun-repgen
MD5: e85458996a32062f25606eb359433439
Run description: BM25 retrieval with Pyserini over the MS MARCO V2.1 (Segmented) Lucene index. For each topic we retrieve k=40 segments and keep up to 8 evidence passages after de-dup/length filtering. A single LLM pass (Qwen2.5-7B-Instruct) generates a <=250-word report in 4 sentences; each sentence cites up to 3 MS MARCO segment docids. Post-processing validates JSON, clips citations to <=3, and aligns outputs 1-to-1 with the official topics list.

garamp_mistral_7b¶

Participants | Proceedings | Input | dragun-qgen | Appendix

Run ID: garamp_mistral_7b
Participant: DUTH
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-21
Task: trec2025-dragun-qgen
MD5: 5c5491cd89cb92eac18f92118ffd21b7
Run description: Zero-shot with Mistral-7B-Instruct-v0.3. We create ~30 candidates/topic under strict formatting/length rules and remove compound questions, then pick the final 10 via TF-IDF MMR (α=0.7). Seed=42.

garamp_qwen25_14b¶

Participants | Proceedings | Input | dragun-qgen | Appendix

Run ID: garamp_qwen25_14b
Participant: DUTH
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-21
Task: trec2025-dragun-qgen
MD5: 7156e7676439bd1bb11abae1ce434b1a
Run description: Same pipeline as the 7B run but with Qwen2.5-14B-Instruct. ~30 candidates/topic → cleaning (≤300 chars, single question, no compound connectors) → MMR selection of the final 10. Seed=42.

garamp_qwen25_14b_r4¶

Run ID: garamp_qwen25_14b_r4
Participant: DUTH
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-23
Task: trec2025-dragun-repgen
MD5: 3c9cbb1e1b2d1622ad488d367b4b7693
Run description: BM25 retrieval with Pyserini over the MS MARCO V2.1 (Segmented) Lucene index (msmarco-v2.1-doc-segmented.20240418.4f9675). For each topic we retrieve up to k=40 candidate segments and keep at most 18 evidence passages (dedup/length filtering) to fit the context window. A single LLM pass generates a <=250-word report in 3–5 sentences; each sentence cites up to 3 segment docids. Post-processing clips citations to <=3, validates JSON, and aligns outputs to the official topic list (1 line/topic)

garamp_qwen25_3b_t2¶

Run ID: garamp_qwen25_3b_t2
Participant: DUTH
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-23
Task: trec2025-dragun-repgen
MD5: 787753e9408cf78da3f3a98ba4cac7af
Run description: BM25 retrieval with Pyserini over the MS MARCO V2.1 (Segmented) Lucene index. For each topic we retrieve k=40 segments and keep up to 12 evidence passages after de-dup/length filtering. A single LLM pass (Qwen2.5-3B-Instruct) produces a ≤250-word report in ~4 sentences; each sentence cites up to 3 MS MARCO segment docids. Post-processing validates JSON, clips citations to ≤3, and aligns outputs 1:1 with the official topic list.

garamp_qwen25_72b¶

Participants | Proceedings | Input | dragun-qgen | Appendix

Run ID: garamp_qwen25_72b
Participant: DUTH
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-21
Task: trec2025-dragun-qgen
MD5: 087b07963489a36ac1ac1dece8a53872
Run description: Same structured pipeline with Qwen2.5-72B-Instruct: generate ~30 candidates/topic, apply cleaning (≤300 chars, single question, English, no compound), then choose 10 using TF-IDF MMR (α=0.7). Seed=42.

garamp_qwen25_7b_imp¶

Participants | Proceedings | Input | dragun-qgen | Appendix

Run ID: garamp_qwen25_7b_imp
Participant: DUTH
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-21
Task: trec2025-dragun-qgen
MD5: 471a9a032529259f84c484ebba8cc7b3
Run description: Zero-shot question generation for 30 topics using a fixed system prompt enforcing: ≤300 characters, one question per line, English, ends with “?”, and no compound questions (no “and/and-or”). We produce ~30 candidates per topic, clean/filter them, then select 10 via TF-IDF MMR (α=0.7). Seed=42.

garamp_yi15_9b¶

Participants | Proceedings | Input | dragun-qgen | Appendix

Run ID: garamp_yi15_9b
Participant: DUTH
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-21
Task: trec2025-dragun-qgen
MD5: 2133d39592c3fc1feb9eb64d7a24bec7
Run description: Zero-shot generation with Yi-1.5-9B-Chat. ~30 candidates per topic; enforce ≤300 chars, one question, ends with “?”, and no compound questions; select 10 using TF-IDF MMR (α=0.7). Seed=42.

garamp_yi9b_t2_v1¶

Run ID: garamp_yi9b_t2_v1
Participant: DUTH
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-23
Task: trec2025-dragun-repgen
MD5: cd349a1b2897261e9f35b6cc6a5c94de
Run description: BM25 retrieval with Pyserini over the MS MARCO V2.1 (Segmented) Lucene index. For each topic we retrieve k=40 segments and keep up to 8 evidence passages after de-dup/length filtering. A single LLM pass (Yi-1.5-9B-Chat) produces a ≤250-word report in ~4 sentences; each sentence cites up to 3 MS MARCO segment docids. Post-processing validates JSON, clips citations to ≤3, and aligns outputs 1:1 with the official topic list.

garamp_zephyr7b_t2¶

Run ID: garamp_zephyr7b_t2
Participant: DUTH
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-23
Task: trec2025-dragun-repgen
MD5: c15697e360c39d21a0e6b1996bddd2a8
Run description: BM25 retrieval with Pyserini over the MS MARCO V2.1 (Segmented) Lucene index. For each topic we retrieve k=40 segments and keep up to 10 evidence passages after de-dup/length filtering. A single LLM pass (Zephyr-7B-Beta) produces a ≤250-word report in ~4 sentences; each sentence cites up to 3 MS MARCO segment docids. Post-processing validates JSON, clips citations to ≤3, and aligns outputs 1:1 with the official topics list.

h2oloo_gpt5_orig¶

Participants | Input | dragun-qgen | Appendix

Run ID: h2oloo_gpt5_orig
Participant: h2oloo
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-25
Task: trec2025-dragun-qgen
MD5: 0b637c707d3239dfc27b9d2e835dd2e1
Run description: h2oloo Original Prompt modified from last year

h2oloo_gpt5_step¶

Participants | Input | dragun-qgen | Appendix

Run ID: h2oloo_gpt5_step
Participant: h2oloo
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-25
Task: trec2025-dragun-qgen
MD5: 670f3f8114da678025afe8b9aab31d7a
Run description: Stepwise Prompt modified from last year

h2oloo_qw3-30b_orig¶

Participants | Input | dragun-qgen | Appendix

Run ID: h2oloo_qw3-30b_orig
Participant: h2oloo
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-25
Task: trec2025-dragun-qgen
MD5: 764cc7dbb0d8a201712c4157a02c77f7
Run description: h2oloo Original Prompt modified from last year

h2oloo_qw3-30b_step¶

Participants | Input | dragun-qgen | Appendix

Run ID: h2oloo_qw3-30b_step
Participant: h2oloo
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-25
Task: trec2025-dragun-qgen
MD5: e6dda6527d1ddd2b038e89130e780c19
Run description: Stepwise Prompt modified from last year

organizer-gpt-oss-t1¶

Participants | Input | dragun-qgen | Appendix

Run ID: organizer-gpt-oss-t1
Participant: coordinators
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-25
Task: trec2025-dragun-qgen
MD5: b8671eff3e7e19f59ea3d7f8c1bed6ff
Run description: Used gpt-oss-120b as the LLM backend.

organizer-gpt-oss-t2¶

Run ID: organizer-gpt-oss-t2
Participant: coordinators
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-25
Task: trec2025-dragun-repgen
MD5: 3cf163825b89fcf7bcbb2bda5989fbb0
Run description: Used gpt-oss-120b as the LLM backend.

organizer-t1-chatgpt¶

Participants | Input | dragun-qgen | Appendix

Run ID: organizer-t1-chatgpt
Participant: coordinators
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-25
Task: trec2025-dragun-qgen
MD5: c20dd64cc09f9f1b92cb4cf6e37422da
Run description: ChatGPT 5 Pro with Deep Research via its web interface.

organizer-t1-perplex¶

Participants | Input | dragun-qgen | Appendix

Run ID: organizer-t1-perplex
Participant: coordinators
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-25
Task: trec2025-dragun-qgen
MD5: 5749d566442f558d90a0744b28985457
Run description: Perplexity with Deep Research via its web interface.

SCIAI_03_02_Three¶

Run ID: SCIAI_03_02_Three
Participant: SCIAI
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-22
Task: trec2025-dragun-repgen
MD5: 582fcfe0480c3e20a960396a92326d20
Run description: Three rounds

SCIAI_03_03_Five¶

Run ID: SCIAI_03_03_Five
Participant: SCIAI
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-22
Task: trec2025-dragun-repgen
MD5: a21a7e1659c608f00b9fc94206f955e0
Run description: Five rounds

SCIAI_03_04_Eight¶

Run ID: SCIAI_03_04_Eight
Participant: SCIAI
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-22
Task: trec2025-dragun-repgen
MD5: a38db40436f21bf95113224326cf1ed5
Run description: Eight rounds

SK_ConvinceF_MI_2¶

Participants | Proceedings | Input | dragun-qgen | Appendix

Run ID: SK_ConvinceF_MI_2
Participant: TREMA-UNH
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-22
Task: trec2025-dragun-qgen
MD5: a1789bb6e9cd57468179802b36425c10
Run description: An original article is used to generate a "convince false" article that refutes its claims. From these, in the first iteration, 10 queries are created: 5 derived from the original article and 5 from the "convince false" article. The remaining process continues as originally defined.

SK_ConvinceF_MI_2_RG¶

Run ID: SK_ConvinceF_MI_2_RG
Participant: TREMA-UNH
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-22
Task: trec2025-dragun-repgen
MD5: f3b3a6eb8ccb407e735958ec029dd521
Run description: An original article is used to generate a "convince false" article that refutes its claims. From these, in the first iteration, 10 queries are created: 5 derived from the original article and 5 from the "convince false" article. The remaining process continues as originally defined.

SK_Critique_MI_5¶

Participants | Proceedings | Input | dragun-qgen | Appendix

Run ID: SK_Critique_MI_5
Participant: TREMA-UNH
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-22
Task: trec2025-dragun-qgen
MD5: 2bc63a4f0e1403a9cab4d6afa7e93afd
Run description: Based on the starter kit. The first 10 queries were based on the article (5) and a critique (5) generated for the original article. The rest of the process is the same.

SK_Critique_MI_5_RG¶

Run ID: SK_Critique_MI_5_RG
Participant: TREMA-UNH
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-22
Task: trec2025-dragun-repgen
MD5: 6f43be05a9c4de40f0357d86722c528b
Run description: Same as run5 task 1.

SK_MI_1¶

Participants | Proceedings | Input | dragun-qgen | Appendix

Run ID: SK_MI_1
Participant: TREMA-UNH
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-22
Task: trec2025-dragun-qgen
MD5: 94de962753f7b1520074329193c93b8e
Run description: Baseline. This run is based on the starter kit, but with a tweak in the prompts to match the small model Qwen 7 B and the max iteration of 1.

SK_MI_2¶

Participants | Proceedings | Input | dragun-qgen | Appendix

Run ID: SK_MI_2
Participant: TREMA-UNH
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-22
Task: trec2025-dragun-qgen
MD5: 3258a24ac48a081881b614cd4cf2b7ee
Run description: Baseline. Based on the starter kit, with tweaks in prompts and a max iteration of 2.

SK_MI_2_RG¶

Run ID: SK_MI_2_RG
Participant: TREMA-UNH
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-22
Task: trec2025-dragun-repgen
MD5: 772333e32e4585daae69b8b58ac71736
Run description: Same as task 1

Team01_Run01_Winner¶

Run ID: Team01_Run01_Winner
Participant: SCIAI
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-17
Task: trec2025-dragun-repgen
MD5: d9573fec920f7a29ceb95e1f481a15c8
Run description: Our best attempt with our finalized pipeline using only the MS MARCO data for report generation automatically.

Team02_Run01_1000SegmentsExpansion¶

Run ID: Team02_Run01_1000SegmentsExpansion
Participant: SCIAI
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-15
Task: trec2025-dragun-repgen
MD5: c3ecc285056b9756ae41f77a1218723f
Run description: A set of 60 questions are generated based on the article contents via three LLM calls. These questions are narrowed down to 10 using a pre trained model that ranks questions and by removing questions that are too similar to other questions. These questions are used to generate additional queries. Each query is used to retrieve the top 1000 segments from MS MARCO V2.1 (Segmented), followed by reranking techniques and a LLM being used to select the most relevant segments for each question. An LLM then answers as many questions as possible using the retrieved segments before hitting the 250 word count limit in the final report.

Team02_Run02_100SegmentsExpansion¶

Run ID: Team02_Run02_100SegmentsExpansion
Participant: SCIAI
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-15
Task: trec2025-dragun-repgen
MD5: 9f5a770b48645611b49a66c240b73c70
Run description: A set of 60 questions are generated based on the article contents via three LLM calls. These questions are narrowed down to 10 using a pre trained model that ranks questions and by removing questions that are too similar to other questions. These questions are used to generate additional queries. Each query is used to retrieve the top 100 segments from MS MARCO V2.1 (Segmented), followed by reranking techniques and a LLM being used to select the most relevant segments for each question. An LLM then answers as many questions as possible using the retrieved segments before hitting the 250 word count limit in the final report.

Team02_Run03_100SegmentsNoExpansion¶

Run ID: Team02_Run03_100SegmentsNoExpansion
Participant: SCIAI
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-15
Task: trec2025-dragun-repgen
MD5: cb0e321a12f7f1b7eb4ecd23fd58bd6d
Run description: A set of 60 questions are generated based on the article contents via three LLM calls. These questions are narrowed down to 10 using a pre trained model that ranks questions and by removing questions that are too similar to other questions. These questions are used to retrieve the top 100 segments from MS MARCO V2.1 (Segmented). This is followed by reranking techniques and a LLM being used to select the most relevant segments for each question. An LLM then answers as many questions as possible using the retrieved segments before hitting the 250 word count limit in the final report.

Team02_Task1¶

Participants | Proceedings | Input | dragun-qgen | Appendix

Run ID: Team02_Task1
Participant: SCIAI
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-18
Task: trec2025-dragun-qgen
MD5: 02d48540199b8147969557beebf3a266
Run description: For each article, 60 questions were generated. Using a pre trained model based on the rankings of questions generated from 2024, the 60 questions were sorted from best to worst. Any questions deemed too similar to other questions are removed. Finally, the top 10 questions are used in the final report.

UR_IW_run_1¶

Participants | Proceedings | Input | dragun-qgen | Appendix

Run ID: UR_IW_run_1
Participant: UR_trecking
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-22
Task: trec2025-dragun-qgen
MD5: 8be66013872e5e531618543254a4de17
Run description: 30 questions per article were generated (using GPT-5 nano) and these were filtered for compound questions (those were removed).

UR_IW_run_1_task2¶

Run ID: UR_IW_run_1_task2
Participant: UR_trecking
Track: Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
Year: 2025
Submission: 2025-08-22
Task: trec2025-dragun-repgen
MD5: 95ed0bd3ed4fe298a5cfc1e70e4f29c1
Run description: We used CoT query expansion (Jagerman et al., 2023) to transform questions from task 1 into queries. We searched on an Elastic Search index with MS Marco v2.1 segmented with a multi-match query using the standard english analyzer. We used the mono T5 reranker to rerank the top 1000 retrieval results. We judged the relevance up to the top 100 reranked documents, we checked for the following conditions: is the source of the retrieved document trustworthy (dataset Lin et al., 2023; PC trustworthiness score > 0.7), is the document relevant (using an LLM). We used the remaining documents to generate the report: We prompted an LLM to generate answers for the questions that we were able to retrieve segments for, if we had more than 10 questions that we were able to answer we used k-means to select the 10 most diverse questions based on their embedding. A shortener was employed to reduce the size of the report to a maximum of 250 words.