Run description: BM25+RM3 (Pyserini default) as the initial retrieval (denoted by ret_bm25_rm3 in the file name) method to retrieve 100 passages per query (denoted by k_100). The query in each turn was re-written using: The context, and The top-3 relevant PTKB statements (denoted by num_ptkb_3 in the file name). Query re-writing. In all cases, the re-written query was constructed by appending the relevant PTKB statements to the (manually or automatically) resolved query. Response generation. A response was generated using the top-3 passages retrieved with the re-written query (denoted by num_psg_3 in the file name). We use the T5 model mrm8488/t5-base-finetuned-summarize-news available on HuggingFace for this purpose. The relevant PTKB statements were determined automatically by re-ranking the statements using SentenceTransformers, specifically, the model cross-encoder/ms-marco-MiniLM-L-6-v2 available on HuggingFace. The query was re-written automatically using the castorini/t5-base-canard model available on HuggingFace.
Run description: BM25+RM3 (Pyserini default) as the initial retrieval method to retrieve 100 passages per query.The query in each turn was re-written using: (1) the context, and (2) the top-3 relevant PTKB statements. The re-written query was construted by appending the relevant PTKB statements to the manually resolved query.A response was generated using the top-3 passages retrieved with the re-written query. We use the T5 model mrm8488/t5-base-finetuned-summarize-news available on HuggingFace for this purpose.
Run description: Query rewriting: rewriting model conditioned on the given all ptkbs. Retrieval: Sparse retrieval with re-ranking performed by dense retrievers; the retriever was fine-tuned with synthetic statements with QReCC. Response generation: Generative QA models; the QA model was fine-tuned on augmented QReCC dataset with synthetic statements.
Run description: Query rewriting: statement-aware QR model Retrieval: Sparse retrieval with re-ranking performed by dense retrievers. Response generation: Generative QA models.
Run description: ConvGQR combines query rewriting and query expansion to conduct query reformulation training on QReCC dataset, then apply on iKAT dataset.
Run description: This is a two shot approach, meaning that I had Llama generate a response to the user utterance (combined with the PTKBs automatically determined to be relevant, in a fluent-ish manner), took that response, found passages that were classified as reliable by the TF-IDF and logarithmic regression model, used those passages with llama to generate another response, found new passages, and summarized each passage (with each sentence ranked in order of relevance to query) with FastChat T5 in a 1-2 sentences, combing each summary and using that as our result or text.
Run description: This is a one shot approach, meaning that I had Llama generate a response to the user utterance (combined with the PTKBs automatically determined to be relevant, in a fluent-ish manner), took that response, found passages that were classified as reliable by the TF-IDF and logarithmic regression model, then summarized each passage (with each sentence ranked in order of relevance to query) with FastChat T5 in a 1-2 sentences, combing each summary and using that as our result or text.
Run description: This is a one shot approach, meaning that I had Llama generate a response to the user utterance (combined with the PTKBs automatically determined to be relevant, in a fluent-ish manner), took that response, found the top passages as scored by BM25, then summarized each passage (with each sentence ranked in order of relevance to query) with FastChat T5 in a 1-2 sentences, combing each summary and using that as our result or text.
Run ID: GRILL_BM25_T5Rewriter_T5Ranker_BARTSummariser
Participant: GRILL_Team
Track: Interactive Knowledge Assistance
Year: 2023
Submission: 9/1/2023
Type: automatic
Task: primary
MD5:b38ce956461ba62486e8a91bcc9d17ee
Run description: Pipeline consists of T5 for query rewriting, BM25 for initial retrieval, T5 for document/passage reranking, and a BART model for response generation. We also include a simulator in the loop based on a small LLAMA model that provides simulated user feedback and provides answers to clarification question.
Run ID: GRILL_BM25_T5Rewriter_T5Ranker_BARTSummariser_10
Participant: GRILL_Team
Track: Interactive Knowledge Assistance
Year: 2023
Submission: 9/3/2023
Type: automatic
Task: primary
MD5:1f2f2ee1de32902159df12c25a2af9f9
Run description: T5 for query rewriting, BM25 for initial retrieval then T5 for passage ranking. There is a Llama 2 based simulator in the loop that provides simulated feedback for up to 10 rounds per query
Run description: Run uses Colbert based dense retrieval for initial retrieval and generates a response from the top three ranking passages. There is also a simulator in the loop based on a small Llama 2 model that provides simulated feedback and answers to clarifying questions based on the the user's information needs and ptkbs.
Run description: The pipeline consists of multiple calls to a (small) LLaMA 2 with different prompts and updated versions of the conversation/task state. The first call is a rewrite call where the prompt contains the conversation so far and asks the model to reformulate the latest utterance to help a search system generate retrieve better results. The second call is a reranking call where, after initial retrieval with the rewrite from the first call, the prompt contains the conversation so far and a document and asks the model to score the document based on its relevance to the conversation. The last call is a response generation call where the prompt contains the top 3 documents and the conversation so far and asks the model to generate a response that satisfies the user's information need
Run description: This run is based on zero-shot prompting the Llama 7b model. Llama (zero-shot) is used for response generation and query rewriting. For PTKB selection the Sentence Transformers pre-trained model is used. For reranking, the cross-encoder model from hugging face (ms-marco-MiniLM-L-12-v2) which is trained on passage ranking task on ms Marco dataset, is used.
Run description: The llama model is fine-tuned in this task for query rewriting and response generation. For PTKB selection (Sentence Transformers), and for Reranking cross-encoder (MiniLM12) from hugging face is used.
Run description: This run is a manual run based on using the llama(7b) model fine-tuned on the training dataset of ikat. The llama model is used for response generation. For re-ranking and BM25 the manually-rewritten query and relevant ground truth ptkb statements are used.
Run description: In this run an initial answer is generated for each turn using the GPT-4 model. Then, the GPT-4 is used to generate 5 queries for each answer. The generated queries are passed to a BM25 model and re-ranker (cross-encoder miniLM). The first 2 documents retrieved by each query are selected and passed to GPT-4 to generate the response text.
Run ID: run_automatic_dense_damo_canard_16000_recall
Participant: IITD
Track: Interactive Knowledge Assistance
Year: 2023
Submission: 9/4/2023
Type: automatic
Task: primary
MD5:02ba35a3916adaeffd7d6ca40299ba78
Run description: This method is two step pipeline. In this first step is a dense retrieval followed by reranking of passages. However the automatic queries are rewritten using query rewriting module based on T5, f. Reranking is done using T5 based model.
Run description: This method uses two key points - 1. It uses a neural model to preserve the key elements for reformulation. So that it can keep track of conversation key items. 2. Based on conventional dense retrieval followed by neural reranking .
Run description: This method is two step pipeline. In this first step is a dense retrieval followed by reranking of passages. However the automatic queries are rewritten using custom train query rewriting module based on BART, finetuned on samsum and canard dataset. Reranking is done using T5 based model
Run description: This method is two step pipeline. In this first step is a dense retrieval followed by reranking of passages. However the automatic queries are rewritten using custom train query rewriting module based on BART, finetuned on samsum and canard dataset. Reranking is done using COROM model
Run description: We utilized Pyserini's LuceneSearcher tool to perform initial passage retrieval for each utterance turn. Then, multiple LLMs were employed to re-rank the top five passages retrieved in each turn by pair-wise ranking. We aggregated their results and got a final ranking. Both the passage retrieval and reranking stages consider the first two relevant PTKB statements generated by our automated runs. This is a zero-shot learning approach.
Run description: We utilized Pyserini's LuceneSearcher tool to perform passage retrieval for each utterance turn. Then, multiple LLMs (same with our run 1, the LLMs we used are "stabilityai/stablelm-tuned-alpha-7b," "eachadea/vicuna-13b-1.1," "jondurbin/airoboros-7b," "TheBloke/koala-13B-HF") were employed to re-rank the top five passages retrieved in each turn by pair-wise ranking. We aggregated their results and got a final ranking. Both the passage retrieval and reranking stages didn't consider relevant PTKB statements and only used our rewritten utterance in each turn. This is a zero-shot learning approach.
Run description: We utilized Pyserini's LuceneSearcher tool to perform passage retrieval for each utterance turn. Then, we used MonoT5 to conduct the passage reranking process. Both the passage retrieval and reranking stages consider the top-2 relevant PTKB statements in our automatic PTKB statement ranking runs in each turn. This is a zero-shot learning approach.
Run description: We utilized Pyserini's LuceneSearcher tool to perform passage retrieval for each utterance turn. Then, a combination of GPT-3.5 model with prompts and a sliding window was employed to conduct the reranking process. Both the passage retrieval and reranking stages consider the top-2 relevant PTKB statements in our automatic PTKB statement ranking runs in each turn. This is a zero-shot learning approach.