Run description: Using Indri Query System. 1.build matrix base on the results of baseline,find 25 documents nearest to the input document vector with knn method. 2.take LM to extract the 20 relevant words from the 25 documents,classify these words by svm as the extended words 3.extract hige frequency named entity from the 25 documents and calculate kl distance,get the 5 nearest NE as the extended words. 4.add the extended words in to the original query terms
Run description: Using Indri Query System. 1.build matrix base on the results of baseline,find 25 documents nearest to the input document vector with knn method. 2.take LM to extract the 20 relevant words from the 25 documents,classify these words by svm as the extended words 3.extract hige frequency named entity from the 25 documents and calculate kl distance,get the 5 nearest NE as the extended words. 4.add the extended words in to the original query terms
Run description: Using Indri Query System. 1.build matrix base on the results of baseline,find 25 documents nearest to the input document vector with knn method. 2.take LM to extract the 20 relevant words from the 25 documents,classify these words by svm as the extended words 3.extract hige frequency named entity from the 25 documents and calculate kl distance,get the 5 nearest NE as the extended words. 4.add the extended words in to the original query terms
Run description: Using Indri Query System. 1.build matrix base on the results of baseline,find 25 documents nearest to the input document vector with knn method. 2.take LM to extract the 20 relevant words from the 25 documents,classify these words by svm as the extended words 3.extract hige frequency named entity from the 25 documents and calculate kl distance,get the 5 nearest NE as the extended words. 4.add the extended words in to the original query terms
Run description: Using Indri Query System. 1.build matrix base on the results of baseline,find 25 documents nearest to the input document vector with knn method. 2.take LM to extract the 20 relevant words from the 25 documents,classify these words by svm as the extended words 3.extract hige frequency named entity from the 25 documents and calculate kl distance,get the 5 nearest NE as the extended words. 4.add the extended words in to the original query terms
Run description: Relevance Feature Discovery (RFD) is a run on the top of base run results with N=15000 to select the top documents as the initial documents to RFD. These initial results were re-ranked using relevance feedback document provided by TREC. The knowledge was discovered from the relevant documents and the given query using the Rocchio algorithm. Using pseudo-relevance feedback the top k documents were selected to simulate positive feedback from users and the bottom k were treated as negative feedback (where k = 10). Knowledge was discovered from those selected pseud-relevance feedback documents used RFD model, a pattern-based model that used both positive and negative feedback for feature selection. The 15000 documents were finally re-ranked used the discovered knowledge. The top 2500 documents were submitted as the result.
Run description: Rocchio algorithm is also a run on top of the base run result, with again N=15000. These documents were re-ranked used the relevance feedback document provided by Trec. The knowledge was discovered from relevance documents and query using Rocchio algorithm. Using pseudo-relevance feedback the top k was selected to use as positive feedback and the bottom k selected as negative feedback (k = 10). A text mining techique (the Rocchio model) was proformed to discover knowledge from those selected pseudo-relevance feedback documents. The 15000 documents were finally reranked based on the discovered knowledge, and the top 2500 documents were submitted as the final result.
Run description: Relevance Feature Discovery (RFD) is a run on the top of base run results with N=15000 to select the top documents as the initial documents to RFD. These initial results were re-ranked using relevance feedback document provided by TREC. The knowledge was discovered from the relevant documents and the given query using the Rocchio algorithm. Using pseudo-relevance feedback the top k documents were selected to simulate positive feedback from users and the bottom k were treated as negative feedback (where k = 10). Knowledge was discovered from those selected pseud-relevance feedback documents used RFD model, a pattern-based model that used both positive and negative feedback for feature selection. The 15000 documents were finally re-ranked used the discovered knowledge. The top 2500 documents were submitted as the result.
Run description: Rocchio algorithm is also a run on top of the base run result, with again N=15000. These documents were re-ranked used the relevance feedback document provided by Trec. The knowledge was discovered from relevance documents and query using Rocchio algorithm. Using pseudo-relevance feedback the top k was selected to use as positive feedback and the bottom k selected as negative feedback (k = 10). A text mining techique (the Rocchio model) was proformed to discover knowledge from those selected pseudo-relevance feedback documents. The 15000 documents were finally reranked based on the discovered knowledge, and the top 2500 documents were submitted as the final result.
Run description: Relevance Feature Discovery (RFD) is a run on the top of base run results with N=15000 to select the top documents as the initial documents to RFD. These initial results were re-ranked using relevance feedback document provided by TREC. The knowledge was discovered from the relevant documents and the given query using the Rocchio algorithm. Using pseudo-relevance feedback the top k documents were selected to simulate positive feedback from users and the bottom k were treated as negative feedback (where k = 10). Knowledge was discovered from those selected pseud-relevance feedback documents used RFD model, a pattern-based model that used both positive and negative feedback for feature selection. The 15000 documents were finally re-ranked used the discovered knowledge. The top 2500 documents were submitted as the result.
Run description: Rocchio algorithm is also a run on top of the base run result, with again N=15000. These documents were re-ranked used the relevance feedback document provided by Trec. The knowledge was discovered from relevance documents and query using Rocchio algorithm. Using pseudo-relevance feedback the top k was selected to use as positive feedback and the bottom k selected as negative feedback (k = 10). A text mining techique (the Rocchio model) was proformed to discover knowledge from those selected pseudo-relevance feedback documents. The 15000 documents were finally reranked based on the discovered knowledge, and the top 2500 documents were submitted as the final result.
Run description: Relevance Feature Discovery (RFD) is a run on the top of base run results with N=15000 to select the top documents as the initial documents to RFD. These initial results were re-ranked using relevance feedback document provided by TREC. The knowledge was discovered from the relevant documents and the given query using the Rocchio algorithm. Using pseudo-relevance feedback the top k documents were selected to simulate positive feedback from users and the bottom k were treated as negative feedback (where k = 10). Knowledge was discovered from those selected pseud-relevance feedback documents used RFD model, a pattern-based model that used both positive and negative feedback for feature selection. The 15000 documents were finally re-ranked used the discovered knowledge. The top 2500 documents were submitted as the result.
Run description: Rocchio algorithm is also a run on top of the base run result, with again N=15000. These documents were re-ranked used the relevance feedback document provided by Trec. The knowledge was discovered from relevance documents and query using Rocchio algorithm. Using pseudo-relevance feedback the top k was selected to use as positive feedback and the bottom k selected as negative feedback (k = 10). A text mining techique (the Rocchio model) was proformed to discover knowledge from those selected pseudo-relevance feedback documents. The 15000 documents were finally reranked based on the discovered knowledge, and the top 2500 documents were submitted as the final result.
Run description: Relevance Feature Discovery (RFD) is a run on the top of base run results with N=15000 to select the top documents as the initial documents to RFD. These initial results were re-ranked using relevance feedback document provided by TREC. The knowledge was discovered from the relevant documents and the given query using the Rocchio algorithm. Using pseudo-relevance feedback the top k documents were selected to simulate positive feedback from users and the bottom k were treated as negative feedback (where k = 10). Knowledge was discovered from those selected pseud-relevance feedback documents used RFD model, a pattern-based model that used both positive and negative feedback for feature selection. The 15000 documents were finally re-ranked used the discovered knowledge. The top 2500 documents were submitted as the result.
Run description: Rocchio algorithm is also a run on top of the base run result, with again N=15000. These documents were re-ranked used the relevance feedback document provided by Trec. The knowledge was discovered from relevance documents and query using Rocchio algorithm. Using pseudo-relevance feedback the top k was selected to use as positive feedback and the bottom k selected as negative feedback (k = 10). A text mining techique (the Rocchio model) was proformed to discover knowledge from those selected pseudo-relevance feedback documents. The 15000 documents were finally reranked based on the discovered knowledge, and the top 2500 documents were submitted as the final result.
Run description: As the initial text-preprocess steps, all documents in subcategory B and queries terms were processed for stemming and stopword removal. To select the most relevance document to the query terms the Rocchio and cosine similarly methods were used to calculate the relevance score of documents to the given query, and rank the documents in a list. The top N (N=2500) documents were then selected as the results for the base run.
Run description: Basic Rocchio feedback of 1994 era. 25 terms added, which must occur 100 times in collection. Rocchio weights of 16,16,32 (the 32 is not used), thus equal weights for terms from topic and from single doc.
Run description: Basic Rocchio feedback of 1994 era. 25 terms added, which must occur 100 times in collection. Rocchio weights of 16,16,32 (the 32 is not used), thus equal weights for terms from topic and from single doc.
Run description: Basic Rocchio feedback of 1994 era. 25 terms added, which must occur 100 times in collection. Rocchio weights of 16,16,32 (the 32 is not used), thus equal weights for terms from topic and from single doc.
Run description: Basic Rocchio feedback of 1994 era. 25 terms added, which must occur 100 times in collection. Rocchio weights of 16,16,32 (the 32 is not used), thus equal weights for terms from topic and from single doc.
Run description: Basic Rocchio feedback of 1994 era. 25 terms added, which must occur 100 times in collection. Rocchio weights of 16,16,32 (the 32 is not used), thus equal weights for terms from topic and from single doc.
Run description: Basic Rocchio feedback of 1994 era. 25 terms added, which must occur 100 times in collection. Rocchio weights of 16,16,32 (the 32 is not used), thus equal weights for terms from topic and from single doc.
Run description: Basic Rocchio feedback of 1994 era. 25 terms added, which must occur 100 times in collection. Rocchio weights of 16,16,32 (the 32 is not used), thus equal weights for terms from topic and from single doc.
Run description: Basic Rocchio feedback of 1994 era. 25 terms added, which must occur 100 times in collection. Rocchio weights of 16,16,32 (the 32 is not used), thus equal weights for terms from topic and from single doc.
Run description: Basic Rocchio feedback of 1994 era. 25 terms added, which must occur 100 times in collection. Rocchio weights of 16,16,32 (the 32 is not used), thus equal weights for terms from topic and from single doc.
Run description: Basic Rocchio feedback of 1994 era. 25 terms added, which must occur 100 times in collection. Rocchio weights of 16,16,32 (the 32 is not used), thus equal weights for terms from topic and from single doc.
Run description: Basic vector space similarity of each doc to topic, with a second equally-weighted pass of best inner-product passage similarity with topic.
Run description: In this run the 10 keywords with highest IDF in the feedback document are used to expand the query. Then a cooccurrence matrix of the expanded query terms is computed by using contiguous text windows of size 7. The coocurrence matrix is decomposed by SVD and the principal eigenvector is used to re-rank the documents. In particular the documents are re-ranked according to the distance from the subspace spanned by the selected eigenvector. Each document is represented as a vector of BM25 weights. The top 2500 documents retrieved by the baseline, i.e. BM25, are re-ranked.
Run description: In this run the 10 keywords with highest IDF in the feedback document are used to expand the query. Then a co-occurrence matrix of the expanded query terms is computed by using contiguous text windows of size 7. The co-occurrence matrix is decomposed by SVD and the principal eigenvector is used to re-rank the documents. In particular the documents are re-ranked according to the distance from the subspace spanned by the selected eigenvector. Each document is represented as a vector of BM25 weights. The top 2500 documents retrieved by the baseline, i.e. BM25, are re-ranked.
Run description: In this run the 10 keywords with highest IDF in the feedback document are used to expand the query. Then a cooccurrence matrix of the expanded query terms is computed by using contiguous text windows of size 7. The coocurrence matrix is decomposed by SVD and the principal eigenvector is used to re-rank the documents. In particular the documents are re-ranked according to the distance from the subspace spanned by the selected eigenvector. Each document is represented as a vector of BM25 weights. The top 2500 documents retrieved by the baseline, i.e. BM25, are re-ranked.
Run description: In this run the 10 keywords with highest IDF in the feedback document are used to expand the query. Then a cooccurrence matrix of the expanded query terms is computed by using contiguous text windows of size 7. The coocurrence matrix is decomposed by SVD and the principal eigenvector is used to re-rank the documents. In particular the documents are re-ranked according to the distance from the subspace spanned by the selected eigenvector. Each document is represented as a vector of BM25 weights. The top 2500 documents retrieved by the baseline, i.e. BM25, are re-ranked.
Run description: In this run the 10 keywords with highest IDF in the feedback document are used to expand the query. Then a cooccurrence matrix of the expanded query terms is computed by using contiguous text windows of size 7. The coocurrence matrix is decomposed by SVD and the principal eigenvector is used to re-rank the documents. In particular the documents are re-ranked according to the distance from the subspace spanned by the selected eigenvector. Each document is represented as a vector of BM25 weights. The top 2500 documents retrieved by the baseline, i.e. BM25, are re-ranked.
Run description: In this run the 10 keywords with highest IDF in the feedback document are used to expand the query. Then a cooccurrence matrix of the expanded query terms is computed by using contiguous text windows of size 7. The coocurrence matrix is decomposed by SVD and the principal eigenvector is used to re-rank the documents. In particular the documents are re-ranked according to the distance from the subspace spanned by the selected eigenvector. Each document is represented as a vector of BM25 weights. The top 2500 documents retrieved by the baseline, i.e. BM25, are re-ranked.
Run description: In this run the 10 keywords with highest IDF in the feedback document are used to expand the query. Then a cooccurrence matrix of the expanded query terms is computed by using contiguous text windows of size 7. The coocurrence matrix is decomposed by SVD and the principal eigenvector is used to re-rank the documents. In particular the documents are re-ranked according to the distance from the subspace spanned by the selected eigenvector. Each document is represented as a vector of BM25 weights. The top 2500 documents retrieved by the baseline, i.e. BM25, are re-ranked.
Run description: In this run the 10 keywords with highest IDF in the feedback document are used to expand the query. Then a cooccurrence matrix of the expanded query terms is computed by using contiguous text windows of size 7. The coocurrence matrix is decomposed by SVD and the principal eigenvector is used to re-rank the documents. In particular the documents are re-ranked according to the distance from the subspace spanned by the selected eigenvector. Each document is represented as a vector of BM25 weights. The top 2500 documents retrieved by the baseline, i.e. BM25, are re-ranked.
Run description: In this run the 10 keywords with highest IDF in the feedback document are used to expand the query. Then a cooccurrence matrix of the expanded query terms is computed by using contiguous text windows of size 7. The coocurrence matrix is decomposed by SVD and the principal eigenvector is used to re-rank the documents. In particular the documents are re-ranked according to the distance from the subspace spanned by the selected eigenvector. Each document is represented as a vector of BM25 weights. The top 2500 documents retrieved by the baseline, i.e. BM25, are re-ranked.
Run description: In this run the 10 keywords with highest IDF in the feedback document are used to expand the query. Then a co-occurrence matrix of the expanded query terms is computed by using contiguous text windows of size 7. The co-occurrence matrix is decomposed by SVD and the principal eigenvector is used to re-rank the documents. In particular the documents are re-ranked according to the distance from the subspace spanned by the selected eigenvector. Each document is represented as a vector of BM25 weights. The top 2500 documents retrieved by the baseline, i.e. BM25, are re-ranked.