Skip to content

Proceedings - Enterprise 2007

Overview of the TREC 2007 Enterprise Track

Peter Bailey, Arjen P. de Vries, Nick Craswell, Ian Soboroff

Abstract

The goal of the enterprise track is to conduct experiments with enterprise data that reflect the experiences of users in real organizations. This year, the track has introduced a new corpus with the goal to be more representative of real-world enterprise search, by involving actual members of the organization in the topic development process, performing their real work tasks.

Bibtex
@inproceedings{DBLP:conf/trec/BaileyVCS07,
    author = {Peter Bailey and Arjen P. de Vries and Nick Craswell and Ian Soboroff},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {Overview of the {TREC} 2007 Enterprise Track},
    booktitle = {Proceedings of The Sixteenth Text REtrieval Conference, {TREC} 2007, Gaithersburg, Maryland, USA, November 5-9, 2007},
    series = {{NIST} Special Publication},
    volume = {500-274},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2007},
    url = {http://trec.nist.gov/pubs/trec16/papers/ENT.OVERVIEW16.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/BaileyVCS07.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

Brian Almquist, Viet Ha-Thuc, Aditya Kumar Sehgal, Robert J. Arens, Padmini Srinivasan

Abstract

The University of Iowa Team, under coordinating professor Padmini Srinivasan, participated in the legal discovery and enterprise tracks of TREC-2007.

Bibtex
@inproceedings{DBLP:conf/trec/AlmquistHSAS07,
    author = {Brian Almquist and Viet Ha{-}Thuc and Aditya Kumar Sehgal and Robert J. Arens and Padmini Srinivasan},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {Exploring the Legal Discovery and Enterprise Tracks at the University of Iowa},
    booktitle = {Proceedings of The Sixteenth Text REtrieval Conference, {TREC} 2007, Gaithersburg, Maryland, USA, November 5-9, 2007},
    series = {{NIST} Special Publication},
    volume = {500-274},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2007},
    url = {http://trec.nist.gov/pubs/trec16/papers/uiowa.legal.ent.final.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/AlmquistHSAS07.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

TREC 2007 Enterprise Track at CSIRO

Peter Bailey, Deepak Agrawal, Anuj Kumar

Abstract

The goals of CSIRO's participation in the Enterprise track were formed by the nature of the tasks. With the expert finding search task, we sought to use a variety of means to associate topical expertise with individuals previously located within the collection. With the document search task, we were primarily interested in exploring issues of result diversity based on different characterisations of documents within the collection. We completed both expert and document search tasks by the submission deadline. In both cases, we submitted four runs for each task. The algorithms used for the runs for both tasks used a query-only baseline with subsequent variations. In both cases, we incorporated use of the PADRE retrieval system [2], in which the Okapi BM25 relevance function was implemented as the core ranking component. Incorporation of additional evidence such as anchor text and other characteristics of Web documents is used in the default ranking formula associated with the retrieval system.

Bibtex
@inproceedings{DBLP:conf/trec/BaileyAK07,
    author = {Peter Bailey and Deepak Agrawal and Anuj Kumar},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {{TREC} 2007 Enterprise Track at {CSIRO}},
    booktitle = {Proceedings of The Sixteenth Text REtrieval Conference, {TREC} 2007, Gaithersburg, Maryland, USA, November 5-9, 2007},
    series = {{NIST} Special Publication},
    volume = {500-274},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2007},
    url = {http://trec.nist.gov/pubs/trec16/papers/csiro.ent.final.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/BaileyAK07.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

Krisztian Balog, Katja Hofmann, Wouter Weerkamp, Maarten de Rijke

Abstract

We describe our participation in the TREC 2007 Enterprise track and detail our language modeling-based approaches. For document search, our focus was on estimating a mixture model using a standard web collection, and on constructing query models by employing blind relevance feedback and using the example documents provided with the topics. We found that settings performing well on a web collection do not carry over to the CSIRO collection, but the use of advanced query models resulted in significant improvements. In expert search, our experiments concerned document representation, identification of candidate experts, and combinations of expert search strategies. We find no significant difference in average precision but observe small overall positive effects of the advanced models, with large differences between individual topics.

Bibtex
@inproceedings{DBLP:conf/trec/BalogHWR07,
    author = {Krisztian Balog and Katja Hofmann and Wouter Weerkamp and Maarten de Rijke},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {Query and Document Models for Enterprise Search},
    booktitle = {Proceedings of The Sixteenth Text REtrieval Conference, {TREC} 2007, Gaithersburg, Maryland, USA, November 5-9, 2007},
    series = {{NIST} Special Publication},
    volume = {500-274},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2007},
    url = {http://trec.nist.gov/pubs/trec16/papers/uamsterdam-balog.ent.final.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/BalogHWR07.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

DUTIR at TREC 2007 Enterprise Track

Jianmei Chen, Hui Ren, Linhong Xu, Hongfei Lin, Zhihao Yang

Abstract

This paper describes our experiments on the two tasks of the TREC 2007 Enterprise track. In data preprocessing stage we stripped the non-letter character from documents and query. For the Document Search, we built the index by indri and lemur, handled the query topic and then retrieved relevant documents by indri and lemur. For the Expert Search, we recognized candidates from collection, established correlative document pool, built the index by indri and lemur, and then got expert list and supporting documents.

Bibtex
@inproceedings{DBLP:conf/trec/ChenRXLY07,
    author = {Jianmei Chen and Hui Ren and Linhong Xu and Hongfei Lin and Zhihao Yang},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {{DUTIR} at {TREC} 2007 Enterprise Track},
    booktitle = {Proceedings of The Sixteenth Text REtrieval Conference, {TREC} 2007, Gaithersburg, Maryland, USA, November 5-9, 2007},
    series = {{NIST} Special Publication},
    volume = {500-274},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2007},
    url = {http://trec.nist.gov/pubs/trec16/papers/dalianu.ent.final.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/ChenRXLY07.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

Research on Enterprise Track of TREC 2007 at SJTU APEX Lab

Huizhong Duan, Qi Zhou, Zhen Lu, Ou Jin, Shenghua Bao, Yunbo Cao, Yong Yu

Abstract

This year we (APEX Lab, Shanghai Jiao Tong University) participated in both Document Search Task and Expert Search Task in Enterprise Track of TREC 2007. For Document Search Task, we generally applied BM25 formula separately on different fields of HTML pages: Title, Anchor, H1, H2, Keywords, and Extracted Body. Various Static Ranking methods are also exploited. Scores are combined together using linear combination. Among all the techniques we have embedded in our system, our highlight is the static ranking approaches. Beside this, some data preprocessing methods and similarity function will also be introduced. 1. Static Ranking Approaches. Page quality is our focus for the task. Thus we studied various static ranking methods in Enterprise Corpus. Among them, PageRank[6] and Topic Sensitive PageRank[1], which both generate similar ranks for most pages, do not work for this Corpus. Then we research on HostRank[5]. The central problem of using HostRank is to define a host. After realizing that sub-portals of an enterprise do not necessarily earn a difference, we finally used sub-layers of sub-portals (AAA.BBB.CCC/DDD) as hosts.

Bibtex
@inproceedings{DBLP:conf/trec/DuanZLJBCY07,
    author = {Huizhong Duan and Qi Zhou and Zhen Lu and Ou Jin and Shenghua Bao and Yunbo Cao and Yong Yu},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {Research on Enterprise Track of {TREC} 2007 at {SJTU} {APEX} Lab},
    booktitle = {Proceedings of The Sixteenth Text REtrieval Conference, {TREC} 2007, Gaithersburg, Maryland, USA, November 5-9, 2007},
    series = {{NIST} Special Publication},
    volume = {500-274},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2007},
    url = {http://trec.nist.gov/pubs/trec16/papers/sjtu-apex.ent.final.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/DuanZLJBCY07.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

Yu Fan, Xiangji Huang

Abstract

York University evaluated a prepcessing approach for this year's enterprise document search task. With different parsing tools, we create two data sets. Based on each data set, we generate two official runs. Their results demonstrate that the removal of raw data in preprocessing stage has a negative impact on the retrieval performance.

Bibtex
@inproceedings{DBLP:conf/trec/FanH07,
    author = {Yu Fan and Xiangji Huang},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {York University at {TREC} 2007: Enterprise Document Search},
    booktitle = {Proceedings of The Sixteenth Text REtrieval Conference, {TREC} 2007, Gaithersburg, Maryland, USA, November 5-9, 2007},
    series = {{NIST} Special Publication},
    volume = {500-274},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2007},
    url = {http://trec.nist.gov/pubs/trec16/papers/yorku.ent.final.pdf},
    timestamp = {Sun, 02 Oct 2022 01:00:00 +0200},
    biburl = {https://dblp.org/rec/conf/trec/FanH07.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

THUIR at TREC 2007: Enterprise Track

Yupeng Fu, Yufei Xue, Tong Zhu, Yiqun Liu, Min Zhang, Shaoping Ma

Abstract

We participate in document search and expert search of Enterprise Track in TREC2007. The motive behind the TREC Enterprise Track is to study the issues searching the documents and experts inside an enterprise environment, which has not been sufficiently addressed in research. In document search, we focus on the key overview page pre-selection methods and link analysis algorithms. In expert search, we develop methods to detect expert identifiers and experimented based on our previous PDD model.

Bibtex
@inproceedings{DBLP:conf/trec/FuXZLZM07,
    author = {Yupeng Fu and Yufei Xue and Tong Zhu and Yiqun Liu and Min Zhang and Shaoping Ma},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {{THUIR} at {TREC} 2007: Enterprise Track},
    booktitle = {Proceedings of The Sixteenth Text REtrieval Conference, {TREC} 2007, Gaithersburg, Maryland, USA, November 5-9, 2007},
    series = {{NIST} Special Publication},
    volume = {500-274},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2007},
    url = {http://trec.nist.gov/pubs/trec16/papers/tsinghuau-zhang.ent.final.pdf},
    timestamp = {Wed, 16 Sep 2020 01:00:00 +0200},
    biburl = {https://dblp.org/rec/conf/trec/FuXZLZM07.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

University of Glasgow at TREC 2007: Experiments in Blog and Enterprise Tracks with Terrier

David Hannah, Craig Macdonald, Jie Peng, Ben He, Iadh Ounis

Abstract

In TREC 2007, we participate in four tasks of the Blog and Enterprise tracks. We continue experiments using Terrier1 [14], our modular and scalable Information Retrieval (IR) platform, and the Divergence From Randomness (DFR) framework. In particular, for the Blog track opinion finding task, we propose a statistical term weighting approach to identify opinionated documents. An alternative approach based on an opinion identification tool is also utilised. Overall, a 15% improvement over a non-opinionated baseline is observed in applying the statistical term weighting approach. In the Expert Search task of the Enterprise track, we investigate the use of proximity between query terms and candidate name occurrences in documents.

Bibtex
@inproceedings{DBLP:conf/trec/HannahMPHO07,
    author = {David Hannah and Craig Macdonald and Jie Peng and Ben He and Iadh Ounis},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {University of Glasgow at {TREC} 2007: Experiments in Blog and Enterprise Tracks with Terrier},
    booktitle = {Proceedings of The Sixteenth Text REtrieval Conference, {TREC} 2007, Gaithersburg, Maryland, USA, November 5-9, 2007},
    series = {{NIST} Special Publication},
    volume = {500-274},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2007},
    url = {http://trec.nist.gov/pubs/trec16/papers/uglasgow.blog.ent.final.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/HannahMPHO07.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

CSIR at TREC 2007 Expert Search Task

Jiepu Jiang, Wei Lu, Dan Liu

Abstract

This is the second year for the participation of Center for Studies of Information Resources (CSIR) in the TREC Expert Search Task. Rather than using the candidate profile based approach, a simplified two stage approach is used in our experiment, that is, documents are ranked based on topics, and each expert is scored intuitively by the weights of documents the expert appeared. Instead of the modeling of expert search, we have mainly focused on the effect of document filtering in the expert search. In our experiment, only the top n ranked topic-relevant documents where the expert also appeared are calculated into the expert score. The tuned value of n under W3C corpus for a best performance is 10, which is proved to be stable under CERC corpus.

Bibtex
@inproceedings{DBLP:conf/trec/JiangLL07,
    author = {Jiepu Jiang and Wei Lu and Dan Liu},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {{CSIR} at {TREC} 2007 Expert Search Task},
    booktitle = {Proceedings of The Sixteenth Text REtrieval Conference, {TREC} 2007, Gaithersburg, Maryland, USA, November 5-9, 2007},
    series = {{NIST} Special Publication},
    volume = {500-274},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2007},
    url = {http://trec.nist.gov/pubs/trec16/papers/wuhanu.ent.final.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/JiangLL07.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

UALR at TREC-ENT 2007

Hemant Joshi, Sithu D. Sudarsan, Subhashish Duttachowdhury, Chuanlei Zhang, Srini Ramaswamy

Abstract

This is the first year we participated in the enterprise track. This year's enterprise track offered completely new enterprise data and two new tasks. The data offered was the CSIRO Enterprise Research Collection corpus1. The two new tasks introduced this year are Expert search and Document search. We participated in both tasks, though Document Search was our primary focus this year. We also believe that the results in our document search task might have a direct impact on the expert search task. Expert search task was to identify experts or subject matter experts given a particular topic. The goal was to drive queries regarding a certain subject be diverted to a particular set of experts. Identifying experts from the document collection is a challenging problem. We have to assert if the document is informative enough for the given topic and shows the mark of an expert. We have to also find the author of the article or the relevant name or email address mentioned. The results were to be submitted as email addresses with proof of documents that we believe are expert information for the given topic. Fifty new topics were provided by NIST2 and evaluation for expert search task was conducted with help from real-world CSIRO personnel. The document search task was to identify documents that are authoritative information about a given topic. Fifty topics were common among the document search and expert search tasks. The challenge was to determine if the document merely contained words associated with the given topic or the document was indeed the authoritative source on that topic. We had to analyze the documents relevant to the given topic and rank them according to how informative those documents are for that particular topic. We experimented with various approaches that can estimate authoritative information content contained within a document. We discuss these approaches and compare them later in this paper.

Bibtex
@inproceedings{DBLP:conf/trec/JoshiSDZR07,
    author = {Hemant Joshi and Sithu D. Sudarsan and Subhashish Duttachowdhury and Chuanlei Zhang and Srini Ramaswamy},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {{UALR} at {TREC-ENT} 2007},
    booktitle = {Proceedings of The Sixteenth Text REtrieval Conference, {TREC} 2007, Gaithersburg, Maryland, USA, November 5-9, 2007},
    series = {{NIST} Special Publication},
    volume = {500-274},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2007},
    url = {http://trec.nist.gov/pubs/trec16/papers/ualr.ent.final.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/JoshiSDZR07.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

Enterprise Search: Identifying Relevant Sentences and Using Them for Query Expansion

Maheedhar Kolla, Olga Vechtomova

Abstract

In this paper, we discuss the experiments conducted in context of Document Search task of 2007 Enterprise Search track. Our method is based on selecting sentences from the given relevant documents and using those selected sentences for query expansion. We observed that our method of query expansion improves system's performance over baseline run, under various methods of comparison.

Bibtex
@inproceedings{DBLP:conf/trec/KollaV07,
    author = {Maheedhar Kolla and Olga Vechtomova},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {Enterprise Search: Identifying Relevant Sentences and Using Them for Query Expansion},
    booktitle = {Proceedings of The Sixteenth Text REtrieval Conference, {TREC} 2007, Gaithersburg, Maryland, USA, November 5-9, 2007},
    series = {{NIST} Special Publication},
    volume = {500-274},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2007},
    url = {http://trec.nist.gov/pubs/trec16/papers/uwaterloo-vechtomova.ent.final.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/KollaV07.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

University of Twente at the TREC 2007 Enterprise Track: Modeling Relevance Propagation for the Expert Search Task

Pavel Serdyukov, Henning Rode, Djoerd Hiemstra

Abstract

This paper describes several approaches which we used for the expert search task of the TREC 2007 Enterprise track. We studied several methods of relevance propagation from documents to related candidate experts. Instead of one-step propagation from documents to directly related candidates, used by many systems in the previous years, we do not limit the relevance flow and disseminate it further through mutual documents-candidates connections. We model relevance propagation using random walk principles, or in formal terms, discrete Markov processes. We experiment with infinite and finite number of propagation steps. We also demonstrate how additional information, namely hyperlinks among documents, organizational structure of the enterprise and relevance feedback may be utilized by the presented techniques.

Bibtex
@inproceedings{DBLP:conf/trec/SerdyukovRH07,
    author = {Pavel Serdyukov and Henning Rode and Djoerd Hiemstra},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {University of Twente at the {TREC} 2007 Enterprise Track: Modeling Relevance Propagation for the Expert Search Task},
    booktitle = {Proceedings of The Sixteenth Text REtrieval Conference, {TREC} 2007, Gaithersburg, Maryland, USA, November 5-9, 2007},
    series = {{NIST} Special Publication},
    volume = {500-274},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2007},
    url = {http://trec.nist.gov/pubs/trec16/papers/utwente.ent.final.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/SerdyukovRH07.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

Research on Enterprise Track of TREC 2007

Huawei Shen, Guoyao Chen, Haiqiang Chen, Yue Liu, Xueqi Cheng

Abstract

We (ICT-CAS team) participated in the Enterprise Track of TREC 2007. This paper reports our experimental results on this track.

Bibtex
@inproceedings{DBLP:conf/trec/ShenCCLC07,
    author = {Huawei Shen and Guoyao Chen and Haiqiang Chen and Yue Liu and Xueqi Cheng},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {Research on Enterprise Track of {TREC} 2007},
    booktitle = {Proceedings of The Sixteenth Text REtrieval Conference, {TREC} 2007, Gaithersburg, Maryland, USA, November 5-9, 2007},
    series = {{NIST} Special Publication},
    volume = {500-274},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2007},
    url = {http://trec.nist.gov/pubs/trec16/papers/cas-liu.ent.final.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/ShenCCLC07.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

RMIT University at the TREC 2007 Enterprise Track

Mingfang Wu, Falk Scholer, Milad Shokouhi, Simon J. Puglisi, Halil Ali

Abstract

At TREC 2007, RMIT University participated in the document search task of the enterprise track. Our goals were to investigate: 1. Which sources of external evidence (anchor text, PageRank and Indegree) are useful for improving a document-based ranking scheme for a key page finding task? 2. Should the different source of evidence be used in isolation, or in combination? 3. Can federated search improve performance over single collection search, for example when the collection is divided into discipline or business-function related categories? In this paper, we discuss our approaches to these three questions and present experimental result.

Bibtex
@inproceedings{DBLP:conf/trec/WuSSPA07,
    author = {Mingfang Wu and Falk Scholer and Milad Shokouhi and Simon J. Puglisi and Halil Ali},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {{RMIT} University at the {TREC} 2007 Enterprise Track},
    booktitle = {Proceedings of The Sixteenth Text REtrieval Conference, {TREC} 2007, Gaithersburg, Maryland, USA, November 5-9, 2007},
    series = {{NIST} Special Publication},
    volume = {500-274},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2007},
    url = {http://trec.nist.gov/pubs/trec16/papers/rmit.ent.final.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/WuSSPA07.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

WIM at TREC 2007

Jun Xu, Jing Yao, Jiaqian Zheng, Qi Sun, Junyu Niu

Abstract

This paper introduced the four tracks that WIM-Lab Fudan University had taken part in at TREC 2007. For spam track, a multi-centre model was proposed considering the characteristics of spam mails in contrast of traditional 2-class classification methodology, and the incremental clustering and closeness-based classification methods were applied this year. For enterprise track, our research was mainly focused on ranking functions of experts and selecting correct supporting documents regarding to a given topic. For legal track, the effects of word distribution model in query expansion and various corpus pre-processing methods were mainly evaluated. For genomics track, three score methods were proposed to find the most relevant text snippets to a given topic. This paper gives an overview of the methods employed for each sub tasks, and compares the results of each track.

Bibtex
@inproceedings{DBLP:conf/trec/XuYZSN07,
    author = {Jun Xu and Jing Yao and Jiaqian Zheng and Qi Sun and Junyu Niu},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {{WIM} at {TREC} 2007},
    booktitle = {Proceedings of The Sixteenth Text REtrieval Conference, {TREC} 2007, Gaithersburg, Maryland, USA, November 5-9, 2007},
    series = {{NIST} Special Publication},
    volume = {500-274},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2007},
    url = {http://trec.nist.gov/pubs/trec16/papers/fudan-niu.spam.ent.legal.geo.final.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/XuYZSN07.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

The Open University at TREC 2007 Enterprise Track

Jianhan Zhu, Dawei Song, Stefan M. RĂ¼ger

Abstract

The Multimedia and Information Systems group at the Knowledge Media Institute of the Open University participated in the Expert Search and Document Search tasks of the Enterprise Track in TREC 2007. In both the document and expert search tasks, we have studied the effect of anchor texts in addition to document contents, document authority, url length, query expansion, and relevance feedback in improving search effectiveness. In the expert search task, we have continued using a two-stage language model consisting of a document relevance and co-occurrence models. The document relevance model is equivalent to our approach in the document search task. We have used our innovative multiple-window-based co-occurrence approach. The assumption is that there are multiple levels of associations between an expert and his/her expertise. Our experimental results show that the introduction of additional features in addition to document contents has improved the retrieval effectiveness.

Bibtex
@inproceedings{DBLP:conf/trec/ZhuSR07,
    author = {Jianhan Zhu and Dawei Song and Stefan M. R{\"{u}}ger},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {The Open University at {TREC} 2007 Enterprise Track},
    booktitle = {Proceedings of The Sixteenth Text REtrieval Conference, {TREC} 2007, Gaithersburg, Maryland, USA, November 5-9, 2007},
    series = {{NIST} Special Publication},
    volume = {500-274},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2007},
    url = {http://trec.nist.gov/pubs/trec16/papers/openu.ent.final.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/ZhuSR07.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}