Runs - Spam 2005¶
1BASresults¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: 1BASresults
- Participant: beijingu.guo
- Track: Spam
- Year: 2005
- Submission: 9/1/2005
- Task: run
1cefhuj¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: 1cefhuj
- Participant: breyer.laird
- Track: Spam
- Year: 2005
- Submission: 7/28/2005
- Type: final
- Task: filter
- Run description: This filter tests the cef tokenizer with full standard header analysis and uniform reference measure, case sensitive tokens. Same as pilot submission.
2adphu¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: 2adphu
- Participant: breyer.laird
- Track: Spam
- Year: 2005
- Submission: 7/28/2005
- Type: final
- Task: filter
- Run description: This filter tests the adp tokenizer with full standard header analysis and uniform reference measure, lowercase tokens. Same as pilot submission.
2V2Bresults¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: 2V2Bresults
- Participant: beijingu.guo
- Track: Spam
- Year: 2005
- Submission: 9/1/2005
- Task: run
3adphd¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: 3adphd
- Participant: breyer.laird
- Track: Spam
- Year: 2005
- Submission: 7/28/2005
- Type: final
- Task: filter
- Run description: This filter tests the adp tokenizer with full standard header analysis and dirichlet reference measure, lowercase tokens. Same as pilot submission.
3V2Eresults¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: 3V2Eresults
- Participant: beijingu.guo
- Track: Spam
- Year: 2005
- Submission: 9/1/2005
- Task: run
4adp¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: 4adp
- Participant: breyer.laird
- Track: Spam
- Year: 2005
- Submission: 7/28/2005
- Type: final
- Task: filter
- Run description: This filter tests the adp tokenizer with only Subject analysis, uniform reference measure, lowercase tokens. Same as pilot submission.
4V5Bresults¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: 4V5Bresults
- Participant: beijingu.guo
- Track: Spam
- Year: 2005
- Submission: 9/1/2005
- Task: run
621SPAM1FIN¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: 621SPAM1FIN
- Participant: ibm.segal
- Track: Spam
- Year: 2005
- Submission: 7/29/2005
- Type: final
- Task: filter
- Run description: Our submission consists of two core filters and an aggregator. The first is LNB, and extension of naive bayes that partially removes the independence assumption. The second is the SMTP path analysis algorithm presented at CEAS 2005. The aggregator uses the Amoeba optimatization to find the optimal linear weights to combine each classifier. Submission 1 is the aggregate classifier combining both SMTP path analysis and LNB. The second and third submissions are the SMTP path analysis and the LNB algorithm run individually. This will allow us to ascertain both how the individual algorithms perform and how well the aggregator does is combining them. NOTE this submission contains three submissions; details on evaluation precedence are in README.txt
621SPAM1FT1¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: 621SPAM1FT1
- Participant: ibm.segal
- Track: Spam
- Year: 2005
- Submission: 7/5/2005
- Type: pilot
- Task: filter
- Run description: SpamGuru combines mulitple filtering technologies, including bayesian, smtp path analysis, and pattern matching algorithms from the life sciences and combines their scores using an adaptive weighting algorithm. The test filter submitted here only has the bayesian component enabled.
621SPAM1FUL¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: 621SPAM1FUL
- Participant: ibm.segal
- Track: Spam
- Year: 2005
- Submission: 9/12/2005
- Task: run
621SPAM1H25¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: 621SPAM1H25
- Participant: ibm.segal
- Track: Spam
- Year: 2005
- Submission: 9/12/2005
- Task: run
621SPAM1H50¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: 621SPAM1H50
- Participant: ibm.segal
- Track: Spam
- Year: 2005
- Submission: 9/12/2005
- Task: run
621SPAM1S25¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: 621SPAM1S25
- Participant: ibm.segal
- Track: Spam
- Year: 2005
- Submission: 9/12/2005
- Task: run
621SPAM1S50¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: 621SPAM1S50
- Participant: ibm.segal
- Track: Spam
- Year: 2005
- Submission: 9/12/2005
- Task: run
621SPAM2FUL¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: 621SPAM2FUL
- Participant: ibm.segal
- Track: Spam
- Year: 2005
- Submission: 9/12/2005
- Task: run
621SPAM2H25¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: 621SPAM2H25
- Participant: ibm.segal
- Track: Spam
- Year: 2005
- Submission: 9/12/2005
- Task: run
621SPAM2H50¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: 621SPAM2H50
- Participant: ibm.segal
- Track: Spam
- Year: 2005
- Submission: 9/12/2005
- Task: run
621SPAM2S25¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: 621SPAM2S25
- Participant: ibm.segal
- Track: Spam
- Year: 2005
- Submission: 9/12/2005
- Task: run
621SPAM2S50¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: 621SPAM2S50
- Participant: ibm.segal
- Track: Spam
- Year: 2005
- Submission: 9/12/2005
- Task: run
621SPAM3FUL¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: 621SPAM3FUL
- Participant: ibm.segal
- Track: Spam
- Year: 2005
- Submission: 9/12/2005
- Task: run
621SPAM3H25¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: 621SPAM3H25
- Participant: ibm.segal
- Track: Spam
- Year: 2005
- Submission: 9/12/2005
- Task: run
621SPAM3H50¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: 621SPAM3H50
- Participant: ibm.segal
- Track: Spam
- Year: 2005
- Submission: 9/12/2005
- Task: run
621SPAM3S25¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: 621SPAM3S25
- Participant: ibm.segal
- Track: Spam
- Year: 2005
- Submission: 9/12/2005
- Task: run
621SPAM3S50¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: 621SPAM3S50
- Participant: ibm.segal
- Track: Spam
- Year: 2005
- Submission: 9/12/2005
- Task: run
621SPAMFT1¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: 621SPAMFT1
- Participant: ibm.segal
- Track: Spam
- Year: 2005
- Submission: 7/5/2005
- Task: run
azeSPAM1BNS¶
Results
| Participants
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: azeSPAM1BNS
- Participant: uparis-sud.aze
- Track: Spam
- Year: 2005
- Submission: 7/28/2005
- Type: final
- Task: filter
- Run description: Naive Bayes, stop word list, bigrams.
azeSPAM1res¶
Results
| Participants
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: azeSPAM1res
- Participant: uparis-sud.aze
- Track: Spam
- Year: 2005
- Submission: 9/8/2005
- Task: run
azeSPAM2CSS¶
Results
| Participants
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: azeSPAM2CSS
- Participant: uparis-sud.aze
- Track: Spam
- Year: 2005
- Submission: 7/28/2005
- Type: final
- Task: filter
- Run description: Naive Bayes, ChiSq, stop word list, bigrams.
azeSPAMpf01¶
Results
| Participants
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: azeSPAMpf01
- Participant: uparis-sud.aze
- Track: Spam
- Year: 2005
- Submission: 7/5/2005
- Type: pilot
- Task: filter
- Run description: Training offline on spamassassin corpus. Statistical algorithm that doesn't work ).
cao1knf¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: cao1knf
- Participant: yorku.huang
- Track: Spam
- Year: 2005
- Submission: 7/29/2005
- Task: run
cao2ruk¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: cao2ruk
- Participant: yorku.huang
- Track: Spam
- Year: 2005
- Submission: 7/29/2005
- Task: run
cao3knf¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: cao3knf
- Participant: yorku.huang
- Track: Spam
- Year: 2005
- Submission: 7/29/2005
- Task: run
cao4hyb¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: cao4hyb
- Participant: yorku.huang
- Track: Spam
- Year: 2005
- Submission: 7/29/2005
- Task: run
crmSPAM1full¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: crmSPAM1full
- Participant: merl.yerazunis
- Track: Spam
- Year: 2005
- Submission: 9/5/2005
- Task: run
crmSPAM1ham2¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: crmSPAM1ham2
- Participant: merl.yerazunis
- Track: Spam
- Year: 2005
- Submission: 9/5/2005
- Task: run
crmSPAM1ham5¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: crmSPAM1ham5
- Participant: merl.yerazunis
- Track: Spam
- Year: 2005
- Submission: 9/5/2005
- Task: run
crmSPAM1osf¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: crmSPAM1osf
- Participant: merl.yerazunis
- Track: Spam
- Year: 2005
- Submission: 7/28/2005
- Type: final
- Task: filter
- Run description: CRM114 - variant OSBF The OSBF algorithm is a typical Bayesian classifier, but with the OSB (Orthogonal Sparse Bigrams) feature extraction technique (see http //www.siefkes.net/papers/winnow-spam.pdf) as its front-end and an intuitively derived Confidence Factor, a.k.a. "voodoo", for noise reduction and greater accuracy. This configuration uses no pre-trained info and the messages are not preprocessed in any way, not even mimedecoded.
crmSPAM1spm2¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: crmSPAM1spm2
- Participant: merl.yerazunis
- Track: Spam
- Year: 2005
- Submission: 9/5/2005
- Task: run
crmSPAM1spm5¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: crmSPAM1spm5
- Participant: merl.yerazunis
- Track: Spam
- Year: 2005
- Submission: 9/5/2005
- Task: run
crmSPAM2full¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: crmSPAM2full
- Participant: merl.yerazunis
- Track: Spam
- Year: 2005
- Submission: 9/5/2005
- Task: run
crmSPAM2ham2¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: crmSPAM2ham2
- Participant: merl.yerazunis
- Track: Spam
- Year: 2005
- Submission: 9/5/2005
- Task: run
crmSPAM2ham5¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: crmSPAM2ham5
- Participant: merl.yerazunis
- Track: Spam
- Year: 2005
- Submission: 9/5/2005
- Task: run
crmSPAM2spm2¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: crmSPAM2spm2
- Participant: merl.yerazunis
- Track: Spam
- Year: 2005
- Submission: 9/5/2005
- Task: run
crmSPAM2spm5¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: crmSPAM2spm5
- Participant: merl.yerazunis
- Track: Spam
- Year: 2005
- Submission: 9/5/2005
- Task: run
crmSPAM2win¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: crmSPAM2win
- Participant: merl.yerazunis
- Track: Spam
- Year: 2005
- Submission: 7/28/2005
- Type: final
- Task: filter
- Run description: This filter variation combines the OSB (orthogonal sparse bigrams) feature combination technique with the Winnow algorithm developed by Nick Littlestone. See C. Siefkes, F. Assis, S. Chhabra, and W. Yerazunis "Combining Winnow and Orthogonal Sparse Bigrams for Incremental Spam Filtering", PKDD 2004, http //www.siefkes.net/papers/winnow-spam.pdf, for a detailed description. The classifier has not been pre-trained and input mails are not preprocessed in any way, not even mimedecode.
crmSPAM3full¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: crmSPAM3full
- Participant: merl.yerazunis
- Track: Spam
- Year: 2005
- Submission: 9/5/2005
- Task: run
crmSPAM3ham2¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: crmSPAM3ham2
- Participant: merl.yerazunis
- Track: Spam
- Year: 2005
- Submission: 9/5/2005
- Task: run
crmSPAM3ham5¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: crmSPAM3ham5
- Participant: merl.yerazunis
- Track: Spam
- Year: 2005
- Submission: 9/5/2005
- Task: run
crmSPAM3osu¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: crmSPAM3osu
- Participant: merl.yerazunis
- Track: Spam
- Year: 2005
- Submission: 7/28/2005
- Type: final
- Task: filter
- Run description: CRM114 - variant OSB Unique The OSB Unique algorithm is a typical Bayesian classifier. It is a variant of the OSB option, also available in CRM114, which uses the OSB (Orthogonal Sparse Bigrams) feature extraction technique (see http//www.siefkes.net/papers/winnow-spam.pdf) but with the restriction that features are considered only once, no matter how many times they appear in a document. This configuration uses no pre-trained info and the messages are not preprocessed in any way, not even mimedecoded.
crmSPAM3spm2¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: crmSPAM3spm2
- Participant: merl.yerazunis
- Track: Spam
- Year: 2005
- Submission: 9/5/2005
- Task: run
crmSPAM3spm5¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: crmSPAM3spm5
- Participant: merl.yerazunis
- Track: Spam
- Year: 2005
- Submission: 9/5/2005
- Task: run
crmSPAM4full¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: crmSPAM4full
- Participant: merl.yerazunis
- Track: Spam
- Year: 2005
- Submission: 9/5/2005
- Task: run
crmSPAM4ham2¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: crmSPAM4ham2
- Participant: merl.yerazunis
- Track: Spam
- Year: 2005
- Submission: 9/5/2005
- Task: run
crmSPAM4ham5¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: crmSPAM4ham5
- Participant: merl.yerazunis
- Track: Spam
- Year: 2005
- Submission: 9/5/2005
- Task: run
crmSPAM4osb¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: crmSPAM4osb
- Participant: merl.yerazunis
- Track: Spam
- Year: 2005
- Submission: 7/28/2005
- Type: final
- Task: filter
- Run description: CRM114 - variant OSB The OSB algorithm is a typical Bayesian classifier, but uses OSB (Orthogonal Sparse Bigrams), a feature extraction technique derived from SBPH (Sparse Binary Polynomial Hashing) which reduces the number of features produced, keeping the same accuracy - see http//www.siefkes.net/papers/winnow-spam.pdf. This configuration uses no pre-trained info and the messages are not preprocessed in any way, not even mimedecoded.
crmSPAM4spm2¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: crmSPAM4spm2
- Participant: merl.yerazunis
- Track: Spam
- Year: 2005
- Submission: 9/5/2005
- Task: run
crmSPAM4spm5¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: crmSPAM4spm5
- Participant: merl.yerazunis
- Track: Spam
- Year: 2005
- Submission: 9/5/2005
- Task: run
crmSPAMp1of¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: crmSPAMp1of
- Participant: merl.yerazunis
- Track: Spam
- Year: 2005
- Submission: 7/5/2005
- Type: pilot
- Task: filter
- Run description: The OSBF algorithm is a typical Bayesian classifier, but with the OSB (Orthogonal Sparse Bigrams) feature extraction technique (see http //www.siefkes.net/papers/winnow-spam.pdf) as its front-end and an intuitively derived Confidence Factor, a.k.a. "voodoo", for noise reduction and greater accuracy. This configuration uses no pre-trained info and the messages are not preprocessed in any way, not even mimedecoded.
crmSPAMp2wi¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: crmSPAMp2wi
- Participant: merl.yerazunis
- Track: Spam
- Year: 2005
- Submission: 7/5/2005
- Type: pilot
- Task: filter
- Run description: This filter variation combines the OSB (orthogonal sparse bigrams) feature combination technique with the Winnow algorithm developed by Nick Littlestone. See C. Siefkes, F. Assis, S. Chhabra, and W. Yerazunis "Combining Winnow and Orthogonal Sparse Bigrams for Incremental Spam Filtering", PKDD 2004, http //www.siefkes.net/papers/winnow-spam.pdf, for a detailed description. The classifier has not been pre-trained and input mails are not preprocessing in any way.
dalSPAM1fsw¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: dalSPAM1fsw
- Participant: dalhousieu.keselj
- Track: Spam
- Year: 2005
- Submission: 7/29/2005
- Type: final
- Task: filter
- Run description: limited word based naive bayes with white list
dalSPAM1sw1¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: dalSPAM1sw1
- Participant: dalhousieu.keselj
- Track: Spam
- Year: 2005
- Submission: 9/9/2005
- Task: run
dalSPAM1sx2¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: dalSPAM1sx2
- Participant: dalhousieu.keselj
- Track: Spam
- Year: 2005
- Submission: 9/13/2005
- Task: run
dalSPAM1sxw¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: dalSPAM1sxw
- Participant: dalhousieu.keselj
- Track: Spam
- Year: 2005
- Submission: 7/6/2005
- Type: pilot
- Task: filter
- Run description: used partial spamassassin corpus as training
dalSPAM2f¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: dalSPAM2f
- Participant: dalhousieu.keselj
- Track: Spam
- Year: 2005
- Submission: 9/7/2005
- Task: run
dalSPAM2h25¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: dalSPAM2h25
- Participant: dalhousieu.keselj
- Track: Spam
- Year: 2005
- Submission: 9/6/2005
- Task: run
dalSPAM2h50¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: dalSPAM2h50
- Participant: dalhousieu.keselj
- Track: Spam
- Year: 2005
- Submission: 9/7/2005
- Task: run
DalSPAM2n4¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: DalSPAM2n4
- Participant: dalhousieu.keselj
- Track: Spam
- Year: 2005
- Submission: 7/29/2005
- Type: final
- Task: filter
- Run description: Using CNG method with profile aging; byte n-grams (n=4)
dalSPAM2s25¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: dalSPAM2s25
- Participant: dalhousieu.keselj
- Track: Spam
- Year: 2005
- Submission: 9/8/2005
- Task: run
dalSPAM2s50¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: dalSPAM2s50
- Participant: dalhousieu.keselj
- Track: Spam
- Year: 2005
- Submission: 9/8/2005
- Task: run
dalSPAM2vla¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: dalSPAM2vla
- Participant: dalhousieu.keselj
- Track: Spam
- Year: 2005
- Submission: 7/6/2005
- Type: pilot
- Task: filter
- Run description: Later... Perl, ngram-based etc.
dalSPAM3f¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: dalSPAM3f
- Participant: dalhousieu.keselj
- Track: Spam
- Year: 2005
- Submission: 9/9/2005
- Task: run
dalSPAM3h25¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: dalSPAM3h25
- Participant: dalhousieu.keselj
- Track: Spam
- Year: 2005
- Submission: 9/9/2005
- Task: run
dalSPAM3h50¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: dalSPAM3h50
- Participant: dalhousieu.keselj
- Track: Spam
- Year: 2005
- Submission: 9/11/2005
- Task: run
DalSPAM3n5¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: DalSPAM3n5
- Participant: dalhousieu.keselj
- Track: Spam
- Year: 2005
- Submission: 7/29/2005
- Type: final
- Task: filter
- Run description: Using CNG method with profile aging; byte n-grams (n=5)
dalSPAM3s25¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: dalSPAM3s25
- Participant: dalhousieu.keselj
- Track: Spam
- Year: 2005
- Submission: 9/11/2005
- Task: run
dalSPAM3s50¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: dalSPAM3s50
- Participant: dalhousieu.keselj
- Track: Spam
- Year: 2005
- Submission: 9/11/2005
- Task: run
dalSPAM3vla¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: dalSPAM3vla
- Participant: dalhousieu.keselj
- Track: Spam
- Year: 2005
- Submission: 7/6/2005
- Type: pilot
- Task: filter
- Run description: Later... Perl, ngram-based etc.
dalSPAM4fsw¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: dalSPAM4fsw
- Participant: dalhousieu.keselj
- Track: Spam
- Year: 2005
- Submission: 7/29/2005
- Type: final
- Task: filter
- Run description: limited ngram based naive bayes with white list
dalSPAM4sw1¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: dalSPAM4sw1
- Participant: dalhousieu.keselj
- Track: Spam
- Year: 2005
- Submission: 9/9/2005
- Task: run
dalSPAM4sx2¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: dalSPAM4sx2
- Participant: dalhousieu.keselj
- Track: Spam
- Year: 2005
- Submission: 9/13/2005
- Task: run
dalSPAM4sxw¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: dalSPAM4sxw
- Participant: dalhousieu.keselj
- Track: Spam
- Year: 2005
- Submission: 7/6/2005
- Type: pilot
- Task: filter
- Run description: used spamassassin corpus as training
ICTSPAM1WNB¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: ICTSPAM1WNB
- Participant: cas-ict.wang
- Track: Spam
- Year: 2005
- Submission: 7/28/2005
- Type: final
- Task: filter
- Run description: An winnow algorithms,using structure-based hierarchical model.
ICTSPAM2WNH¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: ICTSPAM2WNH
- Participant: cas-ict.wang
- Track: Spam
- Year: 2005
- Submission: 7/28/2005
- Type: final
- Task: filter
- Run description: An winnow algorithms,using multi-feature model.
ICTSPAM3NBH¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: ICTSPAM3NBH
- Participant: cas-ict.wang
- Track: Spam
- Year: 2005
- Submission: 7/28/2005
- Type: final
- Task: filter
- Run description: An nBayes algorithms,using structure-based hierarchical model.
ICTSPAM4NBB¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: ICTSPAM4NBB
- Participant: cas-ict.wang
- Track: Spam
- Year: 2005
- Submission: 7/28/2005
- Type: final
- Task: filter
- Run description: An nBayes algorithms,using multi-feature model.
ICTSPAMpWNB¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: ICTSPAMpWNB
- Participant: cas-ict.wang
- Track: Spam
- Year: 2005
- Submission: 7/4/2005
- Type: pilot
- Task: filter
- Run description: An hierarchical filter,and use winnow as the classifier.
ijsSPAM1full¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: ijsSPAM1full
- Participant: jozef-stefan-inst.bratko
- Track: Spam
- Year: 2005
- Submission: 9/12/2005
- Task: run
ijsSPAM1h25¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: ijsSPAM1h25
- Participant: jozef-stefan-inst.bratko
- Track: Spam
- Year: 2005
- Submission: 9/13/2005
- Task: run
ijsSPAM1pm1¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: ijsSPAM1pm1
- Participant: jozef-stefan-inst.bratko
- Track: Spam
- Year: 2005
- Submission: 7/5/2005
- Type: pilot
- Task: filter
- Run description: The filter uses a character-based algorithm, which attempts to evaluate the probability that a message is spam based on the observed sequence of characters. A markov model is used, i.e. the next character is estimated depending on D preceding characters (D is set to 8 in this case). The Prediction by Partial Matching algorithm is used for prediction, with escape method D and exclusion. An adaptive model is used, so that the classifier continues to update the model as it evaluates an unknown message from beginning to end. Preprocessing is minimal. It includes MIME decoding (non-text parts are excluded), whitespace "normalization" (all sequences of whitespace are truncated to a single space) and minimal preprocessing of headers (e.g. dates are removed). The model is not pre-trained.
ijsSPAM1pm8¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: ijsSPAM1pm8
- Participant: jozef-stefan-inst.bratko
- Track: Spam
- Year: 2005
- Submission: 7/29/2005
- Type: final
- Task: filter
- Run description: This filter models email messages with a character-level markov model. The Prediction by Parital Mathing algorithm is used for this purpose. Two models are trained and the model that best approximates the message being classified determines the classification outcome. An order-8 model is used. The model is adaptive, i.e. it is updated as the message is being processed. No particular preprocessing is used, except MIME decoding and exclusion of embedded attachments. The filter is not pre-trained (the model is empty to begin with).
ijsSPAM2cw1¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: ijsSPAM2cw1
- Participant: jozef-stefan-inst.bratko
- Track: Spam
- Year: 2005
- Submission: 7/5/2005
- Type: pilot
- Task: filter
- Run description: The filter uses a character-based algorithm, which attempts to evaluate the probability that a message is spam based on the observed sequence of characters. A markov model is used, i.e. the next character is estimated depending on D preceding characters (D is set to 8 in this case). The Context Tree Weighting algorithm is used for prediction, with "custom" escape and exclusion mechanisms. An adaptive model is used, so that the classifier continues to update the model as it evaluates an unknown message from beginning to end. Preprocessing is minimal. It includes MIME decoding (non-text parts are excluded), whitespace "normalization" (all sequences of whitespace are truncated to a single space) and minimal preprocessing of headers (e.g. dates are removed). The model is not pre-trained.
ijsSPAM2full¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: ijsSPAM2full
- Participant: jozef-stefan-inst.bratko
- Track: Spam
- Year: 2005
- Submission: 9/12/2005
- Task: run
ijsSPAM2h25¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: ijsSPAM2h25
- Participant: jozef-stefan-inst.bratko
- Track: Spam
- Year: 2005
- Submission: 9/13/2005
- Task: run
ijsSPAM2pm6¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: ijsSPAM2pm6
- Participant: jozef-stefan-inst.bratko
- Track: Spam
- Year: 2005
- Submission: 7/29/2005
- Type: final
- Task: filter
- Run description: This filter models email messages with a character-level markov model. The Prediction by Parital Mathing algorithm is used for this purpose. Two models are trained and the model that best approximates the message being classified determines the classification outcome. An order-6 model is used. The model is adaptive, i.e. it is updated as the message is being processed. No particular preprocessing is used, except MIME decoding and exclusion of embedded attachments. The filter is not pre-trained (the model is empty to begin with).
ijsSPAM3cw2¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: ijsSPAM3cw2
- Participant: jozef-stefan-inst.bratko
- Track: Spam
- Year: 2005
- Submission: 7/5/2005
- Type: pilot
- Task: filter
- Run description: The filter uses a character-based algorithm, which attempts to evaluate the probability that a message is spam based on the observed sequence of characters. A markov model is used, i.e. the next character is estimated depending on D preceding characters (D is set to 8 in this case). The Context Tree Weighting algorithm is used for prediction, with "custom" escape and exclusion mechanisms. An adaptive model is used, so that the classifier continues to update the model as it evaluates an unknown message from beginning to end. A special mechanism for handling case is employed in this version with additional "control" characters that are encoded to signal the case letters Preprocessing is minimal. It includes MIME decoding (non-text parts are excluded), whitespace "normalization" (all sequences of whitespace are truncated to a single space) and minimal preprocessing of headers (e.g. dates are removed). The model is not pre-trained.
ijsSPAM3full¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: ijsSPAM3full
- Participant: jozef-stefan-inst.bratko
- Track: Spam
- Year: 2005
- Submission: 9/12/2005
- Task: run
ijsSPAM3h25¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: ijsSPAM3h25
- Participant: jozef-stefan-inst.bratko
- Track: Spam
- Year: 2005
- Submission: 9/13/2005
- Task: run
ijsSPAM3pm2¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: ijsSPAM3pm2
- Participant: jozef-stefan-inst.bratko
- Track: Spam
- Year: 2005
- Submission: 7/5/2005
- Type: pilot
- Task: filter
- Run description: The filter uses a character-based algorithm, which attempts to evaluate the probability that a message is spam based on the observed sequence of characters. A markov model is used, i.e. the next character is estimated depending on D preceding characters (D is set to 8 in this case). The Prediction by Partial Matching algorithm is used for prediction, escape method D and exclusion. An adaptive model is used, so that the classifier continues to update the model as it evaluates an unknown message from beginning to end. A special mechanism for handling case is employed in this version with additional "control" characters that are encoded to signal the case letters Preprocessing is minimal. It includes MIME decoding (non-text parts are excluded), whitespace "normalization" (all sequences of whitespace are truncated to a single space) and minimal preprocessing of headers (e.g. dates are removed). The model is not pre-trained.
ijsSPAM3pm4¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: ijsSPAM3pm4
- Participant: jozef-stefan-inst.bratko
- Track: Spam
- Year: 2005
- Submission: 7/29/2005
- Type: final
- Task: filter
- Run description: This filter models email messages with a character-level markov model. The Prediction by Parital Mathing algorithm is used for this purpose. Two models are trained and the model that best approximates the message being classified determines the classification outcome. An order-4 model is used. The model is adaptive, i.e. it is updated as the message is being processed. No particular preprocessing is used, except MIME decoding and exclusion of embedded attachments. The filter is not pre-trained (the model is empty to begin with).
ijsSPAM4cw2¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: ijsSPAM4cw2
- Participant: jozef-stefan-inst.bratko
- Track: Spam
- Year: 2005
- Submission: 7/5/2005
- Type: pilot
- Task: filter
- Run description: The filter uses a character-based algorithm, which attempts to evaluate the probability that a message is spam based on the observed sequence of characters. A markov model is used, i.e. the next character is estimated depending on D preceding characters (D is set to 8 in this case). The Context Tree Weighting algorithm is used for prediction, with "custom" escape and exclusion mechanisms. An adaptive model is used, so that the classifier continues to update the model as it evaluates an unknown message from beginning to end. A special mechanism for handling case is employed in this version with additional "control" characters that are encoded to signal the case letters Preprocessing is minimal. It includes MIME decoding (non-text parts are excluded), whitespace "normalization" (all sequences of whitespace are truncated to a single space) and minimal preprocessing of headers (e.g. dates are removed). The model is not pre-trained.
ijsSPAM4cw8¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: ijsSPAM4cw8
- Participant: jozef-stefan-inst.bratko
- Track: Spam
- Year: 2005
- Submission: 7/29/2005
- Type: final
- Task: filter
- Run description: This filter models email messages with a character-level markov model. The Context Tree Weighting algorithm is used for this purpose. Two models are trained and the model that best approximates the message being classified determines the classification outcome. An order-8 model is used. The model is adaptive, i.e. it is updated as the message is being processed. No particular preprocessing is used, except MIME decoding and exclusion of embedded attachments. The filter is not pre-trained (the model is empty to begin with).
ijsSPAM4full¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: ijsSPAM4full
- Participant: jozef-stefan-inst.bratko
- Track: Spam
- Year: 2005
- Submission: 9/12/2005
- Task: run
ijsSPAM4h25¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: ijsSPAM4h25
- Participant: jozef-stefan-inst.bratko
- Track: Spam
- Year: 2005
- Submission: 9/13/2005
- Task: run
indSPAM1f4f¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: indSPAM1f4f
- Participant: indianau.yang
- Track: Spam
- Year: 2005
- Submission: 7/29/2005
- Type: final
- Task: filter
- Run description: Fusion filter (Naive Bayes, Rule-based, Pattern-based, Blacklist) with large training data
indSPAM2f42¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: indSPAM2f42
- Participant: indianau.yang
- Track: Spam
- Year: 2005
- Submission: 7/29/2005
- Type: final
- Task: filter
- Run description: Fusion filter (Naive Bayes, Rule-based, Pattern-based, Blacklist) with training after each classification
indSPAM3f40¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: indSPAM3f40
- Participant: indianau.yang
- Track: Spam
- Year: 2005
- Submission: 7/29/2005
- Type: final
- Task: filter
- Run description: Fusion filter (Naive Bayes, Rule-based, Pattern-based, Blacklist) with training only after misclassification
indSPAM4pf4¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: indSPAM4pf4
- Participant: indianau.yang
- Track: Spam
- Year: 2005
- Submission: 7/29/2005
- Type: final
- Task: filter
- Run description: Prototype Fusion filter (Naive Bayes, Rule-based, Pattern-based, Blacklist) with large training data
indSPAMpwf4¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: indSPAMpwf4
- Participant: indianau.yang
- Track: Spam
- Year: 2005
- Submission: 7/4/2005
- Type: pilot
- Task: filter
- Run description: Fusion filter that combines Bayesian filter, rule-based filter, pattern-based filter, and blacklist filter
kidSPAM1BAS¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: kidSPAM1BAS
- Participant: beijingu.guo
- Track: Spam
- Year: 2005
- Submission: 7/27/2005
- Type: final
- Task: filter
- Run description: KidultPRIS is a command line program which is exploited by the members in the lab of Pattern Recognition and Intelligent System, Beijing University of Posts and Telecommunications.
kidSPAM2V2B¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: kidSPAM2V2B
- Participant: beijingu.guo
- Track: Spam
- Year: 2005
- Submission: 7/27/2005
- Type: final
- Task: filter
- Run description: KidultPRIS is a command line program which is exploited by the members in the lab of Pattern Recognition and Intelligent System, Beijing University of Posts and Telecommunications.
kidSPAM3V2E¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: kidSPAM3V2E
- Participant: beijingu.guo
- Track: Spam
- Year: 2005
- Submission: 7/27/2005
- Type: final
- Task: filter
- Run description: KidultPRIS is a command line program which is exploited by the members in the lab of Pattern Recognition and Intelligent System, Beijing University of Posts and Telecommunications.
kidSPAM4V5B¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: kidSPAM4V5B
- Participant: beijingu.guo
- Track: Spam
- Year: 2005
- Submission: 7/27/2005
- Type: final
- Task: filter
- Run description: KidultPRIS is a command line program which is exploited by the members in the lab of Pattern Recognition and Intelligent System, Beijing University of Posts and Telecommunications.
kidSPAMpBAS¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: kidSPAMpBAS
- Participant: beijingu.guo
- Track: Spam
- Year: 2005
- Submission: 6/30/2005
- Type: pilot
- Task: filter
- Run description: KidultPRIS is a command line program which is exploited by the members in the lab of Pattern Recognition and Intelligent System, Beijing University of Posts and Telecommunications.
No2¶
Results
| Participants
| Summary
| Appendix
- Run ID: No2
- Participant: cas.nlpr.jzhao
- Track: Spam
- Year: 2005
- Submission: 7/5/2005
- Type: pilot
- Task: filter
- Run description: This is just a pilot system. Based on ifile system developed by Jason Rennie, we incorporate other machine learning methods in this system. please see the incoming final filter system and its Readme file.
None¶
Results
| Participants
| Summary
| Appendix
- Run ID: None
- Participant: cas.nlpr.jzhao
- Track: Spam
- Year: 2005
- Submission: 7/5/2005
- Task: run
p1cefhuj¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: p1cefhuj
- Participant: breyer.laird
- Track: Spam
- Year: 2005
- Submission: 7/3/2005
- Type: pilot
- Task: filter
- Run description: This filter tests the cef tokenizer with full standard header analysis and uniform reference measure, case sensitive tokens.
p2adphu¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: p2adphu
- Participant: breyer.laird
- Track: Spam
- Year: 2005
- Submission: 7/3/2005
- Type: pilot
- Task: filter
- Run description: This filter tests the adp tokenizer with full standard header analysis and uniform reference measure, lowercase tokens.
p3adphd¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: p3adphd
- Participant: breyer.laird
- Track: Spam
- Year: 2005
- Submission: 7/3/2005
- Type: pilot
- Task: filter
- Run description: This filter tests the adp tokenizer with full standard header analysis and dirichlet reference measure, lowercase tokens.
p4adp¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: p4adp
- Participant: breyer.laird
- Track: Spam
- Year: 2005
- Submission: 7/3/2005
- Type: pilot
- Task: filter
- Run description: This filter tests the adp tokenizer with only Subject analysis, uniform reference measure, lowercase tokens.
pub1full¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: pub1full
- Participant: breyer.laird
- Track: Spam
- Year: 2005
- Submission: 9/6/2005
- Task: run
pub1ham50¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: pub1ham50
- Participant: breyer.laird
- Track: Spam
- Year: 2005
- Submission: 9/6/2005
- Task: run
pub1spam50¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: pub1spam50
- Participant: breyer.laird
- Track: Spam
- Year: 2005
- Submission: 9/6/2005
- Task: run
pub2full¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: pub2full
- Participant: breyer.laird
- Track: Spam
- Year: 2005
- Submission: 9/6/2005
- Task: run
pub3full¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: pub3full
- Participant: breyer.laird
- Track: Spam
- Year: 2005
- Submission: 9/7/2005
- Task: run
pub4full¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: pub4full
- Participant: breyer.laird
- Track: Spam
- Year: 2005
- Submission: 9/6/2005
- Task: run
PUC¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: PUC
- Participant: puc-rs.terra
- Track: Spam
- Year: 2005
- Submission: 7/5/2005
- Type: pilot
- Task: filter
- Run description: It uses a unigram language modelling approach with laplace smoothing. It computes the likelihood, given two lnguage models (one for ham, one for spam) and then the odds of generating the message on each, deciding which class based on the odds. It uses as prior training 2500 spam messages (collected from the web) for the spam language model and 1400 aquaint documents as the ham language model. The backgroung model is interpolated with new models created on-the-fly and updated on each new message (train). The current configuration will generate 98 false negatives and 45 false positives in the spamassassing corpus.
pucrs0¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: pucrs0
- Participant: puc-rs.terra
- Track: Spam
- Year: 2005
- Submission: 7/28/2005
- Type: final
- Task: filter
- Run description: It is based on unigram language models, one for ham messages and another for ham messages. It computes the message likelihood given these two models and uses the values to classify the message. Uses spamassassin corpus as training data.
pucrs0full¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: pucrs0full
- Participant: puc-rs.terra
- Track: Spam
- Year: 2005
- Submission: 8/23/2005
- Task: run
pucrs0ham25¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: pucrs0ham25
- Participant: puc-rs.terra
- Track: Spam
- Year: 2005
- Submission: 8/23/2005
- Task: run
pucrs0ham50¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: pucrs0ham50
- Participant: puc-rs.terra
- Track: Spam
- Year: 2005
- Submission: 8/23/2005
- Task: run
pucrs0spam25¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: pucrs0spam25
- Participant: puc-rs.terra
- Track: Spam
- Year: 2005
- Submission: 8/23/2005
- Task: run
pucrs0spam50¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: pucrs0spam50
- Participant: puc-rs.terra
- Track: Spam
- Year: 2005
- Submission: 8/23/2005
- Task: run
pucrs1¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: pucrs1
- Participant: puc-rs.terra
- Track: Spam
- Year: 2005
- Submission: 7/28/2005
- Type: final
- Task: filter
- Run description: It is based on unigram language models, one for ham messages and another for ham messages. It computes the message likelihood given these two models and uses the values to classify the message.
pucrs1full¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: pucrs1full
- Participant: puc-rs.terra
- Track: Spam
- Year: 2005
- Submission: 8/23/2005
- Task: run
pucrs1ham25¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: pucrs1ham25
- Participant: puc-rs.terra
- Track: Spam
- Year: 2005
- Submission: 8/23/2005
- Task: run
pucrs1ham50¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: pucrs1ham50
- Participant: puc-rs.terra
- Track: Spam
- Year: 2005
- Submission: 8/23/2005
- Task: run
pucrs1spam25¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: pucrs1spam25
- Participant: puc-rs.terra
- Track: Spam
- Year: 2005
- Submission: 8/23/2005
- Task: run
pucrs1spam50¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: pucrs1spam50
- Participant: puc-rs.terra
- Track: Spam
- Year: 2005
- Submission: 8/23/2005
- Task: run
pucrs2¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: pucrs2
- Participant: puc-rs.terra
- Track: Spam
- Year: 2005
- Submission: 7/28/2005
- Type: final
- Task: filter
- Run description: It is based on unigram language models, one for ham messages and another for ham messages. It computes the message likelihood given these two models and uses the values to classify the message.
pucrs2full¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: pucrs2full
- Participant: puc-rs.terra
- Track: Spam
- Year: 2005
- Submission: 8/23/2005
- Task: run
pucrs2ham25¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: pucrs2ham25
- Participant: puc-rs.terra
- Track: Spam
- Year: 2005
- Submission: 8/23/2005
- Task: run
pucrs2ham50¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: pucrs2ham50
- Participant: puc-rs.terra
- Track: Spam
- Year: 2005
- Submission: 8/23/2005
- Task: run
pucrs2spam25¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: pucrs2spam25
- Participant: puc-rs.terra
- Track: Spam
- Year: 2005
- Submission: 8/23/2005
- Task: run
pucrs2spam50¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: pucrs2spam50
- Participant: puc-rs.terra
- Track: Spam
- Year: 2005
- Submission: 8/23/2005
- Task: run
tamSPAM1dte¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: tamSPAM1dte
- Participant: masseyu.meyer
- Track: Spam
- Year: 2005
- Submission: 7/28/2005
- Type: final
- Task: filter
- Run description: No prior training. This is SpamBayes 1.1 (see http //spambayes.org for details, including the source), using all default options, and doing train-on-everything (with both cutoffs set to 0.6). SpamBayes is a standard statistic classifier, using the now-common Fishers/chi-squared combining technique, and various tokenization rules.
tamSPAM1full¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: tamSPAM1full
- Participant: masseyu.meyer
- Track: Spam
- Year: 2005
- Submission: 9/10/2005
- Task: run
tamSPAM1h25¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: tamSPAM1h25
- Participant: masseyu.meyer
- Track: Spam
- Year: 2005
- Submission: 9/10/2005
- Task: run
tamSPAM1h50¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: tamSPAM1h50
- Participant: masseyu.meyer
- Track: Spam
- Year: 2005
- Submission: 9/10/2005
- Task: run
tamSPAM1s25¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: tamSPAM1s25
- Participant: masseyu.meyer
- Track: Spam
- Year: 2005
- Submission: 9/10/2005
- Task: run
tamSPAM1s50¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: tamSPAM1s50
- Participant: masseyu.meyer
- Track: Spam
- Year: 2005
- Submission: 9/10/2005
- Task: run
tamSPAM2ber¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: tamSPAM2ber
- Participant: masseyu.meyer
- Track: Spam
- Year: 2005
- Submission: 7/28/2005
- Type: final
- Task: filter
- Run description: This is SpamBayes 1.1 (see http //spambayes.org for full details, including the source), with the use_bigrams option enabled, and doing train-on-error. Both cutoffs are set to 0.6. The use_bigrams option does both unigram and bigram tokenization, and uses a windowing technique to avoid using tokens twice (i.e. once in the unigram and once in the bigram). SpamBayes is a standard statistic classifier, with various tokenization methods, and using the now-common Fisher/chi-squared combination method.
tamSPAM2full¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: tamSPAM2full
- Participant: masseyu.meyer
- Track: Spam
- Year: 2005
- Submission: 9/10/2005
- Task: run
tamSPAM2h25¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: tamSPAM2h25
- Participant: masseyu.meyer
- Track: Spam
- Year: 2005
- Submission: 9/10/2005
- Task: run
tamSPAM2h50¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: tamSPAM2h50
- Participant: masseyu.meyer
- Track: Spam
- Year: 2005
- Submission: 9/10/2005
- Task: run
tamSPAM2s25¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: tamSPAM2s25
- Participant: masseyu.meyer
- Track: Spam
- Year: 2005
- Submission: 9/10/2005
- Task: run
tamSPAM2s50¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: tamSPAM2s50
- Participant: masseyu.meyer
- Track: Spam
- Year: 2005
- Submission: 9/10/2005
- Task: run
tamSPAM3bex¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: tamSPAM3bex
- Participant: masseyu.meyer
- Track: Spam
- Year: 2005
- Submission: 7/28/2005
- Type: final
- Task: filter
- Run description: SpamBayes 1.1 (see http //spambayes.org for full details, including source), with the use_bigrams option enabled (see description for run #2) and using train-to-exhaustion. Train-to-exhaustion (see Gary Robinson's blog for full details) repeatedly trains on the ham/spam corpus until all messages are correctly classified. None of the SpamBayes submissions have any prior training.
tamSPAM3full¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: tamSPAM3full
- Participant: masseyu.meyer
- Track: Spam
- Year: 2005
- Submission: 9/10/2005
- Task: run
tamSPAM3s25¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: tamSPAM3s25
- Participant: masseyu.meyer
- Track: Spam
- Year: 2005
- Submission: 9/10/2005
- Task: run
tamSPAM4aex¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: tamSPAM4aex
- Participant: masseyu.meyer
- Track: Spam
- Year: 2005
- Submission: 7/28/2005
- Type: final
- Task: filter
- Run description: SpamBayes 1.1 (see http //spambayes.org for full details, including source), with all possible tokenization options turned on and using train-to-exhaustion (se description for run #3). See previous run descriptions for more details.
tamSPAMpplt¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: tamSPAMpplt
- Participant: masseyu.meyer
- Track: Spam
- Year: 2005
- Submission: 6/30/2005
- Type: pilot
- Task: filter
- Run description: SpamBayes 1.1. Uses chi-squared combining (both thresholds set at 0.6 to eliminate unsure range) statistical method. No prior training.
yorSPAM1full¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: yorSPAM1full
- Participant: yorku.huang
- Track: Spam
- Year: 2005
- Submission: 9/12/2005
- Task: run
yorSPAM1ham2¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: yorSPAM1ham2
- Participant: yorku.huang
- Track: Spam
- Year: 2005
- Submission: 9/12/2005
- Task: run
yorSPAM1ham5¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: yorSPAM1ham5
- Participant: yorku.huang
- Track: Spam
- Year: 2005
- Submission: 9/12/2005
- Task: run
yorSPAM1knf¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: yorSPAM1knf
- Participant: yorku.huang
- Track: Spam
- Year: 2005
- Submission: 7/29/2005
- Type: final
- Task: filter
- Run description: Our pilot filter is based on the k-nearest neighbor classification method. We use a collection of public and private emails as an initial set of training data. Based on this training data set, an initial set of features are selected according to an information gain measure. The feature values are computed according to local term frequencies. Data preprocessing is conducted on both the training data and the test emails. We use a decoding program to parse the content of an email, which retains only the text part of an email including the decoded text body. The parsed email is then tokenized by first using white spaces as separators, then remove some non-alphanumerical characters with a few of them left. When tokenizing, we remove stop words and stem the left word if necessary. During classification, the training data set is maintained by removing examples that constantly cause wrong classifications and adding each test example into the training set. The weighted Euclidean distance measure is used to find the nearest neighbors. To be adaptive to the evolving of spam, we retrain our model periodically. The retained instances are represented start all over to reflect the updated features learnt.
yorSPAM1spa2¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: yorSPAM1spa2
- Participant: yorku.huang
- Track: Spam
- Year: 2005
- Submission: 9/12/2005
- Task: run
yorSPAM1spa5¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: yorSPAM1spa5
- Participant: yorku.huang
- Track: Spam
- Year: 2005
- Submission: 9/12/2005
- Task: run
yorSPAM2full¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: yorSPAM2full
- Participant: yorku.huang
- Track: Spam
- Year: 2005
- Submission: 9/12/2005
- Task: run
yorSPAM2ham2¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: yorSPAM2ham2
- Participant: yorku.huang
- Track: Spam
- Year: 2005
- Submission: 9/12/2005
- Task: run
yorSPAM2ham5¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: yorSPAM2ham5
- Participant: yorku.huang
- Track: Spam
- Year: 2005
- Submission: 9/12/2005
- Task: run
yorSPAM2ruk¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: yorSPAM2ruk
- Participant: yorku.huang
- Track: Spam
- Year: 2005
- Submission: 7/29/2005
- Type: final
- Task: filter
- Run description: Our filter reuses SpamAssassins code to do rule analysis, SA returns a preliminary result. The modified SA returns a parsed text to kNN which then do classify again. The final result is determined base on SA and kNN. In the training part, we use auto learning and then train on errors.
yorSPAM2spa2¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: yorSPAM2spa2
- Participant: yorku.huang
- Track: Spam
- Year: 2005
- Submission: 9/12/2005
- Task: run
yorSPAM2spa5¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: yorSPAM2spa5
- Participant: yorku.huang
- Track: Spam
- Year: 2005
- Submission: 9/12/2005
- Task: run
yorSPAM3full¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: yorSPAM3full
- Participant: yorku.huang
- Track: Spam
- Year: 2005
- Submission: 9/12/2005
- Task: run
yorSPAM3ham2¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: yorSPAM3ham2
- Participant: yorku.huang
- Track: Spam
- Year: 2005
- Submission: 9/12/2005
- Task: run
yorSPAM3ham5¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: yorSPAM3ham5
- Participant: yorku.huang
- Track: Spam
- Year: 2005
- Submission: 9/12/2005
- Task: run
yorSPAM3knf¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: yorSPAM3knf
- Participant: yorku.huang
- Track: Spam
- Year: 2005
- Submission: 7/29/2005
- Type: final
- Task: filter
- Run description: Our pilot filter is based on the k-nearest neighbor classification method. We use a collection of public and private emails as an initial set of training data. Based on this training data set, an initial set of features are selected according to an information gain measure. The feature values are computed according to local term frequencies. Data preprocessing is conducted on both the training data and the test emails. We use a decoding program to parse the content of an email, which retains only the text part of an email including the decoded text body. The parsed email is then tokenized by first using white spaces as separators, then remove some non-alphanumerical characters with a few of them left. When tokenizing, we remove stop words and stem the left word if necessary. During classification, the training data set is maintained by removing examples that constantly cause wrong classifications and adding each test example into the training set. The weighted Euclidean distance measure is used to find the nearest neighbors. To be adaptive to the evolving of spam, we retrain our model periodically. The retained instances are represented start all over to reflect the updated features learnt. This one is different from the first filter in pre-processing methods and retrain parameters.
yorSPAM3spa2¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: yorSPAM3spa2
- Participant: yorku.huang
- Track: Spam
- Year: 2005
- Submission: 9/12/2005
- Task: run
yorSPAM3spa5¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: yorSPAM3spa5
- Participant: yorku.huang
- Track: Spam
- Year: 2005
- Submission: 9/12/2005
- Task: run
yorSPAM4full¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: yorSPAM4full
- Participant: yorku.huang
- Track: Spam
- Year: 2005
- Submission: 9/12/2005
- Task: run
yorSPAM4ham2¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: yorSPAM4ham2
- Participant: yorku.huang
- Track: Spam
- Year: 2005
- Submission: 9/12/2005
- Task: run
yorSPAM4ham5¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: yorSPAM4ham5
- Participant: yorku.huang
- Track: Spam
- Year: 2005
- Submission: 9/12/2005
- Task: run
yorSPAM4hyb¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: yorSPAM4hyb
- Participant: yorku.huang
- Track: Spam
- Year: 2005
- Submission: 7/29/2005
- Type: final
- Task: filter
- Run description: This filter is a combination of kNN, NaveBayes and RandomForest tree learning.
yorSPAM4spa2¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: yorSPAM4spa2
- Participant: yorku.huang
- Track: Spam
- Year: 2005
- Submission: 9/12/2005
- Task: run
yorSPAM4spa5¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: yorSPAM4spa5
- Participant: yorku.huang
- Track: Spam
- Year: 2005
- Submission: 9/12/2005
- Task: run
yorSPAMpknn¶
Results
| Participants
| Proceedings
| Summary
| Appendix (aggregate results)
| Appendix (public corpus [full])
| Appendix (Mr. X Private corpus)
| Appendix (S.B. Private corpus)
| Appendix (T.M. Private corpus)
| Appendix (public corpus [five subsets])
- Run ID: yorSPAMpknn
- Participant: yorku.huang
- Track: Spam
- Year: 2005
- Submission: 7/5/2005
- Type: pilot
- Task: filter
- Run description: Our pilot filter is based on the k-nearest neighbor classification method. We use a collection of public and private emails as an initial set of training data. Based on this training data set, an initial set of features are selected according to an information gain measure. The feature values are computed according to the TF-IDF measure. Data preprocessing is conducted on both the training data and the test emails. We use a decoding program to parse the content of an email, which retains only the text part of an email including the decoded text body. The parsed email is then tokenized by using non-alphabetical letters as separators. During classification, the training data set is maintained by removing examples that constantly cause wrong classifications and adding each test example into the training set. The Euclidean distance measure is used to find the nearest neighbors.