Skip to content

Runs - Spam 2005

1BASresults

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: 1BASresults
  • Participant: beijingu.guo
  • Track: Spam
  • Year: 2005
  • Submission: 9/1/2005
  • Task: run

1cefhuj

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: 1cefhuj
  • Participant: breyer.laird
  • Track: Spam
  • Year: 2005
  • Submission: 7/28/2005
  • Type: final
  • Task: filter
  • Run description: This filter tests the cef tokenizer with full standard header analysis and uniform reference measure, case sensitive tokens. Same as pilot submission.

2adphu

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: 2adphu
  • Participant: breyer.laird
  • Track: Spam
  • Year: 2005
  • Submission: 7/28/2005
  • Type: final
  • Task: filter
  • Run description: This filter tests the adp tokenizer with full standard header analysis and uniform reference measure, lowercase tokens. Same as pilot submission.

2V2Bresults

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: 2V2Bresults
  • Participant: beijingu.guo
  • Track: Spam
  • Year: 2005
  • Submission: 9/1/2005
  • Task: run

3adphd

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: 3adphd
  • Participant: breyer.laird
  • Track: Spam
  • Year: 2005
  • Submission: 7/28/2005
  • Type: final
  • Task: filter
  • Run description: This filter tests the adp tokenizer with full standard header analysis and dirichlet reference measure, lowercase tokens. Same as pilot submission.

3V2Eresults

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: 3V2Eresults
  • Participant: beijingu.guo
  • Track: Spam
  • Year: 2005
  • Submission: 9/1/2005
  • Task: run

4adp

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: 4adp
  • Participant: breyer.laird
  • Track: Spam
  • Year: 2005
  • Submission: 7/28/2005
  • Type: final
  • Task: filter
  • Run description: This filter tests the adp tokenizer with only Subject analysis, uniform reference measure, lowercase tokens. Same as pilot submission.

4V5Bresults

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: 4V5Bresults
  • Participant: beijingu.guo
  • Track: Spam
  • Year: 2005
  • Submission: 9/1/2005
  • Task: run

621SPAM1FIN

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: 621SPAM1FIN
  • Participant: ibm.segal
  • Track: Spam
  • Year: 2005
  • Submission: 7/29/2005
  • Type: final
  • Task: filter
  • Run description: Our submission consists of two core filters and an aggregator. The first is LNB, and extension of naive bayes that partially removes the independence assumption. The second is the SMTP path analysis algorithm presented at CEAS 2005. The aggregator uses the Amoeba optimatization to find the optimal linear weights to combine each classifier. Submission 1 is the aggregate classifier combining both SMTP path analysis and LNB. The second and third submissions are the SMTP path analysis and the LNB algorithm run individually. This will allow us to ascertain both how the individual algorithms perform and how well the aggregator does is combining them. NOTE this submission contains three submissions; details on evaluation precedence are in README.txt

621SPAM1FT1

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: 621SPAM1FT1
  • Participant: ibm.segal
  • Track: Spam
  • Year: 2005
  • Submission: 7/5/2005
  • Type: pilot
  • Task: filter
  • Run description: SpamGuru combines mulitple filtering technologies, including bayesian, smtp path analysis, and pattern matching algorithms from the life sciences and combines their scores using an adaptive weighting algorithm. The test filter submitted here only has the bayesian component enabled.

621SPAM1FUL

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: 621SPAM1FUL
  • Participant: ibm.segal
  • Track: Spam
  • Year: 2005
  • Submission: 9/12/2005
  • Task: run

621SPAM1H25

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: 621SPAM1H25
  • Participant: ibm.segal
  • Track: Spam
  • Year: 2005
  • Submission: 9/12/2005
  • Task: run

621SPAM1H50

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: 621SPAM1H50
  • Participant: ibm.segal
  • Track: Spam
  • Year: 2005
  • Submission: 9/12/2005
  • Task: run

621SPAM1S25

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: 621SPAM1S25
  • Participant: ibm.segal
  • Track: Spam
  • Year: 2005
  • Submission: 9/12/2005
  • Task: run

621SPAM1S50

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: 621SPAM1S50
  • Participant: ibm.segal
  • Track: Spam
  • Year: 2005
  • Submission: 9/12/2005
  • Task: run

621SPAM2FUL

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: 621SPAM2FUL
  • Participant: ibm.segal
  • Track: Spam
  • Year: 2005
  • Submission: 9/12/2005
  • Task: run

621SPAM2H25

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: 621SPAM2H25
  • Participant: ibm.segal
  • Track: Spam
  • Year: 2005
  • Submission: 9/12/2005
  • Task: run

621SPAM2H50

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: 621SPAM2H50
  • Participant: ibm.segal
  • Track: Spam
  • Year: 2005
  • Submission: 9/12/2005
  • Task: run

621SPAM2S25

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: 621SPAM2S25
  • Participant: ibm.segal
  • Track: Spam
  • Year: 2005
  • Submission: 9/12/2005
  • Task: run

621SPAM2S50

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: 621SPAM2S50
  • Participant: ibm.segal
  • Track: Spam
  • Year: 2005
  • Submission: 9/12/2005
  • Task: run

621SPAM3FUL

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: 621SPAM3FUL
  • Participant: ibm.segal
  • Track: Spam
  • Year: 2005
  • Submission: 9/12/2005
  • Task: run

621SPAM3H25

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: 621SPAM3H25
  • Participant: ibm.segal
  • Track: Spam
  • Year: 2005
  • Submission: 9/12/2005
  • Task: run

621SPAM3H50

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: 621SPAM3H50
  • Participant: ibm.segal
  • Track: Spam
  • Year: 2005
  • Submission: 9/12/2005
  • Task: run

621SPAM3S25

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: 621SPAM3S25
  • Participant: ibm.segal
  • Track: Spam
  • Year: 2005
  • Submission: 9/12/2005
  • Task: run

621SPAM3S50

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: 621SPAM3S50
  • Participant: ibm.segal
  • Track: Spam
  • Year: 2005
  • Submission: 9/12/2005
  • Task: run

621SPAMFT1

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: 621SPAMFT1
  • Participant: ibm.segal
  • Track: Spam
  • Year: 2005
  • Submission: 7/5/2005
  • Task: run

azeSPAM1BNS

Results | Participants | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: azeSPAM1BNS
  • Participant: uparis-sud.aze
  • Track: Spam
  • Year: 2005
  • Submission: 7/28/2005
  • Type: final
  • Task: filter
  • Run description: Naive Bayes, stop word list, bigrams.

azeSPAM1res

Results | Participants | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: azeSPAM1res
  • Participant: uparis-sud.aze
  • Track: Spam
  • Year: 2005
  • Submission: 9/8/2005
  • Task: run

azeSPAM2CSS

Results | Participants | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: azeSPAM2CSS
  • Participant: uparis-sud.aze
  • Track: Spam
  • Year: 2005
  • Submission: 7/28/2005
  • Type: final
  • Task: filter
  • Run description: Naive Bayes, ChiSq, stop word list, bigrams.

azeSPAMpf01

Results | Participants | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: azeSPAMpf01
  • Participant: uparis-sud.aze
  • Track: Spam
  • Year: 2005
  • Submission: 7/5/2005
  • Type: pilot
  • Task: filter
  • Run description: Training offline on spamassassin corpus. Statistical algorithm that doesn't work ).

cao1knf

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: cao1knf
  • Participant: yorku.huang
  • Track: Spam
  • Year: 2005
  • Submission: 7/29/2005
  • Task: run

cao2ruk

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: cao2ruk
  • Participant: yorku.huang
  • Track: Spam
  • Year: 2005
  • Submission: 7/29/2005
  • Task: run

cao3knf

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: cao3knf
  • Participant: yorku.huang
  • Track: Spam
  • Year: 2005
  • Submission: 7/29/2005
  • Task: run

cao4hyb

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: cao4hyb
  • Participant: yorku.huang
  • Track: Spam
  • Year: 2005
  • Submission: 7/29/2005
  • Task: run

crmSPAM1full

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: crmSPAM1full
  • Participant: merl.yerazunis
  • Track: Spam
  • Year: 2005
  • Submission: 9/5/2005
  • Task: run

crmSPAM1ham2

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: crmSPAM1ham2
  • Participant: merl.yerazunis
  • Track: Spam
  • Year: 2005
  • Submission: 9/5/2005
  • Task: run

crmSPAM1ham5

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: crmSPAM1ham5
  • Participant: merl.yerazunis
  • Track: Spam
  • Year: 2005
  • Submission: 9/5/2005
  • Task: run

crmSPAM1osf

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: crmSPAM1osf
  • Participant: merl.yerazunis
  • Track: Spam
  • Year: 2005
  • Submission: 7/28/2005
  • Type: final
  • Task: filter
  • Run description: CRM114 - variant OSBF The OSBF algorithm is a typical Bayesian classifier, but with the OSB (Orthogonal Sparse Bigrams) feature extraction technique (see http //www.siefkes.net/papers/winnow-spam.pdf) as its front-end and an intuitively derived Confidence Factor, a.k.a. "voodoo", for noise reduction and greater accuracy. This configuration uses no pre-trained info and the messages are not preprocessed in any way, not even mimedecoded.

crmSPAM1spm2

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: crmSPAM1spm2
  • Participant: merl.yerazunis
  • Track: Spam
  • Year: 2005
  • Submission: 9/5/2005
  • Task: run

crmSPAM1spm5

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: crmSPAM1spm5
  • Participant: merl.yerazunis
  • Track: Spam
  • Year: 2005
  • Submission: 9/5/2005
  • Task: run

crmSPAM2full

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: crmSPAM2full
  • Participant: merl.yerazunis
  • Track: Spam
  • Year: 2005
  • Submission: 9/5/2005
  • Task: run

crmSPAM2ham2

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: crmSPAM2ham2
  • Participant: merl.yerazunis
  • Track: Spam
  • Year: 2005
  • Submission: 9/5/2005
  • Task: run

crmSPAM2ham5

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: crmSPAM2ham5
  • Participant: merl.yerazunis
  • Track: Spam
  • Year: 2005
  • Submission: 9/5/2005
  • Task: run

crmSPAM2spm2

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: crmSPAM2spm2
  • Participant: merl.yerazunis
  • Track: Spam
  • Year: 2005
  • Submission: 9/5/2005
  • Task: run

crmSPAM2spm5

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: crmSPAM2spm5
  • Participant: merl.yerazunis
  • Track: Spam
  • Year: 2005
  • Submission: 9/5/2005
  • Task: run

crmSPAM2win

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: crmSPAM2win
  • Participant: merl.yerazunis
  • Track: Spam
  • Year: 2005
  • Submission: 7/28/2005
  • Type: final
  • Task: filter
  • Run description: This filter variation combines the OSB (orthogonal sparse bigrams) feature combination technique with the Winnow algorithm developed by Nick Littlestone. See C. Siefkes, F. Assis, S. Chhabra, and W. Yerazunis "Combining Winnow and Orthogonal Sparse Bigrams for Incremental Spam Filtering", PKDD 2004, http //www.siefkes.net/papers/winnow-spam.pdf, for a detailed description. The classifier has not been pre-trained and input mails are not preprocessed in any way, not even mimedecode.

crmSPAM3full

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: crmSPAM3full
  • Participant: merl.yerazunis
  • Track: Spam
  • Year: 2005
  • Submission: 9/5/2005
  • Task: run

crmSPAM3ham2

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: crmSPAM3ham2
  • Participant: merl.yerazunis
  • Track: Spam
  • Year: 2005
  • Submission: 9/5/2005
  • Task: run

crmSPAM3ham5

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: crmSPAM3ham5
  • Participant: merl.yerazunis
  • Track: Spam
  • Year: 2005
  • Submission: 9/5/2005
  • Task: run

crmSPAM3osu

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: crmSPAM3osu
  • Participant: merl.yerazunis
  • Track: Spam
  • Year: 2005
  • Submission: 7/28/2005
  • Type: final
  • Task: filter
  • Run description: CRM114 - variant OSB Unique The OSB Unique algorithm is a typical Bayesian classifier. It is a variant of the OSB option, also available in CRM114, which uses the OSB (Orthogonal Sparse Bigrams) feature extraction technique (see http//www.siefkes.net/papers/winnow-spam.pdf) but with the restriction that features are considered only once, no matter how many times they appear in a document. This configuration uses no pre-trained info and the messages are not preprocessed in any way, not even mimedecoded.

crmSPAM3spm2

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: crmSPAM3spm2
  • Participant: merl.yerazunis
  • Track: Spam
  • Year: 2005
  • Submission: 9/5/2005
  • Task: run

crmSPAM3spm5

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: crmSPAM3spm5
  • Participant: merl.yerazunis
  • Track: Spam
  • Year: 2005
  • Submission: 9/5/2005
  • Task: run

crmSPAM4full

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: crmSPAM4full
  • Participant: merl.yerazunis
  • Track: Spam
  • Year: 2005
  • Submission: 9/5/2005
  • Task: run

crmSPAM4ham2

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: crmSPAM4ham2
  • Participant: merl.yerazunis
  • Track: Spam
  • Year: 2005
  • Submission: 9/5/2005
  • Task: run

crmSPAM4ham5

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: crmSPAM4ham5
  • Participant: merl.yerazunis
  • Track: Spam
  • Year: 2005
  • Submission: 9/5/2005
  • Task: run

crmSPAM4osb

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: crmSPAM4osb
  • Participant: merl.yerazunis
  • Track: Spam
  • Year: 2005
  • Submission: 7/28/2005
  • Type: final
  • Task: filter
  • Run description: CRM114 - variant OSB The OSB algorithm is a typical Bayesian classifier, but uses OSB (Orthogonal Sparse Bigrams), a feature extraction technique derived from SBPH (Sparse Binary Polynomial Hashing) which reduces the number of features produced, keeping the same accuracy - see http//www.siefkes.net/papers/winnow-spam.pdf. This configuration uses no pre-trained info and the messages are not preprocessed in any way, not even mimedecoded.

crmSPAM4spm2

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: crmSPAM4spm2
  • Participant: merl.yerazunis
  • Track: Spam
  • Year: 2005
  • Submission: 9/5/2005
  • Task: run

crmSPAM4spm5

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: crmSPAM4spm5
  • Participant: merl.yerazunis
  • Track: Spam
  • Year: 2005
  • Submission: 9/5/2005
  • Task: run

crmSPAMp1of

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: crmSPAMp1of
  • Participant: merl.yerazunis
  • Track: Spam
  • Year: 2005
  • Submission: 7/5/2005
  • Type: pilot
  • Task: filter
  • Run description: The OSBF algorithm is a typical Bayesian classifier, but with the OSB (Orthogonal Sparse Bigrams) feature extraction technique (see http //www.siefkes.net/papers/winnow-spam.pdf) as its front-end and an intuitively derived Confidence Factor, a.k.a. "voodoo", for noise reduction and greater accuracy. This configuration uses no pre-trained info and the messages are not preprocessed in any way, not even mimedecoded.

crmSPAMp2wi

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: crmSPAMp2wi
  • Participant: merl.yerazunis
  • Track: Spam
  • Year: 2005
  • Submission: 7/5/2005
  • Type: pilot
  • Task: filter
  • Run description: This filter variation combines the OSB (orthogonal sparse bigrams) feature combination technique with the Winnow algorithm developed by Nick Littlestone. See C. Siefkes, F. Assis, S. Chhabra, and W. Yerazunis "Combining Winnow and Orthogonal Sparse Bigrams for Incremental Spam Filtering", PKDD 2004, http //www.siefkes.net/papers/winnow-spam.pdf, for a detailed description. The classifier has not been pre-trained and input mails are not preprocessing in any way.

dalSPAM1fsw

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: dalSPAM1fsw
  • Participant: dalhousieu.keselj
  • Track: Spam
  • Year: 2005
  • Submission: 7/29/2005
  • Type: final
  • Task: filter
  • Run description: limited word based naive bayes with white list

dalSPAM1sw1

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: dalSPAM1sw1
  • Participant: dalhousieu.keselj
  • Track: Spam
  • Year: 2005
  • Submission: 9/9/2005
  • Task: run

dalSPAM1sx2

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: dalSPAM1sx2
  • Participant: dalhousieu.keselj
  • Track: Spam
  • Year: 2005
  • Submission: 9/13/2005
  • Task: run

dalSPAM1sxw

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: dalSPAM1sxw
  • Participant: dalhousieu.keselj
  • Track: Spam
  • Year: 2005
  • Submission: 7/6/2005
  • Type: pilot
  • Task: filter
  • Run description: used partial spamassassin corpus as training

dalSPAM2f

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: dalSPAM2f
  • Participant: dalhousieu.keselj
  • Track: Spam
  • Year: 2005
  • Submission: 9/7/2005
  • Task: run

dalSPAM2h25

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: dalSPAM2h25
  • Participant: dalhousieu.keselj
  • Track: Spam
  • Year: 2005
  • Submission: 9/6/2005
  • Task: run

dalSPAM2h50

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: dalSPAM2h50
  • Participant: dalhousieu.keselj
  • Track: Spam
  • Year: 2005
  • Submission: 9/7/2005
  • Task: run

DalSPAM2n4

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: DalSPAM2n4
  • Participant: dalhousieu.keselj
  • Track: Spam
  • Year: 2005
  • Submission: 7/29/2005
  • Type: final
  • Task: filter
  • Run description: Using CNG method with profile aging; byte n-grams (n=4)

dalSPAM2s25

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: dalSPAM2s25
  • Participant: dalhousieu.keselj
  • Track: Spam
  • Year: 2005
  • Submission: 9/8/2005
  • Task: run

dalSPAM2s50

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: dalSPAM2s50
  • Participant: dalhousieu.keselj
  • Track: Spam
  • Year: 2005
  • Submission: 9/8/2005
  • Task: run

dalSPAM2vla

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: dalSPAM2vla
  • Participant: dalhousieu.keselj
  • Track: Spam
  • Year: 2005
  • Submission: 7/6/2005
  • Type: pilot
  • Task: filter
  • Run description: Later... Perl, ngram-based etc.

dalSPAM3f

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: dalSPAM3f
  • Participant: dalhousieu.keselj
  • Track: Spam
  • Year: 2005
  • Submission: 9/9/2005
  • Task: run

dalSPAM3h25

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: dalSPAM3h25
  • Participant: dalhousieu.keselj
  • Track: Spam
  • Year: 2005
  • Submission: 9/9/2005
  • Task: run

dalSPAM3h50

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: dalSPAM3h50
  • Participant: dalhousieu.keselj
  • Track: Spam
  • Year: 2005
  • Submission: 9/11/2005
  • Task: run

DalSPAM3n5

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: DalSPAM3n5
  • Participant: dalhousieu.keselj
  • Track: Spam
  • Year: 2005
  • Submission: 7/29/2005
  • Type: final
  • Task: filter
  • Run description: Using CNG method with profile aging; byte n-grams (n=5)

dalSPAM3s25

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: dalSPAM3s25
  • Participant: dalhousieu.keselj
  • Track: Spam
  • Year: 2005
  • Submission: 9/11/2005
  • Task: run

dalSPAM3s50

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: dalSPAM3s50
  • Participant: dalhousieu.keselj
  • Track: Spam
  • Year: 2005
  • Submission: 9/11/2005
  • Task: run

dalSPAM3vla

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: dalSPAM3vla
  • Participant: dalhousieu.keselj
  • Track: Spam
  • Year: 2005
  • Submission: 7/6/2005
  • Type: pilot
  • Task: filter
  • Run description: Later... Perl, ngram-based etc.

dalSPAM4fsw

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: dalSPAM4fsw
  • Participant: dalhousieu.keselj
  • Track: Spam
  • Year: 2005
  • Submission: 7/29/2005
  • Type: final
  • Task: filter
  • Run description: limited ngram based naive bayes with white list

dalSPAM4sw1

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: dalSPAM4sw1
  • Participant: dalhousieu.keselj
  • Track: Spam
  • Year: 2005
  • Submission: 9/9/2005
  • Task: run

dalSPAM4sx2

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: dalSPAM4sx2
  • Participant: dalhousieu.keselj
  • Track: Spam
  • Year: 2005
  • Submission: 9/13/2005
  • Task: run

dalSPAM4sxw

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: dalSPAM4sxw
  • Participant: dalhousieu.keselj
  • Track: Spam
  • Year: 2005
  • Submission: 7/6/2005
  • Type: pilot
  • Task: filter
  • Run description: used spamassassin corpus as training

ICTSPAM1WNB

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: ICTSPAM1WNB
  • Participant: cas-ict.wang
  • Track: Spam
  • Year: 2005
  • Submission: 7/28/2005
  • Type: final
  • Task: filter
  • Run description: An winnow algorithms,using structure-based hierarchical model.

ICTSPAM2WNH

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: ICTSPAM2WNH
  • Participant: cas-ict.wang
  • Track: Spam
  • Year: 2005
  • Submission: 7/28/2005
  • Type: final
  • Task: filter
  • Run description: An winnow algorithms,using multi-feature model.

ICTSPAM3NBH

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: ICTSPAM3NBH
  • Participant: cas-ict.wang
  • Track: Spam
  • Year: 2005
  • Submission: 7/28/2005
  • Type: final
  • Task: filter
  • Run description: An nBayes algorithms,using structure-based hierarchical model.

ICTSPAM4NBB

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: ICTSPAM4NBB
  • Participant: cas-ict.wang
  • Track: Spam
  • Year: 2005
  • Submission: 7/28/2005
  • Type: final
  • Task: filter
  • Run description: An nBayes algorithms,using multi-feature model.

ICTSPAMpWNB

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: ICTSPAMpWNB
  • Participant: cas-ict.wang
  • Track: Spam
  • Year: 2005
  • Submission: 7/4/2005
  • Type: pilot
  • Task: filter
  • Run description: An hierarchical filter,and use winnow as the classifier.

ijsSPAM1full

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: ijsSPAM1full
  • Participant: jozef-stefan-inst.bratko
  • Track: Spam
  • Year: 2005
  • Submission: 9/12/2005
  • Task: run

ijsSPAM1h25

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: ijsSPAM1h25
  • Participant: jozef-stefan-inst.bratko
  • Track: Spam
  • Year: 2005
  • Submission: 9/13/2005
  • Task: run

ijsSPAM1pm1

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: ijsSPAM1pm1
  • Participant: jozef-stefan-inst.bratko
  • Track: Spam
  • Year: 2005
  • Submission: 7/5/2005
  • Type: pilot
  • Task: filter
  • Run description: The filter uses a character-based algorithm, which attempts to evaluate the probability that a message is spam based on the observed sequence of characters. A markov model is used, i.e. the next character is estimated depending on D preceding characters (D is set to 8 in this case). The Prediction by Partial Matching algorithm is used for prediction, with escape method D and exclusion. An adaptive model is used, so that the classifier continues to update the model as it evaluates an unknown message from beginning to end. Preprocessing is minimal. It includes MIME decoding (non-text parts are excluded), whitespace "normalization" (all sequences of whitespace are truncated to a single space) and minimal preprocessing of headers (e.g. dates are removed). The model is not pre-trained.

ijsSPAM1pm8

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: ijsSPAM1pm8
  • Participant: jozef-stefan-inst.bratko
  • Track: Spam
  • Year: 2005
  • Submission: 7/29/2005
  • Type: final
  • Task: filter
  • Run description: This filter models email messages with a character-level markov model. The Prediction by Parital Mathing algorithm is used for this purpose. Two models are trained and the model that best approximates the message being classified determines the classification outcome. An order-8 model is used. The model is adaptive, i.e. it is updated as the message is being processed. No particular preprocessing is used, except MIME decoding and exclusion of embedded attachments. The filter is not pre-trained (the model is empty to begin with).

ijsSPAM2cw1

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: ijsSPAM2cw1
  • Participant: jozef-stefan-inst.bratko
  • Track: Spam
  • Year: 2005
  • Submission: 7/5/2005
  • Type: pilot
  • Task: filter
  • Run description: The filter uses a character-based algorithm, which attempts to evaluate the probability that a message is spam based on the observed sequence of characters. A markov model is used, i.e. the next character is estimated depending on D preceding characters (D is set to 8 in this case). The Context Tree Weighting algorithm is used for prediction, with "custom" escape and exclusion mechanisms. An adaptive model is used, so that the classifier continues to update the model as it evaluates an unknown message from beginning to end. Preprocessing is minimal. It includes MIME decoding (non-text parts are excluded), whitespace "normalization" (all sequences of whitespace are truncated to a single space) and minimal preprocessing of headers (e.g. dates are removed). The model is not pre-trained.

ijsSPAM2full

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: ijsSPAM2full
  • Participant: jozef-stefan-inst.bratko
  • Track: Spam
  • Year: 2005
  • Submission: 9/12/2005
  • Task: run

ijsSPAM2h25

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: ijsSPAM2h25
  • Participant: jozef-stefan-inst.bratko
  • Track: Spam
  • Year: 2005
  • Submission: 9/13/2005
  • Task: run

ijsSPAM2pm6

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: ijsSPAM2pm6
  • Participant: jozef-stefan-inst.bratko
  • Track: Spam
  • Year: 2005
  • Submission: 7/29/2005
  • Type: final
  • Task: filter
  • Run description: This filter models email messages with a character-level markov model. The Prediction by Parital Mathing algorithm is used for this purpose. Two models are trained and the model that best approximates the message being classified determines the classification outcome. An order-6 model is used. The model is adaptive, i.e. it is updated as the message is being processed. No particular preprocessing is used, except MIME decoding and exclusion of embedded attachments. The filter is not pre-trained (the model is empty to begin with).

ijsSPAM3cw2

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: ijsSPAM3cw2
  • Participant: jozef-stefan-inst.bratko
  • Track: Spam
  • Year: 2005
  • Submission: 7/5/2005
  • Type: pilot
  • Task: filter
  • Run description: The filter uses a character-based algorithm, which attempts to evaluate the probability that a message is spam based on the observed sequence of characters. A markov model is used, i.e. the next character is estimated depending on D preceding characters (D is set to 8 in this case). The Context Tree Weighting algorithm is used for prediction, with "custom" escape and exclusion mechanisms. An adaptive model is used, so that the classifier continues to update the model as it evaluates an unknown message from beginning to end. A special mechanism for handling case is employed in this version with additional "control" characters that are encoded to signal the case letters Preprocessing is minimal. It includes MIME decoding (non-text parts are excluded), whitespace "normalization" (all sequences of whitespace are truncated to a single space) and minimal preprocessing of headers (e.g. dates are removed). The model is not pre-trained.

ijsSPAM3full

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: ijsSPAM3full
  • Participant: jozef-stefan-inst.bratko
  • Track: Spam
  • Year: 2005
  • Submission: 9/12/2005
  • Task: run

ijsSPAM3h25

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: ijsSPAM3h25
  • Participant: jozef-stefan-inst.bratko
  • Track: Spam
  • Year: 2005
  • Submission: 9/13/2005
  • Task: run

ijsSPAM3pm2

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: ijsSPAM3pm2
  • Participant: jozef-stefan-inst.bratko
  • Track: Spam
  • Year: 2005
  • Submission: 7/5/2005
  • Type: pilot
  • Task: filter
  • Run description: The filter uses a character-based algorithm, which attempts to evaluate the probability that a message is spam based on the observed sequence of characters. A markov model is used, i.e. the next character is estimated depending on D preceding characters (D is set to 8 in this case). The Prediction by Partial Matching algorithm is used for prediction, escape method D and exclusion. An adaptive model is used, so that the classifier continues to update the model as it evaluates an unknown message from beginning to end. A special mechanism for handling case is employed in this version with additional "control" characters that are encoded to signal the case letters Preprocessing is minimal. It includes MIME decoding (non-text parts are excluded), whitespace "normalization" (all sequences of whitespace are truncated to a single space) and minimal preprocessing of headers (e.g. dates are removed). The model is not pre-trained.

ijsSPAM3pm4

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: ijsSPAM3pm4
  • Participant: jozef-stefan-inst.bratko
  • Track: Spam
  • Year: 2005
  • Submission: 7/29/2005
  • Type: final
  • Task: filter
  • Run description: This filter models email messages with a character-level markov model. The Prediction by Parital Mathing algorithm is used for this purpose. Two models are trained and the model that best approximates the message being classified determines the classification outcome. An order-4 model is used. The model is adaptive, i.e. it is updated as the message is being processed. No particular preprocessing is used, except MIME decoding and exclusion of embedded attachments. The filter is not pre-trained (the model is empty to begin with).

ijsSPAM4cw2

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: ijsSPAM4cw2
  • Participant: jozef-stefan-inst.bratko
  • Track: Spam
  • Year: 2005
  • Submission: 7/5/2005
  • Type: pilot
  • Task: filter
  • Run description: The filter uses a character-based algorithm, which attempts to evaluate the probability that a message is spam based on the observed sequence of characters. A markov model is used, i.e. the next character is estimated depending on D preceding characters (D is set to 8 in this case). The Context Tree Weighting algorithm is used for prediction, with "custom" escape and exclusion mechanisms. An adaptive model is used, so that the classifier continues to update the model as it evaluates an unknown message from beginning to end. A special mechanism for handling case is employed in this version with additional "control" characters that are encoded to signal the case letters Preprocessing is minimal. It includes MIME decoding (non-text parts are excluded), whitespace "normalization" (all sequences of whitespace are truncated to a single space) and minimal preprocessing of headers (e.g. dates are removed). The model is not pre-trained.

ijsSPAM4cw8

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: ijsSPAM4cw8
  • Participant: jozef-stefan-inst.bratko
  • Track: Spam
  • Year: 2005
  • Submission: 7/29/2005
  • Type: final
  • Task: filter
  • Run description: This filter models email messages with a character-level markov model. The Context Tree Weighting algorithm is used for this purpose. Two models are trained and the model that best approximates the message being classified determines the classification outcome. An order-8 model is used. The model is adaptive, i.e. it is updated as the message is being processed. No particular preprocessing is used, except MIME decoding and exclusion of embedded attachments. The filter is not pre-trained (the model is empty to begin with).

ijsSPAM4full

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: ijsSPAM4full
  • Participant: jozef-stefan-inst.bratko
  • Track: Spam
  • Year: 2005
  • Submission: 9/12/2005
  • Task: run

ijsSPAM4h25

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: ijsSPAM4h25
  • Participant: jozef-stefan-inst.bratko
  • Track: Spam
  • Year: 2005
  • Submission: 9/13/2005
  • Task: run

indSPAM1f4f

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: indSPAM1f4f
  • Participant: indianau.yang
  • Track: Spam
  • Year: 2005
  • Submission: 7/29/2005
  • Type: final
  • Task: filter
  • Run description: Fusion filter (Naive Bayes, Rule-based, Pattern-based, Blacklist) with large training data

indSPAM2f42

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: indSPAM2f42
  • Participant: indianau.yang
  • Track: Spam
  • Year: 2005
  • Submission: 7/29/2005
  • Type: final
  • Task: filter
  • Run description: Fusion filter (Naive Bayes, Rule-based, Pattern-based, Blacklist) with training after each classification

indSPAM3f40

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: indSPAM3f40
  • Participant: indianau.yang
  • Track: Spam
  • Year: 2005
  • Submission: 7/29/2005
  • Type: final
  • Task: filter
  • Run description: Fusion filter (Naive Bayes, Rule-based, Pattern-based, Blacklist) with training only after misclassification

indSPAM4pf4

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: indSPAM4pf4
  • Participant: indianau.yang
  • Track: Spam
  • Year: 2005
  • Submission: 7/29/2005
  • Type: final
  • Task: filter
  • Run description: Prototype Fusion filter (Naive Bayes, Rule-based, Pattern-based, Blacklist) with large training data

indSPAMpwf4

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: indSPAMpwf4
  • Participant: indianau.yang
  • Track: Spam
  • Year: 2005
  • Submission: 7/4/2005
  • Type: pilot
  • Task: filter
  • Run description: Fusion filter that combines Bayesian filter, rule-based filter, pattern-based filter, and blacklist filter

kidSPAM1BAS

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: kidSPAM1BAS
  • Participant: beijingu.guo
  • Track: Spam
  • Year: 2005
  • Submission: 7/27/2005
  • Type: final
  • Task: filter
  • Run description: KidultPRIS is a command line program which is exploited by the members in the lab of Pattern Recognition and Intelligent System, Beijing University of Posts and Telecommunications.

kidSPAM2V2B

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: kidSPAM2V2B
  • Participant: beijingu.guo
  • Track: Spam
  • Year: 2005
  • Submission: 7/27/2005
  • Type: final
  • Task: filter
  • Run description: KidultPRIS is a command line program which is exploited by the members in the lab of Pattern Recognition and Intelligent System, Beijing University of Posts and Telecommunications.

kidSPAM3V2E

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: kidSPAM3V2E
  • Participant: beijingu.guo
  • Track: Spam
  • Year: 2005
  • Submission: 7/27/2005
  • Type: final
  • Task: filter
  • Run description: KidultPRIS is a command line program which is exploited by the members in the lab of Pattern Recognition and Intelligent System, Beijing University of Posts and Telecommunications.

kidSPAM4V5B

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: kidSPAM4V5B
  • Participant: beijingu.guo
  • Track: Spam
  • Year: 2005
  • Submission: 7/27/2005
  • Type: final
  • Task: filter
  • Run description: KidultPRIS is a command line program which is exploited by the members in the lab of Pattern Recognition and Intelligent System, Beijing University of Posts and Telecommunications.

kidSPAMpBAS

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: kidSPAMpBAS
  • Participant: beijingu.guo
  • Track: Spam
  • Year: 2005
  • Submission: 6/30/2005
  • Type: pilot
  • Task: filter
  • Run description: KidultPRIS is a command line program which is exploited by the members in the lab of Pattern Recognition and Intelligent System, Beijing University of Posts and Telecommunications.

No2

Results | Participants | Summary | Appendix

  • Run ID: No2
  • Participant: cas.nlpr.jzhao
  • Track: Spam
  • Year: 2005
  • Submission: 7/5/2005
  • Type: pilot
  • Task: filter
  • Run description: This is just a pilot system. Based on ifile system developed by Jason Rennie, we incorporate other machine learning methods in this system. please see the incoming final filter system and its Readme file.

None

Results | Participants | Summary | Appendix

  • Run ID: None
  • Participant: cas.nlpr.jzhao
  • Track: Spam
  • Year: 2005
  • Submission: 7/5/2005
  • Task: run

p1cefhuj

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: p1cefhuj
  • Participant: breyer.laird
  • Track: Spam
  • Year: 2005
  • Submission: 7/3/2005
  • Type: pilot
  • Task: filter
  • Run description: This filter tests the cef tokenizer with full standard header analysis and uniform reference measure, case sensitive tokens.

p2adphu

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: p2adphu
  • Participant: breyer.laird
  • Track: Spam
  • Year: 2005
  • Submission: 7/3/2005
  • Type: pilot
  • Task: filter
  • Run description: This filter tests the adp tokenizer with full standard header analysis and uniform reference measure, lowercase tokens.

p3adphd

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: p3adphd
  • Participant: breyer.laird
  • Track: Spam
  • Year: 2005
  • Submission: 7/3/2005
  • Type: pilot
  • Task: filter
  • Run description: This filter tests the adp tokenizer with full standard header analysis and dirichlet reference measure, lowercase tokens.

p4adp

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: p4adp
  • Participant: breyer.laird
  • Track: Spam
  • Year: 2005
  • Submission: 7/3/2005
  • Type: pilot
  • Task: filter
  • Run description: This filter tests the adp tokenizer with only Subject analysis, uniform reference measure, lowercase tokens.

pub1full

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: pub1full
  • Participant: breyer.laird
  • Track: Spam
  • Year: 2005
  • Submission: 9/6/2005
  • Task: run

pub1ham50

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: pub1ham50
  • Participant: breyer.laird
  • Track: Spam
  • Year: 2005
  • Submission: 9/6/2005
  • Task: run

pub1spam50

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: pub1spam50
  • Participant: breyer.laird
  • Track: Spam
  • Year: 2005
  • Submission: 9/6/2005
  • Task: run

pub2full

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: pub2full
  • Participant: breyer.laird
  • Track: Spam
  • Year: 2005
  • Submission: 9/6/2005
  • Task: run

pub3full

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: pub3full
  • Participant: breyer.laird
  • Track: Spam
  • Year: 2005
  • Submission: 9/7/2005
  • Task: run

pub4full

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: pub4full
  • Participant: breyer.laird
  • Track: Spam
  • Year: 2005
  • Submission: 9/6/2005
  • Task: run

PUC

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: PUC
  • Participant: puc-rs.terra
  • Track: Spam
  • Year: 2005
  • Submission: 7/5/2005
  • Type: pilot
  • Task: filter
  • Run description: It uses a unigram language modelling approach with laplace smoothing. It computes the likelihood, given two lnguage models (one for ham, one for spam) and then the odds of generating the message on each, deciding which class based on the odds. It uses as prior training 2500 spam messages (collected from the web) for the spam language model and 1400 aquaint documents as the ham language model. The backgroung model is interpolated with new models created on-the-fly and updated on each new message (train). The current configuration will generate 98 false negatives and 45 false positives in the spamassassing corpus.

pucrs0

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: pucrs0
  • Participant: puc-rs.terra
  • Track: Spam
  • Year: 2005
  • Submission: 7/28/2005
  • Type: final
  • Task: filter
  • Run description: It is based on unigram language models, one for ham messages and another for ham messages. It computes the message likelihood given these two models and uses the values to classify the message. Uses spamassassin corpus as training data.

pucrs0full

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: pucrs0full
  • Participant: puc-rs.terra
  • Track: Spam
  • Year: 2005
  • Submission: 8/23/2005
  • Task: run

pucrs0ham25

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: pucrs0ham25
  • Participant: puc-rs.terra
  • Track: Spam
  • Year: 2005
  • Submission: 8/23/2005
  • Task: run

pucrs0ham50

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: pucrs0ham50
  • Participant: puc-rs.terra
  • Track: Spam
  • Year: 2005
  • Submission: 8/23/2005
  • Task: run

pucrs0spam25

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: pucrs0spam25
  • Participant: puc-rs.terra
  • Track: Spam
  • Year: 2005
  • Submission: 8/23/2005
  • Task: run

pucrs0spam50

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: pucrs0spam50
  • Participant: puc-rs.terra
  • Track: Spam
  • Year: 2005
  • Submission: 8/23/2005
  • Task: run

pucrs1

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: pucrs1
  • Participant: puc-rs.terra
  • Track: Spam
  • Year: 2005
  • Submission: 7/28/2005
  • Type: final
  • Task: filter
  • Run description: It is based on unigram language models, one for ham messages and another for ham messages. It computes the message likelihood given these two models and uses the values to classify the message.

pucrs1full

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: pucrs1full
  • Participant: puc-rs.terra
  • Track: Spam
  • Year: 2005
  • Submission: 8/23/2005
  • Task: run

pucrs1ham25

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: pucrs1ham25
  • Participant: puc-rs.terra
  • Track: Spam
  • Year: 2005
  • Submission: 8/23/2005
  • Task: run

pucrs1ham50

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: pucrs1ham50
  • Participant: puc-rs.terra
  • Track: Spam
  • Year: 2005
  • Submission: 8/23/2005
  • Task: run

pucrs1spam25

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: pucrs1spam25
  • Participant: puc-rs.terra
  • Track: Spam
  • Year: 2005
  • Submission: 8/23/2005
  • Task: run

pucrs1spam50

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: pucrs1spam50
  • Participant: puc-rs.terra
  • Track: Spam
  • Year: 2005
  • Submission: 8/23/2005
  • Task: run

pucrs2

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: pucrs2
  • Participant: puc-rs.terra
  • Track: Spam
  • Year: 2005
  • Submission: 7/28/2005
  • Type: final
  • Task: filter
  • Run description: It is based on unigram language models, one for ham messages and another for ham messages. It computes the message likelihood given these two models and uses the values to classify the message.

pucrs2full

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: pucrs2full
  • Participant: puc-rs.terra
  • Track: Spam
  • Year: 2005
  • Submission: 8/23/2005
  • Task: run

pucrs2ham25

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: pucrs2ham25
  • Participant: puc-rs.terra
  • Track: Spam
  • Year: 2005
  • Submission: 8/23/2005
  • Task: run

pucrs2ham50

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: pucrs2ham50
  • Participant: puc-rs.terra
  • Track: Spam
  • Year: 2005
  • Submission: 8/23/2005
  • Task: run

pucrs2spam25

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: pucrs2spam25
  • Participant: puc-rs.terra
  • Track: Spam
  • Year: 2005
  • Submission: 8/23/2005
  • Task: run

pucrs2spam50

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: pucrs2spam50
  • Participant: puc-rs.terra
  • Track: Spam
  • Year: 2005
  • Submission: 8/23/2005
  • Task: run

tamSPAM1dte

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: tamSPAM1dte
  • Participant: masseyu.meyer
  • Track: Spam
  • Year: 2005
  • Submission: 7/28/2005
  • Type: final
  • Task: filter
  • Run description: No prior training. This is SpamBayes 1.1 (see http //spambayes.org for details, including the source), using all default options, and doing train-on-everything (with both cutoffs set to 0.6). SpamBayes is a standard statistic classifier, using the now-common Fishers/chi-squared combining technique, and various tokenization rules.

tamSPAM1full

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: tamSPAM1full
  • Participant: masseyu.meyer
  • Track: Spam
  • Year: 2005
  • Submission: 9/10/2005
  • Task: run

tamSPAM1h25

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: tamSPAM1h25
  • Participant: masseyu.meyer
  • Track: Spam
  • Year: 2005
  • Submission: 9/10/2005
  • Task: run

tamSPAM1h50

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: tamSPAM1h50
  • Participant: masseyu.meyer
  • Track: Spam
  • Year: 2005
  • Submission: 9/10/2005
  • Task: run

tamSPAM1s25

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: tamSPAM1s25
  • Participant: masseyu.meyer
  • Track: Spam
  • Year: 2005
  • Submission: 9/10/2005
  • Task: run

tamSPAM1s50

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: tamSPAM1s50
  • Participant: masseyu.meyer
  • Track: Spam
  • Year: 2005
  • Submission: 9/10/2005
  • Task: run

tamSPAM2ber

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: tamSPAM2ber
  • Participant: masseyu.meyer
  • Track: Spam
  • Year: 2005
  • Submission: 7/28/2005
  • Type: final
  • Task: filter
  • Run description: This is SpamBayes 1.1 (see http //spambayes.org for full details, including the source), with the use_bigrams option enabled, and doing train-on-error. Both cutoffs are set to 0.6. The use_bigrams option does both unigram and bigram tokenization, and uses a windowing technique to avoid using tokens twice (i.e. once in the unigram and once in the bigram). SpamBayes is a standard statistic classifier, with various tokenization methods, and using the now-common Fisher/chi-squared combination method.

tamSPAM2full

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: tamSPAM2full
  • Participant: masseyu.meyer
  • Track: Spam
  • Year: 2005
  • Submission: 9/10/2005
  • Task: run

tamSPAM2h25

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: tamSPAM2h25
  • Participant: masseyu.meyer
  • Track: Spam
  • Year: 2005
  • Submission: 9/10/2005
  • Task: run

tamSPAM2h50

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: tamSPAM2h50
  • Participant: masseyu.meyer
  • Track: Spam
  • Year: 2005
  • Submission: 9/10/2005
  • Task: run

tamSPAM2s25

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: tamSPAM2s25
  • Participant: masseyu.meyer
  • Track: Spam
  • Year: 2005
  • Submission: 9/10/2005
  • Task: run

tamSPAM2s50

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: tamSPAM2s50
  • Participant: masseyu.meyer
  • Track: Spam
  • Year: 2005
  • Submission: 9/10/2005
  • Task: run

tamSPAM3bex

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: tamSPAM3bex
  • Participant: masseyu.meyer
  • Track: Spam
  • Year: 2005
  • Submission: 7/28/2005
  • Type: final
  • Task: filter
  • Run description: SpamBayes 1.1 (see http //spambayes.org for full details, including source), with the use_bigrams option enabled (see description for run #2) and using train-to-exhaustion. Train-to-exhaustion (see Gary Robinson's blog for full details) repeatedly trains on the ham/spam corpus until all messages are correctly classified. None of the SpamBayes submissions have any prior training.

tamSPAM3full

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: tamSPAM3full
  • Participant: masseyu.meyer
  • Track: Spam
  • Year: 2005
  • Submission: 9/10/2005
  • Task: run

tamSPAM3s25

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: tamSPAM3s25
  • Participant: masseyu.meyer
  • Track: Spam
  • Year: 2005
  • Submission: 9/10/2005
  • Task: run

tamSPAM4aex

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: tamSPAM4aex
  • Participant: masseyu.meyer
  • Track: Spam
  • Year: 2005
  • Submission: 7/28/2005
  • Type: final
  • Task: filter
  • Run description: SpamBayes 1.1 (see http //spambayes.org for full details, including source), with all possible tokenization options turned on and using train-to-exhaustion (se description for run #3). See previous run descriptions for more details.

tamSPAMpplt

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: tamSPAMpplt
  • Participant: masseyu.meyer
  • Track: Spam
  • Year: 2005
  • Submission: 6/30/2005
  • Type: pilot
  • Task: filter
  • Run description: SpamBayes 1.1. Uses chi-squared combining (both thresholds set at 0.6 to eliminate unsure range) statistical method. No prior training.

yorSPAM1full

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: yorSPAM1full
  • Participant: yorku.huang
  • Track: Spam
  • Year: 2005
  • Submission: 9/12/2005
  • Task: run

yorSPAM1ham2

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: yorSPAM1ham2
  • Participant: yorku.huang
  • Track: Spam
  • Year: 2005
  • Submission: 9/12/2005
  • Task: run

yorSPAM1ham5

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: yorSPAM1ham5
  • Participant: yorku.huang
  • Track: Spam
  • Year: 2005
  • Submission: 9/12/2005
  • Task: run

yorSPAM1knf

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: yorSPAM1knf
  • Participant: yorku.huang
  • Track: Spam
  • Year: 2005
  • Submission: 7/29/2005
  • Type: final
  • Task: filter
  • Run description: Our pilot filter is based on the k-nearest neighbor classification method. We use a collection of public and private emails as an initial set of training data. Based on this training data set, an initial set of features are selected according to an information gain measure. The feature values are computed according to local term frequencies. Data preprocessing is conducted on both the training data and the test emails. We use a decoding program to parse the content of an email, which retains only the text part of an email including the decoded text body. The parsed email is then tokenized by first using white spaces as separators, then remove some non-alphanumerical characters with a few of them left. When tokenizing, we remove stop words and stem the left word if necessary. During classification, the training data set is maintained by removing examples that constantly cause wrong classifications and adding each test example into the training set. The weighted Euclidean distance measure is used to find the nearest neighbors. To be adaptive to the evolving of spam, we retrain our model periodically. The retained instances are represented start all over to reflect the updated features learnt.

yorSPAM1spa2

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: yorSPAM1spa2
  • Participant: yorku.huang
  • Track: Spam
  • Year: 2005
  • Submission: 9/12/2005
  • Task: run

yorSPAM1spa5

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: yorSPAM1spa5
  • Participant: yorku.huang
  • Track: Spam
  • Year: 2005
  • Submission: 9/12/2005
  • Task: run

yorSPAM2full

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: yorSPAM2full
  • Participant: yorku.huang
  • Track: Spam
  • Year: 2005
  • Submission: 9/12/2005
  • Task: run

yorSPAM2ham2

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: yorSPAM2ham2
  • Participant: yorku.huang
  • Track: Spam
  • Year: 2005
  • Submission: 9/12/2005
  • Task: run

yorSPAM2ham5

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: yorSPAM2ham5
  • Participant: yorku.huang
  • Track: Spam
  • Year: 2005
  • Submission: 9/12/2005
  • Task: run

yorSPAM2ruk

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: yorSPAM2ruk
  • Participant: yorku.huang
  • Track: Spam
  • Year: 2005
  • Submission: 7/29/2005
  • Type: final
  • Task: filter
  • Run description: Our filter reuses SpamAssassins code to do rule analysis, SA returns a preliminary result. The modified SA returns a parsed text to kNN which then do classify again. The final result is determined base on SA and kNN. In the training part, we use auto learning and then train on errors.

yorSPAM2spa2

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: yorSPAM2spa2
  • Participant: yorku.huang
  • Track: Spam
  • Year: 2005
  • Submission: 9/12/2005
  • Task: run

yorSPAM2spa5

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: yorSPAM2spa5
  • Participant: yorku.huang
  • Track: Spam
  • Year: 2005
  • Submission: 9/12/2005
  • Task: run

yorSPAM3full

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: yorSPAM3full
  • Participant: yorku.huang
  • Track: Spam
  • Year: 2005
  • Submission: 9/12/2005
  • Task: run

yorSPAM3ham2

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: yorSPAM3ham2
  • Participant: yorku.huang
  • Track: Spam
  • Year: 2005
  • Submission: 9/12/2005
  • Task: run

yorSPAM3ham5

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: yorSPAM3ham5
  • Participant: yorku.huang
  • Track: Spam
  • Year: 2005
  • Submission: 9/12/2005
  • Task: run

yorSPAM3knf

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: yorSPAM3knf
  • Participant: yorku.huang
  • Track: Spam
  • Year: 2005
  • Submission: 7/29/2005
  • Type: final
  • Task: filter
  • Run description: Our pilot filter is based on the k-nearest neighbor classification method. We use a collection of public and private emails as an initial set of training data. Based on this training data set, an initial set of features are selected according to an information gain measure. The feature values are computed according to local term frequencies. Data preprocessing is conducted on both the training data and the test emails. We use a decoding program to parse the content of an email, which retains only the text part of an email including the decoded text body. The parsed email is then tokenized by first using white spaces as separators, then remove some non-alphanumerical characters with a few of them left. When tokenizing, we remove stop words and stem the left word if necessary. During classification, the training data set is maintained by removing examples that constantly cause wrong classifications and adding each test example into the training set. The weighted Euclidean distance measure is used to find the nearest neighbors. To be adaptive to the evolving of spam, we retrain our model periodically. The retained instances are represented start all over to reflect the updated features learnt. This one is different from the first filter in pre-processing methods and retrain parameters.

yorSPAM3spa2

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: yorSPAM3spa2
  • Participant: yorku.huang
  • Track: Spam
  • Year: 2005
  • Submission: 9/12/2005
  • Task: run

yorSPAM3spa5

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: yorSPAM3spa5
  • Participant: yorku.huang
  • Track: Spam
  • Year: 2005
  • Submission: 9/12/2005
  • Task: run

yorSPAM4full

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: yorSPAM4full
  • Participant: yorku.huang
  • Track: Spam
  • Year: 2005
  • Submission: 9/12/2005
  • Task: run

yorSPAM4ham2

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: yorSPAM4ham2
  • Participant: yorku.huang
  • Track: Spam
  • Year: 2005
  • Submission: 9/12/2005
  • Task: run

yorSPAM4ham5

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: yorSPAM4ham5
  • Participant: yorku.huang
  • Track: Spam
  • Year: 2005
  • Submission: 9/12/2005
  • Task: run

yorSPAM4hyb

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: yorSPAM4hyb
  • Participant: yorku.huang
  • Track: Spam
  • Year: 2005
  • Submission: 7/29/2005
  • Type: final
  • Task: filter
  • Run description: This filter is a combination of kNN, NaveBayes and RandomForest tree learning.

yorSPAM4spa2

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: yorSPAM4spa2
  • Participant: yorku.huang
  • Track: Spam
  • Year: 2005
  • Submission: 9/12/2005
  • Task: run

yorSPAM4spa5

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: yorSPAM4spa5
  • Participant: yorku.huang
  • Track: Spam
  • Year: 2005
  • Submission: 9/12/2005
  • Task: run

yorSPAMpknn

Results | Participants | Proceedings | Summary | Appendix (aggregate results) | Appendix (public corpus [full]) | Appendix (Mr. X Private corpus) | Appendix (S.B. Private corpus) | Appendix (T.M. Private corpus) | Appendix (public corpus [five subsets])

  • Run ID: yorSPAMpknn
  • Participant: yorku.huang
  • Track: Spam
  • Year: 2005
  • Submission: 7/5/2005
  • Type: pilot
  • Task: filter
  • Run description: Our pilot filter is based on the k-nearest neighbor classification method. We use a collection of public and private emails as an initial set of training data. Based on this training data set, an initial set of features are selected according to an information gain measure. The feature values are computed according to the TF-IDF measure. Data preprocessing is conducted on both the training data and the test emails. We use a decoding program to parse the content of an email, which retains only the text part of an email including the decoded text body. The parsed email is then tokenized by using non-alphabetical letters as separators. During classification, the training data set is maintained by removing examples that constantly cause wrong classifications and adding each test example into the training set. The Euclidean distance measure is used to find the nearest neighbors.