View Single Post
  #4  
Old September 4th 08, 10:01 PM posted to rec.aviation.piloting
alexy
external usenet poster
 
Posts: 53
Default Bayesian filtering of this group's riff raff.

Tman x@x wrote:

Kind of on topic, let me tell ya.
Anyone happen across a good Usenet filter that uses Bayesian filtering
or the similar, to filter out unwanted articles based upon a user's (or
a community's) preference.

For those that don't know, these are commonly used in email spam
filters. I think it would work better than a crude old kill file.
It'll identify keywords, or patterns and attributes of messages that you
like (want to see), and those that you don't. It'll learn from your
feedback and then get smart enough to classify the messages. It works
_really well_ for email, in spite of spammers trying to thwart it.

This could be even more powerful by having a (secure) community of users
with common interests within the subset of RAP vote to train the
classifier with more volume and less user effort. WOuld need to think
about vandalism and make sure it is secure however.

Anything like this out there - for NNTP / news? I'd apprecaite any
leads. Guess I can just google this,,, and I will now, but also
interested in what some feedback might be.

RAP motivates me to write a crude one and see how it works, perhaps by
adopting pieces of spamassassin (a classifier commonly used for email).

T


Popfile, which I use to classify emails, has an NNTP client proxy
component. I haven't used it, but it might be worth a try. For email,
what I like about popfile is that it allows multiple classifications
(all using bayesian filtering), not just spam/nonspam. I use it
classify mail as
spam/personal/business/client/shopping/bills/newletters/othernonspam/unclassified.
Currently running about 97% accuracy, and most of the 3% misfiled are
"false negatives", i.e. "unclassified", not ones that have been
classified incorrectly.

Fine popfile on sourceforge.

Report back if you find it or another program to work well for this.

There is another approach. Newsproxy (aka nfilter) is an old program
no longer supported by its author that provides rules-based filtering
_before_ your client, and with much more flexibility than any news
client I have seen.
--
Alex -- Replace "nospam" with "mail" to reply by email. Checked infrequently.