Sunday, February 17, 2008

IR or Spam Filter

I haven't been updating my blog lately. The commencement of the spring 2008 semester has to do something with that but the real reason is that I have been busy researching some interesting topics for my AI project.

Last semester I wanted to create an Image Spam Filter for my "Data mining & Pattern Recognition" class. My theory was to apply OCR techniques to capture text from the image, after which it really becomes a text spam-filter problem. I was thinking of using Neural Networks for Image Text recognition and Bayes' Theorem for spam-filtering. But my idea was unanimously rejected by my group and instead we developed a Handwriting Recognition system (which was still a better project to work on than our previous plan to tell time by reading an image of analogue clock.)

So now I have a chance to have another go at my Image Spam Filter project. But I am still debating about it. The reason is that I am also fond of Information Retrieval problems. I am thinking of working on a Search engine and automatic indexing of a technical book/manual. Since this is an individual project I can do whatever I want (Of course my professor has to approve it.)

If I think about it, I find both, IR and Spam Filter, interesting. So it is not a clash of interest but more of what will I gain from them and what I want to do in future.

While I try to analyze this, feel free to give me your suggestions. Maybe you can give me an insight which may eventually help me reach a decision.