Mining the Web
Tuesday, November 16, 2010
Using Language Models for Information Retrieval (Python)
›
Continuing my regard for Bayes' Theorem , I decided to write a small python program that will crawl a list of urls and then will allow t...
193 comments:
Sunday, November 14, 2010
Bayes' theorem : A Love Story
›
A few days ago, I saw the video of Hilary Mason presenting the history of machine learning . As far as I know, she has covered the signifi...
13 comments:
Sunday, October 17, 2010
Text Classification using Naive Bayes Classifier
›
I received some emails related to my spam filter post . Some of them asked me to submit a code related to it. A very simple implementation o...
2 comments:
Sunday, September 26, 2010
DT-Tree: A Semantic Representation of Scientific Papers
›
This year I have been really busy working on some research projects, hence the delay in blog posts. Recently, one of my research works got a...
1 comment:
Saturday, March 20, 2010
Creating Spam Filter using Naive Bayes Classifier
›
Few months ago I gave a lecture to CS students about data mining. I decided to show how a spam filter can be built using simple data mining ...
80 comments:
Friday, January 29, 2010
Parsing Robots.txt File
›
Crawling is an essential part of search engine development. The more sites a search engine crawls, the bigger its index will be. However, a ...
8 comments:
Monday, November 23, 2009
Duplicate Detection using MD5 and Jaccard Coefficient in C#
›
Duplicate documents are significant issue in context of the Web. Detecting near duplicate documents in a large data set, like the web, is a ...
6 comments:
›
Home
View web version