Sunday, November 14, 2010

Bayes' theorem : A Love Story

A few days ago, I saw the video of Hilary Mason presenting the history of machine learning. As far as I know, she has covered the significant developments in the field in the most simplistic, understandable and casual way possible. She shed some light on the fundamental math and algorithmic tools employed in the field.

Hilary suggested, that after the AI winter storm, the best thing that happened in computer science was when computer scientists started invading the field of statistics. That is they starting using probability theories to built application and solve problems. She went on to give an example of how Bayes Theorem was used (and can be used) to classify text, detect spam, etc. I can very well identify with her point.

When I started my MS in computer science, I was supposed to take courses like Software Design, Advance DB, Advance Operating in my first semester. I couldn't get in the Advance Operating Sys class and so I enrolled in Data Mining class. The data mining techniques used, the algorithms that solve complex problems and simply finding interesting patterns in raw data just blew me away. Needless to say, taking DM was probably the best decision of my academic career. I changed my concentration from "Software Engineering" to "Machine Intelligence", worked on making application become smarter (rather than just building a smart application to pass a course) and read more about interesting concepts than that was covered in the class.

My first DM project was building a Spam Filter using Bayes' Theorem. Even though I built much complex DM applications after that, it remains one of my favorite things that I have done. I remember that project presentation was an option to get extra credit (project executable and report however accounted for 40% of the grade). I was doing pretty well in that class but I really wanted to present my Spam Filter. I have never felt so excited about implementing and presenting a project, so I knew that this is something special. And hence, even today when I take on projects that require use of more sophisticated algorithms like SVM, etc. I still perform a quick test using Bayes' Theorem just to get some insight. Maybe it is not a correct approach but I have felt that it has helped me fine tune the algorithms I have to use.

Being a big fan of all things Bayes, I wanted the undergrad and grad students in my college to be excited about using statistics in their projects as well. And hence I gave short tutorial about creating a spam filter using naive bayes theorem during one of the ACM open house meetings. I did that because, I realized that not every CS student enrolled in data mining. The meeting was pretty successful and I tend to believe that I at least made a handful of students excited about using Bayes' theorem.

So if you are an undergard (or even grad) and you are thinking about what project to work on, I'll suggest that you look up some machine learning techniques and see how you can use them to make applications related to your course. There is no better feeling than seeing your application run through mountains of raw(unstructured) data and then providing you with amazing insights and solution.

13 comments:

  1. Thanks for this worthy article, it's absolutely acclaimed blogs

    ReplyDelete
  2. Hi, і feel that i nоticed you visited
    my wеb site sο i came to gо back the favor?
    .I'm trying to find issues to improve my web site!I guess its adequate to make use of some of your concepts!!

    Here is my web blog :: http://www.allplan-Usa.com
    Also visit my blog post home laser hair

    ReplyDelete
  3. Wοw that was unuѕual. I just wrotе an extremelу long commеnt but after I clіcked submit my comment didn't show up. Grrrr... well I'm not writing аll that ovеr again.

    Аnywayѕ, just wantеd to say ωonderful
    blog!
    Take a look at my page - V2 Cigs Review

    ReplyDelete
  4. Grеat article! Thаt іs thе κind of infoгmation thаt
    aгe meant to be sharеd аcross thе wеb.
    Disgrаce on the searсh engіnes for no lοngeг
    poѕitioning thiѕ submit upper! Ϲome оn oѵer and discuѕѕ with my sіte .

    Thаnkѕ =)

    mу page www.Prweb.com

    ReplyDelete
  5. What's Taking place i am new to this, I stumbled upon this I have found It absolutely useful and it has helped me out loads. I hope to contribute & help other customers like its helped me. Good job.

    Also visit my blog - http://www.prweb.com/

    ReplyDelete
  6. Normally Ι ԁo nοt learn artiсle on blogs, howevеr
    I wiѕh to ѕay that thiѕ wгitе-up very forced mе to сheck оut аnd ԁo it!
    Your writing taѕte haѕ been suгрrised me.
    Thаnks, quіte niсе poѕt.



    Feеl free to surf to my homepаge .
    .. http://www.prweb.com/releases/silkn/sensepilreview/prweb10193901.htm

    ReplyDelete
  7. Hi! ӏ just wanted to aѕk if you ever have anу
    trouble ωith hackers? Mу last blog (ωоrdpгeѕs) ωas hacκed and I ended uρ lοsing
    manу mοnthѕ of hard woгk ԁue to no datа
    baсkup. Do you have anу methods tο protect agаіnѕt hackers?


    Feеl free to surf to mу site - Buy Silk N Sensepil

    ReplyDelete
  8. Prеtty! Тhіs hаѕ beеn a reallу wondeгful article.
    Mаny thanκs fοr ргοviԁіng these detaіls.



    my webpage - please click the following internet page

    ReplyDelete
  9. Do you have a ѕpam problem on this blog; Ι also am
    a blogger, anԁ I ωаs wonԁering your situation;
    mаny of us hаѵe deνеloped some nіce methods anԁ wе are loоking
    to ѕwap tесhniques ωith
    others, why not shoot me аn e-mail if іntereѕted.


    Here is mу blog ... V2 Cigs review

    ReplyDelete
  10. I love that this game has a variety of players; if you go to the lower coins tournaments, you could encounter beginner players.
    The 8 ball pool hack 2014 failed to seem to bring about waste materials and also
    could not possibly be obtained in forested acres, jungles, streams,
    lakes, seas, and so forth. Online games were invented few years ago
    and soon it gained huge popularity all around the world.


    Here is my weblog - tumblr.com

    ReplyDelete
  11. Useful info. Fortunate me I discovered your site by chance, and I am surprised why
    this coincidence did not came about earlier!

    I bookmarked it.

    My homepage :: drain snake Goodyear

    ReplyDelete