# Machine Learning: A Love Story

**Posted:**November 10, 2010 |

**Author:**hilary |

**Filed under:**academics, blog, Presentations |

**Tags:**conferences, machinelearning, presentations, video |

**27**Comments »

The video from my keynote at Strange Loop 2010 is up!

You can watch the video here: Machine Learning: A Love Story

The original abstract:

Machine learning has come a long way in recent years — from a long-marginalized field so old it still has the word “machine” in the name, to the last, best hope for making sense of our massive flows of data.

The art of ‘data science’ is asking the right questions; the answers are generally trivial or impossible. This talk will focus more on questions than on answers. I’ll give a brief history of the field with a focus on the fundamental math and algorithmic tools that we use to address these kinds of problems, then walk through several descriptive and predictive scenarios.

Finally, I’ll show one example system using bit.ly data in-depth, from the backend infrastructure through the algorithms and data processing layer to show a functioning product.

Attendees should expect to hear some good stories of data gone right and data gone awry, and walk away with a few new clever tricks.

The presentation was calibrated for the audience in the room, but I’ll be happy to answer any questions in the comments below!

Please explain how “data science” differs from plain science.

The same way computer science does.

Hey Hilary,

Sounds like a great presentation! Any chance the slides will find their way to Slideshare?

Thanks,

Jeff

Very, very nice talk! Super interesting details about Twitter classification and “streamlining”; we are working on similar problems/set of (Bayesian) algorithms in FUSE Labs (http://fuse.microsoft.com). Would love to talk more with you about your ideas – please Email me if you are interested!

Slides are available on the same site, see the link on top, use

http://www.bugmenot.com/view/infoq.com

About the slides for others, there are pretty sparse, video might be the way to go.

This is so false: “The art of ‘data science’ is asking the right questions; the answers are generally trivial or impossible”.

Unfortunate. Aggregations have blurred your innovation.

I like your presentation.Do you use the sistem of elimination?

In medical world is that really good way for diagnosis.

I caught only a piece of your NPR discussion. You said you collected information/data about people’s thinking on clothing styles. Can I get access to this information? Would this be something like focus groups?

When was machine learning a marginalized field?

Whenever and wherever they still call it statistics!

Actually, more seriously, it took some big historical hits for being rather too optimistic and then taking 20 years to come through on a fraction of what was promised.

i feel like machine learning is relevant these days in part *because* it embraced statistics.

I was joking a bit since there’s a tendency for statistical techniques to become far more interesting and monetizable the moment you call it ML. In all honesty you’re absolutely right; the sorts of probability calculi that were developed in statistical fields are an enormous boon to ML generating tons of models, providing a flexible framework to create new ones within, and even giving powerful interpretations of models that were generated without considering the probabilistic interpretation.

Any discriminator between the two fields is liable to misclassify heavily at the best of times and still be very non-robust.

Could you put links to the books you mentioned?

Is this the purple book you mentioned? http://www.amazon.com/dp/0321321367

I think the books that I mentioned in the talk were:

Toby Segaran’s Programming Collective Intelligence: http://amzn.to/fMpamCand Data Mining: Concepts and Techniques (this one is getting a bitold): http://amzn.to/dUC1Nwand Bishop’s Pattern Recognition and Machine Learning: http://amzn.to/hfbNpRI haven’t read the book that you linked to above.

Thanks.

I think you forgot:

Machine Learning (Thomas Mitchell)

http://www.amazon.com/dp/0071154671

I’m glad I asked about the “data mining book with the purple cover” (your words), because I would have bought the wrong one.

Ahh yes, Tom Mitchell’s book is a classic. Thanks!

What about “The Elements of Statistical Learning”? http://www.amazon.com/The-Elements-Statistical-Learning-Prediction/dp/0387848576 – do you rate it?

Newer version of the “data mining book with the purple cover” now too http://www.amazon.com/Data-Mining-Concepts-Techniques-Management/dp/0123814790

There’s a new version of that one coming out in March 2014 http://www.amazon.com/Introduction-Data-Mining-Pang-Ning-Tan/dp/0133128903

The camera followed you the whole time and didn’t show your slides. Where is the other half of your presentation?

You can download the PDF of the slides here: http://bit.ly/k37Zpc

I’m happy to send the PPT if you prefer, just send me your e-mail address (the contact form on this site goes straight to my e-mail).

Awesome #machinelearning stuff from @hmason:twitter that held my attention from beginning to end.

[…] video is an instructional take and builds on the material I covered in my Strange Loop 2010 keynote Machine Learning: A Love Story and the Data Bootcamp I did with Joe Adler, Drew Conway, and Jake Hofman at the Strata Conference […]

[…] in watching an entertaining and insightful presentation on Machine Learning, take a look at Machine Learning: A Love Story by Hilary […]

I loved that you said that “Yes, probability and statistics should be taught. It should be taught early, it should be taught well and I would even argue that it should be taught instead of pre-calculus.” – I completely agree and in fact we were discussing this from the point of view of teaching biology and how probability, statistics and data analysis/bioinformatics should be integral to biology degrees from the start!

Totally agree with you on that. I took bio 101 and bio102, then business stats, needless to say, taking stats along with, or before biology, would have made research papers much more fun.