My NYC Python Meetup Presentation: Practical Data Analysis in PythonPosted: August 12, 2009 | Author: hilary | Filed under: blog | Tags: data, data analysis, nltk, presentations, python, spam, twitter | Leave a comment »
I gave a talk at the NYC Python Meetup on July 29 on Practical Data Analysis in Python.
I tend to use my slides for visual representations of the concepts I’m discussing, so there’s a lot of content that was in the presentation that you unfortunately won’t see here.
The talk starts with the immense opportunities for knowledge derived from data. I spent some time showing data systems ‘in the wild’ along with the appropriate algorithmic vocabulary (for example, amazon.com‘s ‘books you might like’ feature is a recommender system).
Once we can describe the problems properly, we can look for tools, and Python has many! Finally, in the fun part of the presentation, I demoed working code that uses NLTK to build a Twitter spam filter with 90% accuracy*.
Please let me know if you have questions or comments.
* I’ll post the code and training data shortly