It’s easy to believe that other people use social networks in the same way that you do. Your friends largely do use them the same way, which gives us an even more biased perspective.
Unfortunately, most networks don’t provide a way to explore representative communications that you’re not connected to.
Well, now you can! One random tweet, please.
Update: There were some slight technical difficulties due to hitting Twitter’s oembed rate limit. They should be repaired now.
(Note: between this and bookbookgoose.com I’m on a bit of a random kick lately. There’s a method to this madness!)
The introductory e-mail is a message where I introduce two (or more) people who have yet to meet each other. It generally takes the highly structured form, “Salutation A and B! A does X. B does Y. You should meet for reason Z. Valediction.”
I almost always do opt-in intros, where I’ll write to each party separately and make sure it’s okay if I share their information, and explain why I think it’s worth their time. I find this approach to be more respectful of people’s privacy and busy schedules.
That means that by the time they get the formal introduction, they generally know what’s going on. Still, I find these messages peculiarly stressful.
Stressful task? Check. Highly-structured output? Check. Repeating the same information over and over? Check. This calls for … a script!
You can grab the code on github here.
The first step is to set your valediction and names in the settings.py file, then to add people that you want to introduce with a brief description. Finally, you need only type something like:
python intro.py alan betty
to generate and copy to your clipboard (on a mac, anyway):
Alan & Betty, please meet. Alan is the fake director at fake company, where he does fake things. Betty is the fake person who does other fake interesting things. I think you'll find quite a lot to talk about. Cheers, Hilary
Paste it into your favorite e-mail client, send, and relax.
This is how I mentally organize introductions, but I have no idea if it’ll work for anyone else. Would you ever use something like this? What does it need to be useful for you?
Data scientists need data, and good data is hard to find. I put together this bitly bundle of research quality data sets to collect as many useful data sets as possible in one place. The list includes such exciting and diverse things as spam, belly buttons, item pricing, social media, and face recognition, so you know there’s something that will intrigue anyone.
Have one to add? Let me know!
(I’ve shared the bundle before, but this post can act as unofficial homepage for it.)
There must be a better way to explore books.
A random way to explore books would be a good way to start.
Hence, bookbookgoose. Browse randomly. Enjoy!
Hint: use the ‘n’ key to go forward quickly. I find about .2% of the books are awesome.
Update: you can now find @bookbookgoose on Twitter, sharing one random book per hour.
Update: Dustin Kurtz at Melville House had an eloquent writeup of the beauty in this random literature.
DataGotham is a celebration of the NYC data community, and will bring together professionals from all industries in New York that are built around data, from finance to fashion and from startups to the Fortune 500 and government. The event is September 13th – 14th at NYU, with tutorials and The Great Data Extravaganza Show (with cocktails!) at the Tribeca Rooftop Thursday evening, and a single track conference Friday. Our speakers and sponsors are all amazing. You can register now.
While DataGotham is definitely a labor of love, there are numerous reasons to do it. I believe that New York has a distinct data philosophy — the study of human behavior — that is unique and should be celebrated. We have an large population of local badass data hackers, and our community will only grow stronger if we can build relationships across the industry divides. Finally, there’s an opportunity for all of us to influence the future of data science, and this event will highlight some voices that might not otherwise be heard.
I hope to see you there!
(Also, anyone who made it this far through can register with code “dataGothamist” for 25% off )
Several months ago I was looking for a command-line solution for group bookmark sharing. I couldn’t find one, so I coded up a quick python script that runs on top of git. It’s very much a hack that takes advantage of git to manage users, preserve the URL, the tags, the description of the URL (in the commit message) and also includes the content itself (so it’s grep-able later). If you put it on github, you get the additional commenting and collaboration features. You can check out my original code here.
I’m very excited that Far McKon has picked up the project and has a great vision for where it can go. If you’re interested in hacking on it with him, let him know!
bc is a command-line calculator that’s fast and has the capacity to do some fairly complex math.
Try it out on the command line:
echo '100 / 10' | bc -l
I released the code under GPL, and it’s available on github: http://github.com/hmason/tweetbc.
John Cook mentions the bot and makes some great observations in his post three surprises with bc.
Welcome! I’ve gotten several hundred e-mails about my e-mail management code. I do want to share it as soon as possible. Here are the answers to the most common questions.
Why separate scripts?
My philosophy is based on the unix command-line tool model; Each script should be simple and useful alone, but when combined together they become extremely powerful.
Why don’t we have the code yet?!
I had no idea the talk would be shared beyond the couple hundred people in the audience or that it would be so popular! I started my position at bit.ly the same day I gave that IgniteNYC presentation, and I also have some other awesome projects that are competing for time.
I have to admit that the trained classifiers are all based on my personal data and were also trained mostly through tweaking in ipython. I need to finish a generic framework for people to train their own filters before I can publish that piece of the system. I promise, I’m working on it.
Keep nagging me — nagging works!
Are you going to commercialize your scripts / can I invest?
I have certainly thought about commercializing the application, but I’m uncomfortable asking people to give me access to their personal e-mail data (even if there are very interesting things to be learned by aggregate analysis).
Just imagine how much more creative, interesting work could be done if we could partially free the world from the e-mail workload… that alone is worth making the code open.
How does it work? What tech are you using?
The scripts run on my gmail account through IMAP (and should work with any IMAP interface, though I’m sure there is debugging to be done). They live on a Linode VPS and run individually via cron jobs.
I primarily use the gmail web interface (though I’ve flipflopped between Mail.app and Thunderbird for a while), and the only cost is that I have to manually reload the page to see new labels and new drafts appear.
Do your scripts go mad with power and e-mail inappropriately? Are you some kinda robot?
I have all of the scripts deposit suggested responses in the draft folder, and then I use the gmail “multiple inboxes” feature to keep the draft folder up in the UI. It’s very easy to go through and modify or delete responses before they are sent.
Of course, I only thought of that after one of the script DID go a bit mad. I’m still sorry about that, Mom.
I’m not a robot, though of course I would say that anyway! The point of the automation is to remove the stupid parts of e-mail and leave me free to personally address the interesting messages.
If you’ve read this far, there are a few things I would love your feedback on:
What’s a kickass name for this project?
More important, which features/scripts are you most interested in seeing first? The nag script is about ready to go, but I’d like to know where to focus my time.
Over at NYC Resistor, it was getting cold, and we needed a doorbell so visitors wouldn’t be stranded outside when the building was locked. A standard wireless model didn’t work reliably (the space is on the fifth floor, just out of range), so various members generally resorted to writing their phone numbers on a sign on the front door when they were expecting guests.
Since almost everyone has a mobile phone already, and SMS-based solution seemed appropriate. In order to implement this we need two things:
- An SMS shortcode
- A system to notify when the shortcode is triggered
It’s irritating and expensive to acquire your own shortcode, but there are several services that will allow you to use one in exchange for a small fee or advertisements in your messages. TextMarks is my favorite (I used TextMarks for my WhereAmI project). While TextMarks markets their service as a system for mobile mailing lists, they allow you to reserve a keyword and define a behavior (that can include pulling data from a URL!) to occur when that keyword is triggered.
Sign up for TextMarks and choose a keyword. Configure the keyword to respond with the “First 120 characters on web page”, and point it at the future home of your script (you can always come back and modify this later).
Note the � as the value of the msg parameter — this instructs TextMarks to send along any additional message contents as the value of that parameter. That means if someone were to text 41411 “doorbell hi this is hilary”, TextMarks would call the script with the param msg=hi this is hilary. This can be quite useful.
This script is written in Python, but you can use any scripting language you like. This particular script just sends an e-mail to an account when the ‘doorbell’ is rung, but you could have it do pretty much anything up to and including ringing a real bell (which may be coming soon!).
#!/usr/bin/env python # encoding: utf-8 """ doorbell.py Created by Hilary Mason, feel free to use this code in your own projects. """ import sys, os import smtplib import cgi import cgitb; cgitb.enable() class Doorbell(object): GMAIL_USERNAME = 'YOURGMAILACCOUNT@gmail.com' GMAIL_PASSWORD = 'YOURPASSWORD' def __init__(self, msg): message = """ From: YOURGMAILACCOUNT@gmail.com To: YOURGMAILACCOUNT@gmail.com Subject: KNOCK KNOCK, someone is at the door! %s """ % msg server = smtplib.SMTP('smtp.gmail.com:587') server.ehlo() server.starttls() server.ehlo() server.login(self.GMAIL_USERNAME, self.GMAIL_PASSWORD) server.sendmail('YOURGMAILACCOUNT@gmail.com', ['YOURGMAILACCOUNT@gmail.com'], message) server.quit() print "You knocked! You can also call us at 347-586-9270. <3, NYC Resistor" if __name__ == '__main__': print "Content-Type: text/plainnn" form = cgi.FieldStorage() if 'msg' in form: w = Doorbell(form['msg'].value) else: w = Doorbell('There is an anonymous monkey at the door.')
And that's it! Provided you have your keyword configured to point at your script, and the script living at an accessible address, you'll get an e-mail whenever your SMS doorbell is rung and the person who sent the message will get back a cute response confirming their action.
This setup can be easily extended such that a message containing 'doorbell hilary' could e-mail only me, or be forwarded to my phone.
I'm curious to see if having a remotely accessible 'doorbell' will encourage pranksters -- we might need to add a password.