Hack ‘n Jill Initial Impressions
My girlfriend has been learning how to program lately. She first completed the wonderful Intro to Computer Science course on Udacity and has recently moved on to exploring web development and design with the same folks. This weekend she decided to participate in her first “hackathon” to test new skills! The hackathon is sponsored by Hack’n Jill, a group working toward eliminating the disgusting gender imbalance in tech. I love the idea, and furthermore, I love the “punny” name of the event, “Hacksgiving”.
I’ll be helping her along the best I can. Her group has decided to improve the discoverability of projects on the charity website, Donors Choose. They’re putting together an interface for donors to better explore projects.
Their team wanted to build a recommendation engine from scratch. DJ Patil has a great essay on building data projects, and a great point is “spend the shortest amount of time/effort learning if your users want your product.” In a 24-hour hackathon, one is obviously in the limiting case of maximizing efficiency. Maybe we can take a peak of the data, and gather some evidence that our idea is reasonable.

Above is a quick plot showing some support for our hypothesis. Each circle is a user, and the axis show the number of donations for the user vs the number of projects donated to. Users on the y = 1 axis are those that have donated only to a single project (but possibly many times). Those falling on the y = x line always donate to different projects with every donation. The dashed line shows the line of best fit lending some credibility to the idea that users who donate to multiple projects often spread their donations around. THEREFORE(!) a better method for discovery is somewhat reasonable.