In data mining, predictive modeling is used to determine future outcomes and future behaviors. And, people are paid big bucks to peer into their crystal balls for guidance. A few years ago, there was a surge in predictive modeling start-ups to predict market conditions, news events, presidential elections to just about anything. Those start-ups included Hubdub, NewsFutures, Spigit, Inkling, and InTrade. (See an old interview about Predictive markets get buzz over polls)
Those start-ups have since evolved and pivoted, but today there's a new start-up on the block that's capturing the attentions of data groupies and venture capitalists alike. It's called Kaggle (a name that is not based on any mathematical equation), a recently-launched start-up with $11 million in Series A funding from Index Ventures, Khosla Ventures, as well as Stanford University's endownment, PayPal co-founder Max Levchin, Google's chief economist Hal Varian and now Google Adsense co-founder Gil Elbaz. Kaggle is solving real-world problems through competitions among the world's biggest brains.
Indeed, the competitive nature of the site reminds one of Zynga's playfullness. But the fact that the players aren't just playing Words with Friends or Farmville, but rather solving pretty hardcore intellectual problems, makes Kaggle more of: Zynga meets Mensa.
Kaggle brings together super smart data scientists, or just plain hyper-intelligent folks who love data and number crunching, with data sets and prizes for anyone who can predict most accurately who will be admitted to hospitals next year, to how far each country will progress during the World Cup, to my favorite (because it's totally something I spend my days thinking about, not!) - measuring distortions in galazy images caused by dark matter.
Kaggle was the brain child of data-loving Anthony Goldbloom, who worked on interest rate and fiscal policy forecasting models for the Australian government before sitting down and coding what would become Kaggle. Goldbloom soon partnered with Jeremy Howard, one of the leading contributors on Kaggle, who then became chief scientist and president of Kaggle.
Both men moved to Silicon Valley from Australia and visited me at our studios. Watch our interview to hear how the company works and how it came to be and what it's for.
Here are some highlights:
- There are 21,000 members, including PhD data scientists to amateurs. Apparently, there's a lot of brain power to spare and people who would otherwise be playing Saduku puzzles are now setting their eyes on the data sets at Kaggle.
- There have been 50 competitions so far on Kaggle. Check out the list of competitions.
- The way it works is that organizations set up competitions, such as predicting "Bodily Injury Liability Insurance claim payments based on the characteristics of the insured’s vehicle." The organizer of the competition will give Kaggle all the data from an actual "past" event. Kaggle will then put out half the data to its members. The members will to determine what the outcome was based on the data given. Since Kaggle has the answer of what actually happened, they can compare the answer with the members' input. Whoever comes closest to the answer wins the prize.
- The beauty of Kaggle is that data is just data. So if someone with expertise in studying glaciers wants to apply the same processes to studying distant galaxies, a lot more information sharing can be had.
- There's no average company on Kaggle, like insurance companies or healthcare providers, or hedge fund managers.
Stay tuned for our other interviews - like how does Kaggle make money? - coming up soon.