« All your MLBase are belong to vapor | Is that a hockey puck under your car or are you just glad to have a parking space? » |
The Mole is a retiring and self-effacing sort, quick to minimize his own acheivements (such as they are), so he will stipulate without hesitation that Dr. Alexander Gray is more accomplished and more knowledgeable about machine learning than he is. But the Mole has also known quite a few members of the National Academies -- and even one Nobel laureate -- who have not found it necessary to proclaim their intellectual superiority quite so sneeringly.
The Mole has also seen and heard people describe their technical innovations in ways that were more compelling than blending self-congratulation and hand-waving. At the very least a table of timings showing product X running much faster than product Y would tend to induce a belief that the presenter had run the the two products head-to-head.
So whence this little jeremiad? The Big Data Gurus meetup last night at Samsung titled "Real-World Machine Learning on Big Data: Which Method(s) Should You Use?". The taxonomic analyses (ML tasks and their corresponding methods, parametric vs. non-parametric methods) were interesting -- sufficiently so that the Mole will go digging through YouTube to find a version of the talk and copy the information down since the speaker indicated an intention not to share his slides. And a couple of hints dropped during Q&A appear to be well worth investigating further: a forthcoming report from the National Academies on massive data (did he mean "New Tools for the Analysis of Massive Data", which is a year and a half overdue?) and the CMU TETRAD project led by philosopher Peter Sprirtes.
Overall it was a very good pitch for BigML, or wise.io, or Precog, or Alteryx, or any other competitor to Skytree, and good motivation to go contribute to the open source big data machine learning tools so derisively dismissed.