The Mole is a retiring and self-effacing sort, quick to minimize his own acheivements (such as they are), so he will stipulate without hesitation that Dr. Alexander Gray is more accomplished and more knowledgeable about machine learning than he is. But the Mole has also known quite a few members of the National Academies -- and even one Nobel laureate -- who have not found it necessary to proclaim their intellectual superiority quite so sneeringly.
The Mole has also seen and heard people describe their technical innovations in ways that were more compelling than blending self-congratulation and hand-waving. At the very least a table of timings showing product X running much faster than product Y would tend to induce a belief that the presenter had run the the two products head-to-head.
So whence this little jeremiad? The Big Data Gurus meetup last night at Samsung titled "Real-World Machine Learning on Big Data: Which Method(s) Should You Use?". The taxonomic analyses (ML tasks and their corresponding methods, parametric vs. non-parametric methods) were interesting -- sufficiently so that the Mole will go digging through YouTube to find a version of the talk and copy the information down since the speaker indicated an intention not to share his slides. And a couple of hints dropped during Q&A appear to be well worth investigating further: a forthcoming report from the National Academies on massive data (did he mean "New Tools for the Analysis of Massive Data", which is a year and a half overdue?) and the CMU TETRAD project led by philosopher Peter Sprirtes.
Overall it was a very good pitch for BigML, or wise.io, or Precog, or Alteryx, or any other competitor to Skytree, and good motivation to go contribute to the open source big data machine learning tools so derisively dismissed.
Last night the Mole attended a pleasant talk by Mark Noworolski on Streetline's smart parking system. (The talk was presented through the good offices of the IoTSiliconValley meetup, capably organized by Drew Johnson and Elle Wood, and hosted by Hacker Dojo. </shoutouts>) While the talk was not particularly technically demanding, the Mole was quite impressed by the expansiveness of Streetline's vision.
Stipulating up front that the devil is in the details and with a tip of the hat to Streetline's six years of hard work, the general technical architecture was exactly what the Mole expected:
Two strategic decisions are particularly interesting:
The verticality is interesting because it indicates a belief that the increased development costs will be repaid by the improvements made possible on the feature side (better reliability, control of enhancements, etc.). This runs against the grain of trendy lean startup MVP notions, reflecting some combination of the grownup big company backgrounds of Streetline's founders and the stolidity of the customers.
The service play is interesting because of the capital exposure Streetline faces. The Mole would love to see their internal numbers: either they're in a position to loan substantial amounts of money to municipalities for an extended period or they've laid their hands on the goose that lays the golden eggs. If they can recover the costs of installing a network quickly their margins will tilt sharply upwards the next day. That allows them to balance strong cash flows with significant ongoing product R&D.
So where is the "expansive vision"? Streetline have thought deeply about the needs of both their paying customers (i.e., municipalities, private parking operators, etc.) and the end users of the system. They're even evolving best-practice guidance for how to use the technology (the Mole asked about this), advising cities, for example, not to zero meters when cars pull out. Although technically possible, the bad feeling engendered among drivers (who feel that finding some time on a meter is a "God-given right") isn't worth the extra revenue.
They're also working towards a broader sensor ecosystem that will incorporate both Streetline technology and others'. In addition to creating additional functionality such as traffic flow sensors and management tools they are thinking about opening up (the Mole hopes he got this right...) a network level API that would allow other sensors to ship their data into the Streetline cloud.
In the medium term the Mole predicts moderate chaos as vertically integrated players like Streetline jostle not only with direct competitors (FastPrk, Fybr) and niche endeavors (SFpark), and with lateral moves from nearby domains (Sensys Networks, Iteris), but also with parallel innovations (Parkopedia). Plan to install multiple parking apps on your smartphone...
And one final nugget of great value: in the context of advocating "version early, version everything", Mark pointed the crowd to Tom Preston-Werner's Semantic Versioning. Read it. Follow it.
The Mole's grandmother was born in another country, one where English is not the native language. We will return to this point later. Our immediate point of departure, however, is that the Mole is currently burnishing his credentials in pursuit of a return to more technical work and has, to that end, been taking online courses on machine learning and other "data science" topics. The first course he took was quite theoretical, abstract, and mathematical, and the normal equation method of solving a linear regression problem was presented as
(XTX)-1XTy
using the universally familiar notation of linear algebra.
As the Mole completes homework assignments he has implemented this several times, in several languages, and expects to have to do it at least once more in at least one more language. Each reimplementation requires a wholly uninstructive detour into syntax, a cost that the Mole would justify as an investment if he could settle with some confidence on one of these languages as the one he was likely to use primarily in the near future. But the language wars rage on -- it was always thus, and always will be -- and the poor Mole cannot predict what will be most useful. So far, the Mole has written
pinv(X' * X) * X' * y
in Octave and
coef(lm(y~X[,2]+X[,3]))
and
solve(t(X) %*% X) %*% t(X) %*% y
in R. He fully expects to add Python to the list very shortly. Then what? Julia? Java? Javascript?
The Mole's grandmother spoke excellent, nearly unaccented, English. Yet traces of her native tongue remained: when she counted silently to herself (while kntting, for example), she reverted to her first language. So it is with the Mole, who leaves as an exercise for the reader determining why his mind first imagines this code:
((⌹((⍉X)+.×X))+.×⍉X)+.×Y
The Mole found that the writers of the soon-to-be-released movie Larry Crowne displayed a perfectly tuned sense of poor UX design in this clip showing how to set basic preferences on the mythical Map Genie GPS.
The Mole is emerging from a long sojourn, a period of inactivity rooted in multiple causes.