A colleague mentioned the hazard of biasing a research investigation towards the data that are readily available over the data that are most desirable. That seemed a good cue to post a link to this old chestnut. http://quoteinvestigator.com/2013/04/11/better-light/
The Perils of Metrics
On the general inevitability of Goodhart’s Law
Ask a stupid question, get a stupid answer
“How deeply a problem is understood and how concretely it is defined sets an upper bound for the quality of any subsequent solutions.” https://medium.com/intercom-inside/growth-hacking-is-bullshit-60aae95f9caa
"But we don’t WANT to teach ’em, we want to LEARN ’em…"
True progress is hard; it is always easier to move the goalposts than to score more goals. Human beings are very good at rationalization and self-deception, so adjustments made in ambiguous or complicated situations for varied motivations are not always obviously directly aimed at shifting the reference in order to inflate the metric. On… Continue reading "But we don’t WANT to teach ’em, we want to LEARN ’em…"
There’s no such thing as bad publicity
“It’s easier to measure if that change led to increased engagement than to measure if it also made your users hate you” –Benedict Evans https://twitter.com/BenedictEvans/status/673633094113484800
Turtles, questions, and fleas
Metrics are used to answer questions, and questions are like fleas: So, naturalists observe, a flea Has smaller fleas that on him prey; And these have smaller still to bite ’em, And so proceed ad infinitum. (Jonathan Swift, On Poetry: A Rhapsody, 1733) So a change in a single metric should be… Continue reading Turtles, questions, and fleas
Two’s company, three’s a crowd
Counting things is generally much easier than defining sharply the boundary between what is to be counted and what is not. Focus often lands on that problem late, when a decision is to be made based on (or at least informed by) the metric and a shift in the definition could change the outcome. … Continue reading Two’s company, three’s a crowd
Testing, testing, 1,2,2,2,2,3
“Salesforce, you see, refuses to release code unless there’s 75% test coverage. A contract developer programming on a deadline looked at that requirement and said …” http://thedailywtf.com/articles/at-least-there-s-tests
Just step across the threshold
“In our analytics-obsessed world, it’s tempting to first ask how to measure whether something is a view, but if we take a step back and just ask what a “view” is, the answer becomes clearer. What is a view? It’s when someone watches the video. And Facebook counts views significantly before people could be said… Continue reading Just step across the threshold
Is that working for you?
“Whether it’s unpaid time waiting around at the beginning or end of a shift, spending time on tasks that are unavoidable but don’t officially count, or being forced to absorb the costs of uncertainties like weather delays and sub-par sales, workers are paying the price for new technologies of measurement in the workplace.” The Future… Continue reading Is that working for you?
Doctoring the numbers
“When the statistics were publicized, some talented surgeons with higher-than-expected mortality statistics lost their operating privileges, while others, whose risk aversion had earned them lower-than-predicted rates, used the report cards to promote their services in advertisements.” Gathering and analyzing the statistics is nonetheless a good idea. Refining the comparison cohorts would be an improvement, but… Continue reading Doctoring the numbers