What is the definition of the term “data scientist”…?
In my previous post, Painting by numbers, I offered a shorthand definition of data science based on what I could synthesise from the interwebs. Namely, it is the combination of statistics, computer programming, and domain expertise to generate insight. It follows, then, that the definition of data scientist is someone who has those skill sets.
In this post I intended to articulate my observation that in the real world, incredibly few people could be considered masters of all three disciplines. I was then going to suggest that rather than seeking out these unicorns, employers should build data science teams comprising experts with complementary talents. I say “was” because I subsequently read this CIO article by Thor Olavsrud in which he quotes Bob Rogers saying, well… that.
Given Thor and Bob have stolen my thunder (18 months ago!) I think the only value I can add now is to draw a parallel with pop culture. So I will do so with the geeky HBO sitcom Silicon Valley.
If you aren’t familiar with this series, the plot revolves around the trials and tribulations of a start-up called Pied Piper. Richard is the awkward brainiac behind a revolutionary data compression algorithm, and he employs a sardonic network engineer, Gilfoyle, and another nerdy coder, Dinesh, to help bring it to market. The other team members are the ostentatious Erlich – in whose incubator (house) the group can work rent-free in exchange for a 10% stake – and Jared, a mild-mannered economics graduate who could have been plucked from the set of Leave It to Beaver.
The three code monkeys are gifted computer scientists, but they have zero business acumen. They are entirely dependent on Jared to write up their budgets and forecasts and all the other tickets required to play in the big end of town. Gilfoyle and Dinesh’s one attempt at a SWOT analysis is self-serving and, to be generous, NSFW.
Conversely, Jared would struggle to spell HTML.
Arguably the court jester, Erlich, is the smartest guy in the room. Despite his OTT bravado and general buffoonery, he proves his programming ability when he rolls up his sleeves and smashes out code to rescue the start-up from imploding, and he repeatedly uses his savvy to shepherd the fledgling business through the corporate jungle.
Despite the problems and challenges the start-up encounters throughout the series, it succeeds not because it is a team of unicorns, but because it comprises specialists and a generalist who work together as a team.
And so the art of Silicon Valley shows us how unlikely we would be in real-life to recruit an expert statistician / computer programmer / business strategist. Each is a career in its own right that demands years of education and practice to develop. A jack-of-all-trades will inevitably be a master of none.
That is not to say a statistician can’t code, or a programmer will be clueless about the business. My point is, a statistician will excel at statistics, a computer programmer will excel at coding, while a business strategist will excel at business strategy. And I’m not suggesting the jack-of-all-trades is useless; on the contrary, he or she will be the glue that holds the specialists together.
So that begs the question… which one is the data scientist?
Since each is using data to inform business decisions, I say they all are.