The era of the citizen data scientist has arrived

Data scientists are invaluable, which is a challenge for any business outside of Google, Facebook,, and Apple. CIOs who have been lucky enough to poach them from big tech companies or lure them out of academia with pride as they talk about all the business information they are going to generate with their data gurus.

IBM expects demand for data scientists to increase 28% by 2020, and that figure could be conservative. To address the talent shortage, companies are creating software that does the heavy lifting for businesses, creating “citizen” data scientists out of company employees who are not integrated into IT.

Citizen data science includes capabilities and practices that allow users to extract predictive and prescriptive insights from data while working in positions outside of statistics and analytics, according to the research firm. Gartner. Citizen data scientists are “powerful users”, such as business analysts who have no computer training but can perform simple to moderately sophisticated analytical tasks that previously would have required more expertise, explains Carlie Idoine, analyst at Gartner, in a blog post. She adds that experienced users, such as business analysts, can help narrow the current skills gap.

“The increased availability of tools, technologies, data and models enables the dissemination of information to people who would not normally be able to fend for themselves,” said Brandon Purcell, analyst at Forrester Research.

Data science democratized for (almost) everyone

Technology always finds a way to democratize access to information. So what has changed? In the traditional model, still practiced by most companies, business analysts leaned on a computer scientist and a data scientist for months to plan models intended to generate predictive information, with the data scientist often building the model from it. zero.

Now, thanks to tools like IBM’s SPSS and Alteryx, citizen data scientists, many of whom have no or minimal coding experience, drag and drop data models onto a kind of software canvas. for information. Such tools allow “business analysts to manipulate data much more easily than in Excel”, explains Purcell.

General Motors, for example, created Maxis, an analytics platform that allows business users to perform Google-like queries to get a window into sales forecasts and operational metrics like chain performance. supply. GM may be an outlier now, but it will be crowded in a short period of time, experts say.

Data science is a critical focus for oil giant Shell, where employees navigate petabytes of company data to generate operational and business insights. With self-service software, employees who otherwise might not have been able to access analytics can now do so without technical assistance, says Daniel Jeavons, general manager of the Center of Excellence for Data Science at Shell. For example, Shell uses Alteryx self-service software to help run predictive models that anticipate when thousands of parts of oil rig machines could fail.


Daniel Jeavons, Managing Director of Shell’s Data Science Center of Excellence

“Data science tools democratize the low end of data science, so pretty much anyone can do it,” Jeavons says. But at the other end of the spectrum, Shell uses “powerful engines” such as Google TensorFlow and the MXNet deep learning library, as well as the programming languages ​​Python and R. “There will always be a spectrum covering the scientist. data citizen and data scientist professional and we must support both. “

Rather, the citizen data scientist bridges the gap between self-service analytics conducted by business users and advanced analytics attributed to data scientists. Professional data scientists create and adapt data models and algorithms across the enterprise, explains Purcell of Forrester.

Owned by the now widely held maxim that data is the new oil, many companies have become “enthralled with the glamor of complex analytics,” says Joe DosSantos, senior vice president of corporate information at TD Bank Group . The reality is, data science is no longer about wizards and mythical unicorns.

TD Bank uses a wide range of basic to sophisticated analytical tools to better align historical and current customer data, as well as to perform fraud analysis, explains DosSantos. For example, the bank uses software from AtScale to help business users query live data from the bank’s Hadoop data lake and get results quickly. TD Bank analysts view the data in Tableau’s self-service viewer software.

Data scientists: always wanted

Other software companies are accelerating the trend towards the democratization of data, often using machine learning (ML) and artificial intelligence (AI) capabilities to create automated models., for example, offers Einstein Prediction Builder, which allows business analysts to create custom AI models, adding variables to any custom Salesforce field or object to predict outcomes such as the likelihood of unsubscribing. of a customer or the lifetime value of an account. Adobe Sensei, another ML software tool, helps marketers create marketing campaigns in minutes, saving them hours.

More than 40% of data science tasks will likely be automated by 2020, according to Gartner. “This [automated ML approach] is the next generation of data science, ”says Purcell.

Of course, not all big data challenges are easily taken up by a citizen data scientist. Businesses always need statisticians, data scientists, actuaries and other experts with advanced mathematical skills, says Bill Roberts, managing director of cognitive and analytical practice at Deloitte Consulting. Such specialists can fill in data gaps and missing fields, tasks for which citizen data scientists are ill-suited.

Further, Roberts notes that if self-service tools can serve a business well if they work well, and if they don’t? What if something is wrong and the calculations don’t check? There might be a problem with the algorithm itself. “When there is a traffic jam or a problem, you need someone with an advanced education or degree who can fix it,” says Roberts.

Sean N. Ayres