Not so long ago, big data was all the rage. However, the amount of data is increasing at an alarming rate, to the point of overwhelming many organizations. With so much data generated, how do you group it, analyze it and use it wisely?
“What can you do with data? You can get it, you can look beyond it, or you can use it to power automation,” said Cassie Kozyrkov, chief decision scientist at Google, during the recent Rev3 conference held in New York.
This is where data science – and data scientists – come in. In fact, throngs of data scientists attended the Rev3 conference sponsored by Domino Lab to get a glimpse of their roles. For many attendees, Rev3 was one of their first – if not their first – in-person events since the COVID-19 pandemic shut down or conferences moved online.
A career as a data scientist is one of the most desirable careers in the United States, according to Glassdoor. After factoring in earning potential, overall job satisfaction and number of job openings, Glassdoor found data scientist to be the third best job in America.
What is data science and what is the role of a data scientist?
Data science, in a nutshell, is the discipline of making data useful, according to Kozyrkov. However, Kjell Carlsson, head of data science strategy and evangelism at Domino Data Lab, finds defining the field a difficult task.
“When it comes to data science, it’s all over the place in terms of what people are referring to, and I think it’s right behind AI in terms of nebulosity and the amount of things it can accomplish depending on who you talk to,” Carlsson said. “So it’s a tough topic to cover and a tough topic to prepare for.”
“The term ‘data scientist’ means different things to different people,” added Nina Zumel, vice president of data science practice at Wallaroo, which facilitates the last mile of organizations’ machine learning journey, in integrating ML into their production environment. “Part of the problem may be that as the term has become popular, people in other data analytics-related professions may have started to call themselves ‘data scientists. “, because that’s what the job description called for. That makes the term quite diffuse.”
For Zumel, co-author of the book Practical data science with R, “A data scientist is someone who can extract useful patterns from data and turn those patterns into repeatable, automatable, data-driven decision processes.”
Tool makers are stepping up their efforts to make the work of data scientists easier, Kozyrkov noted. For example, she said, Domino Data Lab, which showcased its Domino 5.2 enterprise MLOps platform at Rev3 and previewed its Domino Nexus hybrid architecture. The company says Domino 5.2 will support data scientists by increasing the flexibility of data science teams while reducing infrastructure costs and complexity, and Nexus will enable hybrid machine learning operations that run on-premises. and between cloud providers, all controlled from a unified interface.
“True advocates don’t just sell buzzwords; they care about making data scientists more efficient, about making their own work experience wonderful, awesome,” Kozyrkov said.
“Think of an Olympic skier,” she continued. “You wouldn’t ask an Olympic skier to spend most of their time trudging uphill. You would build a ski lift for that mountain of data science chores so data scientists’ time could be put to better use.”
And that’s what companies like Domino do. At Rev3, Domino CEO Nick Elprin said ITPro today that a guiding principle for his company is to accelerate “model velocity” – the speed at which companies can build and deploy machine learning models – and to give data scientists the freedom and flexibility to use the resources they need, while giving IT control and security.
Help Wanted: More Data Scientists
Now that organizations have the tools, Kozyrkov asked, why do we need data scientists? Is there anything else for them to do? The answer, of course, is yes, as evidenced by the number of job postings for data scientists.
According to Carlsson, his latest research on the number of data scientists at the Global 2000 was around 98,000 – and they also had over 60,000 job openings for data scientists.
In an article Carlsson wrote for ITPro today sister site InformationWeekhe wrote, “It’s no exaggeration to say that every rapidly growing organization needs more data scientists. They are the crucial ingredient for transforming raw data into innovative new products and services, and for transforming business based on data. … With so many companies competing for data science talent, adopting an inclusive data science strategy isn’t just good for business — and more ethical — it’s a necessity. Data is not the strategic resource of the 21st century, data scientists are.”
Zumel agreed that there is a great need for more diversity on the ground: “The world is diverse; decisions have to be made to serve diverse populations, so the ground should reflect that.”
Specialization is now the name of the data science game
What should organizations look for in data scientist candidates?
“It’s time to embrace specialization,” Kozyrkov said, even though data scientists still want to be the whole of data, despite that person being a myth.
Carlsson of Domino Data Lab agreed. It’s no longer a job of anything but the kitchen sink. It’s rare to find a data scientist who takes care of everything, he says.
“Every data science team I spoke to sounded like The A-team“, said Carlsson. “Impossible mission is a better example. “For this assignment, we recommend this person because he has a background in make-up and such and this person because he is a specialist in demolitions, etc.”
There is no standard data scientist, Carlsson added. “Few people have all the necessary skills,” he said. “And the ones that do are really expensive. They’re hard to keep, and they’re usually not very good for any of those individual components.”
Wallaroo’s Zumel said she never thought of data science as a one-person job. “Data science projects have always been a collaboration between business stakeholders, data scientists, IT and operations,” she said. “It has always been vital that data scientists have empathy for their colleagues in other functional teams.”
This does not mean that data scientists have to be experts in business issues or IT, for example. However, that means they need to appreciate issues in those areas and have communication skills to develop relationships with their teammates, Zumel said.
The future role of the data scientist
If data scientists are indeed, as Carlsson says, a strategic resource of the 21st century, what should we expect in the future?
With new tools created to improve their work and give them more flexibility, data scientists – who once spent 80% of their time preparing and engineering data – can now devote a considerable amount of their time to DevOps, Carlsson said.
When Zumel started his career as a data scientist, she said it was often to automate an analyst’s results. “Now data science involves a lot more concerns about running at scale,” she said. “I expect this trend to continue as more companies begin to integrate AI and ML process in their operations.
About the AuthorRick Dagley is a managing editor at ITPro Today, covering IT operations and management, cloud computing, edge computing, software development and IT careers. Previously, he was a longtime editor at PCWeek/eWEEK, with stints at Computer Design and Telecommunications magazines before that.