Data Scientist Uyi Stewart Talks Global Health and Social Impact
On Wednesday, November 16, Dr. Uyi Stewart, Chief Data and Technology Officer at Data.orgdelivered a lecture at Robertson Hall entitled “Data Science: The New Frontier in Global Health and Development”, which revolved around the potential of data science to solve a wide range of contemporary global problems, including infectious diseases, economic vulnerabilities , drug treatment, and access to health care, among other areas of urgent importance.
Data.org is a non-profit organization that strives to “democratize and reinvent the use of data to address society’s greatest challenges and improve lives around the world,” according to their website.
Stewart previously served as Director of Global Development Strategy, Data and Analytics at the Bill and Melinda Gates Foundation and co-founded IBM Research – Africaacting as the platform’s chief scientist and capitalizing on big data to help bring the Ebola outbreak under control that broke out in West Africa in 2014.
The Daily Princetonian sat down with Stewart to discuss his past work and how data science can be used to solve pressing global humanitarian issues. This interview has been edited for clarity and conciseness.
The Daily Princetonian: As Chief Data and Technology Officer at Data.org, what is your nonprofit’s goal?
Dr. Uyi Stewart: In the area of social impact, there is a lot of fragmentation. Social impact organizations are therefore those that use data and other means to fight inequalities in the world. But what you find is that everyone does what they want, so there is a lot of fragmentation. And though he intended to do good, the good was not realized. It’s in small pieces. What is needed is a way to combine all of these efforts while staying true to the goals, but finding a way to maintain economies of scale. This is the problem and the ambition. We are a coordination platform on the ground to drive partnerships towards data with social impact. That’s really our goal. And in the process, train one million motivated data scientists in 10 years.
DP: You played a pivotal role in using big data to fight the Ebola outbreak in West Africa during your tenure as Chief Scientist of IBM Research – Africa, which you also co-founded. How was this experience and what were the main challenges you encountered?
WE: It was one of the hardest things I had to go through as a data scientist. Today it is very common to see the use of data to model the rate of infection or the trajectory of disease spread, as we have seen [COVID-19]. So you look at infection rates, infection patterns, peaks and ebbs. But in 2014, when Ebola hit West Africa, the use of big data to model the disease was very nascent. It was a feat of epidemiology and even getting around to do it was difficult at the time. There was access to data, to answer your question, but the biggest challenge was the stigma associated with Ebola.
The dominant culture in West Africa is such that funeral practices require a rite of passage. When relatives die there is a rite of passage and this passage involves the washing of clothes. But Ebola is a disease that is transmitted by touch. If you touch an infected person, you will contract Ebola. So it was a clash between a disease that is spread by contact and people’s traditional habits. So the big challenge was how to use the technology of innovation to influence behavior change? That was the big problem. And that’s the work we were able to do successfully in Sierra Leone.
DP: What do you see as the most pressing issues facing humanity today that you believe data science is uniquely positioned to address?
WE: A lot…I can talk about gender inequality. Not only in the [United States], but worldwide. In India and other countries in Africa, there is just huge gender inequality – especially in education and even healthcare – staring us in the face.
But there is the climate. In fact, the climate crisis is a health crisis. And I think we’re not up to it. There are inequalities in the way knowledge is disseminated around the world. The World Wide Web is dominated by English and as a result billions of people are excluded from this information corridor. I can go on and on, but these are some of the challenges we are facing in the world right now.
PD: Yes, these appear to be very broad challenges that represent a global phenomenon, not just problems confined to a particular area or location.
WE: It’s true. I isolated them because you asked where the data can help. So I think data can help address gender inequities, health inequities, and inequities that we see in language, to create better models of translation.
DP: Could you explain some of the new data science techniques and technologies that have emerged recently and how they could be deployed to identify disease transmission mechanisms, catalyze drug development and improve access to healthcare resources, among other apps?
WE: I can talk about the ability to unlock new data sets, which is truly phenomenal. One of the things that’s happened in the last five to ten years is taking a data set–like health information–and overlaying it with what we call geospatial data, data on travel. What has been done is that we can now start identifying specific pockets, or what we call heatmaps. Before, you could do a global treatment of a problem. Let’s say, for example, that there is dengue fever in an entire region. But the fact is that there is no one-size-fits-all solution because dengue fever can be intense in a specific area. But previously there was no geospatial motion data. You only had dengue data. But now you have the social determinants of health data, but you also have movement data, from cell towers and cell phones.
When you overlay geospatial data, you can begin to achieve a degree of accuracy in your health intervention. Rather than an all-encompassing treatment, you can now be more specific – what we call targeted intervention. This is an innovation that has just happened recently. I think it’s great to help us achieve better health outcomes, through targeted interventions, for example.
DP: Finally, in a world saturated with data, do you think that innovative data science techniques are enough to deal with the overabundance of data? What challenges remain to be overcome?
WE: We are still on the verge of causal inference. Many of the analyzes we see today all look for correlations in datasets by looking for interpretive characteristics in the dataset. The next big thing we look for, at least in the social impact industry, is what is causal inference? If I see X and Y, can I conclude the trigger from the data?
For example, boys travel eight kilometers to school—I’m just making that up—and girls travel seven miles to school. At the end of the year, we see the results as the test results come back: the boys are doing better, the girls are not as well. The question is how to explain these results? Is there a causal relationship, can you conclude that those extra two kilometers are a reason why girls score lower on tests? Right now, data science can’t give you with any degree of precision the exact conclusion yet. It is a question of probability; we can tell you that in all likelihood it is because of the two miles. But we cannot say that with certainty. So we’re still looking to be much more prescriptive from the data in terms of causal inference, for example.
Implicit in what I have just explained is this big question of how to move from ideas to action. If I find that “Wow, because the girls walked two extra miles, it’s impacting their ability to do well in school,” then what’s the recommendation for action? What do we do about it? This is called ‘impact’ – how do you mitigate this? Analysis is good, but it is not enough. For us to complete the arc and truly say that data really makes a difference, we need to go further to turn insights into action. And right now, it’s more of an art than a science.
Amy Ciceu is a senior writer who often covers research and developments related to COVID-19.
Please direct any requests for corrections to the corrections on dailyprincetonian.com.