A 5by5 conversation with Matthew Robison, a Data Scientist about how is access changing based on algorithms and what can we do to get it right?
Interview by Twisha Shah-Brandenburg & Thomas Brandenburg
“Data science suffers from the same lack of diversity that most STEM fields suffer, too few women and minorities to represent more diverse viewpoints and experiences.” —Matthew Robison
Is it possible to create a completely unbiased algorithm?
No. Any decisions made in this process, from collecting, cleaning and transforming the data to choosing variables to writing the final line of code, are made by humans and humans have biases, especially ones that they can’t or won’t see. Even filtering out spam, removes some information that someone could find useful. The best we can hope for is integrity and mindfulness on the part of data scientists.
What is the role of diversity in the planning and creation process of algorithms?
Data science suffers from the same lack of diversity that most STEM fields suffer, too few women and minorities to represent more diverse viewpoints and experiences. Diversity plays the same role in data science as it does in design thinking, policy making or product development. The more diverse the team that creates an algorithm, the greater the chance that mistakes and biases will be caught out before the model goes into production.
Blind orchestra auditions were introduced to keep biases in check so that the focus is on the music and not the demographic information that can make the decision-making process subjective. What might we learn from this as we design algorithms that make decisions?
Stripping all personal data from a dataset before building a model is completely possible, and many times very beneficial. That would allow the algorithm to model people’s behaviors, beliefs and actions with less external bias, whether from preconceived stereotypes on the part of the data scientist or assumed knowledge on the part of an industry or a product by a domain expert. Certainly, more data will help make a model more accurate but it can also create opportunities for unethical people to misuse data science.
The issue of misuse of personal information and/or bias in algorithm design is an ethical issue that must be dealt with. How do we keep – let’s call them “bad actors” – from using algorithms to steal, cheat or unethically influence people’s opinions. I don’t have an answer for that.
What are the long-term effects of feedback loops? How might data scientists/designers and engineers think about and monitor their data?
Feedback loops tend to reinforce whatever biases users have with whatever biases their makers have, and they aren’t going away anytime soon. In the long-term, it seems they are either good or bad depending on how the actors – users, marketers, coders, etc. – behave. With so much personal information available on social media sites, users’ demographics, preferences and behaviors, it has become imperative that the ethics of data science needs to be stressed as much as the coding, mathematics and statistics.
Data scientists need to check as to how the algorithm is behaving periodically, and look for any unintended actions or biases that they have coded into it. They also need to consider ethical issues at the start of any project, not just before it goes online, or after something bad happens.
What is the future off data science? What signals are you looking at that are making you excited and worried?
Data science is going through a phase that search, design and personal finances went through earlier in the digital era. Processes for data analysis and visualization, model building, and the handling of Big Data are being made into simple, easy-to-use services – Data Science as a Service – that allow anyone to use them with a little training. It’s exciting because it democratizes a process that can be a “black box” for many companies, but it’s also worrisome that it calls into question the expertise and experience needed to understand the complexity of a problem and the algorithms that could be used to help solve it.
Interested in this topic? Register to be part of a larger community at the Design Intersections conference in Chicago May 24-25, 2018.