What is Algorithmic Bias?

Published in

Heartbeat

11 min readJan 26, 2022

“Sometimes respecting people means making sure your systems are inclusive such as in the case of using AI for precision medicine, at times it means respecting people’s privacy by not collecting any data, and it always means respecting the dignity of an individual.” — Joy Buolamwini

What is an algorithm?

We define an algorithm as a way designed to achieve a certain purpose (which could be to solve a problem). However, to make a more technical definition, algorithms form the basis of computer programs. Algorithms define the steps of the workflow that solves the problem.

Algorithms are needed because software is running in the background of technological devices that have become indispensable in our lives today, and these software programs are designed with algorithms. Today, not only the devices we use but also the individual platforms we interact with have smart algorithms.

These platforms can be social media tools or search engines, advertisements, movie/music recommendations, human resources applications, or banking applications that help determine the credit scores of individuals. In general, it would not be wrong to define such smart platforms as Artificial Intelligence (AI) powered applications. All these applications are fed by data, that is, they make decisions by making sense of the data.

As a result, we are talking about applications that learn and make predictions by extracting patterns/information from the data related to the area we want to find a solution for. They can talk to us, maybe drive our vehicle, and can diagnose the diseases of our loved ones early.

But long story short, the data can become biased, which can negatively impact the end results of our model. Let’s examine the definition of bias, where it can occur, and how we can avoid this problem.

Are There Biases in Algorithms?

Of course, like humans, we encounter bias in algorithms. We technically call this algorithmic bias. Bias in algorithms consists of non-representative or incomplete data. However, that may not be the only reason. We can also inadvertently create bias when we design our systems by relying on imperfect information that reflects historical inequalities. This is human bias. Worst of all, once we start using an algorithm with a bias, we may trigger a cumulatively increasing bias system.

An example of bias in action: The automated risk assessments used by US judges to set bail and penalty limits have led to inaccurate results, resulting in longer prison sentences for certain groups (people of color, especially Black men) or higher bail for non-whites.

Can We Illustrate the Biases Encountered in Algorithms with Examples?

I can count bias in online advertisements as an example that almost all of us can observe in daily life. Tracking information such as your gender, ethnicity, products you like or shop for, i.e. your movements on online platforms, and the data of your internet presence, advertisements are designed and presented to you in a way that will be specific to you. This is called microtargeting.

I can recommend a documentary movie that you can watch for details on this subject: The Social Dilemma.

Social Dilemma

Bias in word associations is another common type. It is an algorithmic bias caused by the under-representation of data that can be found in applications such as search engines and machine translation. Princeton University researchers developed an AI application to analyze 2.2 million words. They discovered that European names were perceived as more pleasant than African-American names and that the words “woman” and “girl” were more likely to be associated with the arts rather than science and mathematics.

In analyzing these word associations in the training data, the AI algorithm identified existing racial and gender biases displayed by humans. If the learned associations of these algorithms were used to generate word suggestions as part of a search engine ranking algorithm or as part of an autocomplete tool, this could have a cumulative effect, reinforcing racial and gender biases.

Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings

The bias in online recruitment tools is one of the examples we should mention. Approximately 60% of the global workforce and 74% of management positions are male. That’s why Amazon stopped using its recent AI-assisted recruitment algorithm. With the biased algorithm, even if a person met the requirements for managerial positions, female candidates were not recommended.

I mentioned that bias was also noticed in criminal justice algorithms. According to a report on ProPublica; A racist algorithmic bias has been found in the decision support request to judges in the USA. The algorithm assigns a risk score to the defendant’s likelihood of committing a crime in the future, based on the voluminous data available on arrest records, defendant demographics, and other variables. Compared with the white population, who were equally likely to re-offend, African-Americans were more likely to have higher risk scores, resulting in longer detention periods awaiting trial. This was recorded as a historical human bias and a bias demonstrated by ongoing, data-driven systems.

Another example of racial discrimination, Amazon made a corporate decision to remove certain neighborhoods from its Prime delivery system on the same day. Their decision was based on these factors:

Whether there are enough Prime members in a particular zip code,
Whether this zip code is near a warehouse,
Whether there were enough people willing to deliver to that zip code.

While these factors corresponded to the company’s profitability model, they resulted in the exclusion of poor, predominantly Black neighborhoods, transforming these data points into representations for racial classification. The results unintentionally discriminated against racial and ethnic minorities who were not included.

Another well-known example is face recognition technology, and here too we are faced with bias. MIT researcher Joy Buolamwini discovered that the algorithms of three commercially available facial recognition software systems fail to recognize darker skin.

Overall, it is estimated that the dataset used for most facial recognition systems consists of more than 75% male and more than 80% male and female white people. When the person in the photo was a white male, the software identified the correct person as 99% male. According to Buolamwini’s research, product defect rates for the three products were generally less than 1%. However, the error rate was high, and the identification of dark-skinned women as women exceeded 20% in one product and 34% in the other two products.

In response to Buolamwini’s facial analysis findings, both IBM and Microsoft have committed to improving the accuracy of their recognition software for darker faces. I can recommend another documentary film where you can find a detailed explanation of Joy Buolamwini’s work: Coded Bias.

Coded Bias

The subject I would like to emphasize here is: Recognizing the possibility and cause of bias is the first step towards any bias-remediation approach.

Is There an Intention in Creating Algorithmic Bias?

While these examples of biases are not exhaustive, they suggest that these problems are empirical realities, not just theoretical concerns. It also shows how these results come about and in most cases without malicious intent by the algorithm’s designers or implementers.

Although the discussions mainly focus on the extent and lack of representation of the data used to train algorithms, human-born and sometimes unconscious biases also emerge with algorithms, and worst of all, AI models reproduce this information as they are used and can cause a growing bias.

Of course, there are also examples of these systems that are carried out deliberately and with a certain motivation. In the Facebook case, we all followed the sale of Facebook data to Cambridge Analytica in the previous US elections, and thus the manipulation by profiling and microtargeting. I can even recommend a documentary movie for those who did not follow the process and are curious: The Great Hack.

The Great Hack

What Should Be Done to Prevent Biases?

Recognizing and accepting biases is the first step to prevention. There are also technical and social methods with this goal. This bias prevention work is called algorithmic hygiene. These are a few of the ways we can ensure algorithmic hygiene:

First, to ensure diversity in data:

Trying to test in different environments.
Involving people in the execution and evaluation processes and ensuring the follow-up of the results.
Paying attention to privacy and security if working with sensitive data.
Ensuring that certain principles are established on which error rates should be equalized in order to be fair.
In order to avoid reinforcing the bias, algorithm developers should seek ways to reduce the differences between groups without compromising the overall performance of the model.

Algorithm hygiene is being developed in international mechanisms that follow these studies.

“There are three big steps to avoiding bias in AI; first is to create an ethical culture, then to ensure transparency, and finally to take action to eliminate exclusion and discrimination, whether in your datasets or algorithms.”
— Kathy Baxter

What Mechanisms Are Needed?

In the feasibility studies published by the Artificial Intelligence Ad Hoc Committee (CAHAI) established within the Council of Europe, AI systems were defined as socio-technical systems and it was emphasized that they would be provided with multi-disciplinary and international cooperation in order to effectively create technical, social, and legal measures. There are ongoing international efforts to develop standards for the use of AI and ethical governance. It is important to follow the Artificial Intelligence Principles of the Organization for Economic Cooperation and Development (OECD). These principles are:

Understanding Artificial Intelligence Ethics and Safety

Human-centered values and fairness.
Privacy and data governance.
Transparency and explainability.
Robustness, security, and safety.
Inclusive growth, sustainable development, and well-being.
Accountability.

Dr. Atty. Basak Ozan Ozparlak wrote the following suggestion in her recent publication: “Cooperation between competition law, data protection, and labor law, and effective sanctions in these areas should support each other in order to carry out a fair employment policy while benefiting from artificial intelligence systems is sustainable.”

Increasing awareness of privacy and security to reduce bias will have a significant impact on political incentives and legal requirements. Concrete steps should be taken, such as supporting technical and social research that complies with laws and ethical principles and determining legal responsibilities.

Join 16,000 of your colleagues at Deep Learning Weekly for the latest products, acquisitions, technologies, deep-dives and more.

To put AI ethics into practice: A 12-step guide (WEF)

Justify the choice of introducing an AI-powered service
Adopt a multi-stakeholder approach
Consider relevant regulations and build on existing best practices
Apply risks/benefits assessment frameworks across the lifecycle of AI-powered services
Adopt a user-centric and use-case based approach
Clearly lay out a risk prioritization scheme
Define performance metrics
Define operational roles
Specify data requirements and flows
Specify lines of accountability
Support a culture of experimentation
Create educational resources

What Can We Do Individually?

It’s commonly said that “algorithms that are great for efficiency, do not yet know what fairness is.”

Justice is a human determination, rather than a mathematical approach, based on shared ethical beliefs. Therefore, algorithmic decisions that could have serious consequences for humans will require human participation. We call this process human inclusion, that is, human-in-the-loop. In short, it is important to ensure human participation.

I think bringing together experts from various disciplines and industries, including engineering, law, marketing, strategy, and communications will help facilitate accountability standards and strategies to reduce online AI bias.

In addition, I think users should be taught technology literacy in order to give feedback on the systems. Government agencies that regulate bias should also work to increase literacy in AI and other data-centric technologies as part of their mission. The individuals most vulnerable to biased decision-making, both in the public and private sectors, are the users.

Book Suggestion: Weapons of Math Destruction, — Cathy O’Neil

Feel free to follow me on GitHub and Twitter accounts for more content!

Check out some other blog posts published on Heartbeat:

Towards Data-Centric AI

Considering a data-focused framework for the future of machine learning

heartbeat.comet.ml

Explainable, Responsible, and Trustworthy Artificial Intelligence

“The most important part of learning is actually forgetting.” Naftali Tishby

heartbeat.comet.ml

Reviewing EfficientNet: Increasing the Accuracy and Robustness of CNNs

Exploring the evolution and nuances of the EfficientNet architecture

heartbeat.comet.ml

References

Nicol Turner-Lee, Paul Resnick, and Genie Barton, “Algorithmic bias detection and mitigation: Best practices and policies to reduce consumer harms”, Brooking Report, May 22, 2019
Stephen F. DeAngelius. “Artificial intelligence: How algorithms make systems smart,” Wired Magazine, September 2014.
Michael J. Garbade. “Clearing the Confusion: AI vs. Machine Learning vs. Deep Learning Differences,” Towards Data Science, September 14, 2018.
Andrea Blass and Yuri Gurevich. Algorithms: A Quest for Absolute Definitions. Bulletin of European Association for Theoretical Computer Science 81, 2003.
Kearns, Michael. “Data Intimacy, Machine Learning and Consumer Privacy.” University of Pennsylvania Law School, May 2018.
Hamilton, Isobel Asher. “Why It’s Totally Unsurprising That Amazon’s Recruitment AI Was Biased against Women.” Business Insider, October 13, 2018.
Vincent, James. “Amazon Reportedly Scraps Internal AI Recruiting Tool That Was Biased against Women.” The Verge, October 10, 2018.
Hadhazy, Adam. “Biased Bots: Artificial-Intelligence Systems Echo Human Prejudices.” Princeton University, April 18, 2017.
Sweeney, Latanya. “Discrimination in online ad delivery.” Rochester, NY: Social Science Research Network, January 28, 2013.
Sweeney, Latanya and Jinyan Zang. “How appropriate might big data analytics decisions be when placing ads?” P
“FTC Hearing #7: The Competition and Consumer Protection Issues of Algorithms, Artificial Intelligence, and Predictive Analytics,” § Federal Trade Commission (2018), Hardesty, Larry. “Study Finds Gender and Skin-Type Bias in Commercial Artificial-Intelligence Systems.” MIT News, February 11, 2018.
Angwin, Julia, Jeff Larson, Surya Mattu, and Laura Kirchner. “Machine Bias.” ProPublica, May 23, 2016.
Brennan, Tim, William Dieterich, and Beate Ehret. “Evaluating the Predictive Validity of the COMPAS Risk and Needs Assessment System.” Criminal Justice and Behavior 36 (2009): 21–40.
Corbett-Davies, Sam, Emma Pierson, Avi Feller, Sharad Goel, and Aziz Huq. “Algorithmic Decision Making and the Cost of Fairness.” ArXiv:1701.08230 [Cs, Stat], January 27, 2017. https://doi.org/10.1145/3097983.309809.
How to put AI ethics into practice: a 12-step guide https://www.weforum.org/agenda/2020/09/how-to-put-ai-ethics-into-practice-in-12-steps/
A Responsible Machine Learning Workflow with Focus on Interpretable Models, Post-hoc Explanation, and Discrimination Testing, file:///C:/Users/ayyuce.kizrak/Downloads/A_Responsible_Machine_Learning_Workflow_with_Focus.pdf
Understanding Artificial Intelligence Ethics and Safety

Editor’s Note: Heartbeat is a contributor-driven online publication and community dedicated to providing premier educational resources for data science, machine learning, and deep learning practitioners. We’re committed to supporting and inspiring developers and engineers from all walks of life.

Editorially independent, Heartbeat is sponsored and published by Comet, an MLOps platform that enables data scientists & ML teams to track, compare, explain, & optimize their experiments. We pay our contributors, and we don’t sell ads.

If you’d like to contribute, head on over to our call for contributors. You can also sign up to receive our weekly newsletters (Deep Learning Weekly and the Comet Newsletter), join us on Slack, and follow Comet on Twitter and LinkedIn for resources, events, and much more that will help you build better ML models, faster.

What is Algorithmic Bias?

What is an algorithm?

Are There Biases in Algorithms?

Can We Illustrate the Biases Encountered in Algorithms with Examples?

Is There an Intention in Creating Algorithmic Bias?

What Should Be Done to Prevent Biases?

What Mechanisms Are Needed?

To put AI ethics into practice: A 12-step guide (WEF)

What Can We Do Individually?

Check out some other blog posts published on Heartbeat:

Towards Data-Centric AI

Considering a data-focused framework for the future of machine learning

Explainable, Responsible, and Trustworthy Artificial Intelligence

“The most important part of learning is actually forgetting.” Naftali Tishby

Reviewing EfficientNet: Increasing the Accuracy and Robustness of CNNs

Exploring the evolution and nuances of the EfficientNet architecture

References

Written by Ayyüce Kızrak, Ph.D.