What is Algorithmic Bias?

Ayyüce Kızrak, Ph.D.
Heartbeat
Published in
11 min readJan 26, 2022

--

“Sometimes respecting people means making sure your systems are inclusive such as in the case of using AI for precision medicine, at times it means respecting people’s privacy by not collecting any data, and it always means respecting the dignity of an individual.” — Joy Buolamwini

What is an algorithm?

We define an algorithm as a way designed to achieve a certain purpose (which could be to solve a problem). However, to make a more technical definition, algorithms form the basis of computer programs. Algorithms define the steps of the workflow that solves the problem.

Algorithms are needed because software is running in the background of technological devices that have become indispensable in our lives today, and these software programs are designed with algorithms. Today, not only the devices we use but also the individual platforms we interact with have smart algorithms.

These platforms can be social media tools or search engines, advertisements, movie/music recommendations, human resources applications, or banking applications that help determine the credit scores of individuals. In general, it would not be wrong to define such smart platforms as Artificial Intelligence (AI) powered applications. All these applications are fed by data, that is, they make decisions by making sense of the data.

As a result, we are talking about applications that learn and make predictions by extracting patterns/information from the data related to the area we want to find a solution for. They can talk to us, maybe drive our vehicle, and can diagnose the diseases of our loved ones early.

But long story short, the data can become biased, which can negatively impact the end results of our model. Let’s examine the definition of bias, where it can occur, and how we can avoid this problem.

Are There Biases in Algorithms?

Of course, like humans, we encounter bias in algorithms. We technically call this algorithmic bias. Bias in algorithms consists of non-representative or incomplete data. However, that may not be the only reason. We can also inadvertently create bias when we design our systems by relying on imperfect information that reflects historical inequalities. This is human bias. Worst of all, once we start using an algorithm with a bias, we may trigger a cumulatively increasing bias system.

An example of bias in action: The automated risk assessments used by US judges to set bail and penalty limits have led to inaccurate results, resulting in longer prison sentences for certain groups (people of color, especially Black men) or higher bail for non-whites.

Can We Illustrate the Biases Encountered in Algorithms with Examples?

I can count bias in online advertisements as an example that almost all of us can observe in daily life. Tracking information such as your gender, ethnicity, products you like or shop for, i.e. your movements on online platforms, and the data of your internet presence, advertisements are designed and presented to you in a way that will be specific to you. This is called microtargeting.

I can recommend a documentary movie that you can watch for details on this subject: The Social Dilemma.

Social Dilemma

Bias in word associations is another common type. It is an algorithmic bias caused by the under-representation of data that can be found in applications such as search engines and machine translation. Princeton University researchers developed an AI application to analyze 2.2 million words. They discovered that European names were perceived as more pleasant than African-American names and that the words “woman” and “girl” were more likely to be associated with the arts rather than science and mathematics.

In analyzing these word associations in the training data, the AI ​​algorithm identified existing racial and gender biases displayed by humans. If the learned associations of these algorithms were used to generate word suggestions as part of a search engine ranking algorithm or as part of an autocomplete tool, this could have a cumulative effect, reinforcing racial and gender biases.

Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings

The bias in online recruitment tools is one of the examples we should mention. Approximately 60% of the global workforce and 74% of management positions are male. That’s why Amazon stopped using its recent AI-assisted recruitment algorithm. With the biased algorithm, even if a person met the requirements for managerial positions, female candidates were not recommended.

I mentioned that bias was also noticed in criminal justice algorithms. According to a report on ProPublica; A racist algorithmic bias has been found in the decision support request to judges in the USA. The algorithm assigns a risk score to the defendant’s likelihood of committing a crime in the future, based on the voluminous data available on arrest records, defendant demographics, and other variables. Compared with the white population, who were equally likely to re-offend, African-Americans were more likely to have higher risk scores, resulting in longer detention periods awaiting trial. This was recorded as a historical human bias and a bias demonstrated by ongoing, data-driven systems.

Another example of racial discrimination, Amazon made a corporate decision to remove certain neighborhoods from its Prime delivery system on the same day. Their decision was based on these factors:

  • Whether there are enough Prime members in a particular zip code,
  • Whether this zip code is near a warehouse,
  • Whether there were enough people willing to deliver to that zip code.

While these factors corresponded to the company’s profitability model, they resulted in the exclusion of poor, predominantly Black neighborhoods, transforming these data points into representations for racial classification. The results unintentionally discriminated against racial and ethnic minorities who were not included.

Accuracy across the spectrum

Another well-known example is face recognition technology, and here too we are faced with bias. MIT researcher Joy Buolamwini discovered that the algorithms of three commercially available facial recognition software systems fail to recognize darker skin.

Overall, it is estimated that the dataset used for most facial recognition systems consists of more than 75% male and more than 80% male and female white people. When the person in the photo was a white male, the software identified the correct person as 99% male. According to Buolamwini’s research, product defect rates for the three products were generally less than 1%. However, the error rate was high, and the identification of dark-skinned women as women exceeded 20% in one product and 34% in the other two products.

In response to Buolamwini’s facial analysis findings, both IBM and Microsoft have committed to improving the accuracy of their recognition software for darker faces. I can recommend another documentary film where you can find a detailed explanation of Joy Buolamwini’s work: Coded Bias.

Coded Bias

The subject I would like to emphasize here is: Recognizing the possibility and cause of bias is the first step towards any bias-remediation approach.

See Also: Digital Policing Tools ‘Reinforce’ Racial Bias, UN Panel Warns

Is There an Intention in Creating Algorithmic Bias?

While these examples of biases are not exhaustive, they suggest that these problems are empirical realities, not just theoretical concerns. It also shows how these results come about and in most cases without malicious intent by the algorithm’s designers or implementers.

Although the discussions mainly focus on the extent and lack of representation of the data used to train algorithms, human-born and sometimes unconscious biases also emerge with algorithms, and worst of all, AI models reproduce this information as they are used and can cause a growing bias.

Of course, there are also examples of these systems that are carried out deliberately and with a certain motivation. In the Facebook case, we all followed the sale of Facebook data to Cambridge Analytica in the previous US elections, and thus the manipulation by profiling and microtargeting. I can even recommend a documentary movie for those who did not follow the process and are curious: The Great Hack.

The Great Hack

What Should Be Done to Prevent Biases?

Recognizing and accepting biases is the first step to prevention. There are also technical and social methods with this goal. This bias prevention work is called algorithmic hygiene. These are a few of the ways we can ensure algorithmic hygiene:

First, to ensure diversity in data:

  • Trying to test in different environments.
  • Involving people in the execution and evaluation processes and ensuring the follow-up of the results.
  • Paying attention to privacy and security if working with sensitive data.
    Ensuring that certain principles are established on which error rates should be equalized in order to be fair.
  • In order to avoid reinforcing the bias, algorithm developers should seek ways to reduce the differences between groups without compromising the overall performance of the model.

Algorithm hygiene is being developed in international mechanisms that follow these studies.

“There are three big steps to avoiding bias in AI; first is to create an ethical culture, then to ensure transparency, and finally to take action to eliminate exclusion and discrimination, whether in your datasets or algorithms.”

Kathy Baxter

What Mechanisms Are Needed?

In the feasibility studies published by the Artificial Intelligence Ad Hoc Committee (CAHAI) established within the Council of Europe, AI systems were defined as socio-technical systems and it was emphasized that they would be provided with multi-disciplinary and international cooperation in order to effectively create technical, social, and legal measures. There are ongoing international efforts to develop standards for the use of AI and ethical governance. It is important to follow the Artificial Intelligence Principles of the Organization for Economic Cooperation and Development (OECD). These principles are:

Understanding Artificial Intelligence Ethics and Safety
  • Human-centered values and fairness.
  • Privacy and data governance.
  • Transparency and explainability.
  • Robustness, security, and safety.
  • Inclusive growth, sustainable development, and well-being.
  • Accountability.

Dr. Atty. Basak Ozan Ozparlak wrote the following suggestion in her recent publication: “Cooperation between competition law, data protection, and labor law, and effective sanctions in these areas should support each other in order to carry out a fair employment policy while benefiting from artificial intelligence systems is sustainable.”

Increasing awareness of privacy and security to reduce bias will have a significant impact on political incentives and legal requirements. Concrete steps should be taken, such as supporting technical and social research that complies with laws and ethical principles and determining legal responsibilities.

Join 16,000 of your colleagues at Deep Learning Weekly for the latest products, acquisitions, technologies, deep-dives and more.

To put AI ethics into practice: A 12-step guide (WEF)

  1. Justify the choice of introducing an AI-powered service
  2. Adopt a multi-stakeholder approach
  3. Consider relevant regulations and build on existing best practices
  4. Apply risks/benefits assessment frameworks across the lifecycle of AI-powered services
  5. Adopt a user-centric and use-case based approach
  6. Clearly lay out a risk prioritization scheme
  7. Define performance metrics
  8. Define operational roles
  9. Specify data requirements and flows
  10. Specify lines of accountability
  11. Support a culture of experimentation
  12. Create educational resources
Understanding Artificial Intelligence Ethics and Safety

What Can We Do Individually?

It’s commonly said that “algorithms that are great for efficiency, do not yet know what fairness is.”

Justice is a human determination, rather than a mathematical approach, based on shared ethical beliefs. Therefore, algorithmic decisions that could have serious consequences for humans will require human participation. We call this process human inclusion, that is, human-in-the-loop. In short, it is important to ensure human participation.

Understanding Artificial Intelligence Ethics and Safety

I think bringing together experts from various disciplines and industries, including engineering, law, marketing, strategy, and communications will help facilitate accountability standards and strategies to reduce online AI bias.

In addition, I think users should be taught technology literacy in order to give feedback on the systems. Government agencies that regulate bias should also work to increase literacy in AI and other data-centric technologies as part of their mission. The individuals most vulnerable to biased decision-making, both in the public and private sectors, are the users.

Book Suggestion: Weapons of Math Destruction, — Cathy O’Neil

References

Editor’s Note: Heartbeat is a contributor-driven online publication and community dedicated to providing premier educational resources for data science, machine learning, and deep learning practitioners. We’re committed to supporting and inspiring developers and engineers from all walks of life.

Editorially independent, Heartbeat is sponsored and published by Comet, an MLOps platform that enables data scientists & ML teams to track, compare, explain, & optimize their experiments. We pay our contributors, and we don’t sell ads.

If you’d like to contribute, head on over to our call for contributors. You can also sign up to receive our weekly newsletters (Deep Learning Weekly and the Comet Newsletter), join us on Slack, and follow Comet on Twitter and LinkedIn for resources, events, and much more that will help you build better ML models, faster.

--

--

AI Specialist @Digital Transformation Office, Presidency of the Republic of Türkiye | Academics @Bahçeşehir University | http://www.ayyucekizrak.com/