I recently spoke at UCSD CSE’s faculty meeting as part of our “Inclusion Minutes.” This is an activity we have adopted in which a faculty member shares a short experience/story or tips/nuggets on our efforts on enhancing Diversity, Equity, and Inclusion (DEI) in computing. It is presented as a short (<5min) opening talk. The topic I chose was one that has been proving meaningful in practice for me: inclusion in CS examples.

Inclusion and Diversity

When we talk about DEI, we recognize the importance of ensuring that people from all backgrounds can feel welcome and included in accordance with UCSD’s Principles of Community. This includes differences on various attributes of people, including race/ethnicity, sex and gender identity, sexual orientation, disability status, language, national origin, religion, political beliefs, economic status, etc. However, in spite of our best intentions, we often end up unintentionally furthering the erasure and marginalization of historically marginalized and/or underrepresented groups in CS. In this article, I focus on one small but highly visible domain where this issue manifests: examples of people when teaching CS concepts. This is structured as a conversation to provoke introspective thought, not as a prescription.

People in CS Examples

One might wonder what DEI has to do with examples of CS concepts, which are often very technical with heavy doses of mathematics and engineering. The answer is obvious: many CS concepts tackle people-related problems and matter in people-facing settings. For instance, we often use illustrations/icons of users/developers of software in CS examples. Likewise, many CS examples use the name, gender, pronoun, or country of people. But it is a mistake to think that this is a concern for only CS areas that overlap intellectually with social sciences and humanities (e.g., HCI, user interfaces, ubiquitous computing, etc.). I contend that inclusion in examples matters in almost all areas of CS, as well as in data science.

Examples of (Unintentionally) Exclusionary or Indifferent CS Examples

To prove my point, let me walk you through a few technical examples that are at best indifferent and at worst exclusionary/hurtful (likely unintentionally so). These examples are sampled from multiple areas of CS. My intent here is not to cast blame but to raise awareness of the importance of inclusion.

(Copyright Disclaimer: Some of the images below are not owned by me. I did not name their sources for obvious reasons. But if you own one of these images and want me to take it down, let me know and I will oblige.)

1. Machine Learning

Do you see any issues with the above example of binary classification?

Perhaps not at first glance. Most people may not either. But for many people from intersex, transgender, third gender, two-spirit, agender, genderqueer, and other non-binary gender backgrounds, such binary characterization of gender is emblematic of the erasure and discrimination they often face, sometimes combined with violence. Gender binarism is unscientific and anachronistic. Many countries, including India, now legally recognize a third/non-binary gender. How much effort is it really to add a third category instead of perpetuating this false dichotomy of male vs female? Or why not just use truly binary attributes, e.g., clicked vs did not click?

2. Algorithms/Theory

Do you see any issues with this example of the stable matching problem?

This one is likely jarring in today’s world, since same-sex marriage is now legal in the US and many other nations. How do you think LGBT students will feel looking at this example? Although I am now an out gay man, such exclusionary examples in my classes may have reinforced my unconscious yet self-destructive behavior of being closeted for many years. Besides, surely I am not the only one who finds it inappropriate/awkward for faculty to ask students to rank their marriage preferences! What’s next: Algorithms instructors playing “The Bachelor” in class?! :) Why not just use more immediate examples of matchings, e.g., courses and lecture rooms?

3. Databases

This is perhaps more subtle, but do you see any issues with this example of database integrity constraints?

Again, it may not be immediately apparent to most people. But to students with same-sex parents or from single-parent households, such examples can feel like a gut-punch. As if the bullying such kids face from ignorant peers is not enough, CS textbooks also make them feel excluded. Indeed, when I first came across such examples, I myself wondered if it will ever be societally feasible for people like me and my husband to raise kids together. :( Why not just name it Parent1 (and optionally Parent2)? Or use a different example altogether, e.g., Netflix ratings with user and movie IDs? After all, most people like movies or TV shows. :)

4. Cryptography

Finally, do you see any issues with the above illustration of perhaps the most famous “people” in CS examples: Alice and Bob (ok, and Eve too)?

Again, this one is perhaps subtle. While this example is not exclusionary, it is indifferent to a major dimension of hurtful discrimination: colorism (discrimination based on skin tone within a racial/ethnic group). Most Indians are aware of the discriminatory “fair skin good; dark skin bad” bias that permeates Indian society. For instance, a recent “Miss India” contest was derided for overvaluing fair skin; many movie stars tastelessly endorse skin bleaching creams. Colorism also plagues other Asian cultures, African cultures, and even black Americans. How much effort is it really to show at least one of the three being dark-skinned instead of inadvertently perpetuating the bias that only fair-skinned people will get represented by default? Or why not just use abstract icons without skin colors?

Examples of Inclusive and Diverse CS Examples

Phew. To cynical eyes, all this may seem like a no-win situation. How can we possibly represent all groups and their intersections perfectly in CS examples? Someone or the other will surely feel left out. I do not have a perfect solution but only this take: The goal is not perfect representation but to avoid being exclusionary and show inclusive intent. That is it. I suspect this will achieve much of the same DEI benefits because most marginalized groups empathize with each others’ struggles. We should not let the perfect become the enemy of the good.

With the above in mind, I now present some examples of inclusive and diverse CS examples from my own talks and lectures. This is not to “toot my own horn” or “get on my high horse” but only because I am able to talk about the intent behind them and their impact. I am sure there will be more and better examples from others.

1. My Academic Job Talk

This is from my 2016 academic job talk (video from UCSD CSE). In my research, I often interact with data scientists and software/ML/data engineers. Thus, I use people icons to represent them in my talks. In this case, I consciously chose to use female (or feminine-presenting) and androgynous icons with different skin tones. Not a groundbreaking DEI effort but still noticed by some in the audience; a senior professor at a school I interviewed later praised me in our face-to-face meeting for my choice and also joked: “You have single-handedly raised the representation of women in CS slides!”

From the same talk, note the Gender attribute where I used an “Other” category beyond just “Female” and “Male”. This was no extra effort for me. But it warmed my heart when I later received emails from two people — a gay man and a transgender person — thanking me for this representation instead of perpetuating gender binarism. So, even seemingly trivial efforts in CS examples can make a difference to the experiences of people from marginalized/underrepresented groups in CS. Later at UCSD, I also learned from our LGBT Resource Center that California now has a law recognizing a “Nonbinary” gender category (abbreviation ‘X’ akin to ‘M’/‘F’). So, “Nonbinary” would be more appropriate than “Other” here. Many people also identify with no particular gender (“Agender”).

2. My Database Systems Class

This is an example for a classical task at the intersection of databases and ML/AI called “entity matching,” wherein one disambiguates multiple records of real-world entities, often people. The person’s name is obviously a major signal for this prediction task. Instead of reusing overused names (“Joe”, “John”, “Alice”, etc.), I chose the name “Aisha Williams”. Why? See the Google image search results below.

Sticking with examples from data integration, I chose a geographically diverse set of names for the above question. My students also grasped such diversity easily because, well, over three-quarters of the students in this specific class were international graduate students from China and India. :)

Finally, one last example from my course is on a perhaps under-appreciated dimension of inclusion: political beliefs. Most people in CS and academia, certainly in the US, are liberal-leaning. So, people with conservative views may feel sidelined. As someone who identifies as a freethinker (not the left-right dichotomy), I think educators have a duty to be inclusive to people of all political beliefs as long as they abide by our Principles of Community. So, when I explain the buzzword “Big Data” to my class, I draw analogies to both “Big Oil” (a term popular on the left) and “Big Government” (a term popular on the right) to clarify that the “Big” refers to not just size (a common misconception) but connotations of undue power and complexity.

Takeaways and Caveats

Overall, I hope this conversation on exclusionary/indifferent vs inclusive/diverse CS examples is helpful to people interested in DEI in CS. This is admittedly a small and low-effort part of the enormous work we have to do to improve the state of DEI in CS. But as the cliche goes, little drops of water do make a mighty ocean. :) My key takeaways are two-fold:

  1. Use CS examples that make more people feel included and seen/heard, especially historically marginalized and/or underrepresented groups.
  2. Be vigilant and not (unintentionally) exclusionary in your CS examples.

If you do plan to adopt and build upon some of the ideas I moot above, please also be mindful of two key caveats:

  1. Guard against the dangers of stereotyping, since it can be counterproductive and alienate the very groups you hope to help.
  2. If you do plan to use names from languages/cultures unfamiliar to you, put in the effort to pronounce/spell/remember them accurately. In other words, do not end up being a sorry spectacle like John Travolta! :)

If you have more examples of exclusionary/indifferent or inclusive/diverse CS examples that you have encountered, please do post them in a comment and let me know your thoughts.

ACKs: Thank you to some of my faculty colleagues for encouraging me to blog about my talk. Thank you to my spouse for reviewing my first draft. Thank you to Shaun Travers of the UCSD LGBT Resource Center for helpful feedback on this post. All opinions expressed here are entirely my own and do not necessarily reflect the views of any of these people, CSE, or UCSD.

Associate Professor at UC San Diego CSE and HDSI. Research on data management and machine learning systems. Freethinker. Poet. Memester. Gay. He/him.