Dr. Jahna Otterbacher (female) is an Assistant Professor at the Open University of Cyprus, where she is the academic coordinator of the M.Sc. in Social Information Systems. She received her doctorate in Information from the University of Michigan (Ann Arbor – USA), where she was a member of the Computational Linguistics and Information Retrieval (CLAIR) research group. She previously served as Assistant Professor in the Lewis College of Human Sciences at the Illinois Institute of Technology (Chicago, USA) (2010-2012) and Visiting Lecturer of Management Information Systems at the University of Cyprus (2006-2009).
Dr. Otterbacher’s research and teaching interests lie at the intersection of social computing, communication science and data science. She analyses behavioural and language traces left by users of information systems in order to better facilitate their interactions with others, as well as their access to information.
Given the recent acceptance for publication and presentation of a new paper authored by Assistant Professor Jahna Otterbacher (Cyprus Center for Algorithmic Transparency, Open University of Cyprus), Dr. Styliani Kleanthous (Cyprus Center for Algorithmic Transparency,and RISE), Pınar Barlas and Kyriakos Kyriakou, from RISE at The 7th AAAI Conference on Human Computation and Crowdsourcing (HCOMP 2019 Oct 28–30 in Washington State) – we asked Dr. Otterbacher to share with us what drove her to choose this area of focus for her work. In the following text she explains the changes in frequency in which we come across biases today in comparison to a decade ago and what she hopes to see as the result of her work.
For the full version of Dr. Otterbacher’s latest published paper titled “How Do We Talk About Other People?Group (Un)Fairness in Natural Language Image Descriptions” will be available online soon and will be linked to this blog post.
Question: What event motivated you to further explore this area of focus?
Answer:
My motivation for studying the social biases in algorithmic systems actually came from my personal life. Several years back, my 8-year-old announced to me one afternoon that her school had hired a new “doctor,” and pointed him out to me in the school yard (the previous nurse, a woman, had recently retired). Probing her a bit on this, I realized that because he was a man, she assumed that he was a doctor. That triggered me into thinking about all of the images in the media that perpetuate social stereotypes in one way or another. My PhD was in the area of information retrieval, so I initially started looking into how search engines like Google might inadvertently reinforce gender- and race-based stereotypes through the results returned to users. For instance, I looked at image searches, and the way that results were gendered for queries on character traits such as “sensitive person” (heavily skewed toward returning images of children and women) or “intelligent person” (heavily skewed towards images of men). It’s easy to find such examples, but one research challenge is to develop a way to systematically measure biases, so that we can make meaningful comparisons and/or track them over time. Another challenge is to bring into the equation the behavior of the user, since user behavior feeds back into the engine’s behavior (through personalization algorithms). [1]
Question: How often would you come across biases in the past 10 years in comparison to today?
Answer:
Much more often today, because of the increasing influence of machine learning and “Big Data” in our everyday lives. Big Data is full of biases, and not only just social biases. For instance, we can consider reporting bias in social media. We’re much more likely to report something unusual or impressive, than we are to mention something mundane. So, if a machine tried to build a model of “an average human day,” by collecting and processing social media posts, it could end up missing some boring details, which are still essential to accurately reflect a typical day. Another consideration that is especially relevant to social media as a data source, is that the population of social media users is not representative of the population as a whole (e.g., the very old / young are underrepresented).
Question: What can you tell us about the recent work of the TAG MRG?
Answer:
This year, with the TAG MRG, we have been studying computer vision algorithms, which are used extensively today in multimedia applications. Many of the big Internet companies (e.g., Google, Microsoft, Amazon) are offering their algorithms as services; a developer can upload a set of photos and within seconds, obtain textual descriptions of the image content, and then facilitate various functionality within the app being developed (e.g., search/tagging of images, recognizing one’s friends in them, modeling users’ aesthetic preferences, etc.) However, no one knows how the algorithms work; they are proprietary, complex “black boxes.” When processing images that depict people – particularly in light of GDPR – we need to ensure that the descriptions are not inaccurate or discriminatory towards particular individuals or social groups. In some initial experiments for example, we found that images of Black people are less likely to be described by the algorithms as being physically attractive, as compared to Whites or Asians. [2]
So again, the challenge is to develop systematic procedures for auditing system behaviors, which can be replicated over time and/or by other researchers. These audits often take into consideration whether the system respects human values, such as group fairness, meaning that minority groups are treated no differently than the majority, or the population as a whole. [3]
However, detecting discriminatory behaviors of algorithms is not enough. Of course, we’d also like to know the source of the bias. To this end, we are also active in crowdsourcing research. Most of the important datasets used worldwide for training and evaluating computer vision algorithms (e.g., ImageNet, MS COCO, etc.) have been built using online crowdsourcing platforms like Amazon Mechanical Turk or FigureEight, where one can hire, train, and pay “workers” to annotate data. For instance, someone could be given an image or a video, and asked to describe the content of that video, its sentiment, or some facts of interest (e.g., how many objects are pictured, or the dominant color).
We have published two articles at the leading international conference on human computation and crowdsourcing (AAAI HCOMP) that examine the kinds of systematic biases that we find when crowd workers are asked to describe images of people, in their own words. In the 2018 experiments, I showed that, as expected, it’s very easy to manipulate workers’ behaviors. In particular, when the task instructions and set-up convey more information about the audience (for whom the data is intended), they are more likely to give interpretive yet stereotypical descriptions, rather than providing a strict description of the image. This means that we end up with training datasets that reflect our own social biases and the stereotypes about women and minorities that are prevalent in society. [4]
In October, with the TAG team, we’ll present a second article, which builds on the previous work, but is more focused on measuring group fairness in the descriptions that crowd workers provide on images of people.
Question: What changes do you hope to generate with your work and to what extent?
Answer:
Intelligent systems are socio-technical systems, they are not strictly technical in nature. Every step of the development and evaluation process involves human judgment. Training and evaluation datasets attempt to capture some aspects of the state-of-the-world, and learning mechanisms are applied to create a model. However, given the diversity of the world, its complexity and messiness, it is unsurprising that algorithmic systems reflect human biases.
In other words, while we can take measures to reduce the perpetuation of social biases in algorithmic systems and data, we certainly cannot eliminate them! So, I hope that our work will help various groups of people (end-users, developers, policy makers, educators, etc.) understand that today we live in an ecosystem of smart technologies, and that we all have a responsibility to develop a healthy relationship with them.
We would like to thank Dr. Otterbacher for her time and valuable information shared with us!
Don’t forget to bookmark this blog, email to a friend/colleague or sign up to get instantly notified on every new blog post. Of course, we’d love getting your feedback on RISE and this blog — and what you want to see discussed (and showcased) in the future.
Relevant Publications:
[1] [https://zenodo.org/record/2670015#.XXETAC2B2fV ]
[2] [https://zenodo.org/record/3333361#.XXEVny2B2fU ]
[3] [https://zenodo.org/record/3326950#.XXEUzi2B2fU ]
[4] [https://zenodo.org/record/2670019#.XXEbDC2B2fU ]