Using Gender Data to Encourage Expression

Dec 3, 2014   |   Matt Stempeck

I ended November on a high note because I got to spend a day and a half working with a wide range of amazing people to: 1) creatively use data 2) to work to ensure people who identify as female have a more equal voice online and in the media. Microsoft New York hosted a gender data workshop in our Times Square offices where we got to work connecting with one another to identify common challenges and new opportunities to work together.

Women are underrepresented in corporate leadership, Wikipedia, the technology industry, and in journalism (as sources, writers, op-ed contributors, and obituary receivers). But there’s really great work being done to leverage technology and data, in addition to advocacy, to improve the gender balance* in each of these arenas. That’s why we were thrilled to host a workshop event bringing these designers, advocates, developers, and journalists together.

For his Master’s thesis, my MIT Media Lab colleague Nate Matias built a variety of carefully designed technology interventions to computationally measure and then improve representation of women in new and traditional media. You can read Nate’s thesis here. As part of that work, Nate used the OpenGenderTracker library to automatically analyze the gender of names used in newspaper archives, Twitter followings, and other corpora. In his research and conversations, he got plugged into existing communities using or hoping to use gender data towards the same goals. Nate being the most diligent human being I know, he worked with Willow Brugh, the Knight Foundation, and DataKind to convene an incredible group of people to collaborate in our Times Square offices (although this was potentially an elaborate General Exams procrastination technique.

Participants represented a wide range of backgrounds: academics, journalists, online communities, activists, and various combinations of each. Participants were united by a common cause that transcended field-specific jargon and silos. Many, if not all, of the projects are too early to announce here, but I’d expect to see some exciting work in the future that:

  • help institutions conduct internal analyses to measure which genders are being heard and published
  • measure reactions to women’s contributions to a variety of user-generated communities
  • intervene in a variety of creative and effective ways to online harassment
  • plot the trajectories of successful female influencers with an eye towards replicating inflection points like mentorships and fellowships
    • The Op-Ed Project is leading this initiative and will be using MediaCloud‘s media analysis tools. MediaCloud has been used by Yochai Benkler and yours truly to reverse-engineer the growth of major news stories like SOPA / PIPA and Trayvon Martin, respectively. This project hopes to reverse-engineer the societal communications success of female thoughtleaders in order to create more of them.

On the morning of the second day, Nate told us an inspirational story about Microsoft’s bathroom signs. Specifically, the active accessibility icon. This update to the passive wheelchair icon was created by Sara Hendren, a graduate student at the time, working with stencils donated by a small community of support. What started as a guerilla street art project targeting existing signs eventually became New York state law for all new signage. Sara’s work shows that small, principled projects with very little budget have the potential to take off and create huge impact. I wouldn’t be at all surprised to see one of the projects initiated at this workshop take off in a similar way.

*Unlike code, gender is not a binary (and maybe code won’t always be, either). Unfortunately many of the gender data projects we see have to treat gender as simply “male” or “female”. Sometimes this is because we’re working with limited data that restricted a continuum of options to these two options. Companies like Facebook have begun offering their users many more options, which in turn improves the resolution of the data with which researchers and designers work.

Tags: , , , , , , , , , , , ,