Measuring the Impact of Negative Language on FOSS Participation (Part I)

A recent academic paper showed that there were clear differences in the communication styles of two of the top Linux kernel developers (“Differentiating Communication Styles of Leaders on the Linux Kernel Mailing List”). One leader is much more likely to say “thank you” while the other is more likely to jump into a conversation with a “well, actually”.

Many open source contributors have stories of their patches being harshly rejected. Some people are able to “toughen up” and continue participating, and others will move onto a different project. The question is, how many people end up leaving a project due to harsh language? Are people who experience positive language more likely to contribute more to a project? Just how positive do core open source contributors need to be in order to attract newcomers and grow their community? Which community members are good at mentoring newcomers and helping them step into leadership roles?

I’ve been having a whole lot of fun coming up with scientific research methods to answer these questions, and I’d like to thank Mozilla for funding that research through their Participation Experiment program.
words

How do you measure positive and negative language?

The Natural Language Processing (NLP) field tries to teach computers to parse and derive meaning from human language. When you ask your phone a question like, “How old was Ada Lovelace when she died?” somewhere a server has to run a speech to text algorithm. NLP allows that server to parse the text into a subject “Ada Lovelace” and other sentence parts, which allows the server to respond with the correct answer, “Ada Lovelace died at the age of 36”.

Several open source NLP libraries, including the Natural Language Toolkit (NLTK) and Standford CoreNLP also include sentiment analysis. Sentiment analysis attempts to determine the “tone” and objectiveness of a piece of text. I’ll do more of a deep dive into sentiment analysis next month in part II of this blog post. For now, let’s talk about a more pressing question.
wocintech (microsoft) - 62

How do you define open source participation?

On the surface, this question seems so simple. If you look at any github project page or Linux Foundation kernel report or Open Stack statistics, you’ll see a multitude of graphs analyzing code contribution statistics. How many lines of code do people contribute? How frequently? Did we have new developers contribute this year? Which companies had the most contributions?

You’ll notice a particular emphasis here, a bias if you will. All these measurements are about how much code an individual contributor got merged into a code base. However, open source developers don’t act alone to create a project. They are part of a larger system of contributors that work together.

In order for code or documentation to be merged, it has to be reviewed. In open source, we encourage peer review in order to make sure the code is maintainable and (mostly) free of bugs. Some reports measure the work maintainers do, but they often lack recognition for the efforts of code reviewers. Bug reports are seen as bad, rather than proof that the project is being used and its features are being tested. People may measure the number of closed vs open bug reports, but very few measure and acknowledge the people who submit issues, gather information, and test fixes. Open source projects would be constantly crashing without the contribution of bug reporters.

All of these roles (reviewer, bug reporter, debugger, maintainer) are valuable ways to contribute to open source, but no one measures them because the bias in open source is towards developers. We talk even less about the vital non-coding contributions people do (conference planning, answering questions, fund raising, etc). Those are invaluable but harder to measure and attribute.

For this experiment, I hope to measure some of the less talked-about ways to contribute. I would love to extend this work to the many different contributions methods and different tools that open source communities use to collaborate. However, it’s important to start small, and develop a good framework for testing hypothesis like my hypothesis about negative language impacting open source participation.

does it measure up?

How do you measure open source participation?

For this experiment, I’m focusing on open source communities on github. Why? The data is easier to gather than projects that take contributions over mailing lists, because the discussion around a contribution is all in one place, and it’s easy to attribute replies to the right people. Plus, there are a lot of libraries in different languages that provide github API wrappers. I chose to work with the github3.py library because it still looked to be active and it had good documentation.

Of course, gathering all the information from github isn’t easy when you want to do sentiment analysis over every single community interaction. When you do, you’ll quickly run into their API request rate limit of 5,000 requests per hour. There are two projects that archive the “public firehose” of all github events: http://githubarchive.org and http://ghtorrent.org However, those projects only archive events that happened after 2011 or 2012, and some of the open source communities I want to study are older than that. Plus, downloading and filtering through several terabytes of data would probably take just as long as slurping just the data I need through a smaller straw (and would allow me to avoid awkward conversations with my ISP).

For my analysis, I wanted to pull down all open and closed issues and pull requests, along with their comments. For a community like Rust, which has been around since 2010, their data (as of a week or two ago) looks like this:

  • 18,739 issues
  • 18,464 pull requests
  • 182,368 comments on issues and pull request
  • 31,110 code review comments

Because of some oddities with the github API (did you know that an issue json data can be for either an issue or a pull request?), it took about 20 hours to pull down the information I need.

I’m still sorting through how exactly I want to graph the data and measure participation over time. I hope to have more to share in a week!

*Edit* The code is available on github, and the reports for various open source communities are also available.

5 thoughts on “Measuring the Impact of Negative Language on FOSS Participation (Part I)

  1. I suspect Github would be willing to increase your API quotas for a project like this.

    Awesome work, thank you!

    1. Yep, I’ve already talked to github about an API limit increase. At this point, I have all the data I need for the small set of communities I want to study.

  2. I wrote an 7-piece series of articles against verbal abuse and Torvalds for root.cz, a Linux and Open Source specialized website. I am a co-author of the Twibright Links browser and author of the opensource Ronja wireless optical link. I also cite this study there. They published 3 articles out of 7 and said they will stop it. I published 33 articles in this company and they never ever refused my article. These articles about verbal abuse have very high comment counts and I believe they are quality written, all arguments and facts based on citations from studies by renowned sources. Now they wasted my money for 4 royalties. I get impression the verbal abuse culture is really ingrained in the IT.

    I am arguing in the articles that verbal abuse is violence, that bullying is endemic in the IT, that verbal abuse causes both physical and psychiatrical damage to the human brain (studies cited are supported by NIMH, the biggest psychiatry research centre in the world). I am giving examples where it leads to loss of productivity and team damage on large OS projectests – Sarah Sharp leaving Linux, The de Raadtt being kicked out of NetBSD.

    https://www.root.cz/serialy/dopousti-se-linus-torvalds-verbalniho-zneuzivani/

    I think these articles are one of the best I ever wrote for this company and they are talking about them like if they were low quality and that I “can write better ones”. Especially my articles about verbal abuse are referencing scientific studies constantly and have a clear structure of introduction, definitions, examples of harm, and proposed solution, so they are kinda in the style of a scientific study article.

    They are claiming the readers are receiving the articles negatively (no wonder – so many perpetrators among them, cf. prevalence of bullyin in IT). I don’t read the reactions and comments because they are full of verbal abuse themselves.

    I perceive this as some kind of censorship and unfair rejection. The article is apparently being rejected because it shows a reality perpetrators of verbal abuse don’t want to see.

    I am disgusted by this.

Comments are closed.