It’s fascinating to see bots show up in the contributor statistics. For example, if you look at github users who comment on issues the Rust community, you’ll quickly notice two contributors who interact a lot:
bors is a bot that runs pull requests through the rust continuous integration test suite, and automatically merges the code into the master branch if it passes. bors responds to commands issued in pull request comments (of the form’@bors r+ [commit ID]’ by community members with permission to merge code into rust-lang/rust.
rust-highfive is a bot that recommends a reviewer based on the contents of the pull request. It then add a comment that tags the reviewer, who will get a github notification (and possibly an email, if they have that set up).
Both bots have been set up by the Rust community in order to make pull request review smoother. bors is designed to cut down the amount of time developers need to spend running the test suite on code that’s ready to be merged. rust-highfive is designed to make sure the right person is aware of pull requests that may need their experienced eye.
But just how effective are these github bots? Are they really helping the Rust community or are they just causing more noise?
Chances of a successful pull request
bors merged its first pull request on 2013-02-02. The year before bors was introduced, only 330 out of 503 pull requests were merged. The year after, 1574 out of 2311 pull requests were merged. So the Rust community had four times more pull requests to review.
Assuming that the tests bors used were some of the same tests rust developers were running manually, we would expect that pull requests would be rejected at about the same rate (or maybe rejected more, since the automatic CI system would catch more bugs).
To test that assumption, we turn to a statistics method called the Chi squared test. It helps answer the question, “Is there a difference in the success rates of two samples?” In our case, it helps us answer the question, “After bors was used, did the percentage of accepted pull requests change?”
It looks like there’s no statistical difference in the chances of getting a random pull request merged before or after bors started participating. That’s pretty good, considering the number of pull requests submitted quadrupled.
Now, what about rust-highfive? Since the bot is supposed to recommend pull request reviewers, we would hope that pull requests would have a higher chance of getting accepted. Let’s look at the chances of getting a pull request merged for the year before and the year after rust-highfive was introduced (2014-09-18).
So yes, it does seem like rust-highfive is effective at getting the right developer to notice a pull request they need to review and merge.
Impact on time a pull request is open
One of the hopes of a programmer who designs a bot is that it will cut down on the amount of time that the developer has to spend on simple repetitive tasks. A bot like bors is designed to run the CI suite automatically, leaving the developer more time to do other things, like review other pull requests. Maybe that means pull requests get merged faster?
To test the impact of bors on the amount of time a pull request is open, we turn to the Two-means hypothesis test. It tells you whether there’s a statistical difference between the means of two different data sets. In our case, we compare the length of time a pull request is open. The two populations are the pull requests a year before and a year after bors was introduced.
We would hope to see the average open time of a pull request go down after bors was introduced, but that’s not what the data shows. The graph shows the length of time actually increased, with an increase of 1.1 days.
What about rust-highfive? We would hope that a bot that recommends a reviewer would cause pull requests to get closed sooner.
The graph shows there’s no statistical evidence that rust-highfive made a difference in the length of time pull requests were open.
These results seemed odd to me, so I did a little bit of digging to generate a graph of the average time a pull request is open for each month:
The length of time pull requests are open has been increasing for most of the Rust project history. That explains why comparing pull request age before and after bors showed an increase in the wait time to get a pull request merged. The second line shows the point that rust-highfive was introduced, and we do see a decline in the wait time. Since the decrease is almost symmetrical with the increase the year before, the average was the same for the two years.
What can we conclude about github bots from all this statistics?
We can prove with 99% confidence that adding the bors bot to automatically merge changes after it passed the CI tests had no impact on the chances of a random pull request getting merged.
We can prove with 99% confidence that rust-highfive increases a Rust developer’s chances of getting code merged, by as much as 11.7%. The bot initially helped lower the amount of time developers had to wait for their pull requests to be merged, but something else changed in May 2015 that caused the wait time to increase again. I’ll note that Rust version 1.0 came out on May 2015. Rust developers may have been more cautious about accepting pull requests after the API was frozen or the volume of pull requests may have increased. It’s unclear without further study.
This is awesome, can I help?
If you’re interested in metrics analysis for your community, please leave a note in the comments or drop an email to my consulting business, Otter Tech. I could use some help identifying the github usernames for bots in other communities I’m studying:
- angular.js – bug report and metrics
- .net – bug report and metrics
- elm – bug report and metrics
- react – bug report and metrics
- fsharp – bug report and metrics
- idris – bug report and metrics
- jquery – bug report and metrics
- node.js – bug report and metrics
- ruby on rails – bug report and metrics
- bootstrap – bug report and metrics
This blog post is part of a series on open source community metrics analysis: