Deciphering the Story of Software Development through Frequent Pattern Mining

Nicolas Bettenburg and Andrew Begel

Queen's University, Canada; Microsoft Research, USA

Track: New Ideas and Emerging Results
Session: Supporting Tomorrow's Developer
Software teams record their work progress in task repositories which often require them to encode their activities in a set of edits to field values in a form-based user interface. When others read the tasks, they must decode the schema used to write the activities down. We interviewed four software teams and found out how they used the task repository fields to record their work activities. However, we also found that they had trouble interpreting task revisions that encoded for multiple activities at the same time. To assist engineers in decoding tasks, we developed a scalable method based on frequent pattern mining to identify patterns of frequently co-edited fields that each represent a conceptual work activity. We applied our method to our two years of our interviewee's task repositories and were able to abstract 83,000 field changes into just 27 patterns that cover 95% of the task revisions. We used the 27 patterns to render the teams' tasks in web-based English newsfeeds and evaluated them with the product teams. The team agreed with most of our patterns and English interpretations, but outlined a number of improvements that we will incorporate into future work.