We conducted a between-group user study with 10 participants. They were randomly divided into two groups, a control and a treatment. We gave the participants a code base that they were not familiar with, and asked them to answer questions about the history of the code base. The participants in the control group used only EGit, and those in the treatment group used only Tempura to explore code history.
The study involved two sessions, with the entire study lasting about 1 hour. During the first session, participants were given the a version of LANSimulation project (adopted and modified from the original LANSimulation project from the Refactoring Lab Session exercise developed at LORE), and asked to study and understand the code base in 15 minutes. In the second session immediately following the first, participants were given the same project that has undergone 20 revisions.
Changes made to LANSimulation project |
1. Encapsulate fields in Message class 2. Encapsulate fields in Node class 3. Non-code changes 4. Extract a new method called log in Network class 5. Rename Message class to Packet 6. Non-code changes 7. Inline printAccounting method in Network class 8. Add getter and setter methods for firstNode field, and getter method for workstations field in Network class 9. Move DefaultExample method from Network class to LANSimulation class 10. Move log method from Network class to Node class 11. Move printDocument method from Network class to Node class 12. Non-code changes 13. Add Printer and Workstation classes that extend Node class, and remove type field from Node class 14. Extract isAtDestination method in Network class 15. Rename printDocument method in Node class to printJobStatus 16. Add LANSimulationUtil.jar that contains NetworkPrinter hierarchy, and deprecate previous print methods in Network class 17. Fix assertEquals calls in LANTests class 18. Clean up try-catch statements in LANTests class 19. Non-code changes 20. Add empty test methods for testing simple, XML, and HTML print functions |
Participants answered a set of questions regarding the changes. Both groups were given a written user guide (EGit, Tempura) for the tools they used prior to the the user study, and were also allowed to refer to the user guides at any point during the user study.
Questions given to user study subjects |
1. What happened to the Message class? 2. What happened to the private Network.printAccounting method? 3. What happened to the Network.printDocument method? 4. Can you identify any other methods that were previously defined in Network class? 5. What are the changes made to/in the Node class? 6. Implement the bodies of testPrint, testHTMLPrint, and testXMLPrint methods in the LANTests class |
Grading rubric |
1. Renamed to Packet (2pts), Other changes (1pt) 2. Inlined (2pts), Other changes (1pt) 3. Moved from Network to Node (1pt), Renamed (1pt) 4. DefaultExample (1pt), log (1pts) 5. Encapsulation (1pt), log from Network (1pt), Added NetworkPrinter hierarchy (1pt), printDocument from Network (1pt), printJobStatus (renamed from printDocument) (1pt) 6. Implement test methods using NetworkPrinter classes (2pt per test method) |
Maximum possible score: 21 |
The results were scored following the grading rubric (shown above). We also measured the time it took for participants
to answer the questions. More specifically, each user study session was recorded using a screencast software,
and the recordings were analyzed after the study to determine the time that participants spent using the
designated tools. The usage of the tools were marked by any window or interface of the tools being in focus.
We concentrated on the tool usage time as opposed to the time it took for participants to finish the user study
in order to eliminate as much variables as possible, for example, participants' experiences with Eclipse and
speed of programming. We also calculated the rate of information acquirement by dividing the raw score by tool
usage time, to obtain a more precise indication of how eciently the tools help developers gain understanding of
code history.
On average, the participants using Tempura scored 17pp higher than the participants using EGit.
Participants also used Tempura for shorter period of time than EGit, suggesting that they were able to
learn about code history more quickly with Tempura. The higher average rate of information acquirement
for participants using Tempura also corroborates this conjecture. Participants using Tempura showed
50% higher efficiency in terms of rate of information acquirement than participants using EGit.
Control group, using EGit
Participant | Score (%) | Time (s) | Score per Min. | Answers | Video* |
C1 | 28.6 | 1387 | 0.26 | answers | C1.mov |
C2 | 57.1 | 13.41 | 0.54 | answers | C2.mov |
C3 | 42.9 | 961 | 0.56 | answers | C3.mov |
C4 | 52.5 | 797 | 0.83 | answers | C4.mov |
C5 | 57.1 | 1521 | 0.47 | answers | C5.mov |
Average: | 47.6 | 1201 | 0.53 |
Treatment group, using Tempura
Participant | Score (%) | Time (s) | Score per Min. | Answers | Video* |
T1 | 42.9 | 595 | 0.91 | answers | T1.mov |
T2 | 66.7 | 1202 | 0.70 | answers | T2.mov |
T3 | 66.7 | 1393 | 0.60 | answers | T3.mov |
T4 | 76.2 | 1384 | 0.69 | answers | T4.mov |
T5 | 71.4 | 843 | 1.07 | answers | T5.mov |
Average: | 64.8 | 1083 | 0.79 |