Exercise 1.3: Identifying a performance bottleneck
Before you begin, you must complete Exercise 1.2 Collecting performance and coverage data.
Performance bottlenecks are areas within the code that slow down or halt execution. In this procedure, you use Performance Call Graph View to identify a bottleneck within the sort program.
To find bottlenecks:
- In the Profiling Monitor, right-click the Profiling resource, and then select Open With > Performance Call Graph.
The Performance Call Graph view, by default, displays the following information:
- The 20 most time-consuming nodes in the profiling run, plus the Process node that represents the cumulative time of the entire run. In our case, according to the Visible statistics on the status line, there is a total of only 17 nodes, and all 17 are displayed. A node can represent a method, a process, or a thread.
- The dynamic call structure of the program during the profiling run, shown by connecting lines (arcs) linking the nodes. Thicker lines indicate the more time-consuming call paths.
Tip: Right-click a node to display a menu that allows you to focus the display on the node and its descendants (the Subtree), or to manipulate the display in other ways. These menu commands allow you to simplify the mass of data that you collect for even a small application.
- Note that in the Highlight field, located above the graph, Max Path to Root is selected.
Max Path to Root highlighting shows you the single most time-consuming call path in the current run of the application. Specifically, it changes the call graph display in the following ways:
- The node in your program that consumed the most time is selected. In our example, this node is the method quick in class Sort.
- The call path from the selected node to the Process node, which represents the total time for the entire run, is highlighted.
- Note that there is also a bsort method in the call graph, representing the bubble sort algorithm. It's clear from the thickness of the lines that bsort performed better than qsort in this run.
Getting additional performance information
Both the Performance Call Graph view and other views in the platform provide additional details about your application's performance.
You can get additional performance data in the following ways:
- To learn more about how Sort.quick performed, pause your cursor over the Sort.quick node. A tool-tip appears with statistics for the method. Note that the method makes a many calls.
- To get a detailed graphical display of the data for the method, double-click the node. The Method Detail view opens.
- In the Method Detail view, look at the Callers pane. Note that the method is called by Sort.Qsort once, but that it calls itself several thousand times. This in itself is not suspicious; a quick sort is typically heavily recursive. However, the relatively large amount of time for the calls is suspicious.
- To examine a sortable list of all methods, right-click the profiling resource and select Open With > Method Statistics.
Click the column header Base Time to sort the methods according to the amount of time spent within each method during the current run. You can see that quick is considerably slower than the bubble sort method bsort.
You have now verified that the quick method consumes more time that was expected. You have also seen how to get performance information from the Performance Call Graph and Method Detail views. This information will be important when you inspect the code and figure out where you need to make modifications.
Before inspecting the code, however, we should also find out if there were any methods in the code that were not executed. This check will give us a better understanding of the scope of the application, and will also indicate whether there are any alternative paths this program might contain.
You are ready to begin Exercise 1.4: Checking code coverage, to determine if there are any unexecuted methods and alternative paths.