109 lines
		
	
	
		
			3.3 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
			
		
		
	
	
			109 lines
		
	
	
		
			3.3 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
| Overhead calculation
 | |
| --------------------
 | |
| The overhead can be shown in two columns as 'Children' and 'Self' when
 | |
| perf collects callchains.  The 'self' overhead is simply calculated by
 | |
| adding all period values of the entry - usually a function (symbol).
 | |
| This is the value that perf shows traditionally and sum of all the
 | |
| 'self' overhead values should be 100%.
 | |
| 
 | |
| The 'children' overhead is calculated by adding all period values of
 | |
| the child functions so that it can show the total overhead of the
 | |
| higher level functions even if they don't directly execute much.
 | |
| 'Children' here means functions that are called from another (parent)
 | |
| function.
 | |
| 
 | |
| It might be confusing that the sum of all the 'children' overhead
 | |
| values exceeds 100% since each of them is already an accumulation of
 | |
| 'self' overhead of its child functions.  But with this enabled, users
 | |
| can find which function has the most overhead even if samples are
 | |
| spread over the children.
 | |
| 
 | |
| Consider the following example; there are three functions like below.
 | |
| 
 | |
| -----------------------
 | |
| void foo(void) {
 | |
|     /* do something */
 | |
| }
 | |
| 
 | |
| void bar(void) {
 | |
|     /* do something */
 | |
|     foo();
 | |
| }
 | |
| 
 | |
| int main(void) {
 | |
|     bar()
 | |
|     return 0;
 | |
| }
 | |
| -----------------------
 | |
| 
 | |
| In this case 'foo' is a child of 'bar', and 'bar' is an immediate
 | |
| child of 'main' so 'foo' also is a child of 'main'.  In other words,
 | |
| 'main' is a parent of 'foo' and 'bar', and 'bar' is a parent of 'foo'.
 | |
| 
 | |
| Suppose all samples are recorded in 'foo' and 'bar' only.  When it's
 | |
| recorded with callchains the output will show something like below
 | |
| in the usual (self-overhead-only) output of perf report:
 | |
| 
 | |
| ----------------------------------
 | |
| Overhead  Symbol
 | |
| ........  .....................
 | |
|   60.00%  foo
 | |
|           |
 | |
|           --- foo
 | |
|               bar
 | |
|               main
 | |
|               __libc_start_main
 | |
| 
 | |
|   40.00%  bar
 | |
|           |
 | |
|           --- bar
 | |
|               main
 | |
|               __libc_start_main
 | |
| ----------------------------------
 | |
| 
 | |
| When the --children option is enabled, the 'self' overhead values of
 | |
| child functions (i.e. 'foo' and 'bar') are added to the parents to
 | |
| calculate the 'children' overhead.  In this case the report could be
 | |
| displayed as:
 | |
| 
 | |
| -------------------------------------------
 | |
| Children      Self  Symbol
 | |
| ........  ........  ....................
 | |
|  100.00%     0.00%  __libc_start_main
 | |
|           |
 | |
|           --- __libc_start_main
 | |
| 
 | |
|  100.00%     0.00%  main
 | |
|           |
 | |
|           --- main
 | |
|               __libc_start_main
 | |
| 
 | |
|  100.00%    40.00%  bar
 | |
|           |
 | |
|           --- bar
 | |
|               main
 | |
|               __libc_start_main
 | |
| 
 | |
|   60.00%    60.00%  foo
 | |
|           |
 | |
|           --- foo
 | |
|               bar
 | |
|               main
 | |
|               __libc_start_main
 | |
| -------------------------------------------
 | |
| 
 | |
| In the above output, the 'self' overhead of 'foo' (60%) was add to the
 | |
| 'children' overhead of 'bar', 'main' and '\_\_libc_start_main'.
 | |
| Likewise, the 'self' overhead of 'bar' (40%) was added to the
 | |
| 'children' overhead of 'main' and '\_\_libc_start_main'.
 | |
| 
 | |
| So '\_\_libc_start_main' and 'main' are shown first since they have
 | |
| same (100%) 'children' overhead (even though they have zero 'self'
 | |
| overhead) and they are the parents of 'foo' and 'bar'.
 | |
| 
 | |
| Since v3.16 the 'children' overhead is shown by default and the output
 | |
| is sorted by its values. The 'children' overhead is disabled by
 | |
| specifying --no-children option on the command line or by adding
 | |
| 'report.children = false' or 'top.children = false' in the perf config
 | |
| file.
 | 
