azakai's blog: massdiff - Diff for Massif Snapshots

Friday, March 18, 2011

massdiff - Diff for Massif Snapshots

Massif (part of Valgrind) is a super-useful tool to find out how a program allocates memory. It gives very detailed graphs about exactly which lines of code do allocations that are not free'd later. For example,

96.42% (22,771,482B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.

->39.75% (9,388,032B) 0x8058AC1: arena_malloc_large (jemalloc.c:3831)

| ->39.75% (9,388,032B) 0x8058DB9: arena_malloc (jemalloc.c:3856)

|   ->34.74% (8,204,288B) 0x8058E98: imalloc (jemalloc.c:3866)

|   | ->34.27% (8,093,696B) 0x805D833: malloc (jemalloc.c:5882)

|   | | ->12.23% (2,887,680B) 0x6BBB63F: sqlite3MemMalloc (sqlite3.c:14221)

...

This shows a snapshot at a particular point in time, during which the whole program allocated ~22MB of memory, of which almost 3MB was due to SQLite.

One limitation though is you can't easily see what changes from one snapshot to another, which is important in order to see if there are any gradual memory increases (these may or may not be actual leaks). So I wrote a small python script, massdiff, which diffs two Massif snapshots. Here is an example of the output:

Diffing snapshots 30 50 

- (heap allocation functions) malloc/new/new[], --alloc-fns, etc. - 22,286,738

    [ diff: +502,088 ]

+ (heap allocation functions) malloc/new/new[], --alloc-fns, etc. - 22,788,826 

-   0x8058927: arena_malloc_small (jemalloc.c:3794) -  7,494,594

      [ diff: +499,552 ]

+   0x8058927: arena_malloc_small (jemalloc.c:3794) -  7,994,146 

-     0x8058D9D: arena_malloc (jemalloc.c:3854) -  7,494,594

        [ diff: +499,552 ]

+     0x8058D9D: arena_malloc (jemalloc.c:3854) -  7,994,146 

-       0x8058E98: imalloc (jemalloc.c:3866) -  6,055,702

          [ diff: +499,232 ]

+       0x8058E98: imalloc (jemalloc.c:3866) -  6,554,934 

-         0x805D833: malloc (jemalloc.c:5882) -  5,552,542

            [ diff: +499,232 ]

+         0x805D833: malloc (jemalloc.c:5882) -  6,051,774 

-           0x52D2FA3: js_malloc (jsutil.h:213) -  1,635,252

              [ diff: +432,064 ]

+           0x52D2FA3: js_malloc (jsutil.h:213) -  2,067,316 

-             0x52D9D27: JSRuntime::malloc(unsigned int, JSContext*) (jscntxt.h:1358) -  1,635,252

                [ diff: +432,064 ]

+             0x52D9D27: JSRuntime::malloc(unsigned int, JSContext*) (jscntxt.h:1358) -  2,067,316 

-               0x52D9DEC: JSContext::malloc(unsigned int) (jscntxt.h:2027) -  1,600,532

                  [ diff: +431,904 ]

+               0x52D9DEC: JSContext::malloc(unsigned int) (jscntxt.h:2027) -  2,032,436 

-                 0x5D5B41D: JSObject::allocSlots(JSContext*, unsigned int) (jsobj.cpp:4032) -    161,456

                    [ diff: +344,832 ]

+                 0x5D5B41D: JSObject::allocSlots(JSContext*, unsigned int) (jsobj.cpp:4032) -    506,288 

-                   0x5D5B594: JSObject::growSlots(JSContext*, unsigned int) (jsobj.cpp:4078) -    161,456

                      [ diff: +344,832 ]

+                   0x5D5B594: JSObject::growSlots(JSContext*, unsigned int) (jsobj.cpp:4078) -    506,288

The diff only shows what changed between the two revisions, and shows that in in tree format, just like Massif's ms_print does. Each group of three lines shows a line from the first snapshot on top, the corresponding line from the later snapshot on bottom, and between them the difference.

The output here is from loading about:blank 500 times in Fennec. There is overall ~500K of additional allocation (so, ~1K per page load), of which JSObject::growSlots is responsible for ~340K (this is later all deallocated at once, presumably due to GC being run).

So far this has been useful in helping discover one specific case, bug 641663, and I'm still investigating some additional issues. Hopefully it can be a useful tool for other people too. To use it,

Run Massif on your program, see here and here
Run ms_print on the output
Run massdiff on that file, with the two snapshot numbers you want to diff as parameters

8 comments:

UnknownMarch 22, 2011 at 4:08 PM
The script doesn't work when you take 100 or more snapshot as it matches one or more spaces at the beginning of the snapshot title - instead it should match zero or more spaces.

i.e.

# Snapshot title
- m = re.match(' +([..regex elided...]')
+ m = re.match(' *([..regex elided...]')
if m:
snapshot = Snapshot()
for i in ['n', 'time', 'total', 'useful_heap', 'extra_heap', 'stacks']:
ReplyDelete
Replies
UnknownMarch 30, 2011 at 5:59 AM
Your script is interesting, but doesn't work when valgrind collects more than 100 snapshots. The problem is on line 42 - when you attempt to match the snapshot title it should have zero or more spaces at the beginning, not one or more.
ReplyDelete
Replies
azakaiMay 2, 2011 at 10:14 AM
Thanks Oliver!
ReplyDelete
Replies
UnknownDecember 17, 2018 at 1:45 PM
Very useful script.
My use case is slightly different: my program loops 100 times over a memory consuming api and I want the details. For that, I use the macro VALGRIND_MONITOR_COMMAND("detailed_snapshot") at the beginning of the api, and it generates 100 snapshot files.
In the same idea, I want to compare 2 snapshot files. Inspired from your script, here is mine. Thanks for the idea.
ReplyDelete
Replies
UnknownDecember 17, 2018 at 1:46 PM
'''
Parse 2 massif snapshot files
============================

Usage: this_script.py SNAPSHOT_1 SNAPSHOT 2

where

SNAPSHOT_1 and 2 are massif snapshots (not a ms_print dump)
typically generated by the macro in the code VALGRIND_MONITOR_COMMAND("detailed_snapshot")
Here is a sample of such a file:

>>>>>>>>>>>>>>
desc: --time-unit=ms --threshold=0.0
cmd: /my_prog.sh
time_unit: ms
#-----------
snapshot=0
#-----------
time=4078176
mem_heap_B=765203759
mem_heap_extra_B=58898945
mem_stacks_B=0
heap_tree=detailed
n29988: 765203759 (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
n187: 230733840 0x1A303B27: ??? (in /usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.390.77)
n3: 121525532 0x1B7CCA75: ??? (in /usr/lib/x86_64-linux-gnu/libnvidia-glcore.so.390.77)
n2: 123451053 0x1B7B7B6B: ??? (in /usr/lib/x86_64-linux-gnu/libnvidia-glcore.so.390.77)
n1: 123451050 0x1B89A702: ??? (in /usr/lib/x86_64-linux-gnu/libnvidia-glcore.so.390.77)
n1: 123451050 0x1B89C079: ??? (in /usr/lib/x86_64-linux-gnu/libnvidia-glcore.so.390.77)
n2: 123451050 0x1B4FFE67: ??? (in /usr/lib/x86_64-linux-gnu/libnvidia-glcore.so.390.77)
n2: 116592620 0x1B503A2B: ??? (in /usr/lib/x86_64-linux-gnu/libnvidia-glcore.so.390.77)
n2: 109734190 0x1B4D387A: ??? (in /usr/lib/x86_64-linux-gnu/libnvidia-glcore.so.390.77)
>>>>>>>>>>>>>>

As you can see, there is only one snapshot for the whole file, some header
lines, followed by n-lines describing the allocations in a tree-like structure
where indentation is relevant to the call stacks. The first n-line is the
allocated memory for the whole snapshot.
'''
ReplyDelete
Replies
UnknownDecember 17, 2018 at 1:46 PM
import os, sys, re

# define custom parameters

# depth of the tree, 0 means only global memory
max_depth = 1
print 'maximum depth =', max_depth
# tolerance in octets for comparison, 1000=1ko
diff_threshold = 1000000
print 'tolerance =', diff_threshold, 'bytes'

# Read files

class Snapshot: pass
class SnapshotLine:
def __str__(self):
return ('%s%s - %d - %10s' % (' '*self.indent, self.addr, self.mem, self.text))[:130]

snapshots = []

def read_file(file):
print 'reading file', file, '...'
snapshot = Snapshot()
snapshot.file = file
snapshot.lines = []
snapshots.append(snapshot)

started = False
title_read = False
above_max_depth_ws='^' + ' '*(max_depth+1)
for line in open(file, 'r').readlines():
#print line
line = line.replace('\n', '')
if 'snapshot=0' in line:
started = True
continue
if not started: continue
if '#----' in line: continue

# Snapshot title lines
if not title_read:
found = False
for i in ['time', 'mem_heap_B', 'mem_heap_extra_B', 'mem_stacks_B']:
expr = '^' + i + '=(?P[\d]+)'
m = re.match(expr, line)
if m:
setattr(snapshot, i, int(m.group('value')))
found = True
break
if found: continue;
title_read = True

# Snapshot detail line
m = re.match(above_max_depth_ws, line)
if m: continue;

m = re.match('(?P[ ]*)n(?P[\d]+): (?P[\d]+) (?P0x[0-9A-F]*): (?P.*)', line)
if not m:
m = re.match('(?P[ ]*)n(?P[\d]+): (?P[\d]+) (?P.*)', line)
if m:
snapshot_line = SnapshotLine()
snapshot_line.indent = len(m.group('indent'))
snapshot_line.n = int(m.group('n'))
snapshot_line.mem = int(m.group('mem'))
try:
snapshot_line.addr = m.group('addr')
except:
snapshot_line.addr = 0
snapshot_line.text = m.group('text')
#print snapshot_line.__dict__
if snapshot_line.indent <= max_depth:
snapshot.lines.append(snapshot_line)

read_file(sys.argv[1])
read_file(sys.argv[2])

# Generate tree structure

print 'generating tree structure...'
for snapshot in snapshots:
snapshot.roots = {}
for i in range(snapshot.lines.__len__()):
line = snapshot.lines[i]
line.children = {}
indent = line.indent
if indent == 0: snapshot.root = line
elif indent == 1: snapshot.roots[line.addr] = line
else:
# Find parent
j = i-1
while snapshot.lines[j].indent != indent-1: j -= 1
snapshot.lines[j].children[line.text] = line

#print snapshot.file, snapshot.roots

# Dump tree

def dump_lines(lines):
def dump_line(line):
print line
for child in line.children.values():
dump_line(child)
for line in lines:
dump_line(line)

def dump_tree():
print 'Tree:'
for snapshot in snapshots:
print snapshot.file, snapshot.root, snapshot.roots
dump_lines(snapshot.roots.values())

#dump_tree()
ReplyDelete
Replies
UnknownDecember 17, 2018 at 1:47 PM
# Diff two snapshots

print 'diffing...'
def commify(x):
sign = '+' if x >= 0 else '-'
ret = list(str(abs(x))[::-1])
for i in range(len(ret)):
if i % 3 == 2 and i != len(ret)-1:
ret[i] += ','
return sign+(''.join(ret))[::-1]

def diff_dicts(d1, d2):
keys = list(set(d1.keys() + d2.keys()))
data = [[key, 0] for key in keys]

for datum in data:
key = datum[0]
if key not in d2:
datum[1] = -d1[key].mem
elif key not in d1:
datum[1] = d2[key].mem
else:
datum[1] = d2[key].mem - d1[key].mem

data.sort(lambda x, y: y[1]-x[1])

for datum in data:
key = datum[0]
diff = datum[1]
if key not in d2:
print "-", d1[key]
elif key not in d1:
print "+", d2[key]
else:
if abs(diff) > diff_threshold:
print "-/+", d1[key]
print '[diff: %s]' % commify(diff)
diff_dicts(d1[key].children, d2[key].children)

print '-', snapshots[0].file, snapshots[0].root
print '[diff: %s]' % commify(snapshots[1].root.mem - snapshots[0].root.mem)
print '+', snapshots[1].file, snapshots[1].root
print '-------------'
diff_dicts(snapshots[0].roots, snapshots[1].roots)
ReplyDelete
Replies
UnknownDecember 17, 2018 at 1:48 PM
Sorry I had to make several posts because the full script is too large.
ReplyDelete
Replies

Pages

Friday, March 18, 2011

massdiff - Diff for Massif Snapshots

8 comments: