tag:blogger.com,1999:blog-2681864008001569004.post8206863755521578880..comments2023-12-12T01:10:23.246-08:00Comments on azakai's blog: massdiff - Diff for Massif Snapshotsazakaihttp://www.blogger.com/profile/00792138494525424175noreply@blogger.comBlogger8125tag:blogger.com,1999:blog-2681864008001569004.post-88382305593531026682018-12-17T13:48:00.909-08:002018-12-17T13:48:00.909-08:00Sorry I had to make several posts because the full...Sorry I had to make several posts because the full script is too large.Anonymoushttps://www.blogger.com/profile/01842458736830654183noreply@blogger.comtag:blogger.com,1999:blog-2681864008001569004.post-27161665429935105942018-12-17T13:47:07.075-08:002018-12-17T13:47:07.075-08:00# Diff two snapshots
print 'diffing...'
d...# Diff two snapshots<br /><br />print 'diffing...'<br />def commify(x):<br /> sign = '+' if x >= 0 else '-'<br /> ret = list(str(abs(x))[::-1])<br /> for i in range(len(ret)):<br /> if i % 3 == 2 and i != len(ret)-1:<br /> ret[i] += ','<br /> return sign+(''.join(ret))[::-1]<br /><br />def diff_dicts(d1, d2):<br /> keys = list(set(d1.keys() + d2.keys()))<br /> data = [[key, 0] for key in keys]<br /><br /> for datum in data:<br /> key = datum[0]<br /> if key not in d2:<br /> datum[1] = -d1[key].mem<br /> elif key not in d1:<br /> datum[1] = d2[key].mem<br /> else:<br /> datum[1] = d2[key].mem - d1[key].mem<br /><br /> data.sort(lambda x, y: y[1]-x[1])<br /><br /> for datum in data:<br /> key = datum[0]<br /> diff = datum[1]<br /> if key not in d2:<br /> print "-", d1[key]<br /> elif key not in d1:<br /> print "+", d2[key]<br /> else:<br /> if abs(diff) > diff_threshold:<br /> print "-/+", d1[key]<br /> print '[diff: %s]' % commify(diff)<br /> diff_dicts(d1[key].children, d2[key].children)<br /><br />print '-', snapshots[0].file, snapshots[0].root<br />print '[diff: %s]' % commify(snapshots[1].root.mem - snapshots[0].root.mem)<br />print '+', snapshots[1].file, snapshots[1].root<br />print '-------------'<br />diff_dicts(snapshots[0].roots, snapshots[1].roots)Anonymoushttps://www.blogger.com/profile/01842458736830654183noreply@blogger.comtag:blogger.com,1999:blog-2681864008001569004.post-11810447959082300392018-12-17T13:46:46.455-08:002018-12-17T13:46:46.455-08:00import os, sys, re
# define custom parameters
#...import os, sys, re<br /><br /><br /># define custom parameters<br /><br /># depth of the tree, 0 means only global memory<br />max_depth = 1<br />print 'maximum depth =', max_depth<br /># tolerance in octets for comparison, 1000=1ko<br />diff_threshold = 1000000<br />print 'tolerance =', diff_threshold, 'bytes'<br /><br /><br /># Read files<br /><br />class Snapshot: pass<br />class SnapshotLine:<br /> def __str__(self):<br /> return ('%s%s - %d - %10s' % (' '*self.indent, self.addr, self.mem, self.text))[:130]<br /><br />snapshots = []<br /><br />def read_file(file):<br /> print 'reading file', file, '...'<br /> snapshot = Snapshot()<br /> snapshot.file = file<br /> snapshot.lines = []<br /> snapshots.append(snapshot)<br /><br /> started = False<br /> title_read = False<br /> above_max_depth_ws='^' + ' '*(max_depth+1)<br /> for line in open(file, 'r').readlines():<br /> #print line<br /> line = line.replace('\n', '')<br /> if 'snapshot=0' in line:<br /> started = True<br /> continue<br /> if not started: continue<br /> if '#----' in line: continue<br /> <br /> # Snapshot title lines<br /> if not title_read:<br /> found = False<br /> for i in ['time', 'mem_heap_B', 'mem_heap_extra_B', 'mem_stacks_B']:<br /> expr = '^' + i + '=(?P[\d]+)'<br /> m = re.match(expr, line)<br /> if m:<br /> setattr(snapshot, i, int(m.group('value')))<br /> found = True<br /> break<br /> if found: continue;<br /> title_read = True<br /> <br /> # Snapshot detail line<br /> m = re.match(above_max_depth_ws, line)<br /> if m: continue;<br /><br /> m = re.match('(?P[ ]*)n(?P[\d]+): (?P[\d]+) (?P0x[0-9A-F]*): (?P.*)', line)<br /> if not m:<br /> m = re.match('(?P[ ]*)n(?P[\d]+): (?P[\d]+) (?P.*)', line)<br /> if m:<br /> snapshot_line = SnapshotLine()<br /> snapshot_line.indent = len(m.group('indent'))<br /> snapshot_line.n = int(m.group('n'))<br /> snapshot_line.mem = int(m.group('mem'))<br /> try:<br /> snapshot_line.addr = m.group('addr')<br /> except:<br /> snapshot_line.addr = 0<br /> snapshot_line.text = m.group('text')<br /> #print snapshot_line.__dict__<br /> if snapshot_line.indent <= max_depth:<br /> snapshot.lines.append(snapshot_line)<br /> <br />read_file(sys.argv[1])<br />read_file(sys.argv[2])<br /><br /><br /># Generate tree structure<br /><br />print 'generating tree structure...'<br />for snapshot in snapshots:<br /> snapshot.roots = {}<br /> for i in range(snapshot.lines.__len__()):<br /> line = snapshot.lines[i]<br /> line.children = {}<br /> indent = line.indent<br /> if indent == 0: snapshot.root = line<br /> elif indent == 1: snapshot.roots[line.addr] = line<br /> else:<br /> # Find parent<br /> j = i-1<br /> while snapshot.lines[j].indent != indent-1: j -= 1<br /> snapshot.lines[j].children[line.text] = line<br /><br /> #print snapshot.file, snapshot.roots<br /><br /><br /># Dump tree<br /><br />def dump_lines(lines):<br /> def dump_line(line):<br /> print line<br /> for child in line.children.values():<br /> dump_line(child)<br /> for line in lines:<br /> dump_line(line)<br /><br />def dump_tree():<br /> print 'Tree:'<br /> for snapshot in snapshots:<br /> print snapshot.file, snapshot.root, snapshot.roots<br /> dump_lines(snapshot.roots.values())<br /><br />#dump_tree()<br />Anonymoushttps://www.blogger.com/profile/01842458736830654183noreply@blogger.comtag:blogger.com,1999:blog-2681864008001569004.post-28871942601430634892018-12-17T13:46:30.361-08:002018-12-17T13:46:30.361-08:00'''
Parse 2 massif snapshot files
====...'''<br />Parse 2 massif snapshot files<br />============================<br /><br />Usage: this_script.py SNAPSHOT_1 SNAPSHOT 2<br /><br />where<br /><br /> SNAPSHOT_1 and 2 are massif snapshots (not a ms_print dump)<br /> typically generated by the macro in the code VALGRIND_MONITOR_COMMAND("detailed_snapshot")<br /> Here is a sample of such a file:<br /><br />>>>>>>>>>>>>>><br />desc: --time-unit=ms --threshold=0.0<br />cmd: /my_prog.sh<br />time_unit: ms<br />#-----------<br />snapshot=0<br />#-----------<br />time=4078176<br />mem_heap_B=765203759<br />mem_heap_extra_B=58898945<br />mem_stacks_B=0<br />heap_tree=detailed<br />n29988: 765203759 (heap allocation functions) malloc/new/new[], --alloc-fns, etc.<br /> n187: 230733840 0x1A303B27: ??? (in /usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.390.77)<br /> n3: 121525532 0x1B7CCA75: ??? (in /usr/lib/x86_64-linux-gnu/libnvidia-glcore.so.390.77)<br /> n2: 123451053 0x1B7B7B6B: ??? (in /usr/lib/x86_64-linux-gnu/libnvidia-glcore.so.390.77)<br /> n1: 123451050 0x1B89A702: ??? (in /usr/lib/x86_64-linux-gnu/libnvidia-glcore.so.390.77)<br /> n1: 123451050 0x1B89C079: ??? (in /usr/lib/x86_64-linux-gnu/libnvidia-glcore.so.390.77)<br /> n2: 123451050 0x1B4FFE67: ??? (in /usr/lib/x86_64-linux-gnu/libnvidia-glcore.so.390.77)<br /> n2: 116592620 0x1B503A2B: ??? (in /usr/lib/x86_64-linux-gnu/libnvidia-glcore.so.390.77)<br /> n2: 109734190 0x1B4D387A: ??? (in /usr/lib/x86_64-linux-gnu/libnvidia-glcore.so.390.77)<br />>>>>>>>>>>>>>><br /><br /> As you can see, there is only one snapshot for the whole file, some header<br /> lines, followed by n-lines describing the allocations in a tree-like structure<br /> where indentation is relevant to the call stacks. The first n-line is the<br /> allocated memory for the whole snapshot.<br />'''Anonymoushttps://www.blogger.com/profile/01842458736830654183noreply@blogger.comtag:blogger.com,1999:blog-2681864008001569004.post-83934392352921098882018-12-17T13:45:20.668-08:002018-12-17T13:45:20.668-08:00Very useful script.
My use case is slightly differ...Very useful script.<br />My use case is slightly different: my program loops 100 times over a memory consuming api and I want the details. For that, I use the macro VALGRIND_MONITOR_COMMAND("detailed_snapshot") at the beginning of the api, and it generates 100 snapshot files.<br />In the same idea, I want to compare 2 snapshot files. Inspired from your script, here is mine. Thanks for the idea.<br />Anonymoushttps://www.blogger.com/profile/01842458736830654183noreply@blogger.comtag:blogger.com,1999:blog-2681864008001569004.post-46348498634881249522011-05-02T10:14:21.355-07:002011-05-02T10:14:21.355-07:00Thanks Oliver!Thanks Oliver!azakaihttps://www.blogger.com/profile/00792138494525424175noreply@blogger.comtag:blogger.com,1999:blog-2681864008001569004.post-80335172829813135062011-03-30T05:59:15.026-07:002011-03-30T05:59:15.026-07:00Your script is interesting, but doesn't work w...Your script is interesting, but doesn't work when valgrind collects more than 100 snapshots. The problem is on line 42 - when you attempt to match the snapshot title it should have zero or more spaces at the beginning, not one or more.Unknownhttps://www.blogger.com/profile/06766643514947591311noreply@blogger.comtag:blogger.com,1999:blog-2681864008001569004.post-80329710413678625442011-03-22T16:08:49.012-07:002011-03-22T16:08:49.012-07:00The script doesn't work when you take 100 or m...The script doesn't work when you take 100 or more snapshot as it matches one or more spaces at the beginning of the snapshot title - instead it should match zero or more spaces.<br /><br />i.e.<br /><br /># Snapshot title<br />- m = re.match(' +([..regex elided...]')<br />+ m = re.match(' *([..regex elided...]')<br /> if m:<br /> snapshot = Snapshot()<br /> for i in ['n', 'time', 'total', 'useful_heap', 'extra_heap', 'stacks']:Unknownhttps://www.blogger.com/profile/06766643514947591311noreply@blogger.com