Slowdowns due to Dynamic Binary Instrumentation when
generating Basic Block Vectors for the SPEC CPU Benchmarks
These results were generated on our domori cluster. The machines
are all dual-core dual-processor Intel Pentium D machines with 4GB of
RAM running at 3.46GHz. The operating system was Linux 2.6.23.9-perfmon
and the filesystem access was run over NFSv3 over gigabit ethernet.
The benchmarks were run multiple time to account for noise in the
environment, and the fastest result was chosen.
The bencmarks were all compiled as 32-bit x86 Linux binaries using
gcc version 4.1.0 with -O2 optimization on a SuSE 10.1 box.
Summary Results
DBI tool |
Average Slowdown Factor |
CPU 2000 |
CPU 2006 |
Total |
Int |
FP |
Total |
Int |
FP |
Pin |
15.33 |
19.36 |
6.47 |
11.33 |
13.41 |
7.69 |
Qemu |
27.71 |
27.92 |
27.25 |
24.27 |
19.61 |
32.44 |
Valgrind |
38.61 |
41.91 |
31.35 |
31.10 |
30.24 |
32.69 |
Detailed Results
Pin
Using the Pinpoints tool from pin kit
pin-2.0-10520-gcc.4.0.0-ia32-linux
Pin SPEC CPU 2000
Note: in their 2005 PLDI paper
the Pin developers say the slowdowns in gcc
and perlbmk are due to the lack of code reuse and high number
of indirect jumps.
benchmark | native time | DBI'd time | slowdown factor |
swim | 3:18 | 10:16 | 3.11 |
applu | 8:18 | 28:18 | 3.41 |
lucas | 4:50 | 17:09 | 3.55 |
mgrid | 3:46 | 15:27 | 4.10 |
mcf | 2:45 | 14:22 | 5.22 |
equake | 1:27 | 7:59 | 5.51 |
art.470 | 1:52 | 10:38 | 5.70 |
sixtrack | 4:40 | 27:08 | 5.81 |
facerec | 3:53 | 24:11 | 6.23 |
fma3d | 5:21 | 33:35 | 6.28 |
ammp | 4:35 | 29:43 | 6.48 |
apsi | 4:28 | 30:16 | 6.78 |
art.110 | 1:27 | 9:55 | 6.84 |
vpr.route | 1:30 | 10:53 | 7.26 |
mesa | 3:13 | 24:36 | 7.65 |
twolf | 3:15 | 34:17 | 10.55 |
perlbmk.makerand | 0:04 | 0:47 | 11.75 |
perlbmk.diffmail | 0:32 | 6:44 | 12.62 |
galgel | 2:23 | 30:15 | 12.69 |
vpr.place | 1:07 | 14:27 | 12.94 |
wupwise | 3:20 | 43:12 | 12.96 |
eon.kajiya | 0:54 | 13:09 | 14.61 |
bzip2.source | 0:47 | 11:40 | 14.89 |
bzip2.graphic | 0:59 | 15:12 | 15.46 |
eon.rushmeier | 0:31 | 8:06 | 15.68 |
crafty | 1:42 | 26:41 | 15.70 |
gzip.log | 0:17 | 4:28 | 15.76 |
eon.cook | 0:37 | 9:52 | 16.00 |
bzip2.program | 0:48 | 13:00 | 16.25 |
parser | 3:21 | 56:54 | 16.99 |
gzip.graphic | 0:35 | 10:00 | 17.14 |
gzip.random | 0:28 | 8:35 | 18.39 |
gzip.program | 0:53 | 17:22 | 19.66 |
vortex.3 | 1:03 | 20:51 | 19.86 |
vortex.1 | 0:57 | 19:01 | 20.02 |
vortex.2 | 1:01 | 21:01 | 20.67 |
gzip.source | 0:28 | 9:47 | 20.96 |
gap | 1:33 | 33:21 | 21.52 |
perlbmk.perfect | 0:14 | 5:20 | 22.86 |
gcc.166 | 0:12 | 5:07 | 25.58 |
gcc.200 | 0:34 | 14:30 | 25.59 |
perlbmk.704 | 0:23 | 10:37 | 27.70 |
perlbmk.957 | 0:35 | 16:10 | 27.71 |
perlbmk.850 | 0:37 | 17:56 | 29.08 |
perlbmk.535 | 0:21 | 10:23 | 29.67 |
gcc.scilab | 0:20 | 10:02 | 30.10 |
gcc.integrate | 0:05 | 2:31 | 30.20 |
gcc.expr | 0:05 | 2:33 | 30.60 |
Average total slowdown: 15.33 (48 total)
Average int slowdown: 19.36 (33 total)
Average fp slowdown: 6.47 (15 total)
Pin SPEC CPU 2006
benchmark | native time | DBI'd
time | slowdown factor |
lbm | 20:48 | 34:46 | 1.67 |
cactusADM | 35:53 | 1:14:57 | 2.09 |
gromacs | 37:45 | 1:47:33 | 2.85 |
GemsFDTD | 25:00 | 1:15:16 | 3.01 |
zeusmp | 31:47 | 2:06:16 | 3.97 |
bwaves | 21:38 | 1:28:39 | 4.10 |
milc | 21:40 | 1:30:55 | 4.20 |
leslie3D | 25:11 | 1:56:14 | 4.62 |
mcf | 16:36 | 1:23:13 | 5.01 |
astar.BigLakes | 9:44 | 1:00:04 | 6.17 |
astar.rivers | 17:38 | 1:52:23 | 6.37 |
soplex.pds-50 | 9:26 | 1:00:25 | 6.40 |
gcc.expr2 | 4:28 | 30:09 | 6.75 |
xalancbmk | 39:19 | 4:28:42 | 6.83 |
h264ref.foreman_baseline | 5:51 | 42:38 | 7.29 |
namd | 17:10 | 2:10:37 | 7.61 |
wrf | 32:49 | 4:09:43 | 7.61 |
sjeng | 55:08 | 7:03:15 | 7.68 |
soplex.ref | 6:12 | 48:58 | 7.90 |
gamess.triazolium | 28:00 | 3:53:39 | 8.34 |
omnetpp | 17:19 | 2:27:05 | 8.49 |
gamess.h2ocu2 | 6:19 | 59:09 | 9.36 |
hmmer.retro | 24:48 | 4:10:18 | 10.09 |
sphinx3 | 26:24 | 4:29:29 | 10.21 |
tonto | 24:58 | 4:15:19 | 10.23 |
bzip2.combined | 4:13 | 43:41 | 10.36 |
povray | 11:23 | 2:05:30 | 11.02 |
bzip2.liberty | 4:02 | 44:49 | 11.11 |
bzip2.chicken | 2:03 | 24:54 | 12.15 |
bzip2.source | 4:13 | 53:20 | 12.65 |
gcc.c-typeck | 1:47 | 23:15 | 13.04 |
gobmk.trevorc | 3:06 | 42:21 | 13.66 |
gobmk.nngs | 7:33 | 1:43:18 | 13.68 |
gcc.166 | 1:16 | 17:21 | 13.70 |
gobmk.13x13 | 3:00 | 42:18 | 14.10 |
perlbench.checkspam | 2:07 | 30:02 | 14.19 |
gobmk.trevord | 4:04 | 57:56 | 14.25 |
gcc.g23 | 2:26 | 35:34 | 14.62 |
calculix | 1:04:09 | 15:56:31 | 14.91 |
hmmer.nph3 | 7:48 | 1:58:57 | 15.25 |
gcc.200 | 2:00 | 30:34 | 15.28 |
gobmk.score2 | 3:44 | 57:09 | 15.31 |
gcc.s04 | 2:00 | 30:43 | 15.36 |
libquantum | 28:19 | 7:20:24 | 15.55 |
gamess.cytosine | 9:02 | 2:23:51 | 15.92 |
bzip2.program | 4:08 | 1:07:25 | 16.31 |
gcc.scilab | 0:52 | 14:10 | 16.35 |
bzip2.html | 5:15 | 1:26:45 | 16.52 |
gcc.expr | 1:22 | 22:48 | 16.68 |
gcc.cp-decl | 1:12 | 21:15 | 17.71 |
dealII | 13:47 | 4:05:31 | 17.81 |
perlbench.diffmail | 3:33 | 1:06:48 | 18.82 |
h264ref.foreman_main | 2:19 | 44:07 | 19.04 |
h264ref.sss_main | 19:41 | 7:39:57 | 23.37 |
perlbench.splitmail | 4:14 | 1:47:54 | 25.49 |
Average total slowdown: 11.33 (55 total)
Average int slowdown: 13.41 (35 total)
Average fp slowdown: 7.69 (20 total)
Qemu
Qemu version 0.9.3 with my qemusim-0.4 patchset
Qemu SPEC CPU 2000
benchmark | native time | DBI'd
time | slowdown factor |
mcf | 2:45 | 14:32 | 5.28 |
perlbmk.makerand | 0:04 | 0:31 | 7.75 |
vpr.route | 1:30 | 20:06 | 13.40 |
art.470 | 1:52 | 27:27 | 14.71 |
gzip.log | 0:17 | 4:17 | 15.12 |
perlbmk.diffmail | 0:32 | 8:23 | 15.72 |
bzip2.source | 0:47 | 13:35 | 17.34 |
gzip.program | 0:53 | 15:31 | 17.57 |
bzip2.graphic | 0:59 | 17:47 | 18.08 |
gzip.source | 0:28 | 8:27 | 18.11 |
facerec | 3:53 | 1:11:21 | 18.37 |
art.110 | 1:27 | 27:02 | 18.64 |
gzip.random | 0:28 | 8:57 | 19.18 |
bzip2.program | 0:48 | 15:38 | 19.54 |
gzip.graphic | 0:35 | 11:24 | 19.54 |
lucas | 4:50 | 1:34:40 | 19.59 |
applu | 8:18 | 2:53:13 | 20.87 |
gcc.integrate | 0:05 | 1:49 | 21.80 |
fma3d | 5:21 | 2:04:33 | 23.28 |
swim | 3:18 | 1:16:52 | 23.29 |
ammp | 4:35 | 1:50:27 | 24.10 |
equake | 1:27 | 35:55 | 24.77 |
gcc.expr | 0:05 | 2:06 | 25.20 |
gcc.166 | 0:12 | 5:09 | 25.75 |
galgel | 2:23 | 1:01:42 | 25.89 |
vpr.place | 1:07 | 30:27 | 27.27 |
parser | 3:21 | 1:33:21 | 27.87 |
gcc.200 | 0:34 | 16:55 | 29.85 |
mgrid | 3:46 | 1:53:41 | 30.18 |
perlbmk.535 | 0:21 | 10:42 | 30.57 |
gcc.scilab | 0:20 | 10:20 | 31.00 |
perlbmk.704 | 0:23 | 12:01 | 31.35 |
mesa | 3:13 | 1:45:27 | 32.78 |
perlbmk.perfect | 0:14 | 7:49 | 33.50 |
perlbmk.957 | 0:35 | 19:44 | 33.83 |
perlbmk.850 | 0:37 | 20:57 | 33.97 |
apsi | 4:28 | 2:38:45 | 35.54 |
twolf | 3:15 | 1:57:09 | 36.05 |
vortex.1 | 0:57 | 38:08 | 40.14 |
gap | 1:33 | 1:02:41 | 40.44 |
vortex.3 | 1:03 | 42:46 | 40.73 |
eon.rushmeier | 0:31 | 21:43 | 42.03 |
eon.kajiya | 0:54 | 39:22 | 43.74 |
wupwise | 3:20 | 2:29:04 | 44.72 |
crafty | 1:42 | 1:16:48 | 45.18 |
vortex.2 | 1:01 | 45:56 | 45.18 |
eon.cook | 0:37 | 30:24 | 49.30 |
sixtrack | 4:40 | 4:02:57 | 52.06 |
Average total slowdown: 27.71 (48 total)
Average int slowdown: 27.92 (33 total)
Average fp slowdown: 27.25 (15 total)
Qemu SPEC CPU 2006
benchmark | native time | DBI'd time | slowdown factor |
mcf | 16:36 | 1:30:06 | 5.43 |
gcc.expr2 | 4:28 | 33:11 | 7.43 |
astar.BigLakes | 9:44 | 1:14:21 | 7.64 |
astar.rivers | 17:38 | 2:15:55 | 7.71 |
xalancbmk | 39:19 | 5:22:06 | 8.19 |
soplex.pds-50 | 9:26 | 1:52:01 | 11.87 |
sjeng | 55:08 | 11:02:04 | 12.01 |
hmmer.retro | 24:48 | 5:11:17 | 12.55 |
bzip2.liberty | 4:02 | 50:58 | 12.64 |
bzip2.combined | 4:13 | 59:58 | 14.22 |
bzip2.chicken | 2:03 | 29:21 | 14.32 |
gcc.g23 | 2:26 | 35:26 | 14.56 |
gcc.166 | 1:16 | 19:09 | 15.12 |
soplex.ref | 6:12 | 1:34:30 | 15.24 |
omnetpp | 17:19 | 4:43:42 | 16.38 |
gcc.c-typeck | 1:47 | 31:09 | 17.47 |
bzip2.source | 4:13 | 1:16:13 | 18.08 |
milc | 21:40 | 6:31:54 | 18.09 |
h264ref.foreman_baseline | 5:51 | 1:47:29 | 18.37 |
hmmer.nph3 | 7:48 | 2:24:56 | 18.58 |
libquantum | 28:19 | 8:50:15 | 18.73 |
gcc.200 | 2:00 | 40:31 | 20.26 |
gcc.scilab | 0:52 | 17:41 | 20.40 |
gcc.cp-decl | 1:12 | 24:36 | 20.50 |
zeusmp | 31:47 | 11:03:03 | 20.86 |
gcc.s04 | 2:00 | 42:25 | 21.21 |
leslie3D | 25:11 | 9:02:33 | 21.54 |
perlbench.checkspam | 2:07 | 46:46 | 22.09 |
bzip2.program | 4:08 | 1:33:33 | 22.63 |
gobmk.score2 | 3:44 | 1:25:10 | 22.81 |
GemsFDTD | 25:00 | 9:37:38 | 23.11 |
bwaves | 21:38 | 8:37:15 | 23.91 |
gobmk.trevorc | 3:06 | 1:15:30 | 24.35 |
gobmk.13x13 | 3:00 | 1:13:07 | 24.37 |
gobmk.nngs | 7:33 | 3:15:30 | 25.89 |
gobmk.trevord | 4:04 | 1:47:33 | 26.45 |
wrf | 32:49 | 15:16:59 | 27.94 |
tonto | 24:58 | 12:24:16 | 29.81 |
gcc.expr | 1:22 | 40:46 | 29.83 |
h264ref.foreman_main | 2:19 | 1:10:05 | 30.25 |
h264ref.sss_main | 19:41 | 10:04:59 | 30.74 |
gromacs | 37:45 | 20:02:24 | 31.85 |
calculix | 1:04:09 | 34:16:43 | 32.06 |
gamess.h2ocu2 | 6:19 | 3:27:16 | 32.81 |
perlbench.splitmail | 4:14 | 2:21:53 | 33.52 |
perlbench.diffmail | 3:33 | 1:59:18 | 33.61 |
gamess.triazolium | 28:00 | 15:49:24 | 33.91 |
lbm | 20:48 | 12:17:54 | 35.48 |
povray | 11:23 | 6:49:55 | 36.01 |
bzip2.html | 5:15 | 3:19:04 | 37.92 |
gamess.cytosine | 9:02 | 5:49:01 | 38.64 |
cactusADM | 35:53 | 24:55:28 | 41.68 |
namd | 17:10 | 13:06:47 | 45.83 |
sphinx3 | 26:24 | 27:14:45 | 61.92 |
dealII | 13:47 | 15:13:01 | 66.24 |
Average total slowdown: 24.27 (55 total)
Average int slowdown: 19.61 (35 total)
Average fp slowdown: 32.44 (20 total)
Valgrind 3.3.0 with my exp-bbv-0.5 tool
Valgrind SPEC CPU 2000
benchmark | native time | DBI'd time | slowdown factor |
mcf | 2:45 | 22:01 | 8.01 |
perlbmk.makerand | 0:04 | 0:43 | 10.75 |
art.470 | 1:52 | 20:08 | 10.79 |
art.110 | 1:27 | 18:15 | 12.59 |
lucas | 4:50 | 1:23:06 | 17.19 |
vpr.route | 1:30 | 29:12 | 19.47 |
applu | 8:18 | 3:12:13 | 23.16 |
facerec | 3:53 | 1:30:02 | 23.18 |
swim | 3:18 | 1:19:14 | 24.01 |
ammp | 4:35 | 1:52:03 | 24.45 |
perlbmk.diffmail | 0:32 | 13:43 | 25.72 |
fma3d | 5:21 | 2:25:38 | 27.22 |
gzip.log | 0:17 | 7:43 | 27.24 |
equake | 1:27 | 41:24 | 28.55 |
bzip2.source | 0:47 | 23:03 | 29.43 |
bzip2.graphic | 0:59 | 30:04 | 30.58 |
gzip.source | 0:28 | 15:20 | 32.86 |
bzip2.program | 0:48 | 26:50 | 33.54 |
mgrid | 3:46 | 2:06:23 | 33.55 |
gzip.program | 0:53 | 29:57 | 33.91 |
gzip.graphic | 0:35 | 19:54 | 34.11 |
parser | 3:21 | 1:55:06 | 34.36 |
gzip.random | 0:28 | 16:22 | 35.07 |
gcc.integrate | 0:05 | 3:05 | 37.00 |
gcc.expr | 0:05 | 3:15 | 39.00 |
twolf | 3:15 | 2:09:56 | 39.98 |
apsi | 4:28 | 3:06:16 | 41.70 |
vpr.place | 1:07 | 48:03 | 43.03 |
galgel | 2:23 | 1:43:08 | 43.27 |
gcc.166 | 0:12 | 9:08 | 45.67 |
perlbmk.perfect | 0:14 | 10:51 | 46.50 |
gcc.200 | 0:34 | 28:09 | 49.68 |
gcc.scilab | 0:20 | 16:48 | 50.40 |
perlbmk.704 | 0:23 | 19:30 | 50.87 |
perlbmk.535 | 0:21 | 17:53 | 51.10 |
wupwise | 3:20 | 2:51:27 | 51.44 |
gap | 1:33 | 1:20:34 | 51.98 |
mesa | 3:13 | 2:51:15 | 53.24 |
eon.rushmeier | 0:31 | 27:54 | 54.00 |
perlbmk.957 | 0:35 | 31:58 | 54.80 |
sixtrack | 4:40 | 4:21:14 | 55.98 |
eon.kajiya | 0:54 | 50:49 | 56.46 |
vortex.1 | 0:57 | 53:49 | 56.65 |
vortex.3 | 1:03 | 59:37 | 56.78 |
perlbmk.850 | 0:37 | 35:38 | 57.78 |
crafty | 1:42 | 1:44:35 | 61.52 |
eon.cook | 0:37 | 38:09 | 61.86 |
vortex.2 | 1:01 | 1:03:56 | 62.89 |
Average total slowdown: 38.61 (48 total)
Average int slowdown: 41.91 (33 total)
Average fp slowdown: 31.35 (15 total)
Valgrind CPU 2006
benchmark | native time | DBI'd time | slowdown factor |
zeusmp | 31:47 | n/a | n/a |
mcf | 16:36 | 2:29:00 | 8.98 |
astar.BigLakes | 9:44 | 2:04:52 | 12.83 |
gcc.expr2 | 4:28 | 57:27 | 12.86 |
xalancbmk | 39:19 | 8:38:28 | 13.19 |
astar.rivers | 17:38 | 3:54:15 | 13.28 |
gamess.triazolium | 28:00 | 7:44:00 | 16.57 |
soplex.pds-50 | 9:26 | 2:48:04 | 17.82 |
sjeng | 55:08 | 17:15:40 | 18.78 |
milc | 21:40 | 7:02:59 | 19.52 |
tonto | 24:58 | 8:32:11 | 20.51 |
calculix | 1:04:09 | 24:05:09 | 22.53 |
bzip2.liberty | 4:02 | 1:33:03 | 23.07 |
bzip2.combined | 4:13 | 1:37:31 | 23.13 |
omnetpp | 17:19 | 6:51:32 | 23.77 |
soplex.ref | 6:12 | 2:31:18 | 24.40 |
hmmer.retro | 24:48 | 10:07:43 | 24.50 |
gcc.166 | 1:16 | 31:36 | 24.95 |
bzip2.chicken | 2:03 | 51:45 | 25.24 |
lbm | 20:48 | 8:45:20 | 25.26 |
gcc.g23 | 2:26 | 1:04:03 | 26.32 |
leslie3D | 25:11 | 11:19:50 | 27.00 |
GemsFDTD | 25:00 | 11:21:50 | 27.27 |
h264ref.foreman_baseline | 5:51 | 2:39:46 | 27.31 |
gcc.c-typeck | 1:47 | 52:45 | 29.58 |
bzip2.source | 4:13 | 2:06:26 | 29.98 |
bwaves | 21:38 | 10:53:03 | 30.19 |
gcc.expr | 1:22 | 41:55 | 30.67 |
gcc.scilab | 0:52 | 27:13 | 31.40 |
gcc.s04 | 2:00 | 1:03:50 | 31.92 |
perlbench.checkspam | 2:07 | 1:08:54 | 32.55 |
gcc.200 | 2:00 | 1:05:35 | 32.79 |
gcc.cp-decl | 1:12 | 39:45 | 33.12 |
libquantum | 28:19 | 16:41:12 | 35.36 |
hmmer.nph3 | 7:48 | 4:38:09 | 35.66 |
gobmk.trevorc | 3:06 | 1:52:18 | 36.23 |
gromacs | 37:45 | 22:53:46 | 36.39 |
gobmk.score2 | 3:44 | 2:16:06 | 36.46 |
bzip2.html | 5:15 | 3:11:35 | 36.49 |
gobmk.13x13 | 3:00 | 1:49:39 | 36.55 |
wrf | 32:49 | 20:40:43 | 37.81 |
bzip2.program | 4:08 | 2:36:33 | 37.88 |
sphinx3 | 26:24 | 17:03:26 | 38.77 |
gobmk.nngs | 7:33 | 4:54:05 | 38.95 |
gobmk.trevord | 4:04 | 2:38:25 | 38.95 |
cactusADM | 35:53 | 26:25:07 | 44.17 |
gamess.h2ocu2 | 6:19 | 4:40:35 | 44.42 |
povray | 11:23 | 8:26:23 | 44.48 |
h264ref.foreman_main | 2:19 | 1:43:37 | 44.73 |
h264ref.sss_main | 19:41 | 15:15:25 | 46.51 |
dealII | 13:47 | 10:46:29 | 46.90 |
namd | 17:10 | 13:31:57 | 47.30 |
perlbench.diffmail | 3:33 | 2:54:15 | 49.08 |
gamess.cytosine | 9:02 | 7:29:23 | 49.75 |
perlbench.splitmail | 4:14 | 3:54:15 | 55.33 |
Average total slowdown: 31.10 (54 total)
Average int slowdown: 30.24 (35 total)
Average fp slowdown: 32.69 (19 total)
Back to BBV Research Page