perf_event self-monitoring overhead
compared against perfctr and perfmon2
The new Linux perf_event performance counter interface has more overhead
than previous Linux performance counter interfaces.
On this page I compare the overhead of various
Linux hardware performance counter implementations.
A summary of my findings as of January 2015 can be found in
ISPASS 2015 paper:
A preliminary version of this work was presented in April 2013 at the
2013 FastPath workshop:
- V.M. Weaver.
"Self-monitoring Overhead of the Linux perf_event
Performance Counter Interface",
IEEE International Symposium on Performance Analysis of
Systems and Software (ISPASS 2015), Philadelphia, Pennsylvania,
- The IEEE would prefer you obtain this paper through their
IEEE Explore interface (link to this when available)
- You can also view my personal copy of the paper. Warning!
IEEE Copyright rules apply!
- Here are the slides from the talk I gave at ISPASS:
The tools used to generate the data and plots can be obtained
git clone git://github.com/deater/perfevent_overhead.git
The outliers are caused by L2 cache misses as well as DTLB misses.
Older results comaring amd0f/atom/core2/nehalem
Some even older results
Back to the perf_event overhead page