perf_event ABI breakage
perf_event has a unique design philosophy.
Part of that involves having the source code for the flagship
perf tool included in the Linux kernel source tree.
The claim is that
this will enhance the ABI.
In practice the ABI is frequently broken, and as long as the
userspace perf tool still works none of the kernel developers
seem to notice or care.
The sysfs layout is particularly troublesome. Some of the
files are documented in Documentation/ABI/testing/
but since it is "testing" the developers don't feel the need
to keep things stable.
What follows is a list of recent ABI breakage.
I haven't included things from the early pre-2.6.34 days (perf support
started with 2.6.31) as any new interface takes a while to shake out.
Those early kernels did have problems though, and PAPI still carries
the workaround for them.
- Before, a value of "0" in /sys/devices/cpu/rdpmc meant rdpmc support
was disabled, and "1" meant enabled for everyone.
Starting with 4.0 "1" means only processes with active events
can read it, and "2" means enabled for everyone.
16 April 2015:
[patch 10/10] perf_event_open.2: 4.0 update rdpmc documentation
- Before, when polling, code would get a result=0/WIFEXITED()
return when a child with events being polled exited.
Now instead I get a result=1/POLLHUP return.
Programs broken: Some perf_event_test testcases
- PERF_EVENT_IOC_PERIOD changes take effect immediately rather than
after next stop/start.
ARM had different behavior than the rest of the architectures
from 3.7 until 3.14 (i.e. 3581fe0ef37ce12ac7a4f74831168352ae848edc
Introduced by: bad7192b842c83e580747ca57104dd51fe08c223
Programs broken: ??
- In /sys/bus/event_source/bus/devices/*/events/ files started
appearing having decimal (not hex) values, contrary
to the Documentation/ABI/testing/ documentation.
- Move of AMD Fam15h Northbridge events from being part of the
generic CPU events to being a separate PMU.
feature originally added in
Programs-broken: PAPI, libpfm4
Notes: Support for NB was only added in 3.9 so the hope was no one
would notice. But now PAPI has to carry support for both
Peter Zijlstra apologizes, blames pooe judgement
due to illness.
- The PERF_EVENT_IOC_PERIOD ioctl as originally designed updates
the period on the *next* refresh.
In 3.7 in commit 3581fe0ef37ce12ac7a4f74831168352ae848edc
this was changed on ARM to update on the *current*
See the 3.14 changes for when this was "fixed" on the rest
of the architectures.
- Raw access to Offcore Response events added.
This support was previously disabled (offcore response
was merged in 2.6.39 but raw access was disabled) but
indicated success even if support not there. Ther interface
would even return plausible garbage values as results too.
support was finally added in 3.3 it was impossible to
tell if support was working or not.
- If there was not enough room for the event,
ENOSPC was returned. Linus did not like this, and this was
changed to EINVAL. ENOSPC is still returned if you try to read
results into too small of a buffer.
Discussion-threads: To find? The link in the commit
- Prior to this, an exec'd process with active perf_event overflows
did not get signals. After, it did.
Mailing list discussions:
- PERF_IOC_REFRESH behavior
What events count as POLL_IN vs POLL_HUP changed
What happens when you PERF_IOC_REFRESH other than 1 changed
Programs broken: PAPI
- The file /sys/devices/system/cpu/perf_events was used by some
to detect the existence of perf_event support. It was removed.
Programs-broken: Various internal tools
Notes: This is when /proc/sys/kernel/perf_event_paranoid
was made the official ABI-sanctioned way of detecting perf_event.
- Starting in 2.6.37 the nmi watchdog can steal an event counter.
If you try to use the full number of counters, perf_event won't
notice that NMI is stealing one until start time, not
at perf_event_open() time. Eventsets that used to work now
do not and an extra test has to be done before telling the
user an EventSet will work.
Notes:Unlikely to ever be fixed.
Context switch events were reported as happening in userspace before,
now reported as kernelspace.
Built-in Kernel Event Changes
Back to unofficial perf_event page