Simple PAPI Instrumentation
Creating a Small Sample Program in C
For this example we will be creating a small sample program in C to
instrument. The instrumentation we do can be applied to larger programs.
Let's use Hello World:
#include <stdio.h>
int main(int argc, char **argv) {
printf("Hello World\n");
return 0;
}
If you save this as "hello_world.c" then
on Linux you can compile with a command something like this:
gcc -O2 -Wall -o hello_world hello_world.c
To run the program type ./hello_world and it should
run and print the message.
Setting up PAPI
The setup of PAPI is a bit beyond this document. It might already be
installed on your machine, either as a package from your operating system
or maybe the system administrator installed it.
You can install it yourself from the
PAPI website
but if you are using a locally compiled version you will have to do a few
extra steps to link against the library.
Instrumenting the Code
Initializing PAPI
First add the line
#include <papi.h>
to the top of your file with the other includes. This will let your
program know about the PAPI interface. If you are linking against
your own-compiled version of papi you'll want to put papi.h in double
quotes rather than angle brackets.
Next add the PAPI library initialization code:
int retval;
retval=PAPI_library_init(PAPI_VER_CURRENT);
if (retval!=PAPI_VER_CURRENT) {
fprintf(stderr,"Error initializing PAPI! %s\n",
PAPI_strerror(retval));
return 0;
}
This initializes the PAPI library. The PAPI_VER_CURRENT part makes sure that
the version of PAPI you are using is the same one that you built against.
Now recompile your program. You'll have to add -lpapi (that's
a lower-case L) to the command line to tell the compiler to link against
the PAPI library:
gcc -O2 -Wall -o hello_world hello_world.c -lpapi
If you're using a self-compiled version of PAPI
you might need to use the -I option to tell the the compiler
where the PAPI include file is and you will probably want to link against
the libpapi.a static version of papi. Something like:
gcc -O2 -Wall -Ipath_to_papi/src -o hello_world hello_world.c path_to_papi/src/libpapi.a
where path_to_papi is the path to where you compiled PAPI.
Creating an Eventset
Before measuring things with PAPI you need to create an Eventset.
You do that with the following code:
int eventset=PAPI_NULL;
retval=PAPI_create_eventset(&eventset);
if (retval!=PAPI_OK) {
fprintf(stderr,"Error creating eventset! %s\n",
PAPI_strerror(retval));
}
Adding Events to an Eventset
You will want to add the event you want to measure. In this example
we will you PAPI_TOT_CYC (total cycles) but you can add any of the
events that the utilities papi_total_avail and
papi_native_avail show are available on your system.
You can add multiple events to an eventset and if the underlying system
supports that many they can be measured at the same time. If they can't
(for example, hardware often has a limit for how many events can be
measured at once)
you can enable multiplexing which will switch events in and out and let
you estimate the event totals.
Here is code to add PAPI_TOT_CYC
retval=PAPI_add_named_event(eventset,"PAPI_TOT_CYC");
if (retval!=PAPI_OK) {
fprintf(stderr,"Error adding PAPI_TOT_CYC: %s\n",
PAPI_strerror(retval));
}
Actually Instrumenting the Code
Now we need to start and stop the event around the code we care about.
In our example, let's put it around the printf call.
Before the printf call put
long long count;
PAPI_reset(eventset);
retval=PAPI_start(eventset);
if (retval!=PAPI_OK) {
fprintf(stderr,"Error starting CUDA: %s\n",
PAPI_strerror(retval));
}
and after put
retval=PAPI_stop(eventset,&count);
if (retval!=PAPI_OK) {
fprintf(stderr,"Error stopping: %s\n",
PAPI_strerror(retval));
}
else {
printf("Measured %lld cycles\n",count);
}
The PAPI_reset() resets all counters in the eventset to 0.
The PAPI_start() starts counting with the given eventset.
Your code runs, and when it is done call PAPI_stop() which will stop
the measurements and also read out the values. It puts the values
into an array of 64-bit (long long) values. In our example we are only
measuring one value so we just pass a pointer to the count value.
Assuming everything went well, the number of cycles we measured
should be in the count variable and we can print it.
Cleanup
That is all it takes to measure.
However to be fully correct you might want to clean up the eventset
now that we are done with them.
PAPI_cleanup_eventset(eventset);
PAPI_destroy_eventset(&eventset);
While this is good practice, if you are exiting the program anyway it's
not strictly necessary to do this.
Using other components
The directions given assume the default CPU component for events.
Depending how PAPI is configured on your machine there are many
other sources of performance events.
The papi_component_avail utility can show which event sources
are available on your system.
In general using these other components as sources is exactly the
same as CPU events. One thing to know though is it is not possible
to create eventsets that include events from different components.
You will need to have a separate eventset for each component.
You can force an eventset to belong to a component with a call to
PAPI_assign_eventset_component() but in theory that should not be
necessary with modern PAPI.
What if things go Wrong
Performance counters can be tricky and things can go wrong.
Hopefully PAPI will print an error message, but sometimes they can
be confusing.
One of the first things to do is check to make sure PAPI is working OK
on your system. Use the papi_component_avail tool to see
what PAPI components are available and their status.
One common thing that can go wrong on Linux is that for security reasons
performance counters are disabled. In that case papi_component_avail
may tell you to check the value of /proc/sys/kernel/perf_event_paranoid.
If it is too strict you will have to have your sysadmin adjust the value
for you.
Back to the main page