Simple PAPI Instrumentation

Creating a Small Sample Program in C

For this example we will be creating a small sample program in C to instrument. The instrumentation we do can be applied to larger programs. Let's use Hello World:
#include <stdio.h>

int main(int argc, char **argv) {

	printf("Hello World\n");

	return 0;

If you save this as "hello_world.c" then on Linux you can compile with a command something like this:
gcc -O2 -Wall -o hello_world hello_world.c

To run the program type ./hello_world and it should run and print the message.

Setting up PAPI

The setup of PAPI is a bit beyond this document. It might already be installed on your machine, either as a package from your operating system or maybe the system administrator installed it.

You can install it yourself from the PAPI website but if you are using a locally compiled version you will have to do a few extra steps to link against the library.

Instrumenting the Code

Initializing PAPI

First add the line
#include <papi.h>
to the top of your file with the other includes. This will let your program know about the PAPI interface. If you are linking against your own-compiled version of papi you'll want to put papi.h in double quotes rather than angle brackets.

Next add the PAPI library initialization code:

	int retval;

        if (retval!=PAPI_VER_CURRENT) {
                fprintf(stderr,"Error initializing PAPI! %s\n",
                return 0;
This initializes the PAPI library. The PAPI_VER_CURRENT part makes sure that the version of PAPI you are using is the same one that you built against.

Now recompile your program. You'll have to add -lpapi (that's a lower-case L) to the command line to tell the compiler to link against the PAPI library:
gcc -O2 -Wall -o hello_world hello_world.c -lpapi

If you're using a self-compiled version of PAPI you might need to use the -I option to tell the the compiler where the PAPI include file is and you will probably want to link against the libpapi.a static version of papi. Something like:
gcc -O2 -Wall -Ipath_to_papi/src -o hello_world hello_world.c path_to_papi/src/libpapi.a
where path_to_papi is the path to where you compiled PAPI.

Creating an Eventset

Before measuring things with PAPI you need to create an Eventset. You do that with the following code:
int eventset=PAPI_NULL;

if (retval!=PAPI_OK) {
   fprintf(stderr,"Error creating eventset! %s\n",

Adding Events to an Eventset

You will want to add the event you want to measure. In this example we will you PAPI_TOT_CYC (total cycles) but you can add any of the events that the utilities papi_total_avail and papi_native_avail show are available on your system.

You can add multiple events to an eventset and if the underlying system supports that many they can be measured at the same time. If they can't (for example, hardware often has a limit for how many events can be measured at once) you can enable multiplexing which will switch events in and out and let you estimate the event totals.

Here is code to add PAPI_TOT_CYC
if (retval!=PAPI_OK) {
	fprintf(stderr,"Error adding PAPI_TOT_CYC: %s\n",

Actually Instrumenting the Code

Now we need to start and stop the event around the code we care about. In our example, let's put it around the printf call.

Before the printf call put
long long count;

if (retval!=PAPI_OK) {
	fprintf(stderr,"Error starting CUDA: %s\n",
and after put
if (retval!=PAPI_OK) {
   fprintf(stderr,"Error stopping:  %s\n",
else {
      printf("Measured %lld cycles\n",count);
The PAPI_reset() resets all counters in the eventset to 0.
The PAPI_start() starts counting with the given eventset.

Your code runs, and when it is done call PAPI_stop() which will stop the measurements and also read out the values. It puts the values into an array of 64-bit (long long) values. In our example we are only measuring one value so we just pass a pointer to the count value.

Assuming everything went well, the number of cycles we measured should be in the count variable and we can print it.


That is all it takes to measure. However to be fully correct you might want to clean up the eventset now that we are done with them.
While this is good practice, if you are exiting the program anyway it's not strictly necessary to do this.

Using other components

The directions given assume the default CPU component for events. Depending how PAPI is configured on your machine there are many other sources of performance events. The papi_component_avail utility can show which event sources are available on your system.

In general using these other components as sources is exactly the same as CPU events. One thing to know though is it is not possible to create eventsets that include events from different components. You will need to have a separate eventset for each component.

You can force an eventset to belong to a component with a call to PAPI_assign_eventset_component() but in theory that should not be necessary with modern PAPI.

What if things go Wrong

Performance counters can be tricky and things can go wrong. Hopefully PAPI will print an error message, but sometimes they can be confusing.

One of the first things to do is check to make sure PAPI is working OK on your system. Use the papi_component_avail tool to see what PAPI components are available and their status.

One common thing that can go wrong on Linux is that for security reasons performance counters are disabled. In that case papi_component_avail may tell you to check the value of /proc/sys/kernel/perf_event_paranoid. If it is too strict you will have to have your sysadmin adjust the value for you.

Back to the main page