Add example that uses shared memory
Add example that uses texture memory
Add 2d/3d example

Make vector_add use threads rather than blocks?
Check if should add -O2 / -Wall to CFLAGS

