1) The idea behind cache memory is that RAM is much slower than the CPU. Now, if the CPU had to wait for RAM for every new byte of instructions or data, it would be ridiculously slow. Of course, ideally, your ENTIRE RAM could be on the CPU chip itself, running at the same speed as the CPU's core, but that would be very expensive.
To work around this, manufacturers add a small amount of fast RAM as "cache". Although the first time the CPU needs data, it has to wait for the RAM, when that data finally arrives, a copy is saved in the cache. The next time the CPU next needs data from a nearby address, there is a high chance that the data is in the cache, which means no delay in reading it into the CPU.
The bigger the cache, the more it can hold but the more expensive it is. Normally CPUs have a small "L1" cache which runs at full CPU core speed and a slower "L2" cache or both.
2) L1 and L2 are normally not used in the context of GPU though i believe newer GPUs like Fermi have added that feature. HPCwire: NVIDIA Takes GPU Computing to the Next Level