Multi-threading has been around for quite awhile. If the application and data are so structured that one thread doesn't have to wait on results from another thread, the OS scheduler will assign the threads to the available resources(the cores.)
If you want to see another app that gets flat out 100% core usage almost 100% of the time during video encoding look at FAVC. He splits the video into 4 pieces for a quad core and runs one copy of the video encoder for each slice. Then the pieces are all reassembled into the video stream.
If you have to wait for the result of one operation on some data to proceed to the next step, then you have high idle processing time.