I'll also speak to 'elitist' of the top comment. One of the first times I had felt prejudice was when I wrote a very good piece of code for their video card which accelerated some video processing by 4X or more. Seriously they (the Japanese) were surprised and started treating me like a miracle talking dog as they just couldn't seem to get their mind around me being able to figure out their systems and contribute something impactful. Kept asking me - how did you do that?
Alternatively maybe they just didn't think a 4x speedup would be possible? Or their internal culture discouraged people making improvements to code they didn't originally author? Or they thought you did impressive work?
Mostly by eliminating a lot of individual copying steps - also understanding how the texture swizzling worked and writing directly to that format and combining all the steps and writing very good code that used MMX. Often these types of optimizations are things that add speed at the expense of generality but I still think they are useful.