in very simple words, think of a shader as a very simple processor.
now, in their latest GTX280 GPU, nvidia has 240 simple shaders which can do 2 operations per clock, hence the total number of operations per clock is 480. the shaders are clocked at 1300mhz, giving a total theoretical performance of 933 gigaflops
ATI, in their latest 4870 GPU, has 160 complex shaders which do 5 operations per clock. hence, the total number of operations per clock is 800, and ATI prefers to call this the number of streaming processors. they operate at a speed of 750mhz giving a total theoretical performance of 1200 gigaflops
so ATI basically wins in terms of war performance and they also have better techonlogy, since the HD4870 die size (the size of the chip, which is one of the most important economic factors) is around half of the GTX280 die size. however, the GTX280 usually performs better because games nowadays are not yet shader bound that much, and texture fillrate is higher in the GTX280. also, nvidia pays money to game developers to code their game engines to work better on their GPU's. you dont see "The way its meant to be played" logo for no reason
