after some heated discussion came up in the "what are you working on thread" about how anal one should be about shaving those tris down, I guess its better to move the topic here.
this was a presented model, its fairly low already, and has beveled edges in a fashion that allows UV mirroring, and gives nice smooth normals (therefore is good for baking and runtime interpolation).
the proposed version, while taking less tris, will have some issues on interpolation side of things. (paintover!!)
now the major issue was "less = better", which is not always right. Basically the way driver/api works, we speak about batches being the limiting factor in a single frame. That is drawcalls. The less the better, the trianglecount per batch doesnt really matter, and once certain thresholds makes no difference at all. now a very good paper explaining the sideeffects and phenomena is this paper by nvidia
be aware that "high-end" card was a gf5 back then, and even the gf2 was saturated at 130 tris, ie makes no difference if less triangles per batch.a slightly newer version of that paper (wow radeon9600)
shows how basically there is no difference at all if you send 10 or 200 triangles. And this "threshold" where it doesnt matter anymore, is constantly rising (those numbers are from 2004/2005). todays engines are all about crunching as much in as little drawcalls as possible, the triangle count of single objects becomes less an issue. there is certain types of objects that are rendered using "instancing" setups or "skinning", which are supposed to have less vertices, then other stuff. but even there technical variety is just too huge.
in short, make it look good, dont go insane about taking away tris, unless you do PSP/NDS or whatever mobile, or RTS instanced units. Any modern card can crunch so many triangles that this is not the real limiting factor.
if you run the crysis sandbox editor, you will see there is some more than million tris per frame, and it runs rather smooth still (editor is quite faster than game). In EnemyTerritory the whole 3d terrain is rendered fully... what matters is the total drawcall per frame, and not just meshes but postfx and so on are part of that, yes even gui elements. A lot of limiting comes from shading the pixels, and no matter how you lowres your box will be, if texture + size on screen remains the same, the costs thru shading will be identical. The few extra vertices that need to be calculated are negligable.
There are reasons it takes crytek, epic and other guys with tons of experience to get the most out of hardware, they do a lot of performance hunting, profile how cpu/gpu are stressed at a scene. I mean Rorshach wasnt telling you guys this for no reason, after all their studio works on the real "limits", and get more out of the hardware than most. Hence some "veterans" view will surprise you, simply as they are better informed work for the "top studios". I dont want to diss anyone's opinion, its not like people made them up themselves, its just times move on very quickly. Modern cards can crunch some insane workload, but its all about the feeding mechanism...
so yes you might do some mmo which runs on every crap hardware, but even there hardware principles stay the same and looking at steam's hardware survey, there is a lot of capable hardware around...
After all this should not imply that optimization is bad, or not needed, if you can get the work done with less, and not sacrifice quality, then do it. But there is a certain level were the work/effect payoff just isnt there anymore. (this will differ for older platforms (ps2, mobiles)
there is another caveat so, that is the problem with "micro / thin" triangles. If you have triangles that hardly ever occupy much pixels on screen, it kills parallism of the GPU. Which should be taken into account, ie do not go overboard with micro-bevelling.
make sure to read Kevin Johnstone's posts