[Technical Talk] - FAQ: Game art optimisation (do polygon counts really matter?)

13
polycounter lvl 14
Offline / Send Message
CrazyButcher polycounter lvl 14
after some heated discussion came up in the "what are you working on thread" about how anal one should be about shaving those tris down, I guess its better to move the topic here.

this was a presented model, its fairly low already, and has beveled edges in a fashion that allows UV mirroring, and gives nice smooth normals (therefore is good for baking and runtime interpolation).

Tacklebox.gif

the proposed version, while taking less tris, will have some issues on interpolation side of things. (paintover!!)

c33e83818e.jpg


now the major issue was "less = better", which is not always right. Basically the way driver/api works, we speak about batches being the limiting factor in a single frame. That is drawcalls. The less the better, the trianglecount per batch doesnt really matter, and once certain thresholds makes no difference at all. now a very good paper explaining the sideeffects and phenomena is this paper by nvidia
http://developer.nvidia.com/docs/IO/8230/BatchBatchBatch.pdf
be aware that "high-end" card was a gf5 back then, and even the gf2 was saturated at 130 tris, ie makes no difference if less triangles per batch.a slightly newer version of that paper (wow radeon9600)
http://http.download.nvidia.com/develope...ptimization.pdf
shows how basically there is no difference at all if you send 10 or 200 triangles. And this "threshold" where it doesnt matter anymore, is constantly rising (those numbers are from 2004/2005). todays engines are all about crunching as much in as little drawcalls as possible, the triangle count of single objects becomes less an issue. there is certain types of objects that are rendered using "instancing" setups or "skinning", which are supposed to have less vertices, then other stuff. but even there technical variety is just too huge.

in short, make it look good, dont go insane about taking away tris, unless you do PSP/NDS or whatever mobile, or RTS instanced units. Any modern card can crunch so many triangles that this is not the real limiting factor.

if you run the crysis sandbox editor, you will see there is some more than million tris per frame, and it runs rather smooth still (editor is quite faster than game). In EnemyTerritory the whole 3d terrain is rendered fully... what matters is the total drawcall per frame, and not just meshes but postfx and so on are part of that, yes even gui elements. A lot of limiting comes from shading the pixels, and no matter how you lowres your box will be, if texture + size on screen remains the same, the costs thru shading will be identical. The few extra vertices that need to be calculated are negligable.

There are reasons it takes crytek, epic and other guys with tons of experience to get the most out of hardware, they do a lot of performance hunting, profile how cpu/gpu are stressed at a scene. I mean Rorshach wasnt telling you guys this for no reason, after all their studio works on the real "limits", and get more out of the hardware than most. Hence some "veterans" view will surprise you, simply as they are better informed work for the "top studios". I dont want to diss anyone's opinion, its not like people made them up themselves, its just times move on very quickly. Modern cards can crunch some insane workload, but its all about the feeding mechanism...

so yes you might do some mmo which runs on every crap hardware, but even there hardware principles stay the same and looking at steam's hardware survey, there is a lot of capable hardware around...

After all this should not imply that optimization is bad, or not needed, if you can get the work done with less, and not sacrifice quality, then do it. But there is a certain level were the work/effect payoff just isnt there anymore. (this will differ for older platforms (ps2, mobiles)


----

there is another caveat so, that is the problem with "micro / thin" triangles. If you have triangles that hardly ever occupy much pixels on screen, it kills parallism of the GPU. Which should be taken into account, ie do not go overboard with micro-bevelling.


make sure to read Kevin Johnstone's posts
http://boards.polycount.net/showpost.php?p=762412&postcount=5

Replies

  • MoP
    Offline / Send Message
    MoP polycounter lvl 14
    nice writeup and info here crazybutcher, but i notice you're only referring to "runtime" performance of stuff.
    what about loading things into memory, or collision calculations etc.
    if you have 100 duplicates of an object and it's not instanced, that is going to add up quite quickly on the memory side of things if you have extra polygons.
    any more info on stuff like that rather than just brute force rendering of tris would be cool smile.gif

    you also have to bear in mind stencil shadows, more silhouette polys lead to greater stress there. not every engine uses lightmaps.

    good thread tho laugh.gif
  • perna
    Offline / Send Message
    perna ngon master
    What this translates to is that even on ancient hardware you can actually ADD polies to that box and maintain the exact same performance rendering it... which in turn means that, yes, there IS such a thing as "insignificant optimization".

    Now, Ror has the brains to guard his words, but I don't, so I'll say to you earlier optimization-worshippers: When a whole bloody torrent of highly experienced professionals tell you you're wrong, maybe you should stop being so bullheaded and actually try to research the validity of your claims.
  • CrazyButcher
    Offline / Send Message
    CrazyButcher polycounter lvl 14
    yeah mop you are right, stencil shadows are definetely ugly case where more tris = more work. But I dont think stencilshadows will have much future, personal opinion so wink.gif
    or if moved to GPU to do silhouette extraction the tris count will not be soooo important anymore.

    loading into memory, well vertexcounts and texture memory are the limiting factor. I remember having made as similar thread once, stating that vertex weigh signficantly less,too. A triangle in stripping weighs just 4 or 2 bytes, and in non strip case three times as much.

    collision is software stuff and might use dedicated lowest lod, geometry. Its a whole different story I would not want to touch, but yes here also less is better. However often dynamic objects are approximated with primitives such as boxes, spheres... to keep costs down. And static environment stuff is also fairly optimized to cope with a few more tris.

    about "instancing". Engines will instance stuff anyway, I mostly meant higher-techniques that do the rendering in less drawcalls. You will always load that box into memory only once, and have a low weight representation for every time you use it (pos,rot,scale in world + handle to actual geometry). Its just that non-instanced rendering means a drwacall every time you render the box at another position.

    of course you might do total unique box variatns, like mesh deformed permanently, but that would be exactly like modelling two boxes.

    After all we are talking about a very optimized model already, its not like the box is 1000 tris. Its just the question that the tradeoff of quality/speed for < 300 or so, simply isnt worth it, considering the rendering pipeline.
  • Kevin Johnstone
    Offline / Send Message
    Kevin Johnstone polycounter lvl 14
    I spent a solid few months of optimizing polys, lightmap UV channels, collish meshs for everything in UT and the act of
    stripping 2million polys out of a level generally improved the FPS by 2 or 3 frames.

    Polycount is not the huge issue people were rightly sure it was previously. The bigger issue now is texture resolution because all assets carry 3 textures as standard, normal map, diffuse and spec and thats before you have additional mask light textures for emmisives and reflection or whatever other stuff you are supporting in the shader.

    Shader complexity is also a bigger issue now because it requires longer rendering time.
    Section counts are a bigger issue , meshs carrying 2 of each texture and thus requiring a 2nd rendering pass.

    I can't explain things technically enough for people but the coders have explained to me a couple times that the problem with just beating everything down with optimization on things like polycount doesn't affect things as much because of different things being CPU or GPU bound.

    Mesh count is a big issue now that everything is a static mesh rather than the majority being BSP. BSP is terribly inefficient compared to mesh rendering also.

    A mesh is pretty much free for the 1st 600 polys, beyond that its cost can be reduced dramatically by using lightmaps for generating self shadowing and so on rather than vertex lighting.

    The reason also I was saying I wouldnt take out the horizontal spans on this piece was also largely because as an environment artist you have to be thinking about
    the crimes against scale the level designers will often make
    with your work to make a scene work.

    Just because I know its a box , doesn't mean it wont get used as something else much larger so I always try to make sure
    it can hold up, whatever it is, when 4 times the scale!

    Butcher mentioned instancing, this is another feature that we relied upon much more heavily to gain performance.
    Due to textures / BSP being more expensive now and polycounts cheaper we made things modular, very very modular.

    For instance, I made a square 384/384 straight wall using a BSP tiling texture and generated about 60 modular lego pieces that use the same texture and all fit with themselves and each other to replace BSP shelling out of levels.

    This led to lots of optimizations in general and quick easy shelling of levels and it gave our levels a baseline for
    addition of new forms to the base geometry level.

    I doubt I'm changing anyone's opinion here, maybe making a normal map driven next gen game will convince you though.
    And by next gen, i just mean the current new technology like ID's new engine, crysis engine, UE3 etc because they
    are normal map driven and the press likes fancy names
    for simple progression of technology.
    r.
  • MoP
    Offline / Send Message
    MoP polycounter lvl 14
    Heh, Rorshach, in our engine you can't scale static models so everything stays the same size unless someone explicitly exports a different version of the model file smile.gif

    And I would expect a good level designer would know not to madly scale up an object obviously designed to be used as a small prop :/

    But yes good info on all fronts, cheers guys.
  • CrazyButcher
    Offline / Send Message
    CrazyButcher polycounter lvl 14
    I should add that triangles are stored as "indexed" lists. that means you get a huge array of all vertices, and a triangle is made from either 3 or 1 index into that array.

    say for a quad your have verts = [A,B,C,D]. And now you store triangle info (index starting at 1):
    as list: [1,2,3, 2,3,4]
    as strip: [1,2,3, 4]

    each index normally weighs as much as a half uncompressed pixel. And thats it, triangles are really just adding some more indices to that index list. And their memory is sooo tiny compared to the rest... (unless for collision stuff, or stencils, there you need per face normals as well)

    The mentioned texturememory used mentioned by ror, is the one thing to really optimized for, as you can just crunch x megs of texmemory into your graphics card a frame.

    The other thing that may reside in memory are static meshes. Note that static here means we dont want to change vertices individually "by hand". But we change thru shaders (bones), or thru spatial placement in total. Which is nearly all vertices we normally see on a frame. The opposite is data that is generated/manipulated more fundamentally every frame, think particles.
    Now a game vertex is like 32 or 64 bytes mostly. Which means a 512x512 compressed texture gives you about 8192 "small" vertices or 4096 "fat" vertices. Their weight depends on how accurate and how much extra data you need per vertex (second UVs, vertex color...), let it be a bit less than 4k if is even fatter vertex format.

    Now the third texturememory eating thing are the special effects textures, which are uncompressed and can suck up quite some megs depending on window resolution.


    Now the "other" memory costs are on the application side, in regular RAM, like the collision info, and "where object x is", relationships... that memory is mostly never a big problem on PC. Just on console you need to be more clever about not loading everything into RAM. As they are more limited.

    So we have X memory of what fits into the graphics card which is "texturememory" "framebuffer memory + effect textures" "static meshes" and "shaders" (which are like ultra tiny compared to rest).

    Now of course say we have a giant world and want to move around, it will be impossible to preload every texture/mesh into graphicsmemory. So we must cleverly unload/load some while we move around, so noone notices. The amount we can send over without hickups is not a lot (keep in mind that the rest of your scene is still rendered). So one must be very clever about swapping out textures. Hence the modularity and "reuse", rorschach mentioned, is even more important.
    not only per-frame will it allow batching of "same state" objects. but even in the long run means less swapping occurs.

    now what happens if I have more active textures in the scene than my graphics card can actually load per frame. then the driver will kickin and send a cached copy over (very ugly stalls). The driver also optimizes moments when to reload stuff, as most loading from RAM to Videoram is done asynchronously (ie you call the function doesnt mean it happens right now, but you let driver do it when he wants to). So now we got the driver in the memory equation as well. Now some clever strategies the driver optimizations gurus at AMD/NVidia do might create hiccups in "non common" situations. But what is "common"? If some new major game comes out and has a very specific new way to tackle a problem, we see new drivers magically appearing making smoother rides, of course they might optimize a lot more in those drivers, but anyway...

    you get a brief idea of complexity of all this, and why the most common reply on this boards regarding polycounts and whatever is "it depends"
  • Kevin Johnstone
    Offline / Send Message
    Kevin Johnstone polycounter lvl 14
    Mop: sure, but theres always cases where they need to fill in gaps and anything will do when you can scale things.

    Clearly a box prop isnt the best example, i was trying to detail a general attitude toward environment asset creation
    to substitute the 'everything must go' all purpose optimization attitude.

    Also another reason for extra polys in UE3 is smoothing groups. We try to use one smoothing group because it renders a better normal map.

    The optimization at all costs method led me to finding out that reducing polycount and controling the smoothing with more smoothing groups costs as much as using more polys to create a better single smoothing group because
    the engine doesn't actually have smoothing groups, it just doubles the edge count where the smoothing group changes.

    Which costs as much or more than adding additional chamfered edges to support a cleaner smoothing group.

    I still go back and forth on this issue myself but generally
    the consensus at Epic is that its better to use more polys and process out a normal map that will render well with light hitting it from any angle.

    If you go the purist optimization route ( as I did at the beginning of the project) and optimize with smoothing groups to control things and have half the polycount, you
    end up with normals that look good only when hit by light from certain angles and its still just as expensive.

    Again, I doubt anyone who has a different view is going to be changed by this information. I didn't change my
    opinions until I had it reproved to me dozens of times.

    r.
  • MoP
    Offline / Send Message
    MoP polycounter lvl 14
    Yep, I totally agree that for normalmapped stuff it's way better to have some extra polys and keep a single smoothing group, than to use separate smoothing angles or groups, since as you say they pretty much amount to the same vertex memory anyway, and the former gives a better normalmap.

    One of our senior programmers told me that some graphics cards can have a harder time rendering long thin tris (such as those you'd get by having thin bevels instead of smoothing groups) at long range, but I don't know to what extent or how much this would impact performance, not much it seems.
  • Rick Stirling
    Offline / Send Message
    Rick Stirling polycounter lvl 13
    Great to see this all written down with some proper technical backing to it - while there is no excuse whatsoever to piss away polygons for the sake of using them, and those verts do add up, its ALWAYS textures, textures, textures that are the main bottleneck.

    As for collision information, in most cases there is little issue with optimising that mesh down. I know we don't use the full resolution mesh for collision. In many/most cases, the LOD is used for collision. For characters its pretty much a series of boxes, because let's be honest - we might have modelled fingers and the inside of the mouth, but when it comes to collision all you care about is the hand and the head.

    This bit interested me:

    [ QUOTE ]
    A mesh is pretty much free for the 1st 600 polys, beyond that its cost can be reduced dramatically by using lightmaps for generating self shadowing and so on rather than vertex lighting.


    [/ QUOTE ]

    Is that because you don't have to store that data on a per vert basis?
  • perna
    Offline / Send Message
    perna ngon master
    It only makes sense that the objects you're likely to scale in a map are also the ones you instance a lot. Rocks, ruins, vegetation, etc. You're going to see good performance on that so the polycount isn't of the highest importance. Especially taking into account that most of those objects will be lowpoly to start with.

    I never use cages for rendering normal maps anymore. It turns out to be horrible workflow-wise. They most often break when you start editing the original mesh and in the case of 3ds max restrict the processing rays to straight paths instead of bending them nicely creating a smoother result. Instead of ray cages, you're mostly better off simply adding more geo to the object. You can always make that geo count visually as well as cleaning up your normal map results.

    As for temporarily adding geo to a copy of the object for baking, then using the object without that added geo for your exports, I haven't done that in a long time, but am sure it would be useful in cases still.

    I'll use smoothing groups at times because clients can be very insistent on respecting polycounts and nothing else.


    Basically I believe a lot of people are so insistent on the anal poly-reduction because:
    -they're comfortable with it
    -it's very straight forward and easy to understand
    -they've developed poly optimization skills they're proud of and don't like to hear that those skills aren't as important as they've believed wink.gif

    I think most of us that have been modeling for a while mastered poly optimization a long time ago. What defines you as a 3d artist now is how good you can make stuff look. It's not like the "old days" (of just a few years ago) when pretty much all games looked really bad and your success was defined by how well you could get stuff to run.

    Take the polycount your art leads gives you and use ALL of them. Don't try to impress anyone by saying you used half your budget. If you NEEDED to use half the budget.. then he would have given you half!
  • perna
    Offline / Send Message
    perna ngon master
    I'm not sure this has really been given much attention: It's always possible that an engine will join meshes and send in one batch. This depends upon texture use/reuse, to which degree the overhead is worth it, and so on. What that results in is pushing the polycount of the final object past the safe point (600 used by Epic, as reference). Then you've broken the barrier and now start seeing polycount relevance in rendertimes - you may be up to 10000 tris in one batch now. However, the fact that you're merging several batches is going to save performance anyway.. that is, after all, why you do it in the first place.. which cancels out the fact that you're now operating with a higher polycount.

    It gets complicated, and the performance result is individual to each scenario, each engine, each game. Nobody will expect a 3d artist to know these things intimately. But it pretty much boils down to this: if it'll benefit your model a lot to go 200 tris beyond the budget you were given, go ahead and do it. You're not going to make performance drop to 15 FPS. You got to remember that the budget you were given is pretty much pure guess work to begin with wink.gif

    That doesn't mean you should't optimize, it means you should know what gives you the most bang for your bucks.
  • Joao Sapiro
    Offline / Send Message
    Joao Sapiro polycounter
    amzing info here guys, keep them coming smile.gif i have a question :

    since smoothing groups are basically a detaching of faces ( hence the increase of vert count since the verts were duplicated ) if you have one continuous mesh and one with smoothing groups wich one would be faster to render ? my assumption is the continuos since there isnt any overlaping vertex, but i would like to know more about implementation of smoothing groups on assets , when are they a must do and when its better to make god smoothing via polygon.

    i dont make sense.
  • Kevin Johnstone
    Offline / Send Message
    Kevin Johnstone polycounter lvl 14
    You only learn when and where to break the rules once you've
    spent months doing it. I am sorry that this sounds like a cheap answer but its the truth.

    I've seen my processing , with the cage, with multiple smoothing groups, then switching to 1 smoothing group on a basic non chamfered wall shape so the smoothing is REALLY
    stretched and showing lots of horrible black to white gradients in max.

    When I take that ingame, the smoothing forces the engine to bend the normals so when the level is rebuilt you get a
    LOT more normals popping out.

    Rick: that might be it, i don't remember all the technical reasons for each thing working as it does, i remember
    more what works simply from habit now as there are so many
    more rules and what not to bear in mind.

    k.
  • perna
    Offline / Send Message
    perna ngon master
    Johny. You make sense.
    3d hardware has no concept of smoothing groups. A vertex can only contain one normal, so what we call "smoothing groups" is actually just "split geometry".
    So, here are the real vertex counts, with the object in the middle having smoothing groups.
    per128_smoothinggroup_vertexcount.jpg

    Keep in mind that the same goes for UV coordinates. Just one coordinate per vertex.
  • EarthQuake
    More split edges(smoothing groups) are going to give you a higher # of verts, this will always be slower. How much slower in reality it actually is i have no idea, probablly not much.
  • perna
    Offline / Send Message
    perna ngon master
    UVmapping. In this example, the 3d model for the leg on the left will use less verts than the one on the right. It'll have more uv-distortion, but with uv-relax this is distributed throughout the UV-island and not an issue, especially when normalmapping.

    edit: lesson learned is: Keep your UVmaps as continuous as possible. If you're mapping a box, all six sides should be connected. Also keep in mind that if you bevel instead of using smoothing groups, it will still increase your vert count if there's a UV seam on the beveled area.
    per128_uvmapping_vertexcount.jpg
  • CrazyButcher
    Offline / Send Message
    CrazyButcher polycounter lvl 14
    [ QUOTE ]
    A mesh is pretty much free for the 1st 600 polys, beyond that its cost can be reduced dramatically by using lightmaps for generating self shadowing and so on rather than vertex lighting.

    [/ QUOTE ]

    do you mean a unique prebaked AO map?. In theory if you use a second UV for unique map, thats 2 shorts = 4 bytes, which is as much as a vertexcolor. And therefore more expensive (you send same pervertex but still need to sample AO map). it has to do with internal ut3 specific setups.

    looking at the ut3 demo shaders, I actually found that you guys probably have some very complex baked lighting stuff, which is better to store into textures. It seems to be more than just a color value. In fact if the effect is done per vertex 3 x float4 are sent, compared to the single float2 for texture coordinate. Which is a lot more, that is "not normal" for oldschool lightmapping wink.gif but probably some fancy quality thing you do. havent really reverse engineered the effect, but as per-vertex effect its indeed very fat. But maybe you mean realtime shadows and not baked stuff at all. ... edit: after some more diving into it, its directional lightmapping like in hl2.

    anyway this example shows that "it depends", and a magic value like the 600, has to do with vertex formats effects, ie very engine specific.

    what the batch article by nvidia showed however, is that there are engine independent "limits", ie say less than 300 today makes no difference if its 1 tris or 300. (the numbers back then were around 200 I simply guessed the 300 for todays cards)
  • EarthQuake
    I think thats referring to using lightmaps as opposed to stencil shadows? Not actually ambocc type lightmaps.
  • hobodactyl
    Offline / Send Message
    hobodactyl polycounter lvl 13
    Really cool thread! Per I had a question:

    [ QUOTE ]
    I never use cages for rendering normal maps anymore. It turns out to be horrible workflow-wise. They most often break when you start editing the original mesh and in the case of 3ds max restrict the processing rays to straight paths instead of bending them nicely creating a smoother result. Instead of ray cages, you're mostly better off simply adding more geo to the object. You can always make that geo count visually as well as cleaning up your normal map results.

    [/ QUOTE ]

    I was confused by you saying you never use cages for rendering normal maps; I thought that was the only way to render normal maps? Sorry if this is a retarded question; do you just mean you don't use Max's cages?
  • perna
    Offline / Send Message
    perna ngon master
    hobo: Terminology is a bit loose, it can be confusing. The cages I'm talking about are usually copies of your lowpoly mesh which are deformed to control the length and direction of the normal map processing rays. You can generate a normal map just fine without a cage, it'll use the vertex normals and a configurable ray length.

    edit: in max, you can turn off the cage being shown in the projection modifier rollout, and disable it entirely in the render-to-texture menu (click Options, then Use Cage, now define a ray length)
  • EarthQuake
    You can use a cage, or you can simply use the "offset" function in max.
  • oXYnary
    Offline / Send Message
    oXYnary polycounter lvl 14
    Can we sticky this or add it toa PC wiki or something?

    One question:
    [ QUOTE ]

    edit: lesson learned is: Keep your UVmaps as continuous as possible. If you're mapping a box, all six sides should be connected. Also keep in mind that if you bevel instead of using smoothing groups, it will still increase your vert count if there's a UV seam on the beveled area.

    [/ QUOTE ]

    So anytime you have a texture seam it will detach the vertices in the engine?
  • Xenobond
    Offline / Send Message
    Xenobond polycounter lvl 13
    [ QUOTE ]
    Can we sticky this or add it toa PC wiki or something?

    One question:
    [ QUOTE ]

    edit: lesson learned is: Keep your UVmaps as continuous as possible. If you're mapping a box, all six sides should be connected. Also keep in mind that if you bevel instead of using smoothing groups, it will still increase your vert count if there's a UV seam on the beveled area.

    [/ QUOTE ]

    So anytime you have a texture seam it will detach the vertices in the engine?

    [/ QUOTE ]

    Yes. UV splits & smoothing group edges will split the vertices. I remember reading a pretty good article about this in a gd mag some years ago. I'll try and dig up that article on gamasutra.
  • perna
    Offline / Send Message
    perna ngon master
    oxy: yes unfortunately. To help you visualize why, you can imagine the data associated with a vertex.. the structure is always the same size, so you'll only get one UV coordinate. In max, it seems you can have several uv's and several normals per vert, but even 3d modeling programs break stuff up, it's just done transparently to you. When you select and move one vert like that, you're actually moving several.

    Well, I'll ask CB to give an example of such a structure. It ties together with what he said earlier about triangles just indexing a list of verts.
  • Rick Stirling
    Offline / Send Message
    Rick Stirling polycounter lvl 13
    A C&P from a half written tech doc I was working on about uvs (and smothing groups) breaking the tri-strips

    [ QUOTE ]

    Many artists take the number of polygons in the model as the basis for model performance, but this is only a guideline. The real factor is the number of vertices in the model. As an artist your 3d software will count the number of verts in the model, however this is rarely the same number of verts that a game engine thinks there are.


    Put simply, certain modeling techniques break the triangle stripping routine, making the vert count in the game engine be higher than the one reported in your 3d software. These attributes physically break the mesh into separate parts, and thus break triangle stripping algorithms.


    The most common of these are:
    Smoothing groups
    Material IDs
    UV seams


    [/ QUOTE ]
  • Xenobond
    Offline / Send Message
    Xenobond polycounter lvl 13
    Haha. Why am I not surprised.

    http://www.ericchadwick.com/examples/provost/byf1.html
    http://www.ericchadwick.com/examples/provost/byf2.html

    Part 2 talks more on the whole uv/smoothing/mat splits issue.
  • perna
    Offline / Send Message
    perna ngon master
    Rick: Tristrips aren't relevant to your vertex count. They work differently. The idea is that a tri can be defined by one vert, "re-using" two from the previously drawn tri. This just reduces data traffic, the amount of vertex data remains exactly the same as without tristripping.

    Material IDs means state changes, treating the "broken off" chunk as a seperate object, which yes, will break the border verts.
  • CrazyButcher
    Offline / Send Message
    CrazyButcher polycounter lvl 14
    eric hosts that gamedesign paper, too. I am sure we are just minutes or hours away he posts the links again wink.gif

    you get a fixed set of vertex attributes. Think positon,color,normal + some extras like UV channels, tangent stuff. Simply due to pipelining each vertex has no knowledge about the triangle he is part of nor other stuff (okay untrue for latest geometry shaders). So the vertex cannot have 2 normals, or 2 uvs for the same UV channel, hence the split. There might be more splits that are not visible to you (like mirrored UVs might be connected in max, but broken for tangentspace stuff). Whenever such split occurs all other attributes are copied over, so the normal will stay the same, color... but costs are raised by a full new vertex.
    A good deal of "viewport" performance depends on taking the internal 3d data (which is organized different) to those graphics hardware vertices. Hence pure modelling apps or "less complex" on vertex/triangle level, can shortcut more, and benefit from speed.
  • monkeyscience
    Offline / Send Message
    monkeyscience polycounter lvl 12
    Before you go optimizing smoothing groups, consult your friendly neighborhood engine programmer, shared vertices may or may not be used at all by the engine and your model exporter may ignore your hard work. The engineering term is Vertex Indexing and there are some reasons not to always use it. ALL vertex data has to be the same for a vertex to be shared. Position, normal, uvs, any shader parameters all have to match. If this doesn't happen often enough, indexing is wasteful.

    Also, graphics hardware still renders triangles and with no regard to shared data. Pur's example meshes would get rendered as 4, 4, and 6 triangles, or 12, 12, and 18 vertices. Indexing is only a way to compress data in memory and help transfer rates of meshes to the GPU. If transfer rate isn't the limiting factor but the computed vertex count is, smoothing groups won't help. Neither will converting to quads or triangle strips. This usually happens with expensive vertex shaders like skeletal animation skinning or stencil shadow edge finding.

    For everything else though, vertex count optimization just won't get you as far as it used to. Most normal mapped games with fancy-pants shaders are fill rate or texture lookup limited. The three little computations to figure out where a triangle ends up on the screen is just prep work for the potentially thousands of pixels that need to be computed.

    "Fill rate limited" btw means fill rate is way slower than other work the graphics card is doing so its best to start optimizing there. It does NOT mean all other optimization work should be neglected. That's common n00b programmer talk.

    If you do optimize polycount, do it only on shit that matters. Optimize either your half million poly models or models that will be visible in large counts all at once. Spending time on a 10 poly reduction to a tool chest is only justified if somewhere in your game there's a big stack of tool chests visible all at once and you actually shave thousands of polys from that scene.
  • JKMakowka
    Awesome info, thanks guys!

    One thing is still confusing me a bit... how does an engine differentiate between "regular" triangles and quadstrips?
    CB already explained that they are stored much more efficiently, but how can I influence that?
    Sorry if that is a stupid question wink.gif
  • perna
    Offline / Send Message
    perna ngon master
    JK: you can't, leave it to the engine/programmers smile.gif Well, actually you can.. make strips! you can look up on wikipedia how they work and then you'll understand how to make geometry that'll split up better into strips.
    But, in general try to keep things nice and clean with quads.. that's sufficient, and has many other advantages anyway. I mean, I think even us who understand this stuff intimately still don't go too far out of our way to create super-efficient meshes. It's just not a good way to spend your time. Focus on the main issues(don't make any of the huge basic mistakes), and make good looking art. That's really all you need to do. If the programmers want strips out of you, or anything else, they should tell you.
  • Kevin Johnstone
    Offline / Send Message
    Kevin Johnstone polycounter lvl 14
    Per: 'Just make good art' lol

    In the end this stuff gets so damn anus bleedingly technical
    that 'Just make good art' 'and leave me the hell alone!' is
    really what this thread will boil down to for anyone attempting to see it through smile.gif

    Bottom line for me at this point is that UT3 is out and
    you can see exactly what I did there to work around things.
    Though obviously there's a lot of things I messed up as some
    of that stuff is 3 years old to me now and pretty embarassing.

    One key thing I feel I will have to point out about editing
    UT3 environment assets is that lightmaps are crucial.

    Lightmaps are a uniquely unwrapped 2nd set of UV coordinates that you unwrap for the engine to calculate self
    shadowing on objects and reduce the cost of anything over
    600 tri's

    They are required because most meshs having optimized texture UV layouts to reuse mirrored sections and if
    the engine used those to calculate the self shadowing it would look like ass because it would try to apply shadows
    on both sides of a mirrored section when only 1 was in darkness.

    The lightmap UV's need huge amounts of space around each chunk because the resolution of the lightmap will be
    32x32 or 64x64 generally instead of the 1024x1024 resolution
    that the actual textures are.

    You also need to leave a large space around the edge of the unwrap, i texel i am told. This is because when the lighting
    is rebuilt all those assets, 32 or 64 lightmap squares are compiled into large 1024 or 2048 sheets of lightmap information so if
    you do not leave a space around the perimeter of the lightmap UV's diferent lightmaps when compiled on the big
    sheet will bleed subtly into each other and create s sublte
    shadow gradient artifact leaving out from edges where the
    bleed occurs.

    You also need to split the UV's in the lightmap in each location where the normals are mirrored so it doesnt bleed
    between each mirrored half.

    When mirroring normals on the unwrap you need to have the center point be mirrored over the X axis horizontally,
    like a rorschach rather than mirroring vertically like a calender page.

    This is because the normals are calculated from the combination of 3 tangents in code.

    r.
  • CrazyButcher
    Offline / Send Message
    CrazyButcher polycounter lvl 14
    JK: you dont need to worry about quads, strips and all that, the exporter or engine pipeline tools will take care of that. I just wanted to show the principle of how its sent to graphics card (ie as vertex indexed lists)

    MonkeyScience: when would you actually not used indexed lists? I can only think of very chunked meshes, like particle billboards, or classic BSP brush sides, with a very low "sharing" ratio, but other than that, its kinda unlikely to not benefit from reuse I think. Also the performance papers I read suggest to use indexed primitives. (like this one, even a bit aged, I think triangle lists are the most optimized way of rendering, http://ati.amd.com/developer/gdc/PerformanceTuning.pdf ). Of course the lists have to be optimized for "order" to make best use of vertex cache, but drawing non indexed will take out the benefit of vertex cache completely.
    so for most "artist" created triangles, I think it will always be indexed lists, no?
  • hobodactyl
    Offline / Send Message
    hobodactyl polycounter lvl 13
    Per: Thanks for the quick response! I thought that might have been what you were talking about since I'd seen it in Mudbox. I can see how that would be more time-efficient.
  • MoP
    Offline / Send Message
    MoP polycounter lvl 14
    Stickied because this thread is really good.
  • eld
    Offline / Send Message
    eld polycounter lvl 11
    (on the titles: triangle optimization is not, but vertex is always wink.gif )

    I always work my optimizations with vertices, and I always consider other things, drawcalls, splits, and when you have to do more draw calls due to different materials/textures.

    For console especially it's memory versus the vertexdrawing power, then you have to, to a slight degree keep fillrate in mind. (overlapping geometry)


    as on baking with a cage, as mentioned above, if you're having a hard time with the cagebake, and your fingers are itching for the bevel buttons, Just do a combination, bake one with a cage, and one without for the straight rays,

    It's just a texture, so you can combine the parts of the different renders that you like, so that you get correct edge-normals and then straight renders on surface-details that might've gone perspective-skewed.


    and now again, memory, which should be a big part in this thread too, since you usually only have a small part of the 360 memory to work with smile.gif not even half of it in some cases.

    reuse surfaces, dont mirror, but ROTATE!
  • perna
    Offline / Send Message
    perna ngon master
    [ QUOTE ]
    (on the titles: triangle optimization is not, but vertex is always wink.gif )

    [/ QUOTE ]
    Careful now buddy smile.gif Read the whole thread, particularly the opening post. It's nowhere near as simple as that, either, which is something we're trying to communicate here.

    Why we started this topic in the first place is a lot of people forget that the job of an artist is to make assets look great. Nobody will pat your back for making extreme optimizations, whether polygon or vertex based. With games now running more than one million polygons on-screen, it's all about priorities.

    The hard to accept fact for a lot of you is that you simply won't get by with mostly technical skills for long. The more time passes, the less technical restrictions there are for artists. Some of you may have struggled a long time to learn to optimize a few polies off a mesh, and are now told it's not so important anymore? Of course you're not going to like that.

    I see a great deal of threads and discussions on technical topics on these boards. But it gets silly... people go on about edgeflow... for a model that looks absolutely terrible. How is better edgeflow going to help? How is better use of polies going to help? The model will still look bad.

    It's just that those technical things can be learned by anyone, so it's easy for people to shoot off at the mouth about them, but learning to make stuff look great is a major challenge, one that far less are prepared for.

    It should give some food for thought that the people here who are the most technically capable and knowledgable also are the people that care the least about those skills.
  • eld
    Offline / Send Message
    eld polycounter lvl 11
    Per, for me optimization means, looking the same, but costing less, as in, I wouldn't sacrifice the looks for a bit of juice, but there's alot that can be done without actually removing any looks but making it cost less, THAT's optimization.

    It's about knowing how stuff works aswell, knowing a bit of tech as an artist.

    There's always a visible hardware barrier, and we're always hitting it, way more in some games than others, and games on modern consoles are still struggling to maintain framrate.


    While you're fully correct per, I can still see the headache that can come from a big team with only a few persons thinking about optimizations smile.gif
  • Noren
    Offline / Send Message
    Noren polycounter lvl 14
    [ QUOTE ]
    In case of 3ds max (the cage ) restricts the processing rays to straight paths instead of bending them nicely creating a smoother result.

    [/ QUOTE ]

    Hi Per, can you elaborate on this please? Sounds wrong to me, but I might have misunderstood you.
  • perna
    Offline / Send Message
    perna ngon master
    Noren: There are two types of normal map processing cages. One that limits only length, one that controls direction and length (like in max).

    Controlling direction like that means the generator is not able to tweak the results ideally.

    Here's a test you can try: Push a ray cage in max X units and render out the normal maps... then disable the cage and use an offset/raylength of the same X units instead. You should get the exact same result, right? But you don't, the non-cage output is going to be significantly better in most cases. Someone may have time to provide some screenshot examples.
  • Ruz
    Offline / Send Message
    Ruz greentooth
    in the first post the optimised version still looked pretty good, so what the problem.
    surely its also about modelling 'just' enough detail to support the extra detail you are trying to bring out with the normal map.
    more about efficient modelling really.
    If you can make something look good with 1000 polys, why make it with 1200.
  • JordanW
    Offline / Send Message
    JordanW sublime tool
    ruz I'm not sure that the optimized version is an actual mesh with those changes made i think it's a paintover, so the implications that would be seen from the inaccurate normals are not shown.
  • perna
    Offline / Send Message
    perna ngon master
    Ruz: please bother to actually read the full thread. It's kind of dispiriting when a lot of us put in all this work to share some info and it goes in one ear and out the other.

    Did you read the bit that said a 100 poly mesh will not render faster than a 200 poly mesh. There's a limit where, if you optimize below it, you are just wasting your time.
  • Noren
    Offline / Send Message
    Noren polycounter lvl 14
    Per: I'm a 3dsmax user myself. That's why I got curious in the first place because my experience with the cage has been different.
    And even now, if I render the testcase you proposed I get two exactly identical normalmaps.Max 8 and Supersampling activated. I used a simple box here (one SG) and almost always use the cage except for occasions like described by eld. So it can very well be that something slipped under my radar here and I would be very interested if someone could provide an example of cage vs. no cage not matching up. (Cage just pushed, of course, not manipulated further ).
    A big plus for me with the cage is, that if you happen to work with smoothingroups it will still interpolate the castingrays and you don't wind up with missing parts in the map, while the normals are still correct.
  • eld
    Offline / Send Message
    eld polycounter lvl 11
    [ QUOTE ]
    ...the non-cage output is going to be significantly better in most cases. Someone may have time to provide some screenshot examples...

    [/ QUOTE ]

    You can use both though, as a combined result ( it's a texture afterall smile.gif ), as non-cage renders will usually shoot and miss its target on corners and such, but cages will usually do crazy renders on a a big flat surface that has to have details rendered onto.
  • Ruz
    Offline / Send Message
    Ruz greentooth
    per ,don't get downhearted.
    What I said was that you should be modelling just enough detail to support the model you are making.
    My point was that you shouldn't add more detail just for the sake of it. I did n't say anything about rendering speed.
    personally I would keep taking out loops until I thought it was degrading too much in quality. Its about commmon sense a lot of the time.

    you guys seem to be talking mainly about high end , next gen stuff like unreal engine/ doom engine.

    What about MMO's or similar . I am sure that in the grand scale of things, polycount might have more of an impact.
  • CrazyButcher
    Offline / Send Message
    CrazyButcher polycounter lvl 14
    yes paintover, and Ruz the discussion is more about sacrificing quality for a "few tris", which is not worth it. The discussion is about those very last ultra bits of optimizing. It should not imply that optimizing at all isnt needed, its just that there is a grey zone where the amount of lowered quality or time spent with it, isnt worth the benefit in speed. So its not about "adding more", but "removing too much".
    And those "few tris", are with time actually getting more and more. The hardware is still similar for MMOs as well, after all the performance pdfs mentioned, are like 3 years old, which should mean, thats the PC low-end of today.
  • EarthQuake
    [ QUOTE ]
    per ,don't get downhearted.
    What I said was that you should be modelling just enough detail to support the model you are making.
    My point was that you shouldn't add more detail just for the sake of it. I did n't say anything about rendering speed.
    personally I would keep taking out loops until I thought it was degrading too much in quality. Its about commmon sense a lot of the time.

    you guys seem to be talking mainly about high end , next gen stuff like unreal engine/ doom engine.

    What about MMO's or similar . I am sure that in the grand scale of things, polycount might have more of an impact.

    [/ QUOTE ]

    The example was obviously not for a low-end mmo, it was for a cuurent generation project. It would be too much work to cover every single platform, every single engine, every hardware level in one thread. We're talking about current tech here, mostly how current generation hardware handles rendering. Of course if you're making a model for warcraft3 you're not going to want to follow these guidelines, so take some of your own advice and use *common sense*.
  • Ruz
    Offline / Send Message
    Ruz greentooth
    yeah I sometimes kinda forget that you are 'removing' polys rather than adding them.
    It just confused me in the example that because the optimised verison of the box still had a decent bevel along the edges and I thought it would still look correct with a normal map on it.
    To me that extra row of loops adds nothing to the silhouette, but what do I know I am character artist:)
    TBH I would experiment and if it looked ok I would trust my instinct to say yeah that looks right, the silhouettes ok and there are no weird shading artefacts which there should n't be becaseu the box has beveled edges.
  • Ged
    Offline / Send Message
    Ged interpolator
    interesting read. Do you guys think these principles could be true for online 3D? Director with havok, flash cs3, java etc? or are those engines not powerful enough to experience benefits from good hardware? Im just thinking of all the really low poly online games out there and was wondering wether this is due to performance or bandwidth or software or just the developers?

    Some of this thread has just gone over my head but Im currently working on a Director 3D game and its my first 3D game, so this is a relevent topic as theres just 2 of us making the assets.
13
Sign In or Register to comment.