Polygon count, shaders and render times

nomotonous · Feb 2015

Hello all,

I have a quick question about render times involving polygon count and shaders.
How does polygon count and shaders affect render times?

From what I have researched so far, my conclusion would be that shaders are multiplicative in terms of render times as it does calculations on each polygon. What I mean is that:

1m polygon count object without shaders rendered in: 1 second
1m polygon count object with shaders rendered in: 1.5 second
concluding that the shaders multiplies render time by 1.5 times
therefore,
2m polygon count object with shaders would render in 3 seconds
20 of 100k polygon count object with shaders would render in 3 seconds

Is this conclusion correct?

I know of things like normal maps and instancing to reduce render times but it does not come into relevance sometimes and I just want to know whether the above conclusion is correct.

Thanks!

.nL · Feb 2015

That's incorrect. If you're thinking about realtime/game rendering, here's a very simplified overview of how things break down...

A shader is a set of instructions telling the GPU how to draw a given set of vertices/polygons and certain parameters (defined by the programmer). This means that the number, and complexity of the instructions is entirely up to the person writing the shader.

The speed with-which the instructions are performed is determined by the degree to-which the programmer has optimized their shader, and the capabilities of the graphics card which is processing the vertices sent to it.

Different GPUs are faster at different things (though if a GPU is exceptional at one thing, there's a good chance it's good at all the others). Some GPUs can process larger sets of vertices in a shorter amount of time, but may be slower at running certain instructions. Others transfer the model data from your computer's memory to its own much faster, or can store much more of it. Still others are designed with power consumption in mind.

So it's pointless to put a definite processing overhead on a per-vertex/polygon basis.

As a general rule of thumb, the fewer polygons/vertices you use, the better. If you have a lot of polygons, the model will take up more space on your computer's memory, and leave less for anything else in the game. So use only as many as the piece of art you're working on justifies.

As you gain a better understanding of how modelling and games work, you'll be able to identify where you can break this rule, but you're far from there yet and you shouldn't concern yourself with creating your own polygon budgets. Use approximate budgets based on lists like this instead.

And if you're confused by my use of polygon/vertex:

Game engines break every model you make down into triangles, and draw each triangle based on a list of points (vertices) in the engine's memory.

nomotonous · Feb 2015

I still don't quite get it. I understand about polygon budget and all and the main reason most people say the fewer polygons/vertices used the better is because of the computer having to link less vertices up to make the model.
What I am mainly focusing on is the shader. From what I know, the vertex shader calculates how to draw every single vertex of the model. If there are 4 shaders, the engine would draw every single vertex with those 4 shaders. Thus, the render time would increase based on how many polygons/vertex there are.
If by assumption that
1 polygon with 4 shaders = 1 second to render
then
1000 polygons with 4 shaders = 1000 seconds to render
and
0 polygon with 4 shaders = 0 second to render

I understand polygon count budget and it is not about that.

What I am ultimately asking is this: If I want to go absolutely crazy with shaders, is there any reason for me to limit my polygon count because of the shaders?

CrazyButcher · Feb 2015

going crazy with shaders "depends", on many mobile chips you would want to save vertex shading cost + amount of vertices, given their approach to rendering.

in a basic pipeline you have two phases. First is that when you draw your mesh, all vertices will go through the current vertex-shader.
Second is that your triangles become rasterized, simplified speak now every pixel of a triangle runs through the pixel-shader instructions. There is optimizations to avoid this, for example if portions of your triangle is behind what has already been rendered.

nomotonous, start by having a look at
http://simonschreibt.de/gat/renderhell/

here is an excerpt from an article I wrote
pixeljetstream.blogspot.de/2015/02/life-of-triangle-nvidias-logical.html

GPUs are super parallel work distributors

Why all this complexity? In graphics we have to deal with data amplification that creates lots of variable workloads. Each drawcall may generate a different amount of triangles. The amount of vertices after clipping is different from what our triangles were originally made of. After back-face and depth culling, not all triangles may need pixels on the screen. The screen size of a triangle can mean it requires millions of pixels or none at all.

As a consequence modern GPUs let their primitives (triangles, lines, points) follow a logical pipeline, not a physical pipeline. In the old days before G80's unified architecture (think DX9 hardware, ps3, xbox360), the pipeline was represented on the chip with the different stages and work would run through it one after another. G80 essentially reused some units for both vertex and fragment shader computations, depending on the load, but it still had a serial process for the primitives/rasterization and so on. With Fermi the pipeline became fully parallel, which means the chip implements a logical pipeline (the steps a triangle goes through) by reusing multiple engines on the chip.

Let's say we have two triangles A and B. Parts of their work could be in different logical pipeline steps. A has already been transformed and needs to be rasterized. Some of its pixels could be running pixel-shader instructions already, while others are being rejected by depth-buffer (Z-cull), others could be already being written to framebuffer, and some may actually wait. And next to all that, we could be fetching the vertices of triangle B. So while each triangle has to go through the logical steps, lots of them could be actively processed at different steps of their lifetime. The job (get drawcall's triangles on screen) is split into many smaller tasks and even subtasks that can run in parallel. Each task is scheduled to the resources that are available, which is not limited to tasks of a certain type (vertex-shading parallel to pixel-shading).

Think of a river that fans out. Parallel pipeline streams, that are independent of each other, everyone on their own time line, some may branch more than others. If we would color-code the units of a GPU based on the triangle, or drawcall it's currently working on, it would be multi-color blinkenlights

nomotonous · Feb 2015

Well, talk about information overload. Thanks for the very thorough answer. I will definitely look through the links. Meanwhile, I believe my initial assumption to be somewhat true after reading your answer and will continue to model based on that.

Thanks!

Polygon count, shaders and render times

Replies