Draw Calls and batching

Spur · Feb 2013

So I've been lucky enough to get some paid work for for a game and at this point I really want a better understanding of how things work, which leads me to draw calls. So a few questions I have with a simple example.

My example is a simple telephone pole that I am dropping into an environment. The model includes a single mesh and a single texture/ material. So that would be one draw call?

If I combine all 10 into a single mesh , then import into game, it will still be one draw call.

If I drop 10 copies of it in game, it will be 10 draw calls, without batching.

If I drop 10 copies in game and the engine can batch, then it will be 1 draw call again since they all share the common material, right?

monster · Feb 2013

What game engine are you using?

Ten of the same object should be one draw call. Make sure they are Static objects and not playing a default animation.

Eric Chadwick · Feb 2013

In most cases, yes to all four questions (I think, I'm not a programmer).

Depends though, like if the mesh exceeds the vertex limit for a single batch, then it has to be split by the renderer into more than one batch, which would cause more draw calls.

Also in most games you would not want to combine all the telephone poles into a single model, because then the renderer couldn't cull the unseen meshes, so ALL the triangles would have to be loaded.

Hopefully a programmer will chime in here with a more informed opinion.

Spur · Feb 2013

Thanks for the replies!

@monster - its actually a proprietary engine but does allow for batching. Yes everything is static, no animations. So 10 of the same so long as they all share the same material will be batched. I think I'm good on that part thought it does lead me to another question later.

@Eric - Yeah that was the simple explanation given to me by my programmer. In turn it leads me to other questions but he is literally swamped and I hate bombarding him with stupid questions. In doing my own research I've found lots of great technical explanations but they are all over my head. So I'm trying to approach it in terms that I can understand.

Makes sense about not combining the model since it wont be able to cull unseen meshes. The next project I'm getting ready to tackle involves creating large urban and residential areas. Feature assets close to the viewer location followed by simple blocks of combined housing and factories etc.

So all this leads me to my next question.

Lets say the 10 telephone poles I am placing are all different styles of poles so they are all different meshes but still share the same material. In that case since they are differing meshes they could not be batched?

Eric Chadwick · Feb 2013

I think that's correct, not batched. But they save room in memory, since they use the same bitmaps. And they save the time it takes to send and fetch textures to and from memory, which can be slow sometimes, which can improve framerate.

Spur · Feb 2013

Eric Chadwick wrote: »

I think that's correct, not batched. But they save room in memory, since they use the same bitmaps. And they save the time it takes to send and fetch textures to and from memory, which can be slow sometimes, which can improve framerate.

So that would basically be indexing right?

Ashaman73 · Feb 2013

Depends though, like if the mesh exceeds the vertex limit for a single batch, then it has to be split by the renderer into more than one batch, which would cause more draw calls.

Most modern game engine should support instancing. In this case you have one copy of the model on the video cards and just a list of position etc. to place the same model all over the place. In fact one drawcall without any memory limitations.

Lets say the 10 telephone poles I am placing are all different styles of poles so they are all different meshes but still share the same material. In that case since they are differing meshes they could not be batched?

I think that's correct, not batched. But they save room in memory, since they use the same bitmaps. And they save the time it takes to send and fetch textures to and from memory, which can be slow sometimes, which can improve framerate.

Sharing materials and textures is important. Most engines will process meshes in this order:
1. group all objects with the same shader/material
2. group all objects with the same texture
3. group all objects with the same mesh.

Even if using different meshes and producing multiple draw calls , you can improve performance by avoiding multiple shaders, aka material, (this motivates uber-shaders, shaders which are really big, but flexible, though you need to take care, that the internal overhead is not getting too large) and to group multiple textures on a single texture to avoid texture swapping (this is called texture atlas).

E.g. if you have one model with 3 materials (shaders) and 3 different textures, the engine might need to render 9 sub-mesh with 9 different rendering calls, instead of one rendering call when using a single material (or uber shader) and texture atlas.

So that would basically be indexing right?

Do you mean instancing (render the same mesh multiple times, but keep only one copy in memory ? )

Elyaradine · Feb 2013

Um. The way I understand it is a draw call is a chunk of state information that's going to be rendered. So it's one piece of material data (shader and input values/textures), and one set of vertex data.

When an engine performs batching, it looks at everything in the frame that shares the same material, and groups all of those meshes up into one chunk of data, until that chunk is filled up. So what you're effectively doing is sacrificing a bit of CPU (to build the mesh groups) and a bit of memory (because both the original meshes and the new, grouped mesh will be in memory) to send a whole set of data (a draw call) at once.

Therefore, if you had a bunch of different telephone poles, in a bunch of different shapes and sizes, but they were using the same material, then they WOULD batch, provided that they are also static, and have few enough vertices that they can all fit into one chunk of data that gets sent to the gpu.

My terminology might be a bit wrong (I'm not a graphics programmer), but I'm quite sure the essence is correct, because I've been quite heavily involved in art optimization at our studio.

--
The analogy, I suppose, might be kind of like having a schoolbus that takes people places depending on the uniform they wear (material). Lots of different kids of all different shapes and sizes (meshes) can get on the bus if they wear the same uniform (batching), until the bus is full. If the bus is too full, then it needs to make more than one trip (draw call), but that's still much faster than making one trip for each kid.

If kids with different uniforms want to get on, they have to wait at the bus stop until the bus is ready to go to their particular school.

CrazyButcher · Feb 2013

Most thing said are correct, the main point is always: it depends on the engine and the platform.

So any fancy tricks could be applied or not. Don't assume anything, if it's important to you just talk with the guy responsible for it.

Batching is motivated by two reasons:
- GPUs love big work loads, thousands of threads want to do work

That's why we can render those crazy tessellated scenes with millions of triangles these days, the amount of polys/vertices the chips can handle is insane compared to the old days.

- Rendering (dx,gl) is done via state machine (set shader, set geometry buffer, set textures, draw stuff, change blending, draw stuff...). Changing those states introduces various costs, driver has to do validation, the GPU's state has to be updated, sometimes that means it has to wait for work to be completed, some state allows multiple drawcalls in parallel...

The hardware instancing (1 drawcall, rendering same geometry memory N times) support is often used for particles, plants, things that are "meant to be heavy instanced". For generic scene models, I don't think it's popular, as the logistics to provide the "instance" specific data, would be a bit ugly, compared to stuff that is really designed for it (particles...).

And they save the time it takes to send and fetch textures to and from memory, which can be slow sometimes, which can improve framerate.

Textures normally live for the entire "level" or in case of streaming, for many frames. The "binding" costs within a frame when you bind texture a, b back to a.. for a drawcall, has normally nothing to do with the texture data transfer.

When an engine performs batching, it looks at everything in the frame...

Again there is no "standard" how batching is implemented, for some programmers the mere grouping of drawcalls could be considered batching, others would grow workloads and copy meshes together, as you describe here.

However, one would likely not build those batch-buffers every frame, but try to reuse for many frames. Or ideally keep completely static. Say your geometry is just a few vertices, it would be cheaper to just replicate the mesh with vertices being moved to proper location in the world (assuming static scene). Whilst other geometry is already "complex" enough, that it's cheap to just change the position in the world through shader parameter, then draw the same geometry at different locations with distinct drawcalls.

There is endless possibilties...

Here is some slides I did (starting from slide 100, previous ones are from Mark Kilgard) on some of this

http://www.slideshare.net/slideshow/embed_code/14019812?startSlide=100

Draw Calls and batching

Replies