I mean something like speed tree etc. With "cards" rotating to camera vector? Do they have to be separate objects , gazilion of them really ? Can it work in sub-object level with groups of such rotating "cards" be one object ?
If you have cards / billboards there aren't that many of them.. so not every leaf .. would look strange anyway.. But yes, basically every billboard is it's own (sub-)object. Different technical implementations might do some things bit differently but basically that's it.
So every billboard is its own object with its own pivot, right? Isn't it still too many of them that way for an engine to sort . Lots of draw calls ets? I recall we did kind of 2d sprites rendered on screen and sorted/scaled based on depth at some ancient times . Looked weird although.
Draw calls depend on the technical implementation, sorting too. But yes they have all their own pivot in a way of 3d-transforms. Because they have to rotate around this pivot when billboarding.
I believe Speedtree uses batching to draw all those billboards in one draw call. They're all using the same material, so all the vertices are sent as one "call" to be rendered.
Thanks Eric . Did they publish any details about that batching ? i assume the sorting of gazillion objects with alpha blend could still be a a bottleneck.
I am trying to figure out what way would be less FPS hungry: individual billboards vs static geo "cards" in X or Y shape combined in bigger groups + lods.
A grass for example. Some games do billboards , other static ones . Do somebody ever measured which approach is more efficient ?
Not much games uses billboards anymore. Just assign a separated material, so a shader can be assigned. Element pivot position is usually stored in foliages, so rotating them around that point is easy. But, they use this for complex foliage movement, not for turning them towards the camera.
Not much games uses billboards anymore. Just assign a separated material, so a shader can be assigned. Element pivot position is usually stored in foliages, so rotating them around that point is easy. But, they use this for complex foliage movement, not for turning them towards the camera.
So how it it in fact works? Elements are separate objects using its own object pivots or each element is a vertex /poly group or something using certain transformation technique by the shader ? With elements pivots is a separate point cloud or something?
I guess it might be possible to rotate a triangle (for grass) withing a bigger mesh based on it's average calculated in run time middle point and transform 3 vertexes accordingly or it's crazy idea?
Not crazy. Its a shader all the time. Pivots can be stored in various ways. They can be baked into uvs, or vertex colors, or tiny data textures. The Unreal Engine pivot painter implementation uses data textures for example. And even data textures can be used in SIMD fashion. This still requires an extra uvmap though, where one element covers one pixel on the data texture. Then take the pivot point and rotate the vertices around it. For 3dsmax, there is a pivot painter plugin coming with Unreal. Its fairly simple to use. You can take the texture it spits out, and do anything with it inside the material editor.
Not crazy. Its a shader all the time. Pivots can be stored in various ways. They can be baked into uvs, or vertex colors, or tiny data textures. The Unreal Engine pivot painter implementation uses data textures for example. And even data textures can be used in SIMD fashion. This still requires an extra uvmap though, where one element covers one pixel on the data texture. Then take the pivot point and rotate the vertices around it. For 3dsmax, there is a pivot painter plugin coming with Unreal. Its fairly simple to use. You can take the texture it spits out, and do anything with it inside the material editor.
So how it it in fact works? Elements are separate objects using its own object pivots or each element is a vertex /poly group or something using certain transformation technique by the shader ? With elements pivots is a separate point cloud or something?
From the technique I've seen, it works like this:
Using the CPU, you...
Create a mesh and populate it with quads (separate pieces formed by 2 triangles).
Each quad is mapped to the entire UV space.
Each quad has their own pivot (an abstract point, since the quads are not separate objects, only separate geometry), and all quads start with their 4 vertices collapsed at the pivot point.
Using the GPU (vertex shader), you...
Identify which corner of the quad the vertex belongs to. Since the UVs on a quad vertex will be one of (0.0, 0.0), (0.0, 1.0), (1.0, 0.0), (1.0, 1.0), this is enough to identify which corner it is, like "top-left", "bottom-right", that kind of thing.
You will offset that vertex using its UV coordinates and a scale factor to control the size of the quad.
Since all vertices are collapsed at the quad center/pivot, moving each vertex away will expand the quad and make it visible.
If you first transform the vertex with a ModelView matrix, you will bring it into the view-space (the eye/camera is the origin), so any offset that you make to the vertex in this space will keep the quad facing the camera plane, which is what's needed to give the billboard effect of it always facing the camera.
Store data in the vertex colors and vertex normal:
The red vertex color can be used to control the rotation of the quad ([0, 1] maps to [0, 2π]).
The green color can be used to control the horizontal scale and the blue color the vertical scale of the quad.
The vertex alpha can control the quad transparency.
The vertex normal (an XYZ vector) can be used to store other things, like atlas texture position, UV offset for warping effects etc. You can do this since the vertex normals can be recalculated as the normalized vector from the quad center to the offset corner position
If you don't want a quad to be visible at all you can keep its 4 vertices collapsed.
EDIT: If the texture coordinates, normals and vertex colors are not enough to store all the per-particle data you need, you can add more attributes (another texture coordinate set, aka UV channel) as needed.
Thanks you very much RN. Somehow I came to very similar idea too. But my concern it might be not that much FPS efficient at all. Initially I just wanted to do something like grass in Forza . On my count rotating cards needs 3-4 less of them in total than usual chaos of multi-facing ones. But now it seams that rotating ones may eat up all the calculative advantage with such geometry transforming shader .
This is actually a similar case to Nanite in a sense that gpu can work in a more parallel way, when the data is provided in a texture. Niagara in Unreal holds a lot of data in textures. 1 sprite ->1 pixel on the atlas. An operation can be applied on all sprites at once (SIMD). A texture is just a 2d array. Or multiple 1d arrays if we count new rows in the texture as new array. The benefit of this and such an uv layout is that you can do operations on all elements of the array at once. Need to update the transform? Call the update once on the data texture->all sprites will be updated. This is far cheaper than updating them one by one. This doesn't even have huge memory or processing requirements. A 256x256 texture can hold one vector (position for example) for 65k sprites.
We could say that if your input and target array data layout matches - if the data set is a texture and the input output uvs matches, you can go SIMD. If this requirement meets, executing the given operation once is equal to if you did it on each element. Basically a texture sample here is a "for each". You can't apply this in all cases, or at least we haven't found a way, but a lot of algos can be done in this way with some modifications. Also keep in mind that this is not possible with cpu based solutions. You can create similar behavior with many cores, but you don't have that many yet.
Multiplying 2 textures on a given uv is SIMD. Or saying that transform the pixel values in this texture by these vectors is SIMD. This is why its super fast. Ideally, you want most if not all operations to be like this, if your processing unit is made to do things in this way.
As a continuation of this story, maybe you can imagine already, as harsh as it sounds, the way realtime handles skeletal meshes and animation in general is total garbage. There are multiple bottlenecks...There are a few examples of animated "static" meshes using data textures, and you can do it on characters too. I'm not sure why this doesn't get bigger attention especially these days, because you can clearly place hundred times more characters if they are animated using vertex shader. There are also a few crowd implementations using such techniques.The statistics differences between 1000 traditional skeletal mesh and 1000 animated using vertex shader and data textures is shocking. Unfortunately I can't go into exact details regarding this as I experienced this in a certain project. But even just thinking about why this works quickly makes a lot of sense in my opinion. Please go ahead, make a split highly divided plane and move them using vertex shader to see.
Take this part with a grain of salt as this is my experience and I started to be a little bit biased maybe, but the concept still applies.
One thing I still don't get although. A texture can store each vertex position in 3d space like what's in new U5 demo stones I assume .
But how could it store timeline . it's animated texture? a new one for each new frame? A new keys set? A super long atlas with each next frame in a certain new interval ? I mean if UV is same? Or it's separate set of UV squeezed to one pixel column ?
As a continuation of this story, maybe you can imagine already, as harsh as it sounds, the way realtime handles skeletal meshes and animation in general is total garbage. There are multiple bottlenecks...There are a few examples of animated "static" meshes using data textures, and you can do it on characters too. I'm not sure why this doesn't get bigger attention especially these days, because you can clearly place hundred times more characters if they are animated using vertex shader. There are also a few crowd implementations using such techniques.The statistics differences between 1000 traditional skeletal mesh and 1000 animated using vertex shader and data textures is shocking. Unfortunately I can't go into exact details regarding this as I experienced this in a certain project. But even just thinking about why this works quickly makes a lot of sense in my opinion. Please go ahead, make a split highly divided plane and move them using vertex shader to see.
Take this part with a grain of salt as this is my experience and I started to be a little bit biased maybe, but the concept still applies.
If you can find it there's a GDC talk from a few years ago about the crowd from planet coaster where they use similar techniques.
Gnoop, in this context a texture is simply a structure for holding data that the GPU can access in a very efficient way, there's not really any difference between a 4channel 512pixel square texture and an array of 512^2 float4 values. Animation is generally stored as a sequence of matrices or quaternions which fit quite neatly into this format.
At a guess you'd want to store the data hierarchically - Eg. All Character world space transforms in one set of textures with walk cycles etc. in other sets of textures that can be played back on each character. It's worth pointing out that textures don't have to be preauthored, they can be generated at runtime based on what's happening in game etc. so procedural animation is entirely feasible
Thanks poopipe. I understand it as a principle but still have no idea how could it be implemented in our engine. Talked with our tech guys and they also haven't. It looks like the texture is using not regular UV to figure out what pixel a vertex should read from but some kind of special predefined order . But how would next lod do it? How would animation mixing work?
Animation mixing is the harder part of this. If you want to fully blend between 2 animations, and not by parts, its still fairly straight forward. Hold 2 texture samplers for the data texture, put the original one in the first slot, and the new animation to the second one. Then interpolate the value. This will give you smooth transition between the 2 animations.
The uv is an extra uvmap, specifically made for this purpose. Plan the texture res first, then make it so one vertex is one pixel. The 3 verts of the tri would be laid out next to each other.Houdini has a tool for this that automates the process. It also bakes the data texture so it fully automatic. There is a material function template for unreal. Check that out and copy the logic. Its very simple though.
LOD is simple again. Make a new data texture for the lod, and use it in its material instance. These data textures are very small size even for a moderately highpoly mesh with a lenghty animation, so its not a big deal.
For runtime processing, you'll need compute shader.
I highly recommend checking out and using VAT (vertex animation texture) in Houdini game dev tools! Because it simplifies the whole process a lot.
Replies
Different technical implementations might do some things bit differently but basically that's it.
Using the CPU, you...
Using the GPU (vertex shader), you...
Check this out (data storage part):
https://www.gamasutra.com/view/feature/130535/building_a_millionparticle_system.php?print=1
Modern gpu based solutions uses a lot of data texture methods.
Multiplying 2 textures on a given uv is SIMD. Or saying that transform the pixel values in this texture by these vectors is SIMD. This is why its super fast. Ideally, you want most if not all operations to be like this, if your processing unit is made to do things in this way.
Take this part with a grain of salt as this is my experience and I started to be a little bit biased maybe, but the concept still applies.
https://www.youtube.com/watch?v=QykklFF3Jps
Gnoop, in this context a texture is simply a structure for holding data that the GPU can access in a very efficient way, there's not really any difference between a 4channel 512pixel square texture and an array of 512^2 float4 values.
Animation is generally stored as a sequence of matrices or quaternions which fit quite neatly into this format.
At a guess you'd want to store the data hierarchically - Eg. All Character world space transforms in one set of textures with walk cycles etc. in other sets of textures that can be played back on each character.
It's worth pointing out that textures don't have to be preauthored, they can be generated at runtime based on what's happening in game etc. so procedural animation is entirely feasible
The uv is an extra uvmap, specifically made for this purpose. Plan the texture res first, then make it so one vertex is one pixel. The 3 verts of the tri would be laid out next to each other.Houdini has a tool for this that automates the process. It also bakes the data texture so it fully automatic. There is a material function template for unreal. Check that out and copy the logic. Its very simple though.
LOD is simple again. Make a new data texture for the lod, and use it in its material instance. These data textures are very small size even for a moderately highpoly mesh with a lenghty animation, so its not a big deal.
For runtime processing, you'll need compute shader.
I highly recommend checking out and using VAT (vertex animation texture) in Houdini game dev tools! Because it simplifies the whole process a lot.