dds in video memory. more (dxt1) or less files (dxt3)

rollin · Dec 2011

heyho,

I'm wondering if it is 'better' to store your textures in several rgb-dds files (dxt1c) or try to pack special channels like team color, self ilumination, .. into the alpha channel of a dxt3

example would be (all 1024^2 maps)

diffuse + team color dxt3 ~1,4mb
color-spec + self ilumination dxt3 ~1,4mb

~2,8mb

OR

diffuse dxt1 ~0,7mb
color-spec dxt1 ~0,7mb
special-map (r=team color, g=self ilumination, b=one channel for free!) dxt1 ~0,7mb

~2,1mb

this would mean less file size AND one channel extra but 3 maps instead of 2

pure windows file size says that the extra channel (alpha) costs as much as 3 channels in a new dxt1 dds.

How does this look in the video mem?

marks · Dec 2011

Well for starters there isnt such thing as DXT1c ... that is just DXT1.

Regarding your question tho ... its really a toss-up between how fast your texture-read is in the engine and how much memory you have to play with.

And yeah, DXT3/5 alpha is expensiveeeee! A DXT5 is double filesize of a DXT1 and all you really get is an 8-bit alpha.

Noors · Dec 2011

mmh i think dxt1 is also more compressed. You lose quality on rgb channels compare to dxt3/dxt5, it's not only the alpha.
Also, beware dxt3 alpha is explicit, which means pure black and pure white, no gradient, so basically just used for masking.
Then alpha channel and green channel are less destructed by compression than the other channels, so you have to take it into consideration.

Farfarer · Dec 2011

While DXT3 or DXT5 is larger in memory - it's less textures to call up at render time. You can pull 1 texture and sample it once or pull 2 textures and sample them once.

It's probably very dependent on specific hardware set-ups.

Personally, I'd go for your option 2 as it's easier to separate out. Also you can easily LOD the shader down to cut out the specular map but you keep the team colours and self-illum, which would be more visible at long distance.

Froyok · Dec 2011

Noors wrote: »

mmh i think dxt1 is also more compressed. You lose quality on rgb channels compare to dxt3/dxt5, it's not only the alpha.
Also, beware dxt3 alpha is explicit, which means pure black and pure white, no gradient, so basically just used for masking.
Then alpha channel and green channel are less destructed by compression than the other channels, so you have to take it into consideration.

DXT1 and DTX3/5 have the same algorithm for compressing the RGB channels, so it's the same quality. The only difference is how the alpha is compressed. And from what I know, the G channel as the same quality than the R&B channel because the DTX1 algorithm doesn't look into each channel for choosing the colors, but directly with the RGB color. Of course you have less destruction with the alpha channel because there is only one channel to calculate, rather than 3 with the RGB diffuse.

In my case, I will try to avoid as much as possible the number of different textures in memory. Taking larger texture can be bad, but you need to think of the shader itself after that which can rapidly cost heavily if you switch often from a shader to an other (with the drawcalls by example), which mean if you have a lot of different texture it will rapidly be heavy.

rollin wrote: »

How does this look in the video mem?

Normally, it's the same size since the DXT file format is a specific type of file made specially for the video memory. So, in the GPU the texture will have the same size, the GPU simply uncompress them on the fly when it needs them.

rollin · Dec 2011

Froyok: The example is about a setup where all texture channels are used by the same shader. The difference is just how many different textures you send to the shader.

The question would be now.. does this make a difference in performance if you send over 3 dxt1 instead of 2 dxt3 textures?

About the compression this is a good place to start
http://en.wikipedia.org/wiki/S3_Texture_Compression
The green channel is less compressed because our eye is more responsive to green light mixtures
http://en.wikipedia.org/wiki/Cone_cell

Froyok · Dec 2011

Well, I still think the less texture samples you have in this case is the best option.

rollin wrote: »

The green channel is less compressed because our eye is more responsive to green light mixtures
http://en.wikipedia.org/wiki/Cone_cell

But that is not related to the DXT compression, right ?

almighty_gir · Dec 2011

so is it better to do this:

texture sample 1:
RGB = Diffuse
A = Transparrency

texture sample 2:
RGB = specular
A = gloss

texture sample 3:
RGB = Normals
A = Emmission (can control colour easily/cheaply in udk)

OR:
texture sample 1:
RGB = Diffuse

texture sample 2:
RGB = Specular

texture sample 3:
RGB = Normal

texture sample 4
R = Emmission
G = Gloss
B = Transparrency

Rick Stirling · Dec 2011

^
It depends on the engine and things like whether you are memory bound or streaming bound. Your engine might handle the 12 channels differently if they come as 3x4c or 4x3c in how it packs the lookup index.The cool thing is that the programmers can profile that for you and give you an answer.

Programmers are good for stuff like that.

As to the DXT5 taking double the memory of a DXT1, it does. But here's the cool thing with DXT5 normal maps. You leave the green channel where it is, using 6 bits of data, and you store the red channel in the alpha so it gets 8 bits of data rather than 5. The old red channel and blue channels get set to black. At runtime the normal map gets reconstructed from the green and higher quality red (normalising for lack of blue), and the normal map is a much higher quality.

You get less DXT block artifacts, and this is especially noticeable as your texture resolution decreases. How much better? It's not easy to measure, because it depends on the amount of detail in the source map, plus the resolution, but here is a GENERAL answer:

A 256x256 DXT5 (~85k) normal map will look almost as good as a 512x512 DXT1 (~175K) normal map.

I said almost, because there are caveats, but when you are dealing with hundreds of normals maps that's a good memory saving.

Like everything else in games, it's balancing act.

rollin · Dec 2011

Froyok wrote: »

Well, I still think the less texture samples you have in this case is the best option.

But that is not related to the DXT compression, right ?

Well, the wiki page itself not but the reason the DXT compression is weighted towards red end blue is because of that

Programmers are good for stuff like that.

I knew they are good for something.. didn't know for what exactly till now

Rick: I'm looking through the gta4 textures and I'm wondering how you generated the normal maps for the character clothes. It seems they are generated from the photo tex but some of the folds are just too 'volumetric'. Did you overlay a highpoly bake or did you use a tool to get this type of low frequent stuff out of the photo into a normal map?

Eric Chadwick · Dec 2011

A little misinformation in this thread about the various DXT formats. I put some stuff on the Polycount wiki, maybe it helps?
http://wiki.polycount.com/DXT

For example, DXT1c does exist, at least as a way to say that there's no alpha at all, because the encoding is slightly different.

Also, dxt3 alpha isn't just black and white and no gradient. There are indeed grays. It's just not great at handling wide gradients. It's better with alpha that is mostly black and white, with a thin area of anti-aliasing. Dxt5 is better most of the time anyhow.

dpadam450 · Dec 2011

As a graphics programmer: It doesn't matter much.

team color, self ilumination, .. into the alpha channel of a dxt3

Well you cant put team color as a grayscale value, it needs rgb values

special-map (r=team color

again how can you store a color with only a red value, all red = 1.0 red, 0.0 green/blue, 3 components.

Maybe you mean team color mask. Where the component would be either 0 or 1, for this pixel to use the team color blended or not (IE a stripe).

When requesting memory, you send a request and get back 3 or 4 rgb(a) values. You might as well pack everything so that you make the less memory requests (texture fetch / texture samples). I believe the fetch time is pretty much the same whether you request 3 or 4 components at once.

The quality with packing data into the alpha channel should not be visually bad and 1:4 compression is still great. So I would be more worried about the extra texture fetch than the extra 50K or so to have that alpha channel.

is weighted towards red end blue is because of that

DXT runs on samples of 4x4 pixels. It was based on the idea that pixels local to each other are basically the same color. So it takes the max and min, r,g,b and writes those almost uncompressed, then for each of the 16 pixels it writes them in two bits for each r,g,b (00, 01, 10, 11) IE 4 options. So you can only be 0,25%, 50%, or 100% of the min/max for each rgb channel.

It's just not great at handling wide gradients.

As explained above, the more contrast in the pixels in the 4x4 block, you can only reproduce 4 unique colors for each component, so if each 4x4 pixel has a unique alpha, then those will be chopped down to only 4 options. DXT5 provides more options as it adds more bits, so more than 4 options. It looks like it provides 16 alpha options.

rollin · Dec 2011

dpadam450 wrote: »

Maybe you mean team color mask.

That! Sorry for not making this more clear.

So you say that calling one more texture is worse then putting more stuff into the video mem?

Eric Chadwick · Dec 2011

rollin wrote: »

So you say that calling one more texture is worse then putting more stuff into the video mem?

This depends on your hardware. In general, PCs are slow when fetching new bitmaps into memory. Each new bitmap has to travel through a thin pipe that is the "video bus" to get into video memory. This can become a bottleneck, slowing your framerate, because the game has to wait until the bitmaps are loaded before it can render the mesh.

But this really depends on a lot of things, as Rick and others have said. Each game situation is different. The texture needs are different, the rendering code is different, the target audience's hardware is different, etc.

Best thing is to run tests with different textures, and see which runs the best, given your situation.

Learn how to talk with your graphics programmer, they are usually a wealth of info about this stuff, and more importantly how it works in your particular situation.

Rick Stirling · Dec 2011

rollin wrote: »

Rick: I'm looking through the gta4 textures and I'm wondering how you generated the normal maps for the character clothes. It seems they are generated from the photo tex but some of the folds are just too 'volumetric'. Did you overlay a highpoly bake or did you use a tool to get this type of low frequent stuff out of the photo into a normal map?

A mix. Almost everything was sculpted, with overlays from textures to add extra normal map details.

If you've looked through the textures you'll have some idea on the sheer volume of assets that were created for that game, so we had to come up with a pipeline that would let us get the assets built in time to ship.

marks · Dec 2011

Eric Chadwick wrote: »

For example, DXT1c does exist, at least as a way to say that there's no alpha at all, because the encoding is slightly different

Thats interesting. I had this exact same conversation with our lead engine programmer at work not too long ago. My understanding was that as far as DirectX is concerned .... DXT1 is DXT1. If you "disable" the alpha channel, it still saves an alpha except all the pixels have a value of 0, so you don't actually get any memory saving from it. I remember looking through DirectX SDK documentation from June2010 which seemed to support that.
Am I wrong here?

dpadam450 · Dec 2011

Best thing is to run tests with different textures, and see which runs the best, given your situation.

Basically whatever way you use is so negligible that you wont see any performance difference. Just make sure you dont send unused channels as that is a waste. There are options to store textures as 1 channel instead of rgb/rgba.

In general, PCs are slow when fetching new bitmaps into memory. Each new bitmap has to travel through a thin pipe that is the "video bus" to get into video memory. This can become a bottleneck, slowing your framerate, because the game has to wait until the bitmaps are loaded before it can render the mesh.

textures are stored on the gfx card itself, which is why they come in 1Gig up to 2Gig these days, so we can have 2048, 4096 textures.

If you go play a game you have and force your card to do no anisotropic filtering and bump it up to 2,4,8,16 you will see the effect of more texture fetches (turn of FRAPS). An extra 1 is not much. So your original question, the real answer is that it is so negligible, just pack illuminance/color mask etc however you or your engine wants it.

Eric Chadwick · Dec 2011

Well, for one model, it's not a problem. But when you're talking about the way all the models in your game are organized, then it becomes a big deal. At least in my experience. Whether I use atlases or not does usually have a performance impact.

DXT1 does encode the bits differently depending on alpha or not, but there's no change in file size. When you set the bit that "stores" alpha, you have zero in the color for that pixel. But when that's off, you don't have to have dead black pixels. I could be wrong though, I'm no programmer.

CrazyButcher · Dec 2011

If you take something like Unreal there is so much going on, that either using 2 or 3 textures will not make worlds collide

Given you already use a bunch of them and there is tons of stuff going on.

That said, just like dpadam450, I'd favor less textures with more channels used. Means less work for streaming jobs, less resources used by the shader / less fetches.

One could do a hardcore profiling on testing how much faster "in theory" either variant is, but in a real world application, things should be fine.

In the end you have your resources around uncompressed anyway, and it could be made a simple batch process which "combines" your channels and spits out the files. Assuming you have 100s of models that are rendered with said shader. That means if that actually becomes a real world bottleneck to put investigation into, which imo in todays world is less likely to happen (we are talking PC right?)

Eric Chadwick · Dec 2011

Yeah, guess I'm harping the point because in general we're targeting the older and mobile hardware, not the latest/greatest. If you've got only killer hardware, no worries.

marks · Dec 2011

If you're working on console however ... that performance matters a lot of the time.

couette · Dec 2011

As Eric described, there is an encoding difference between DXT1 with alpha and DXT1 without alpha. Both have the same memory footprint at 8 bytes per texel, and both store two 16 bit color values (565 RGB) in the first 4 bytes of the texel.

With no alpha, the first color value is always set to be greater than the second value. Two other colors are lerped between the two min/max values, for a total of 4 possible colors for each pixel.

With alpha, the first color value is always set to be less than the second value. One other color is lerped between the min/max values, and the 4th 'color' is full transparency.

To have that full transparency pixel option, you have to sacrifice one interpolated color.

As for using 3 DXT1 maps (8 bytes per texel, per map) vs 2 DXT3 maps (16 bytes per texel, per map)... the smaller the memory footprint, the less space in both System and Video memory, and also less memory bandwidth consumed during fetches/transfers. Two DXT3 maps takes 1 less fetch than 3 DXT1, but consumes 33% more bandwidth for each texel fetch during sampling, and takes up 33% more space in video memory. The savings in the processing overhead for having one less fetch is probably trivial. DXTC decompression on the gpu is basically free.

For one object in a scene, it won't make much of a difference. For a scene with multiple objects, it might.

dds in video memory. more (dxt1) or less files (dxt3)

Replies