What would be a correct ID3D11Query design for occlusion testing?

Question

I've tried looking for tutorials or samples related to the topic - but all that I could find are either scientific papers or vague posts on forums. As I understand it - I need to dedicate a whole pass to rendering simple AABBs to depth buffer only - to fill it first with depth data which will later be used in a separate pass for occlusion queries? OK I do that.

So how do I go about it in that separate, querying pass? When I render the same AABB a second time - I get the correct 0 pixels drawn result when it's out of view or covered by other objects in the scene - but I get incorrect results when the AABB is actually in the view - because it occludes itself from the previous pass and thus "pixels drawn" result is all over the place.

But I need the correct result to get the percentage of pixels drawn to produce smooth sun lens flare fading effect for example as its AABB gets gradually occluded increasing flare's transparency.

What am I missing here? What is the correct sequence to do Direct3D11 occlusion querying? Maybe some (pseudo-code) example?

An object shouldn't occlude itself if you're using a <= test rather than a <. — DMGregory
– DMGregory ♦, Commented May 25, 2022 at 19:39
It's hard for folks to determine what you are missing when you haven't shown them what you have done. — Pikalek
– Pikalek, Commented May 26, 2022 at 1:07
@DMGregory ah yes, thanks, that helps. And it works, sort of. Apparently doing GPU occlusion querying is still slow even nowadays (last more or less detailed topic on it that I found through google search was from 2012) because if you try to get the result the same frame - GPU parallelism goes out of the window and if you try to get the result a frame or two down the line - precision goes out of the window, resulting in pop-ins? Should I just resort to using some kind of other classic method instead? — krz
– krz, Commented May 27, 2022 at 7:14

krz · Accepted Answer · 2022-05-27 08:28:57Z

OK, I think I found a good enough balance between performance and pop-ins that did improve framerate for me by 20% vs. not doing occlusion/culling at all in a simple scene - should be even better in a complex one. Maybe it will help someone or maybe you will further help me improve it / correct it by answers in this thread?

Initialization:

On mesh creation/loading I generate a dedicated AABB mesh for the parent mesh.
I create a dedicated ID3D11Query object for each parent mesh with D3D11_QUERY_OCCLUSION desc and have each parent mesh object have an UINT64 query result variable.
I add a special UINT m_framesElapsed variable to introduce the delay between querying and getting the results because getting the results in the same frame obliterates performance

Mesh objects have their class methods BeginQuery(), EndQuery(), GetQueryResult(). BeginQuery() and EndQuery() execute only on m_framesElapsed 0 and GetQueryResult() executes on m_framesElapsed 2 (the lowest I could go in my case), if it's not 2 - it would increase frame count by one and reset it to 0 if it's 2.

Now in the main rendering pass I first render all high detailed meshes as usual, straight into both render and depth buffers - then in the occlusion path down the line I render AABB meshes only to depth buffer and query them against the high detailed meshes there: BeginQuery(), DrawCall(), EndQuery(), GetQueryResult() - and write the result into that high detailed mesh's UINT64 variable so next frame it will either be shown or get occluded. The good effect here is that because AABBs take more space than high detailed mesh due to being boxes - they introduce a margin of error that helps with pop ins / pop outs as a high detailed mesh will be guaranteed to be fully occluded by the time AABB is fully occluded. However because of 2 frames delay - there are still notable pop-ins on multi-node/multi-mesh complex models which right now I don't know how to deal with or how to eliminate (maybe cheat by increasing the scale/size of AABBs?), but the performance increase is quite notable.

Another thing I'm thinking about is before doing an AABB occlusion pass I'd copy a full-res depth buffer, downscale it, output it to a new, much smaller resolution depth buffer and then do AABBs rendering/occlusion testing pass against that buffer. Maybe it will save even more performance as copying/resizing is basically free compared to querying a buffer that has 4 times the pixels, hmm?

It's an interesting study case either way.

You may also want to take a look at this legacy Direct3D 10 sample which uses predicated occlusion queries: DrawPredicated. — Chuck Walbourn
– Chuck Walbourn, Commented Jun 15, 2022 at 22:16
@ChuckWalbourn I checked that sample out, it didn't work at first (provided DXUT version couldn't befriend my GTX1080Ti, so had to type in screen resolution manually, as DXUT methods were returning zeroes, resulting in assertion failure) but then it works weird. Like I have ~2200 fps without predication and ~2000 fps with predication, which is 10% framerate drop. Not only that but I'm not even sure predication does anything. At least can't tell if microscopes disappear - they certainly don't if covered by occluder meshes in front of them. — krz
– krz, Commented Jun 19, 2022 at 19:34
Don't forget to mark this answer as Accepted if it solved your problem. Click the checkmark in the top-left of the answer, under the voting buttons. — DMGregory
– DMGregory ♦, Commented Jun 26, 2022 at 10:17

Stack Exchange Network

What would be a correct ID3D11Query design for occlusion testing?

1 Answer 1

You must log in to answer this question.

Hot Network Questions

What would be a correct ID3D11Query design for occlusion testing?

1 Answer 1

You must log in to answer this question.

Related

Hot Network Questions