Support me!
If you enjoy these webpages and you want to show your gratitude, feel free to support me in anyway!
Like Me On Facebook! Megabyte Softworks Facebook
Like Me On Facebook! Megabyte Softworks Patreon
Donate $1
Donate $2
Donate $5
Donate $10
Donate Custom Amount
27.) Occlusion Query
<< back to OpenGL 3 series

Hello guys! This is the 27th tutorial from my series. This one is about occlusion query, which is used to speed up rendering and discard rendering some complex geometry, that won't be rendered simply because it cannot be visible in scene for sure. The principle is really simple - before rendering any complex geometry of the object with many meshes and triangles, we first render a simple bounding shape of an object (for example, bounding box is the easiest) and we just ask OpenGL, whether we would have successfully rendered any fragments (pixels). If rendering of a bounding box wouldn't make a change to a framebuffer (i.e. it's not visible in scene at all), that means, that the object itself cannot be visible as well and we can discard rendering the whole complex shape! And that's the whole simple point, coding it won't be very difficult, so let's dig deeper into it .

Occlusion Query

The whole process described above is called an occlusion query. Unfortunately, the translation of the words occlusion, occluder and occludee doesn't exist in Slovak language, so I hope I won't misuse them (if yes, let me know) . Occluder is in our case the bounding box, because it occludes (contains, consumes ) the object we are testing - the occludee (the highly tesselated sphere in our case). That means that we are not going to actually render occluder, but only ask if it were rendered if we actually rendered it (if we would see some of its parts). If we could see at least one pixel of it, some parts or maybe whole occludee (highly tesselated sphere) is probably visible too and thus we will render it. The whole rendering code is here and I will explain it line by line:

bool bShowOccluders = false;
bool bEnableOcclusionQuery = true;

void RenderScene(LPVOID lpParam)
	// ...

	int iSpheresPassed = 0;
	bool bRenderSphere[3][3][3];
	glm::mat4 mModelMatrices[3][3][3];

	spOccluders.SetUniform("matrices.projMatrix", oglControl->GetProjectionMatrix());
	spOccluders.SetUniform("matrices.viewMatrix", cCamera.Look());
	spOccluders.SetUniform("vColor", glm::vec4(1, 0, 0, 0));

	// Occlusion query begins here
	// First of all, disable writing to the color buffer and depth buffer. We just wanna check if they would be rendered, not actually render them
	glColorMask(false, false, false, false);

	FOR(x, 3)
		FOR(y, 3)
			FOR(z, 3)
				bRenderSphere[x][y][z] = false;
				float fLocalRotAngle = fGlobalAngle + x*60.0f + y*20.0f + z*6.0f;
				glm::vec3 vOcclusionCubePos = glm::vec3(-fCubeHalfSize+fCubeHalfSize*x*2.0f/3.0f + fCubeHalfSize/3.0f, -fCubeHalfSize+fCubeHalfSize*y*2.0f/3.0f + fCubeHalfSize/3.0f, -fCubeHalfSize+fCubeHalfSize*z*2.0f/3.0f + fCubeHalfSize/3.0f);

				mModelMatrices[x][y][z] = glm::translate(glm::mat4(1.0), glm::vec3(0.0f, fCubeHalfSize, 0.0f));
				mModelMatrices[x][y][z] = glm::translate(mModelMatrices[x][y][z], vOcclusionCubePos);
				mModelMatrices[x][y][z] = glm::rotate(mModelMatrices[x][y][z], fLocalRotAngle, glm::vec3(1, 0, 0));
				mModelMatrices[x][y][z] = glm::rotate(mModelMatrices[x][y][z], fLocalRotAngle, glm::vec3(0, 1, 0));
				mModelMatrices[x][y][z] = glm::rotate(mModelMatrices[x][y][z], fLocalRotAngle, glm::vec3(0, 0, 1));

					mModel = glm::scale(mModelMatrices[x][y][z], glm::vec3(fCubeHalfSize/3, fCubeHalfSize/3, fCubeHalfSize/3));
					spOccluders.SetUniform("matrices.modelMatrix", mModel);

					// Begin occlusion query
					glBeginQuery(GL_SAMPLES_PASSED, uiOcclusionQuery);
						// Every pixel that passes the depth test now gets added to the result
						glDrawArrays(GL_TRIANGLES, 0, 36);
					// Now get tthe number of pixels passed
					int iSamplesPassed = 0;
					glGetQueryObjectiv(uiOcclusionQuery, GL_QUERY_RESULT, &iSamplesPassed);
					// If some samples passed, this means, that we should better render the whole sphere, because we were able 
					// to see its bounding box
					if(iSamplesPassed > 0)
						bRenderSphere[x][y][z] = true;
						// Increase the number of spheres that have passed
				else // If we do not use occlusion query, then all of the spheres have passed
					bRenderSphere[x][y][z] = true;
					// Increase the number of spheres that have passed

	// Re-enable writing to color buffer and depth buffer
	glColorMask(true, true, true, true);

	// ...

So what we're basically doing here is that we have an 3x3x3 array of booleans, where we store true or false - whether to render particular sphere or not. First of all, we must disable writing to color and depth buffer using glColorDepthMask and glDepthMask. That means, that anything that is rendered now (our occluder) will actually not get written into any of the buffer and thus not get rendered. But thanks to occlusion query, we can ask that important question - how many fragments (pixels) have passed the depth test and made it to the final stage of rendering, i.e. writing to color buffer? To start the occlusion query, we must first call the function glBeginQuery(GL_SAMPLES_PASSED, uiOcclusionQuery). This function has two parameters - first is query type, in our case it is GL_SAMPLES_PASSED - we would like to find how many fragments will make it to the rendering phase. The second parameter is query object - to run a query, you must have an OpenGL generated ID associated with it. This is done in the initScene() using the glGenQueries function, which we have also used in the 23rd tutorial about Particle System, where queries have been used for figuring out number of output primitives.

Now that we have began the GL_SAMPLES_PASSED query and have turned off writing to color and depth buffer, every pixel that would make it through counts. So now we just render the bounding box of the sphere and then call the glEndQuery function, which as the name suggests ends the query. Note that we have separate shader program for bounding box rendering without any texturing and lighting calculations, just a flat fragment output, because this is the only thing we care about at the moment. Now we are able to ask for the result of the query - number of pixels passed. This is done using glGetQueryObjectiv(uiOcclusionQuery, GL_QUERY_RESULT, &iSamplesPassed) function and now iSamplesPassed holds the number of pixels passed. Only if this number is greater than zero, only then we want to render the sphere. We mark it to our array of booleans and proceed to next sphere.

When we are done testing which spheres should be rendered, next steps is naturally to render (possibly) visible spheres. We just need to re-enable writing to depth and color buffer and then go through every sphere in the scene using the data from previously filled array of booleans. And that's it .

NOTE: There also exists a query GL_ANY_SAMPLES_PASSED, result of which is GL_TRUE or GL_FALSE, if any pixels have passed or none at all, respectively. This query would also work in this case, but I tested it and I didn't find any performance-wise speed-up, so I sticked with GL_SAMPLES_PASSED with comparing it against 0. You can try it out yourself .

When To Use Occlusion Query

Occlusion query is really nice and simple way to get rid of objects not present on the scene easily. But I found out, that it should be used in cases, when we are rendering really really complex objects only. I mean that there is a certain threshold, starting from which it's worth using. For example - if we used sphere with 10 stacks and 10 slices only, the occlusion query (at least on my computer) resulted in lower FPS than without it. However with spheres with 200 stacks and 200 slices, the FPS difference was remarkable when rendering with or without occlusion query. In such cases, the occlusion query really paid off. You can actually test it in the application by turning occlusion query ON / OFF and also by modifying stacks and slices numbers of a single sphere in sphere.ini file, which is present in the bin directory (where exe is). So you can try to modify this value and see the results for yourself.


This is the fruit of today's effort:

I think that this tutorial was one of the easiest I have ever written. It's really not difficult to understand logic behind it, nor is it difficult to convert it to code. I recommend you to play a little with that stacks / slices parameter to see the FPS differences. Next time I will probably create a little more complex tutorial, as this one was really chill .

Download 4.94 MB (3151 downloads)