Support me!
If you enjoy these webpages and you want to show your gratitude, feel free to support me in anyway!
Like Me On Facebook! Megabyte Softworks Facebook
Like Me On Facebook! Megabyte Softworks Patreon
Donate $1
Donate $2
Donate $5
Donate $10
Donate Custom Amount
024.) Uniform Buffer Object
<< back to OpenGL 4 series

Hello fellow 3D graphics enthusiasts and welcome to 24th tutorial of my OpenGL4 series ! In this one, we will learn what is uniform buffer object and how can it be used to speed up our rendering by issuing fewer commands. To demonstrate this, we will make a simple example with many point lights that float around the map. The point lights travel and bounce from edges of the world. So let's not waste time anymore with opening word and let's get down to business!

The problem

In this tutorial, there are two shader programs used to render the scene. One shader program renders all objects (tori, models), the other one renders heightmap with multiple texture layers. These two shader programs are completely different and they really cannot be merged into one universal shader program effectively. Both shader programs use several point lights that float around the scene. One way to use them is to have uniform variables for point lights in both shader programs (the usage we of uniforms that we are already familiar with).

The problem now is - we have multiple point lights (quite complex and big structures) that have to be set in two different shader programs. If we did it the old way, we would simply have to set uniforms (all point lights) in both shader programs. Now imagine that complex game engines and serious games don't use just two, but maybe five or dozen shader programs at once. And of course, the number of uniforms to be set is a lot higher than our simple example! Can we do better?

Well of course we can! That's where uniform buffer objects (further referred as UBO) come for rescue! The main idea is, that instead of setting uniform variables used by multiple shader programs, we set them only once in the UBO and then we setup shader programs to take uniform data from the UBO instead! This set up needs to be done only once (e.g. during scene initialization) and all we need to do from then on is to fill the data of UBOs before rendering and everything will still work !

A bit more theory

Now that we understand the problem, we also have to understand a bit of theory and OpenGL terms to come up with a solution. First of all, we have to understand what uniform block is.

Uniform block

Uniform block is a set of uniform variables, that we can set together as a whole with UBOs. Usually, we have multiple related variables in one uniform block. In this tutorial, we have two uniform blocks in both shader programs - one is for matrices (but only projection and view matrix, as they remain stable throughout the render) and another one is for point lights. In shader, these blocks look like this:

// Memory layout is std140 and block name is MatricesBlock
layout(std140, binding = 0) uniform MatricesBlock
    mat4 projectionMatrix;
    mat4 viewMatrix;
} block_matrices;
// ^This is block instance name (like variable name)

// ...

// Memory layout is std140 and block name is PointLightsBlock
layout(std140, binding = 1) uniform PointLightsBlock
    PointLight lights[20];
} block_pointLights;
// ^This is block instance name (like variable name, through it we access the data)

I hope you got the main idea of the blocks now . I haven't explained what is std140 and binding doing there and why does it matter, but we'll get to that later in the tutorial.

What is now most important is, that these blocks have their indices assigned. So for example, block with matrices can have index 0 in one shader program, but index 3 in another shader program (I suppose that they're assigned in order of appearence in the code). Why are these indices important? Because the uniform blocks are bound to uniform binding points using indices.

In order to understand the sentence above, we have to first have a look at how uniform buffer object works. Of course, my style is to wrap low-level functionality into nice high-level classes and that's what I did in this tutorial as well. .

Uniform buffer object class

Let's have a look at our class definition:

class UniformBufferObject

    void createUBO(const size_t byteSize, GLenum usageHint = GL_STREAM_DRAW);
    void bindUBO() const;
    void setBufferData(const size_t offset, const void* ptrData, const size_t dataSize);
    void bindBufferBaseToBindingPoint(const GLuint bindingPoint);
    GLuint getBufferID() const;
    void deleteUBO();

    GLuint _bufferID{ 0 };
    size_t _byteSize;

    bool _isBufferCreated = false;

As you can see, this class resembles other buffer that we have - vertex buffer object. There and here as well we perform buffer creation, deletion and binding. Let's first go through those functions, that you should already understand just from their names:

  • void createUBO(const size_t byteSize, GLenum usageHint = GL_STREAM_DRAW) - creates new uniform buffer object with a given size (in bytes). The buffer has its ID assigned. usageHint is important to tell OpenGL how we plan to use the buffer and GL_STREAM_DRAW is default option for that, because usually we plan to update UBO data every frame (streaming)
  • void bindUBO() - binds the UBO - this means all operations we'll do from now on apply to this buffer
  • void setBufferData(const size_t offset, const void* ptrData, const size_t dataSize) - sets buffer in the buffer at given offset. The data are located at ptrData and have size dataSize (in bytes)
  • GLuint getBufferID() - retrieves assigned buffer ID
  • void deleteUBO() - deletes the buffer object by calling glDeleteBuffers

Binding points

Great! Now the last function that we haven't covered is void bindBufferBaseToBindingPoint(const GLuint bindingPoint) this one. From the name, you can deduce that we're binding buffer base to the binding point. Thanks Captain Obvious ! But what does this mean? As mentioned before, if we want to link buffer data with uniforms in shader programs, first step is to bind the data to certain binding point. Binding point (or index) is a point, where certain data resides. In my tutorial, I have decided to use two binding points:

  • 0 - MATRICES_BLOCK_BINDING_POINT - at this binding point we will store projection and view matrix - those two martices are important for basically all 3D shader programs
  • 1 - POINT_LIGHTS_BLOCK_BINDING_POINT - at this binding point we will store data for all 20 point lights (20 is hardcoded maximum that I allow for performance, but feel free to raise it to 100 or whatever )

So I have decided, that matrices data reside at binding point 0 and point lights data at binding point 1. Finally what does the function void bindBufferBaseToBindingPoint(const GLuint bindingPoint) does? You probably guess right - with that we say, what binding point the data belong to! This has to be done only once and is in our tutorial it's done in the initializeScene function. Together with UBO creation, the code looks like this:

void OpenGLWindow::initializeScene()
    // ...

    // Create UBO for matrices and bind it to the MATRICES_BLOCK_BINDING_POINT
    uboMatrices = std::make_unique();
    uboMatrices->createUBO(sizeof(glm::mat4) * 2);

    // Create UBO for point lights and bind it to the POINT_LIGHTS_BLOCK_BINDING_POINT
    uboPointLights = std::make_unique();
    uboPointLights->createUBO(MAX_POINT_LIGHTS * shader_structs::PointLight::getDataSizeStd140());

    // ...

This code just tells OpenGL, that data for binding point MATRICES_BLOCK_BINDING_POINT (0) are in uboMatrices and data for binding point POINT_LIGHTS_BLOCK_BINDING_POINT (1) are in uboPointLights. I really hope that you get the point now . Maybe last thing that wasn't exactly explained is, what does the word "base" base stands for. Base just means that we're binding the WHOLE buffer to that binding point. OpenGL gives you an option to have one UBO, parts of which are bound to several binding points. But we don't need this in this tutorial at all .

Connecting uniform blocks with binding points

Now that binding points are ready, we just have to connect uniform blocks in shader programs with the binding points. This also has to be set up only once during initialization. As mentioned, our shader programs now contain something called uniform blocks. In every shader program, these blocks have their indices. First of all, we need those indices and that's why I created a function in ShaderProgram class, that gets the index of uniform block by its name:

GLuint ShaderProgram::getUniformBlockIndex(const std::string& uniformBlockName) const
    if (!_isLinked)
        std::cerr << "Cannot get index of uniform block " << uniformBlockName << " when program has not been linked!" << std::endl;
        return GL_INVALID_INDEX;

    GLuint result = glGetUniformBlockIndex(_shaderProgramID, uniformBlockName.c_str());
    if (result == GL_INVALID_INDEX) {
        std::cerr << "Could not get index of uniform block " << uniformBlockName << ", check if such uniform block really exists!" << std::endl;

    return result;

This function just does a call to OpenGL to give us block index of linked program (this is important - block indices are ready after linking) for block with given name. So simple! And now if we want to connect block index with binding point, that's what function bindUniformBlockToBindingPoint is exactly for:

void ShaderProgram::bindUniformBlockToBindingPoint(const std::string& uniformBlockName, const GLuint bindingPoint) const
    const auto blockIndex = getUniformBlockIndex(uniformBlockName);
    if (blockIndex != GL_INVALID_INDEX) {
        glUniformBlockBinding(_shaderProgramID, blockIndex, bindingPoint);

It gets the index of the block name and the glUniformBlockBinding function does this connection between uniform block and certain binding point. And this all happens only once in the initializeScene function. But care, this has to be done for every shader program!

void OpenGLWindow::initializeScene()
    // ...

    // Bind uniform blocks with binding points for main program
    mainShaderProgram.bindUniformBlockToBindingPoint("MatricesBlock", MATRICES_BLOCK_BINDING_POINT);
    mainShaderProgram.bindUniformBlockToBindingPoint("PointLightsBlock", POINT_LIGHTS_BLOCK_BINDING_POINT);

    // Bind uniform blocks with binding points for custom multilayer heightmap shader program
    heightmapShaderProgram.bindUniformBlockToBindingPoint("MatricesBlock", MATRICES_BLOCK_BINDING_POINT);
    heightmapShaderProgram.bindUniformBlockToBindingPoint("PointLightsBlock", POINT_LIGHTS_BLOCK_BINDING_POINT);

    // ...

By the way, remeber that keyword binding in the shaders? With that keyword, you can also set the binding point for that uniform block! Which means, that the code above actually doesn't have to be executed and everything will still works . But I do it anyway now, because in older versions of OpenGL, this had to be done. This way your logic should work even in older OpenGL contexts.

To recap, this is how these uniform blocks <-> binding points <-> shader programs connections look like:

I hope you are still with me, because even if you should now understand most aspects of uniform buffer objects, there is still one thing I didn't cover yet, but I promised it in the beginning - the std140 memory layout.

The std140 memory layout

The std140 is one possible memory layout how the data in buffers are stored. Why something like this matters? Because of performance. The way how CPUs and GPUs work is that they usually load a whole bunch of data at once. So if you just need a simple boolean or one byte, a bit bigger part is read out anyway. And for OpenGL, this means that it aligns the buffer data so that they can be read out efficiently. Here I will list the rules for most important GLSL types:

  • scalar type (bool, int, uint, float) - Takes size of basic scalar type (sizeof(GLfloat) or 4 bytes)
  • 2D vector type (bvec2, ivec2, uvec2, vec2) - Twice the size of basic scalar type, i.e. 8 bytes (no space wasted)
  • 3D vector type (bvec3, ivec3, uvec3, vec3) - Four times the size of basic scalar type, i.e. 16 bytes (one scalar wasted)
  • 4D vector type (bvec4, ivec4, uvec4, vec4) - Four times the size of basic scalar type, i.e. 16 bytes (no space wasted)
  • array - Size of one array element times length of the array, the whole thing padded as vec4 in the end
  • mat3 - Stored as an array of 3x 3D vectors, that means we have 9 scalars and whole thing is padded as if we had 3x vec4 (3 scalars wasted)
  • mat4 - Stored as an array of 4x 4D vectors, that means we have 16 scalars and whole thing fits just perfectly (no space wasted)

For more information, here in this std140 Uniform Buffer Layout document, one can find more information to this matter .

Pity is this wasted space that I mentioned up there. But good news is that this can be avoided! If you design your data structures in a smart way, you can prevent any waste! For example, combining vec3 with a subsequent float variable means no wasted space! Or you can use mat3 combined with one more vec3 and that's also great! Knowing these rules, let's have a look at my PointLight shader structure with this in mind:

struct PointLight : ShaderStruct
    // ...
    glm::vec3 position; // Position of the point light
    float __DUMMY_PADDING0__; // This is just needed because of std140 layout padding rules

    glm::vec3 color; // Color of the point light
    float ambientFactor; // Ambient factor (how much this light contributes to the global lighting in any case)

    float constantAttenuation; // Constant attenuation factor of light with rising distance
    float linearAttenuation; // Lienar attenuation factor of light with rising distance
    float exponentialAttenuation; // Constant attenuation factor of light with rising distance
    GLint isOn; // Flag telling, if the light is on

Let's start from the bottom up this time. You can see that I grouped together all ambient factors and the flag if the light is on. This sums up to 4 scalars, so no space wasted! I used GLint for the bool flag, so that C++ structure corresponds to the GLSL structure. Then, the color together with ambientFactor form a nicely padded vec4. The last piece of data - position cannot be aligned very well unfortunately. One scalar will be wasted and this is exactly why you see the __DUMMY_PADDING0__ variable. If you remove it, rendering just won't work correctly (feel free to try it). But with this dummy variable, we can set data to the uniform buffer all at once and everything works!

Setting uniform data

Final chapter of this tutorial will show you how can we set data to the UBOs. We do this every single frame, because matrices and point lights are different every time. So at the start of renderScene function, there is this code for setting matrices and point lights data to UBOs:

void OpenGLWindow::renderScene()
    // ...

    // Set matrices uniform buffer object data - we just set projection and view matrix here, they are
    // consistent across all shader programs
    uboMatrices->setBufferData(0, glm::value_ptr(getProjectionMatrix()), sizeof(glm::mat4));
    uboMatrices->setBufferData(sizeof(glm::mat4), glm::value_ptr(camera.getViewMatrix()), sizeof(glm::mat4));

    // Set point lights uniform buffer object data - in our case the poing lights are same across all shader programs
    GLsizeiptr offset = 0;
    for (const auto& pointLight : pointLights)
        uboPointLights->setBufferData(offset, pointLight.getDataPointer(), pointLight.getDataSizeStd140());
        offset += pointLight.getDataSizeStd140();

Very simple really . Just bind the UBO and set the data using setBufferData function! One more thing you might have noticed is, that I'm using two new functions for shader structures - getDataPointer and getDataSizeStd140.

The first one is important, because I'm using inheritance with shader structures and with inheritance, things like virtual tables exist and this means our data won't exactly start at the position of the whole structure. Therefore I have to explicitly point to where the data start. In our case it's the first struct member.

The second one is important for similar reason. Because of inheritance, the sizeof operator won't report size in bytes as you would expect. For example, our point light structure should be 48 bytes, but in fact sizeof(shader_structs::PointLight) will report 52 bytes in 32-bit application and even 56 bytes for 64-bit application! But we need 48 bytes. That's why I have custom function that calculates this size according to OpenGL rules - this way we don't need to worry.

Of course many things can be done to prevent and optimize this, but it would be exhausting to explain everything in such detail. I really try to point out the pitfalls of GLSL and C++ and then it's up to you not to fall into those traps . Sometimes such rules are a bit annoying, because you are persuaded that everything is perfectly programmed and still the program does not work. And the reason might be as subtle as such things like sizeof operator .


So after this whole journey, this is the result we get:

And I really think it's beautiful! For performance reasons, I have limited the number of lights to 20, but feel free to go beyond (I have tried 150 lights on my GeForce GTX 1070 Ti and still had 60 FPS with V-Sync on ).

So that's it for today! I think it was a bit exhausting and a lot to process, so take your time to understand it all. This kind of knowledge cannot settle in your head in one day, just let it lie there and if you get lost after a while, just come back to consult it with this tutorial again .

Download 7.61 MB (827 downloads)