Status report: github, OpenGL 4.4 and multithreaded renderer

The first important change is the new home of AnKi. From now on AnKi’s source code lives on github and the address is In the technical side of things there is a brand new OpenGL 4.4 backend with a multithreaded rendering thread. In this new concept there is a dedicated thread exclusively serving OpenGL work. Other small features are the screenspace local reflections effect, removal of C++ exceptions, some steps towards a complete removal of STL (new string, dynamic array etc).

For the future we have plans to add Windows support. Steps have been made with AnKi compiling on Windows. Unfortunately there are some bugs that we need to address before declaring full Windows support. Another planed feature is integration with Newton Dynamics physics engine. Compared to bullet Newton seems less buggy, with more features and the most important for me is that its interface is superior (a simple C API). Other small features is an improved tile base deferred renderer, improved transparency, cascaded shadowmaps, OpenGL ES 3.1 support and other.

C++11 spinlock

Apart from mutexes, spinlocks provide another way of protecting critical sections from multi-threaded access. Spinlocks perform the same job as mutexes but in a different way.

During mutex locking the OS may send the thread to sleep and wake it up when the lock can be obtained by this thread. Sending a thread to sleep and waking it are somewhat expensive operations but under some circumstances the OS may replace the waiting thread with another that has some meaningful work to perform. Spinlocks use another way of waiting called “busy waiting”. Instead of sending the thread to sleep, spinlocks “trap” the thread in a loop by constantly checking if the lock can be obtained. During busy waiting the thread is kept alive making it difficult for the OS to replace it with another one.

Mutexes are better for long locking periods where spinlocks for shorter. C++11 has build-in support for mutexes but not for spinlocks. Fortunately creating a spinlock class is quite easy.

#include <atomic>

class SpinLock
    void lock()

    void unlock()

    std::atomic_flag lck = ATOMIC_FLAG_INIT;

And an example/benchmark of using a spinlock is the following:

SpinLock lock;
//std::mutex lock;
int count[8] = {0, 0, 0, 0, 0, 0, 0, 0};

void foo(int n)
    for(unsigned i = 0; i < 10000000; i++)

int main(int, char**)
    std::vector<std::thread> v;
    for(int n = 0; n < 8; ++n)
        v.emplace_back(foo, n);

    for (auto& t : v) 

    return count[0] + count[1] + count[2] + count[3]
        + count[4] + count[5] + count[6] + count[7];

Running the above benchmark using time command:

Mutex locking:
real 0m12.167s
user 0m7.760s
sys 1m20.169s

Spinlock locking:
real 0m10.925s
user 1m14.653s
sys 0m0.008s

It seems that for the above simplified scenario, the spinlock performs better and minimizes the time spend in the kernel.

That’s all, happy codding!

Porting AnKi to Android

After months of careful planning, reading the GL specs and waiting for Google to add OpenGL ES 3.0 support, AnKi was finally ported to Android. This article presents the result and briefly expands on the challenges of porting a game engine to Android. More precisely, there is a video showing a demo running on an Android tablet, the second is a pre-build .apk of that demo that you can download and install to your Android device and the third is a few thoughts on Android development and mobile GPUs.


The video is showing a flyby demo running on a Samsung/Google Nexus 10 tablet equipped with Mali T604 GPU. Please note that the demo is a bit slower because of HDMI. The resolution is 720p, the FPS (without HDMI) ~17 and the lights are in the worst case ~10.

Continue reading

Developing games on and for Linux/SteamOS

Linux, or GNU/Linux as some people want to call it, was born 20+ years ago as an open source desktop operating system and despite its massive success on super-computers, servers, embedded and mobile devices it didn’t manage to gain the same traction on the space that was bred for, on desktop. Many people gave valid arguments on why desktop is a hard market to conquer but deep down I believe that video games is one of the most commonly desired features of a home computing system. Despite some notable efforts over the years Linux gaming never shared the same love as Windows PC gaming and console gaming but that may come to an end mainly due to Valveā€™s efforts to steer developers and gamers to Linux. As a Linux (game) developer for the past 6 years I think I can shed some light on the pros and cons of that system and who knows maybe some people will find this reading useful.

To be able to see and understand the big picture of developing on and for Linux we first need to identify the smaller areas that compose it. Things like tools (compilers, debuggers, libraries), hardware abstractions (graphics and sound APIs) and finally maintaining multiplatform capabilities are some key elements of that process. Keeping the multiplatform aspect of things in the back of our head is an interesting and sensible thing to do mainly because with careful planning and not that much effort a Linux application can be ported to other operating systems where the opposite may not be that easy.

Continue reading

Designing a new texture format

In this article we will be discussing the whys and whats of the new AnKi texture format. First we will try to shade some light on the current texture formats and what are their limitations and later we will present the ankitex format in detail and how it solves these limitations.

The first thing we need do is to shed some light on OpenGL hardware, OpenGL drivers and how they need their textures served.

Continue reading

Download the first AnKi demo: Hundreds of lights

After some years of development I am happy to present the first AnKi 3D engine demo. The demo is in alpha state and it’s very rough around the edges so please bare with me and report any problems found in the project’s google code page or use the emails in the contact page. Also note that at the moment it only runs and compiles in Linux.

And here is download link:

To run the demo navigate to the directory that contains the demo64 binary and run that binary. The source is also included in case you want to build everything yourselves. Read the txt files in the directory for more info.

Some points on the tech behind the demo:

  • Its been using using the new tile based deferred renderer.
  • You can visualize more than 100 lights at the same time.
  • You can see al the the shadows of the spot lights.
  • It features post processing effects like HDR, SSAO, color correction, sharpen filter and other.
  • Taking advantage of multi-core CPUs.

What you will need to run the demo:

  • A Linux powered box. Ubuntu 12.04 and beyond it’s supposed to work.
  • The Linux box should be 64bit.
  • A modern GPU and driver that supports OpenGL 3.3 core profile. nVidia with proprietary drivers is supposed to work.


  • ESC will quit the demo.
  • wasd moves the camera forward/back and left/right.
  • z moves the camera up.
  • space moves it down.
  • q and e roll the camera.
  • F1 will show the debug meshes.


  • The building is “sponza” by Crytek.
  • The horse model is from Endre Barath (

Any feedback is appreciated. Enjoy!

Custom C++ allocators suitable for video games

In C++ allocators are special classes that practically do what their name suggests, they allocate and deallocate memory. They are used mainly by STL containers (vector, map etc) and they act as an interface/guide for the container’s memory management.

Allocators is a somewhat hidden feature that most C++ programmers don’t bother messing around with. For the most cases the default STL allocator (std::allocator) is enough and it works just fine. For some specific cases though where performance is essential developers create their own allocators that work around some design problems and limitations the default allocator has.

This article presents some consepts relevant to game development in C++11. More specifically:

1. Create a custom allocator that resembles std::allocator
2. Why replace the default allocator?
3. Stack allocator

Continue reading