Custom C++ allocators suitable for video games

In C++ allocators are special classes that practically do what their name suggests, they allocate and deallocate memory. They are used mainly by STL containers (vector, map etc) and they act as an interface/guide for the container’s memory management.

Allocators is a somewhat hidden feature that most C++ programmers don’t bother messing around with. For the most cases the default STL allocator (std::allocator) is enough and it works just fine. For some specific cases though where performance is essential developers create their own allocators that work around some design problems and limitations the default allocator has.

This article presents some consepts relevant to game development in C++11. More specifically:

1. Create a custom allocator that resembles std::allocator
2. Why replace the default allocator?
3. Stack allocator

Continue reading “Custom C++ allocators suitable for video games”

Status report: Tile based deferred shading renderer, LUA and external dependencies

Some of the planed features of AnKi were discussed in the previous status report. Fortunately some of them were implemented but some are still under way. The interesting ones are a new optimized renderer architecture that uses tile based deferred shading, the other important change is that the engine’s scripting engine is now LUA with a custom binder solution and the last mega feature is the rethinking of the external libraries that AnKi uses.

One of the main goals of the past months were to remove/revise some of the external dependencies that AnKi had. Libraries like boost, Python and SDL were making the building of AnKi a bit complicated especially for some new targeted platforms (ARM Linux, Android etc). To solve that problem first I removed the pre-compiled external libraries (.so and .a) and created CMake for each of these libraries. Now for example libpng source lies in it’s own directory (extern/png) and it is connected to the AnKi general build system. The good thing about this is that we ensure that by having only a C++11 compiler you can build AnKi down to the last bit without the need to install or build anything yourselves. The second step was to remove some of the external libraries that made life difficult and/or lacked a few features. Boost replaced with C++11, Python with the lightweight LUA and the much loved SDL with custom backends (GLX/X11, EGL/X11 and more will follow). To see the external libraries and the way the are organized checkout the the source: svn co http://anki-3d-engine-externals.googlecode.com/svn/trunk

The last feature that is worth writing is the new renderer architecture. The tile based deferred shading renderer boosted the performance quite a bit. To be precise it doubled the speed of the rendering. An article is planed that will explain the implementation that AnKi uses along with some benchmarks.

For the future:

  • Renderer: Move everything to using UBOs
  • Renderer: Re-enable HDR and SSAO
  • Core: Add loading thread that supports GL
  • Scene: Finalize the octree optimizations
  • Math: Make the library template based. Add NEON support
  • Add GLES 3.0 support
  • Add Android support

Status report: New scene graph, OpenGL, C++11 and other

It’s been a long time since we had a status report. This doesn’t mean that AnKi is left behind but the new features are so difficult to implement and they require a big period of design, thinking and prototyping. One of the most important things that are going to be redesigned is the scene graph. It will move from a hierarchy based design to a component based one. This makes the engine very configurable and the process of adding new kind of objects far more easier. The downside of the scene graph reconstruction is that it requires re-design of the renderer as well. Both these modules are complex and changing them is very time consuming. At the present moment the scene is finished but the renderer is not yet.

Another module that got a face lift is the OpenGL. The idea is to make a wrapper that sits on top of OpenGL API that will offer an abstraction for different OpenGL implementations and latter who knows, maybe an API abstraction as well. The interesting things about the OpenGL module though is it’s thread safety and numerous optimizations that minimize the GL API calls. Especially the Texture class offers quite a few optimizations with the texture units and how the binding works.

Apart from the individual modules there is a broader change that affects almost everything, the move from C++03 to C++11. The idea here is to drop the boost library and make the engine more self sustained and less dependent to external libraries. This choice introduced and it will introduce quite a few problems though. The first is that the compilers are not that stable yet. Some of the features only live on the latest GCC and LLVM versions and I doubt they have been tested for production quality software. Another drawback is that some of the C++11 features have been implemented in the latest versions of LLVM and GCC. This practically means that you won’t be able to compile AnKi without GCC > 4.7 and LLVM > 3.1. The last drawback is to compiler support in general. At the present time only GCC and LLVM have a good C++11 support. This is not bad because AnKi is targeted for those two compilers but it wouldn’t hurt to be compilable by Intel’s compiler as well.

When the scene graph and renderer are over there are lots of changes and features that wait in line:

  • Optimize the renderer. Removal of multiple render targets in deferred shading.
  • Implement the loose octrees for visibility tests. Also add and test occlusion queries.
  • Remove Python and put LUA. Preferably without LUA bind (because it requires boost).
  • Remove SDL and write code for X11 directly. SDL still does not support shared context creation.
  • Introduce two OpenGL targets. OpenGL 4.x and OpenGL ES 3.x. Especially the ES one (far future)
  • Port the engine to Android with OpenGL ES 3.x hardware (far future)

That’s all.

PS: AnKi always seeks developers that want to help so if you are interested in contributing contact the main developer.

C++11: Variadic templates. Part I

UPDATE: Adding a new concept
UPDATE 24/Apr/2013: Fixing concept 3

The new C++ standard, namely C++11, is here at last; offering many additions to the language’s core as well as in the companion library, the STL. Without doubt it will change the way we think and work but nobody can predict if it is for better or worst. The experimentation period is nearly over, only a handful of features missing from GCC and clang and the C++ engineers will have to learn and master the new tricks in both fronts (core and STL). For those familiar with boost library the second front should be an easy transaction to the new STL, for the first though we need tutorials and lots of them. This little article is one tutorial that extends the already published ones.

One of the new cool features is the variadic templates, simply put, templates with variable number of template parameters. To put it into context this is a variadic template:

template<typename... Types>
struct Foo{};

Continue reading “C++11: Variadic templates. Part I”

Multithreading: Threadpool

Multithreading is a concept that exist for many decades in computer science, nevertheless, only in recent years it become a trend in game development with the arrival of multi-core CPUs in our home PCs. In this small article we will discuss how AnKi utilizes the power of multiple processors. Also the source code of an example can be found at the end of the article.

The concept of multiple threads offers many advantages in applications that execute many uniform jobs at the same time, for example a webserver needs to serve multiple hosts at the same time and without a host waiting for a previous request to finish. A game application though used to have a very standard and linear flow. For example, we first update the AI, then the physics, then we update the world, we do visibility determination and lastly we render, then we repeat the same again. Sometimes its difficult to execute some of these distinctive steps in parallel because the data of the previous step will be used by the next. I bet high class development studios have found ways to blend these steps but in AnKi we use a more simple approach. We use multiple threads to run uniform jobs in parallel.

One good example to illustrate how AnKi uses the power of multiple threads is the visibility determination algorithm. One step of visibility determination is the test for every renderable scene node against the camera’s view frustum. If we have N nodes to test and M threads we can roughly assign for testing the first N/M nodes to the first thread, the next N/M nodes to the next thread etc.

Continue reading “Multithreading: Threadpool”

Screen Space Ambient Occlusion

“Ambient occlusion is a shading method used in 3D computer graphics which helps add realism to local reflection models by taking into account attenuation of light due to occlusion” [Wikipedia.org]

The proper way to do ambient occlusion is very expensive for today’s hardware and especially without the use of a ray-tracing renderer. For that reason a few new techniques developed that tried to produce the same result using simpler and faster algorithms. One of these approaches is the Screen Space Ambient Occlusion (aka SSAO) that makes the calculations of ambient occlusion in 2D space just like a normal 2D filter. This article will not go deep into what SSAO is, there are many great readings around the web that cover the theory behind SSAO, in this article we will jump into the implementation that AnKi uses.

There are quite a few techniques to produce SSAO, a top level way to group them is from the resources they need to produce the SSAO:

  • Some use just the depth buffer (Amnesia, Aliens VS Predator)
  • others use the depth buffer and the normal buffer (Crysis, gamerendering article)
  • and others use the pixel position and the normal buffer (gamedev article)

AnKi used to implement the second technique for quite some time. The results were good but apparently not that good, so for the past couple of weeks I’ve tried to implement the third technique by reading this great article from gamedev. The present article extends the gamedev one by adding a few optimization tips and by presenting the code in GLSL.

Bellow you can see the old and the new SSAO, rendered at the half of the original rendering size with two blurring passes.

Old implementation of SSAO
New implementation of SSAO
The whole scene with the new SSAO

Continue reading “Screen Space Ambient Occlusion”

STL auto_ptr: A bad idea

C++ Standard Template Library (STL) offers a smart pointer called auto_ptr. auto_ptr is a template class that acts as a wrapper for a single dynamically allocated class instance. Its purpose and its usefulness is to deallocate the memory when this smart pointer gets out of scope. Its a way to do automatic garbage collection in C++ by saving us from the extra code writing. One extra reason to use auto_ptr is that it makes the code less error prone simply because it doesn’t rely on the programmer to do the memory dealocation.

In this short article we will discuss a case where auto_ptr fails to do what its supposed to do….

Continue reading “STL auto_ptr: A bad idea”