C++11 spinlock

Apart from mutexes, spinlocks provide another way of protecting critical sections from multi-threaded access. Spinlocks perform the same job as mutexes but in a different way.

During mutex locking the OS may send the thread to sleep and wake it up when the lock can be obtained by this thread. Sending a thread to sleep and waking it are somewhat expensive operations but under some circumstances the OS may replace the waiting thread with another that has some meaningful work to perform. Spinlocks use another way of waiting called “busy waiting”. Instead of sending the thread to sleep, spinlocks “trap” the thread in a loop by constantly checking if the lock can be obtained. During busy waiting the thread is kept alive making it difficult for the OS to replace it with another one.

Mutexes are better for long locking periods where spinlocks for shorter. C++11 has build-in support for mutexes but not for spinlocks. Fortunately creating a spinlock class is quite easy.

#include <atomic>

class SpinLock
{
public:
	void lock()
	{
		while(lck.test_and_set(std::memory_order_acquire))
		{}
	}

	void unlock()
	{
		lck.clear(std::memory_order_release);
	}

private:
	std::atomic_flag lck = ATOMIC_FLAG_INIT;	
};

And an example/benchmark of using a spinlock is the following:

SpinLock lock;
//std::mutex lock;
int count[8] = {0, 0, 0, 0, 0, 0, 0, 0};

void foo(int n)
{
	for(unsigned i = 0; i < 10000000; i++)
	{
		lock.lock();
		++count[n];
		lock.unlock();
	}
}

int main(int, char**)
{
	std::vector<std::thread> v;
    for(int n = 0; n < 8; ++n) 
	{
        v.emplace_back(foo, n);
    }

    for (auto& t : v) {
		t.join();
    }

	return  count[0] + count[1] + count[2] + count[3]
		+ count[4] + count[5] + count[6] + count[7];
}

Running the above benchmark using time command:

Mutex locking:
real    0m12.167s
user    0m7.760s
sys     1m20.169s

Spinlock locking:
real    0m10.925s
user    1m14.653s
sys     0m0.008s

It seems that for the above simplified scenario, the spinlock performs better and minimizes the time spend in the kernel.

That’s all, happy codding!

Porting AnKi to Android

After months of careful planning, reading the GL specs and waiting for Google to add OpenGL ES 3.0 support, AnKi was finally ported to Android. This article presents the result and briefly expands on the challenges of porting a game engine to Android. More precisely, there is a video showing a demo running on an Android tablet, the second is a pre-build .apk of that demo that you can download and install to your Android device and the third is a few thoughts on Android development and mobile GPUs.

Video

The video is showing a flyby demo running on a Samsung/Google Nexus 10 tablet equipped with Mali T604 GPU. Please note that the demo is a bit slower because of HDMI. The resolution is 720p, the FPS (without HDMI) ~17 and the lights are in the worst case ~10.

Continue reading “Porting AnKi to Android” »

Developing games on and for Linux/SteamOS

Linux, or GNU/Linux as some people want to call it, was born 20+ years ago as an open source desktop operating system and despite its massive success on super-computers, servers, embedded and mobile devices it didn’t manage to gain the same traction on the space that was bred for, on desktop. Many people gave valid arguments on why desktop is a hard market to conquer but deep down I believe that video games is one of the most commonly desired features of a home computing system. Despite some notable efforts over the years Linux gaming never shared the same love as Windows PC gaming and console gaming but that may come to an end mainly due to Valve’s efforts to steer developers and gamers to Linux. As a Linux (game) developer for the past 6 years I think I can shed some light on the pros and cons of that system and who knows maybe some people will find this reading useful.

To be able to see and understand the big picture of developing on and for Linux we first need to identify the smaller areas that compose it. Things like tools (compilers, debuggers, libraries), hardware abstractions (graphics and sound APIs) and finally maintaining multiplatform capabilities are some key elements of that process. Keeping the multiplatform aspect of things in the back of our head is an interesting and sensible thing to do mainly because with careful planning and not that much effort a Linux application can be ported to other operating systems where the opposite may not be that easy.

Continue reading “Developing games on and for Linux/SteamOS” »

Designing a new texture format

In this article we will be discussing the whys and whats of the new AnKi texture format. First we will try to shade some light on the current texture formats and what are their limitations and later we will present the ankitex format in detail and how it solves these limitations.

The first thing we need do is to shed some light on OpenGL hardware, OpenGL drivers and how they need their textures served.

Continue reading “Designing a new texture format” »

Lens flare video

After spending some time AnKi has new lens flare effects. See the video bellow:

Download the first AnKi demo: Hundreds of lights

After some years of development I am happy to present the first AnKi 3D engine demo. The demo is in alpha state and it’s very rough around the edges so please bare with me and report any problems found in the project’s google code page or use the emails in the contact page. Also note that at the moment it only runs and compiles in Linux.

And here is download link: https://anki-3d-engine.googlecode.com/files/anki_hundreds_lights_demo.tar.gz

To run the demo navigate to the directory that contains the demo64 binary and run that binary. The source is also included in case you want to build everything yourselves. Read the txt files in the directory for more info.

Some points on the tech behind the demo:

  • Its been using using the new tile based deferred renderer.
  • You can visualize more than 100 lights at the same time.
  • You can see al the the shadows of the spot lights.
  • It features post processing effects like HDR, SSAO, color correction, sharpen filter and other.
  • Taking advantage of multi-core CPUs.

What you will need to run the demo:

  • A Linux powered box. Ubuntu 12.04 and beyond it’s supposed to work.
  • The Linux box should be 64bit.
  • A modern GPU and driver that supports OpenGL 3.3 core profile. nVidia with proprietary drivers is supposed to work.

Controls:

  • ESC will quit the demo.
  • wasd moves the camera forward/back and left/right.
  • z moves the camera up.
  • space moves it down.
  • q and e roll the camera.
  • F1 will show the debug meshes.

Credits:

  • The building is “sponza” by Crytek.
  • The horse model is from Endre Barath (http://etyekfilm.hu).

Any feedback is appreciated. Enjoy!

Tile based deferred shading video

I’ve compiled a video that illustrates the engine’s capabilities when it comes to handling many lights. Enjoy.

Custom C++ allocators suitable for video games

In C++ allocators are special classes that practically do what their name suggests, they allocate and deallocate memory. They are used mainly by STL containers (vector, map etc) and they act as an interface/guide for the container’s memory management.

Allocators is a somewhat hidden feature that most C++ programmers don’t bother messing around with. For the most cases the default STL allocator (std::allocator) is enough and it works just fine. For some specific cases though where performance is essential developers create their own allocators that work around some design problems and limitations the default allocator has.

This article presents some consepts relevant to game development in C++11. More specifically:

1. Create a custom allocator that resembles std::allocator
2. Why replace the default allocator?
3. Stack allocator

Continue reading “Custom C++ allocators suitable for video games” »

Status report: Tile based deferred shading renderer, LUA and external dependencies

Some of the planed features of AnKi were discussed in the previous status report. Fortunately some of them were implemented but some are still under way. The interesting ones are a new optimized renderer architecture that uses tile based deferred shading, the other important change is that the engine’s scripting engine is now LUA with a custom binder solution and the last mega feature is the rethinking of the external libraries that AnKi uses.

One of the main goals of the past months were to remove/revise some of the external dependencies that AnKi had. Libraries like boost, Python and SDL were making the building of AnKi a bit complicated especially for some new targeted platforms (ARM Linux, Android etc). To solve that problem first I removed the pre-compiled external libraries (.so and .a) and created CMake for each of these libraries. Now for example libpng source lies in it’s own directory (extern/png) and it is connected to the AnKi general build system. The good thing about this is that we ensure that by having only a C++11 compiler you can build AnKi down to the last bit without the need to install or build anything yourselves. The second step was to remove some of the external libraries that made life difficult and/or lacked a few features. Boost replaced with C++11, Python with the lightweight LUA and the much loved SDL with custom backends (GLX/X11, EGL/X11 and more will follow). To see the external libraries and the way the are organized checkout the the source: svn co http://anki-3d-engine-externals.googlecode.com/svn/trunk

The last feature that is worth writing is the new renderer architecture. The tile based deferred shading renderer boosted the performance quite a bit. To be precise it doubled the speed of the rendering. An article is planed that will explain the implementation that AnKi uses along with some benchmarks.

For the future:

  • Renderer: Move everything to using UBOs
  • Renderer: Re-enable HDR and SSAO
  • Core: Add loading thread that supports GL
  • Scene: Finalize the octree optimizations
  • Math: Make the library template based. Add NEON support
  • Add GLES 3.0 support
  • Add Android support

Status report: New scene graph, OpenGL, C++11 and other

It’s been a long time since we had a status report. This doesn’t mean that AnKi is left behind but the new features are so difficult to implement and they require a big period of design, thinking and prototyping. One of the most important things that are going to be redesigned is the scene graph. It will move from a hierarchy based design to a component based one. This makes the engine very configurable and the process of adding new kind of objects far more easier. The downside of the scene graph reconstruction is that it requires re-design of the renderer as well. Both these modules are complex and changing them is very time consuming. At the present moment the scene is finished but the renderer is not yet.

Another module that got a face lift is the OpenGL. The idea is to make a wrapper that sits on top of OpenGL API that will offer an abstraction for different OpenGL implementations and latter who knows, maybe an API abstraction as well. The interesting things about the OpenGL module though is it’s thread safety and numerous optimizations that minimize the GL API calls. Especially the Texture class offers quite a few optimizations with the texture units and how the binding works.

Apart from the individual modules there is a broader change that affects almost everything, the move from C++03 to C++11. The idea here is to drop the boost library and make the engine more self sustained and less dependent to external libraries. This choice introduced and it will introduce quite a few problems though. The first is that the compilers are not that stable yet. Some of the features only live on the latest GCC and LLVM versions and I doubt they have been tested for production quality software. Another drawback is that some of the C++11 features have been implemented in the latest versions of LLVM and GCC. This practically means that you won’t be able to compile AnKi without GCC > 4.7 and LLVM > 3.1. The last drawback is to compiler support in general. At the present time only GCC and LLVM have a good C++11 support. This is not bad because AnKi is targeted for those two compilers but it wouldn’t hurt to be compilable by Intel’s compiler as well.

When the scene graph and renderer are over there are lots of changes and features that wait in line:

  • Optimize the renderer. Removal of multiple render targets in deferred shading.
  • Implement the loose octrees for visibility tests. Also add and test occlusion queries.
  • Remove Python and put LUA. Preferably without LUA bind (because it requires boost).
  • Remove SDL and write code for X11 directly. SDL still does not support shared context creation.
  • Introduce two OpenGL targets. OpenGL 4.x and OpenGL ES 3.x. Especially the ES one (far future)
  • Port the engine to Android with OpenGL ES 3.x hardware (far future)

That’s all.

PS: AnKi always seeks developers that want to help so if you are interested in contributing contact the main developer.