The speed fallacy of unity builds

Recently, I’ve been asked this question several times: “why do you have such a problem with unity builds”. I figured that the question pops up so much that it’s worth writing up. Now I’m not the only person with this opinion and there is an excellent post from 2009, covering most of the problems with unity builds:

What I do want to touch on is the mistaken perception that unity builds actually improve builds times and thereby increase productivity. At first glance you would see a large improvement to build times when doing a full build or a rebuild. And that is true, they do improve full build times. The problem is that I as a programmer don’t do full builds that often. I tend to do a full build only whenever I get a newer version of a code base and even then, that’s often just an incremental build based on the delta between my version of the code base and the latest version. I very rarely ever need to actually do a “rebuild” or full build of a code base as part of my day to day work.

What I actually do on a regular basis is: edit some code, compile, test it and repeat. That is the core workflow of most programmers independent of their industry. That is the workflow that we need to be optimizing and unfortunately unity builds are a huge de-optimization of that. Now why do I say that? Well let’s look a simple example, we have a project with 50 code files and as part of our task we needed to change 2 of those files. In a non-unity build we would change those files then recompile just those 2 code files and we are done. This is illustrated below where the red blocks are the recompiled code files.

In a unity build scenario, we are required to not only rebuild the code files we changed but also the code files that are unified with the changed files. This is illustrated below where we have taken a very conservative approach to unifying the code files by only grouping 10 files together in a unit.

As you can see for the two changed files we now have to compile the two full units which include all the code from the included code files as well as a lot of headers and templates. This is not efficient especially in the cases where the code you are editing is relatively decoupled from the rest of the project as unity builds will break those de-couplings. And the above scenario is very optimistic since in reality, to see the huge “benefits” of unity builds you would need to unify much larger numbers of files into the units. In previous experience, we had 100+ files lumped together into units. This meant each time I edited a file I recompiled the code for 99+ other files. THIS IS NOT FAST! In fact this is the exact opposite of fast, it is extremely frustrating and a massive waste of a programming time.

An Aside: Unity builds also tend to break a specific workflow of mine that improves my productivity. They prevent the individual compilation of code files. I tend to compile each file individually once I’ve made enough changes to them that I am happy and feel that I’ve reached the end of that iteration. The individual compile helps me to verify that my “includes” are correct as well as any template invocations I might have in the code. I do this via the ctrl-F7 shortcut in Visual Studio. So basically I tend to work on a file and then compile it once I am done, then I move onto editing the next file. I do this so often that it has become a nervous tick for me alongside my almost OCD level of hitting ctrl-s without noticing. The end result of this specific workflow is that I end up manually distributing the compilation of a given change over the course of my edition time. This means when I’m finally done with my changes to the project, I often don’t need to recompile much as I have already compiled everything (unless of course I’ve changed a header). This workflow is really nice since for a lot of day to day changes I don’t change public interfaces that much. Unity builds completely break that for me. Since now the code files are just references that go into a unit once you build the actual project, you can’t compile them individually anymore and in the cases where a build engineer has developed a workaround for that to work, the compiled objects aren’t used for the final build so it’s just extra useless work.

The unnecessary recompilation is a known problem with unity builds and one that even the unity build proponents will admit to. They do have a “solution” that “solves” the issue. Their solution extracts the edited file out of the unit once you edit it into a “working” unit. This means that subsequent edits to those files will only require a rebuild of the work unit. This does improve the situation slightly but it comes with its own set of problems. The first one being that it only helps the cost of the subsequent edits. Imagine that I don’t know in advance all the files that I need to edit, each time I edit a new file, I will have to recompile not only the work unit but also the original unit that contained the file.

So yes, while this does offer an improvement to iteration times over the previous approach, we still have to pay a significant cost each time we edit a new file. And I don’t know about you but I often don’t know in advance exactly which files I will have to edit and so I end up paying these costs. I will say that this approach does have one additional benefit over the previous unity build approach in that in moving the files into a smaller work unit you can validate that you are not missing any includes that might have gone unnoticed in the larger unit (note: this only kinda works since potentially the order of the cpp includes in the work unit might once again hide an inclusion issue).

This brings us to the second problem which is that the code you are compiling locally is not the same as what the build machine is compiling and this can result in frustrating issues since you will be generating a different set of compiled objects than the build machine. Obviously, you can then create a mechanism that once the code is tested, you can locally generate the same units as the build machine and then test that but this means practically a recompile of the project since potentially a lot of units will be affected and it obviously adds a completely unnecessary extra compile and testing phase (and cost) t to your development. And god forbid you find an issue that exists when merging the units, cause then you might as well join the circus with all the experience of jumping through hoops you’ll be getting.

So these are the relatively obvious issues with unity builds, so how about we cover a less obvious issue, one that is a real iteration killer: header files. Imagine the project setup highlighted below, we have a set of header files that are included by several code files across two projects. This is a pretty standard setup.

In a normal setup, if we change a header file in project A, then we will need to recompile the dependent code files in project A and in project B. So the change of the one header only results in the recompilation of 5 files. Now imagine the unity build scenario, we don’t really have control over how we group our code files into the units so we may end up with the following setup:

Now instead of having to recompile only the five files that depend on that header, we have a massive amount of work to do. And this problem only grows the lower down the header is in your project hierarchy. Even if you were extremely careful with your include patterns and limited the inclusion of a header to only the bare minimum number of files needed, it doesn’t matter unity builds will throw away all your hard work. In fact, you might as well not care about your inclusion patterns since that header will get included in a ton of places. In fact, in previous experience, in changing a lower level header, I ended up literally recompiling more than half of the entire codebase, this is simply insane. I almost see unity builds as a tool for lazy programmers that don’t want to have to think about their code architectures and include patterns. As far as I know there is no “solution” to the above problem. I can imagine trying to build up a database of all the code files in the code base, have a list of their include files, then do some sort of topological sort (assuming no cyclic dependencies between headers exist – LOL) to try to group the code files in units that minimize the unnecessary includes and so on. Realistically though, that’s not really feasible because to get the low granularity of code files needed for an effective unity build, you will have to have a massive amount of redundant includes.

I was previously working on a project where I had created two new projects (one dependent on the other) for my systems, I avoided using unity builds for them and enjoyed a good iteration time especially compared to my colleagues that were working in the rest of the codebase. Then an over-zealous build engineer moved my projects over to unity builds and I went from 5~10sec build times for a change to 45s-1m per change since now I pretty much ended up rebuilding the dependent project every time I changed a header in the base project.

When complaining about this, the build engineer just looked at me and said that yes it’s true that my iteration times are much worse now, but look at how much faster the projects compile on the build machine. I had no idea how to (politely) respond to that. From my point of view, the production costs in terms of hamstringing your entire development team’s iteration times greatly outweigh the cost of a slightly slower build on a build server.

And this is one of the biggest problem I’ve experienced in terms of dealing with the proponents of unity builds. Simply that they tend to only look at the rebuild times for the whole solution and completely forget that people actually work in that solution and don’t constantly rebuild it. Yes, unity builds really help the overall build time but they also really fuck up iteration times (not to mention that they actually break language features i.e. translation unit locals and anon namespaces). And I really hope that people will stop and think about what the cost in terms of productivity is, before blindly moving over to some sort of unity build system.

Synchronized Behavior Trees


I had spent the better part of a year and a half thinking about and researching what I would have want a next-gen AI framework to be. This framework was intended to live within the engine layer and provide a generalized architecture for writing and executing any kind of game AI. I think the best description of the intentions was that I was trying to build a “genre and game agnostic” AI system. My primary concerns with the framework’s design were ones of production. I don’t believe AI can continue to exist only in code and we need to embrace more data driven approaches in terms of authoring AI. In any case, this viewpoint is not shared by the vast majority of my peers which often makes discussions about this topic quite difficult but as part of this generalized framework research, I did a fair amount of work on behavior trees (BT). I went through numerous iterations on the core BT architecture. As part of this work, I came to some conclusion for overall decision making and behavior authoring. For the most part, this work was abandoned and I’ve now resigned myself to the fact that I will probably not get a chance to move forward with it. I feel the information is still useful and so I would like to share my ideas and conclusions in hope that it might trigger a discussion or inspire someone to do something awesome and crazy. This information sharing will be split into two articles, the first discussion about my final BT architecture and and the second will delve into my conclusion and why I feel that way.

NOTE: It is also extremely important to make it clear that that what I am about to discuss is not new techniques or ideas but rather an attempt to formalize some concepts and architectures of behavior trees. This information is necessary for the follow up article.


I don’t really want to rehash old information, so I will assume that anyone reading this is familiar with the basic concepts of the behavior trees. If not I would direct you to the following resources:

So now that that’s out of the way then let’s move forward. The first thing you might notice when discussing BTs is that a lot of people will refer to them as reactive planners. What they actually mean by reactive planner is simply a “BIG IF STATEMENT”. The reactive part means that each time you evaluate the tree, the result could change based on the environment i.e. it will react. The planning part simply implies that for some given set of values, the conditions for the tree, it will select some authored plan of actions to execute. I’m sure a lot of programmers (and potentially academics) are probably raging at my gross over simplification of BT so let me give a basic example:


Traditionally, behavior tree evaluation starts at the root, and proceeds, depth first, down the tree until it reaches a successful node. If an environmental change occurs that the affects the tree, it will be detected on the next evaluation and the behavior changed accordingly. This implies that you need to be careful when authoring the tree to ensure higher priority behaviors are placed before lower priority behaviors so that we get the proper “reactivity”.

One thing that initially bothered me when looking at various behavior tree implementations was that whereas I saw then as tool to author actual behaviors, a lot of programmers saw BTs as a way to describe the transitions between behaviors. What I mean by is that the granularity of actions in their trees was quite low. You would see leaf nodes such as “FightFromCover”, “GoToCover” or even worse stuff like “RangedCombat” and “InvestigateSuspiciousNoises”. That’s not to say its always the case, I have seen examples of BT with a much higher granularity in their nodes such as “moveto”, “shoot”, “reload”, “enter cover”, etc… This, at least for me, seemed like a more attractive alternative as it allows designers to actually prototype and author new behaviors without depending on a programmer to build the behavior. I envisioned a workflow when game designers could to a large degree script with BTs. The reality is that a lot of newer designers in the industry are not able to script using code and since I don’t want to artificially lock them out, I wanted to provide a toolset for them to be able to have some degree of autonomy with their prototyping and designs.

There is one downside to increasing the granularity: massive growth of the tree size and this can have a pretty bad effect of the performance profile of a tree evaluation especially when in the lowest priority behaviors. Given the simple tree below, to reach our currently running behavior, we needed to evaluate numerous nodes earlier in the graph. For each subsequent update of the behavior we still need to traverse the entire tree, checking whether anything has changed and whether we need to switch behavior. With massive trees, the cost of this traversal can quickly add up. This cost is especially depending on the cost of your condition checks as they will make up the bulk of the operations performed. I hope the irony that the most expensive BT update occurs when the agent is doing nothing, is not lost on you.

RANT: To be frank, I find a lot of performance discussions regarding BT’s to be misleading. I would love to be in a situation where the most expensive thing in my AI update was the actual tree traversal and not the work being done by the nodes. The condition nodes alone, will almost certainly incur cache misses when querying your knowledge structure (be it a blackboard or whatever) as will any calls into other game systems. Barring some sort of idiotic implementation, I think that your time optimizing a BT is better spent in minimizing the cost of the work being done by the nodes and not the traversal of said nodes.


Naively, my first thought to get around this problem was that I didn’t need to re-evaluate the entire tree each time. I could simply just resume evaluation at my last point. This is similar toward AIGameDev’s “Event driven behavior trees” but there are some problems with this approach, some that are not covered in the AIGameDev ( materials. The main problem is that you lose the reactivity of the tree, what I mean by that is that you will only ever react to new events only once the tree finishes an evaluation i.e. a behavior fully completes. It’s also unclear what happens in the case of parallel nodes, as we could at any given point be running N behaviors in parallel across multiple branches of the tree.

Stored State Behavior Trees

After some thought, I decided that there was no way around repeating the evaluation from the root each frame without losing basic functionality. Instead, what I decided to do to limit the amount of work done, by storing the state of each node across subsequent evaluations. I would store the state of each node as it was calculated, then on subsequent evaluations, I would skip all the nodes that had previously completed (succeeded or failed). The stored state of the BT nodes would only be reset once the tree fully completes the evaluation. For clarity’s sake I’m gonna refer to these technique as a Stored State BT (SSBT).

At this point, I’m sure a lot of you are asking yourselves how is this any different from “Event Driven BTs”? Well, let work through an example and the differences should become clear. Given the following tree allowing our agent to investigate a sound:


In the traditional BT model, we will evaluate the entire tree each AI update which means we will detect the noise as soon as it happens. Great so what happens with my approach? Well everything works great on the first AI update. We correctly select the smoking idle behavior and we store each node’s state. The tree state looks as follows:


On the second update, nothing has changed, so we evaluate the tree skipping all nodes that have completed (i.e. succeeded or failed) and we re-enter the smoking behavior and update it. So far so good. On the third update, a noise is detected, but since our evaluation skips completed nodes, we never enter the first sub-branch until the smoke behavior completes and so we will not react to the sound. This is obviously broken.

Before I discuss a solution I also want to point out a big problem with standard BTs. Let’s take a look at the middle branch (the actual investigation behavior). Each time we evaluate the tree, we will detect that we have an investigation target and run the “move to investigation target” behavior and it is when that behavior completes that things start becoming nasty. Not only are we running the “has investigation target” check each subsequent update but now we also need to verify that we are in fact at the correct required position to be able to progress past the “move to investigation target” node in the tree. Now imagine that the “investigate” behavior actually moves the character. Next time you update the tree, the move to investigation target kicks back in and tried to the adjust the position of the character back to the required point. Congratulations! You are now officially an AI programmer as you have encountered the dreaded behavior oscillation problem. There are a bunch of ways to fix this but all are, in my opinion, workarounds and hacks due to sub-optimal decision making. One of my biggest problems with traditional BT is that most people simple treat them as a glorified decision tree, anyways I’m getting side-tracked but this problem is a very common one, and one that’s not really mentioned in a lot of the BT literature 😉

Back to my broken BT model! A naïve approach to fixing the problem that we don’t enter the first branch (the first selector node) could be to put a “repeat” decorator on it. A “repeat” decorator will cause its decorated node to be reset and re-evaluated each AI update. This decorator is entirely useless in the traditional BT approach but does some value with the SSBT model. Does it fix our problem? Well, sort of, it does fix it but only in that specific case. Consider the following modification to the tree:


The repeat decorator was moved to after the first selector node and a sequence was added before it. Granted this example is bit contrived but it does highlight a very real problem. As a result of this setup, after the first evaluation the first sequence node’s status is set to “failed” and on the subsequent evaluation the repeat node will not be reached so we are back to the original problem. The solution I found this problem was actually quite simple. I created a new decorator called a “Monitor”. This decorator had an interesting behavior in that when encountering a Monitor decorator, we would evaluate its child node and store the result of that evaluation (just as before) but the monitor would also register itself to a “monitored nodes list”. Due to the depth first nature of the tree, we would only register monitor nodes with a higher priority that our current branch in the tree. On the AI update after we registered the monitor nodes, we would evaluate all registered monitor nodes in the tree before the tree evaluation step. This simple mechanism allows us to maintain the reactivity of the tree but also prevents us from doing a lot of redundant work. The monitor concept is show in the figure below.


As with any technique, there is a bit of catch, well not a catch but rather that you need to change your way of thinking about the behavior tree. The monitor node concept is powerful but there are some more details concerning the monitor nodes and the SSBT model that need mentioning. Remember I mentioned that we have the concept of resetting the tree? This is something similar to the “OnTerminate” call that Alex shows in his BT video. We reset nodes primarily when the tree has been fully evaluated i.e. all active leaf nodes complete, in that case we call reset on the root node which traverse the tree and reset of all child node that require resetting.

There is also the issue of the resulting state of the monitor nodes. Since we evaluate the list of monitored nodes before we evaluate the tree (essentially evaluating them as standalone trees) what do we do with their return state? Well, this is where the reset comes in. If a given monitored node succeeds, it will return an interrupt result which will cause us to reset the tree and restart the evaluation. This allows us to maintain the reactivity of the tree but also gives us the added benefit of clearing the entire state of the tree, more on that in a bit. Since monitor nodes can reset the tree and will always be evaluated, I would strongly suggest that no actual behavior be placed within them. I used the nodes to perform knowledge and world state queries to see if anything had occurred that I needed to react to. I also tended to use generalized cases for the monitor nodes I had i.e. did I hear a new sound of some class, did I see something I need to react to, etc… In my opinion, the monitor decorator tends to give the AI programmer/designer a bit more control over the performance characteristics of the tree than before.

It’s also worth getting into a bit more detail with regards to the reset mechanism of the BT. Resetting a node allows us to actually perform some state clean up operations before re-evaluating the tree. I found this concept extremely powerful in that it allowed me to have synchronicity in actions for the nodes. What I mean by that is that, for example, certain action nodes would change the agent’s knowledge i.e. a talk node would set a “isSpeaking” flag, or a movement node would set a “isMoving” flag and so on. This meant that during the reset of the tree, each node could clean up after themselves in a sensible manner i.e. clearing any flags it had set or sending termination commands to the animation layer.

A good example of this would be if we were interrupted during a full body interaction with the environment. We can, during the reset of the tree, send an interrupt command to animation layer and have it sensible blend out of the act. This would normal have been achieved with checks at a higher level in the tree i.e. if (in animation && animation can be interrupted ) then interrupt. With the reset concept we get two benefits for free:

  1. We don’t have to worry about interrupting behaviors since they can clean up after themselves removing the need for additional checking earlier in the tree. This really helps in reducing issues arising from bad state due to tree interruption
  2. All clean-up is localized to the nodes, each node knows exactly how to clean up after itself. Meaning that you can really have modular nodes that are plug and play within the tree, without having to worry about their potential effects in the context of multiple traversals.

Honestly, I think the reset concepts was one of the nicest things about this BT and the resulting implementation code ended up being really simple and elegant.

Lastly, the SSBT model has one additional benefit, it actually solves the oscillation problem we mentioned earlier. Since we store the state for a node, once the “move to investigation target” behavior completes, we never have to run it again. Since we track the state, we know that we’ve completed it and can simply let the “investigate” behavior take over. I found this to be quite an elegant solution over the alternative.

So now we find ourselves back to the same functional behavior as with the original BT model. In my opinion, I feel that the SSBT model actually has some functional and authoring improvements over the traditional model but hey, I’m biased! As I mentioned earlier, I believe that any performance problems with BTs are a result of the work done by individual nodes rather than due to the traversal of the actual tree structure. I also feel that the stored state evaluation approach really goes a long way to limit the amount of work needed to be performed during each tree evaluation.

Note: Maybe some of the keen readers have noticed what the SSBT model has actually resulted in? It wasn’t immediately obvious to me either. When I finally realized it, I was shocked. Don’t get me wrong, it works exactly as intended and I really liked the elegance of the solution and implementation but it led me down a rabbit hole that change my entire perspective of the topic.

Synchronized Behavior Trees

So let’s actually get to the synchronized part of the post title. One thing I didn’t mention was that, I only starting playing with the evaluation model after I had already done some of the work discussed in this part, it just made more sense for the writeup to discuss it in this order.

When I started thinking about the AI framework, my target platforms were both current and next gen. As such memory savings in the order of a few MBs were still really important. Now behavior trees are probably not the most memory intensive things you’ll encounter but I also had the idea of having all of crowd run in the new AI framework. In fact, I wanted to not make any distinction between crowd and NPCs from the AI’s perspective. As such, I was considering having several thousand behavior trees instantiated simultaneously. Let’s say I wanted 5000 characters, and each tree took 3kb, that’s around 15MB of data, there is no way that memory budget would ever be approved on 360/ps3 platform. So decided to share the same tree across all agents. This approach has already been touched upon with Alex’s data oriented trees.

My approach was pretty simple, each BT node would have a method called “GetMemoryRequirements” that would return the memory size needed to store its require instance state data, as well as all the memory needed for the node instance data of its children. I would then, for a given tree,  calculate the required space needed to store all of the node’s instance data for an agent by calling “GetMemoryRequirements” on the root. I also made the choice then to tightly pack the memory needed into one big block, then I would store each node’s index into the memory block and use that to retrieve the relevant instance data for a node during traversal.  During tree evaluation, I would pass in the agent’s node instance data for the specific tree and then each node would index into the memory and pull its state data as required. As it turned out, I didn’t need a lot of data per node, I think my heaviest node was around 16bytes, with the vast majority weighing in at around 4bytes. Now this in itself is nothing special but only having a single instance of the tree which all agent were evaluation gave me an idea.

Now it is also worth mentioning that the approach I took with designing the BT was quite animation inspired and so I made use of child trees quite extensively. A child tree is simply a behavior tree that is injected into another tree at runtime. All of our tree were statically allocated so by injected, I simply mean that we would store a pointer to the child tree in a “child tree connector” BT node. During evaluation, we would retrieve that pointer from each agent’s node instance data and trigger an evaluation of that child tree with the agent’s node instance data for the child tree. We would not actually dynamically compose trees in any way, all BTs had a fixed memory size. The child tree concept also allowed us the ability to have multiple agents running different child trees from within the same parent tree connector node. I relied heavily on this mechanism to inject contextual behaviors originating from the environment.

Coming back to the synchronized trees, I will, once again, use a contrived example to discuss the concept. You will notice the use of heavyweight behaviors like “RangedCombat” in the example tree, these are simply examples and I don’t recommend using that level of granularity in your BTs. Anyways, so given the very basic combat tree below, we immediately realize that we will encounter some problems when multiple agents run this tree. Can anyone see what they are?


When we have multiple agents running the tree, they will all throw a grenade if they can and will taunt the enemy if they can. Imagine a battle with 50 enemies, you probably don’t want 50 grenades in the air and 50 taunts at the same time. Design would immediately want to limit the number of agents that can run these behaviors and maybe add some cool downs. A potential quick fix would be to use the blackboard/knowledge. You would give the node an ID, then keep track of how many agents are using it via a counter in the blackboard. The same approach can be used for the cool-down values. I’ve seen other even more convulated appraoches based on specific knowledge flags or even using external systems as synchronization points. Then comes the matter of thread safety and other concurrency issues which is a whole other discussion. In any case, the main problem I find with a lot of these fixes is one of visibility and accessibility, some of them lock out designers from the equation while others make the code fragile since when reordering the tree, the blackboard flags may not be correctly set/unset. In fact, since I believe that the tree should be used to author behavior with a very high granularity, I also believe the tree should be used to solve exactly those problems. So like, I said having a single tree gave me an idea: “what’s stopping us from having a global tree state and using the tree as a synchronization object?” Taking that idea a bit further, we end up with the following tree:


We can use what I termed “Gate Nodes” to help with these problems of synchronization. I was inspired by the talk given by the Yager guys at the Vienna AI game conference, and I borrowed the name from them. I’m not sure that their nodes operate in the same manner as mine but I really liked the idea and the name. The basic premise is that with these “gate nodes” you can globally lock down portions of the tree. The gate nodes have global state which is stored in the tree as well as potentially having agent specific state. They contain a semaphore which allows us to safely control access to sub-branches in the tree. This means that I can now have “Allow Only X agents” nodes in the tree which means that I can limit behaviors globally to only a certain number of agents. As each agent evaluates a gate node, they will try to take a lock on the node, if they succeed they will enter that branch. For example, if I only want to have a single grenade being thrown at any time or only have 3 guys shooting at me while the rest of the enemies charge me then it is relatively easy to do now. And best of all, this can be done without having to touch any agent knowledge or build any external machinery and its clearly visible in the behavior tree.

Furthermore the global state can be used for things like global cool-downs, giving us the flexibility to easily distinguish between local agent cool-down (i.e. special abilities) and global cool-downs (i.e. combat gameplay control by limiting how often grenades are thrown). Best of all, since we only ever have a single instance of each BT, these global locks will also work for all child trees, no matter when and where they are injected, since all child trees are also shared.

I was also playing around with one last use case of the synchronization ability of these trees which i found to be quite elegant. I was using the global tree state to actually coordinate group behaviors. Let’s take a look at the following example:


By using global tree state for synchronization, we can restrict certain behavior branches to only execute when we have exactly 2 agents. We can then assign roles to the agents (for clarity’s sake I’ve left out the conditions for the role assignments). The agents can then perform the first part of their behaviors and once each agent completes they will enter a waiting loop after which we can synchronize their state again. After which they can perform their follow up behaviors. We also have the option to reset the tree for both agent’s if one agent breaks out of the behavior for any reason (i.e. due to an environmental interruption) but I will leave the how as an exercise to the readers in an effort to wrap up this document.

NOTE: One thing that’s necessary to mention is that the above behavior tree would be extremely difficult to construct without using the SSBT evaluation model. In fact, it would be such a pain that that I wouldn’t even recommend trying it.

I personally feel that having global tree states is an incredible tool to have when building behaviors. Unfortunately, I have met a lot of opposition to the idea. In fact, when the BT was evaluated by someone else, the very first thing they did was remove the child tree feature and the global tree state. I don’t know, maybe I’m just an idiot and am missing something? I still believe it to be a good idea and this is in part why I feel that I should write this post. If you disagree or spot some obvious problems, please let me know.


Now in discussing this article with several people, it seems that there is a bit of misunderstanding about what I’m trying to say. I’ve come to the opinion that fundamentally there are two ways to look at behaviors trees:

  1. As a tree of behaviors, where the tree represents the decision making structure (i.e. a decision tree)
  2. As a tree that describes a specific behavior i.e. the set of steps need to perform said behavior.

Personally I think of BTs in terms of 2, this is where they really shine! I think of a behavior as a visual scripting language and I think its a very elegant productivity solution to use BTs as such and this is context for the material I covered in this article. As for using BTs as a tool for defining a characters overall behavior, I feel that BTs are a sub-optimal solution to the problem and I would never recommend them for that. This is exactly what the topic of my next article will be about. What are the problems associated with trying to build a full character behavior with just a BT, and why using BTs is potentially a bad idea.

THANKS: Big thanks to Mika Vehkala for his follow up on this post and our nice long Skype discussion which highlighted some potentially ambiguous portions of the article.

A Guide to Higher-Education for Aspiring Game Programmers

DISCLAIMER: I wrote this piece for a South African Game Development Magazine prior to leaving South Africa. It has been some time since I send the piece in for copy-editing and have heard nothing back regarding it so I’m simply going to post it here. The topic is a matter of some debate and the below article is simply my personal opinion! It was mentioned during the initial review by the game development magazine that certain sections of this article are rather inflammatory and we had agreed to remove them from the final piece. Seeing as this is my personal blog I figured I might as well post the entire un-edited version.

Once Again this is simply a personal opinion and should be taken as such!


When asked to write an article for a local game dev magazine, I was initially apprehensive as writing is not exactly a strong point of mine. It was only once I realized that the article might actually benefit some prospective game programmers that it was well worth the effort. That being said trying to find a topic for the article proved quite challenging. My initial plan was to write a brief article describing the basics of vehicle steering and waypoint following. That idea got scrapped once I built a simple test bed and I realized that I don’t know nearly enough about physics of car motion to write an in-depth article on the topic.

While I was still deciding on a topic, huge discussions (read arguments) started popping up in various game dev communities I followed, all based around the same theme – education:

There seems to be quite a lot of debate on what prospective game programmers should study as well as what current college curricula should include. The most common questions still being asked by prospective game programmers are: “What do I need to study to become a game programmer?” and “What do I need to do to help me land a game industry job as a programmer?”. It is these questions that I wish to address in this article. Continue reading

A wannabe game developer no more

So after all these years of hard work, I’ve finally managed to get into the game industry and into a well known AAA studio to boot. I’ve accepted an offer from IO Interactive (Hitman, Kane & Lynch) who are based in Copenhagen, Denmark. I will be moving to Denmark and starting at IOI in August!

This whole thing still feels a bit like a dream. I cant believe that I’ve finally managed to achieve what honestly felt like an impossible goal when I set it all those years ago. I guess it goes to show that with hard work anything is possible. As excited as I am, I cant help but feel a bit nervous. I’m a little fish getting thrown in the deep end. Then again, I dont think I’d want it any other way. I’ve been dying to improve and to learn and I’m finally going to be working with people that I can really learn a lot from and I honestly cant wait for that!!! To everyone that’s helped me along the way, you know who you are, thank you so much! I honestly dont think I’d have gotten this far without your encouragement and support.

On the downside, I dont know how active this blog will be once I start working but I will do my best to try and keep the content coming. Game dev blogs have been a huge help over the years and I would love to be able to pay the favor forward!

Pathfinding Thesis Complete

WOOT! My thesis, the bane of my existence for the last 2 years, is finally done! Its basically a review of the video game pathfinding field as well as presenting a novel grid map search technique: the spatial grid A*. The version linked below is the final draft that is being submitted to my faculty.

A Robust Explosion Hit Check Technique

In a recent technical interview I got asked the following question: “if an explosion occurs i.e. from a grenade, how would you determine which characters in the game are affected”. This was a question that I couldnt answer at the time which annoyed the hell out of me and it’s been sitting in the back of my head for the last few weeks. So I decided to discuss it (and my proposed solution) on my blog as an “academic” exercise. I would really appreciate any feedback or comments on this discussion. This is far from being a solved problem for me and while my solution may potentially work quite well, there may be a simpler technique available which I don’t know about.

Coming back to the question. Well the naive answer to that question is to simply do a collision detection check with the explosion area of effect/blast radius (sphere) and any nearby characters’ bounding boxes. Continue reading

A sample scene drawn from the viewpoint of the camera

DirectX10 Tutorial 10: Shadow Mapping Part 1

I’ve had some downtime lately and seeing as I wrote a basic shadow mapping demo, I figured I’d write a short tutorial on the theory and implementation of shadow mapping. Shadow mapping is one of those topics that tends to get explained in a overly complicated manner when in fact the concept is rather simple. It is expected that you understand the basic of lighting before attempting this tutorial, if you want to learn more about some basic lighting models please read my lighting tutorial. The figure below shows a sample scene with a single light illuminating a cube.

How shadows are formed

Continue reading

Optimizing the A* algorithm

So I’ve recently completed my MSc thesis on video game pathfinding and I guess it’s a little weird for someone who spent the last year focusing on game AI and pathfinding to not actually spend much time blogging about it. I figured that I spent the time today and write a short post on optimizing the A* algorithm. The A* algorithm pretty much sums up video game pathfinding as a whole. Even advanced techniques like hierarchical pathfinding algorithm make use of A* in searching the various abstraction levels.  So today I’m going to just discuss optimizing the algorithm, not a low level implementation but rather the some of the high level issues. I’m assuming that readers will have some degree of familiarity with the A* algorithm so I’m not going to waste time explaining it.

A*’s computational cost is primarily divided between two components, the heuristic function and the open list and so these are the components that I’m going to focus on. Continue reading

Debugging HLSL

A lot of guys have asked me for advice on developing and debugging shader programs. Well, it’s a tricky topic to deal with. Tto be able to fully debug shader programs you will need either a shader IDE like FXComposer or a GPU debugging tool like Nvidia Nsight. These are both complex tools and beyond the scope of a quick tutorial but what I can do if provide you a quick guide to aid you in writing shaders directly within Visual Studio. You will not be able to perform any sort of in-depth debugging, but it will help you deal with silly syntax errors. The first thing need is NShader. NShader is a shader syntax highlighting plugin for visual studio and helps with clarity when editing and writing shader programs.

The second thing is to create a custom build step within Visual Studio for your shader programs. This custom build step will use the MS shader compiler to compile your shader programs and notify you of any errors as well as tell you on which lines the errors can be found. To do this, we first select our shader file and right-click then select properties (see figure 1 below).

Figure 1: Shader Code File Properties

Doing so will bring up the properties windows, the first step is to ensure that the configuration drop down is set to “all configurations”. The select Item Type and choose “Custom Build Tool” from the drop down (see figure 2).

Figure 2: Enable Custom Build Tool Step

Click Apply, this will then show the custom build menu tab on the left hand side. Select the tab and you be presented with the following dialog window:

Figure 3: Set Custom Build Tool Parameters

Under the general heading, set the command line value to the following:

"$(DXSDK_DIR)Utilities\bin\x86\"fxc.exe  /T fx_4_0 /Fo "%(Filename).fxo" "%(FullPath)"

This will mean that every time the shader file is modified and the solution is compiled, the shader file will be compiled using the microsoft FX compiler (FXC). The /T flag specifies the type of shader file being compile (i.e. the HLSL version). the /Fo flag refers to the compiled output file and the %(FullPath) macro refers to the full path of the current shader file.

Also set the Custom Build Tool outputs to: $(filename).fxo , this is the same as specified in the FXC commandline. Click OK, and you are done.

The results of the this process is shown below, all HLSL errors will pop up in the Visual Studio Error Dialog, double clicking on the error will take you to the line of code that caused the error.

Figure 4: The results of the Custom Build Step

I’ve wasted a ton of time attempting to find silly syntax errors when developing shader programs, and this little custom build step has been a great help. I hope it helps you in some way as well…

Weapon Modification in Games

I recently played through Fallout: New Vegas and couldn’t help but feel that while the whole weapon modification system was a nice addition, it was a pointless one. I, as much as anyone, love some good gun porn and being able to modify my weapon is a welcome addition to any game but in fallout:new vegas there really wasnt any reason to bother with it.

So what did I find so wrong with the weapon modification system in new vegas? Well two things, the first being that unique weapons exist in the game world. These unique weapons are far superior to their fully upgraded standard counterparts which makes you wonder why should I bother with the upgrades? You’ll spent a ton of cash buying the upgrades just to end up with an inferior weapon. Honestly, if designers want unique weapons, I think they need to either remove the entire modification system (and offer more unique weapons) or make the unique weapons upgradeable as well. Hell, make unique weapons unique by adding smaller perks like offering increased criticals or accuracy due to improved internals or improved reliability or something along those lines. That way the the player still has the option of modifying the unique weapons as well and not lose any of the unique features. Basically what you’d be doing is offering the player an improved base platform for the weapon modifications.

Hunting Rifle Upgrade Options from Fallout:New Vegas

Continue reading