# Optimizing the A* algorithm

May 2, 2011 3 Comments

So I’ve recently completed my MSc thesis on video game pathfinding and I guess it’s a little weird for someone who spent the last year focusing on game AI and pathfinding to not actually spend much time blogging about it. I figured that I spent the time today and write a short post on optimizing the A* algorithm. The A* algorithm pretty much sums up video game pathfinding as a whole. Even advanced techniques like hierarchical pathfinding algorithm make use of A* in searching the various abstraction levels. So today I’m going to just discuss optimizing the algorithm, not a low level implementation but rather the some of the high level issues. I’m assuming that readers will have some degree of familiarity with the A* algorithm so I’m not going to waste time explaining it.

A*’s computational cost is primarily divided between two components, the heuristic function and the open list and so these are the components that I’m going to focus on.

# The Heuristic Function

The heuristic function returns an estimate of the distance between a node and a goal node. It is a well-known property of A* that when an admissible heuristic is used then the resultant paths are optimal. An admissible heuristic is simple a heuristic that underestimates (or is exactly equal to) the distance to the goal. So the cheaper the heuristic function the better ;)

Optimization Complete!! haha. This might not seem obvious but in some cases the vast majority of A*’s computational time is spent in the heuristic function. Making that function as lightweight as possible is critical. The standard euclidian distance calculation can be rather expensive due to the square root operator and can be approximated by various methods (keep in mind that such approximation may not be admissable anymore but thats not such a big deal).

Now moving onto slightly more interesting heuristic tweaks, search algorithms often trade path optimality for improved search times, almost all hierarchical methods do so as does my own SGA* algorithm. A simply way to achieve this trade off in a standard A* algorithm is to make use of an overestimating heuristic, what this means is that the heuristic overestimates the cost to the goal node. Now a common mistake and one I’ve seen in a few academic papers is to simply add a constant to the heuristic value and Boom! It’s an overestimating heuristic. No, it’s not; all you’ve done now is increased all the F values across the board with that constant, the heuristic is unchanged and so is still admissible.

A good example of an overestimating heuristic is the Manhattan distance, which while being an admissible heuristic for tile graphs is not an admissible heuristic for octile graphs. The Manhattan distance simply returns the sum of the vertical and horizontal deltas between two points and so is a really cheap heuristic to employ. So what are the effects of using an overestimating heuristic? Well since node selection is based on the F cost, an overestimate for the H cost will diminish the weight of the G cost and the algorithm will explore nodes that have lower H values over nodes that have lower G values. This pushes node exploration in the direction of the goal node, rather than keeping it roughly centered on the start node. This figure below shows the difference between the two heuristic on the total exploration space for the A* algorithm.

Now, this inadmissible heuristic does help to reduce the search space exploration and so decreases both the search time and memory costs of performing the search BUT there is no guarantee on the optimality of the path returned. In the case above there is no difference in path optimality but in practice differences can (and often do) exist. So having a heuristic that (slightly) overestimates the distance is a valid optimization for A* search. Path optimality is not really all that important especially considering the fact that humans arent very good at visually identifying precise path lengths for similar length paths. In my own experience its pretty hard to pick the shorter path from two similar length complex paths on a map (now try doing that mid game). For example, in the below figure which is the shorter path? (Hint: its not the green one :P ).

# The Open List

Before I carry on, I’m assuming that everyone reading this has realized that a closed list need not be maintained and that a simple onList flag can be set for each node thereby removing the need to search a closed list repeatedly.

So how do we optimize the open list: store it in a priority queue! Done!

Well it’s not actually as simple as that, consider using a simple unordered list (linked list/array/whatever) to store your open list. Each time you need to find the lowest F cost node, you’d need to search at worst the whole list giving the pop operation a cost of O(N). Adding, removing and updating nodes haves a cost of O(1). A skew heap priority queue on the other hand states that all operations costs O(log N). So why is the priority queue presented as a valid optimization, mainly because the size of the open list continues to grow over time and so the cost of finding and removing the smallest entry keeps increasing.

The most efficient priority queue implementation is a skew heap (binary tree) with O(log N) costs for all operations but these costs are amortized costs basically meaning that it usually costs that much. The actual costs of the operations are between O(log N) and O(N). Another annoyance is that most priority queues don’t allow updating of existing nodes by default and so if a node is to be updated it (naively) needs to be removed and re-inserted (requiring two O(log n) operations). Granted that is a naïve method and the cost can be brought down in most cases to a single O(log N) operation.

So this all sounds pretty good in theory, if we can reduce the cost of finding and removing the lowest F cost node then we can reduce to total cost of the algorithm. Well finding and removing that node may not be the most commonly performed open list operation. The reality is that in fact in some cases using a priority queue may end up being slower than using a simple unordered list. Consider a grid map where all nodes (excluding edge nodes) have a branching factor of 8. So for each node explored, in the worst case, 8 successor nodes will be examined of which, again at worst, 7 will be added to the open list. So what’s faster 7 best case O(log N) cost insertions or a single O(N) cost traversal?

Well it depends on your implementation and your navigational graph. If you’re using a navmesh with a low branching factor e.g. 3 then, at worst, you’ll be doing 2 open list addition/updates per node and a priority queue would help but for a grid maps with a high branching factor that may not be the case. A good method to profile your A* algorithm is to record the number of open list operations (additions, updates, insertions, etc) and then choose an open list data structure based on your application’s usage patterns. Even though priority queues are ubiquitous in A* implementations, no one mentions that YMMV with them. Hell in some cases a simple bucket approach to open list storage has been shown to offer significant improvements over a priority queue approach. Don’t just assume a priority queue is the best approach to use.

Another cute trick when it comes to open list operations is the early exploration of successor nodes. Since A* always explores the node with the lowest F cost at each iteration. If during a node’s exploration a successor node is discovered with an F cost that is the same or less than the currently explored node’s F cost (i.e. with a lower F cost than all the nodes on the open list) then that successor node can be directly explored at the next iteration without being added and then subsequently removed from the open list. This means that two costly open list operations are saved with no drawbacks. Now considering that during the exploration of a node several nodes may have F values lower than the current node, one of two methods can be used:

- The first successor node encountered which is lower than or equal to the current F cost is marked for early exploration
- Store each successor node with a lower F cost then mark the lowest F cost node from them for early exploration. The rest of the nodes are added to the open list.

Obviously if such a successor node is already on the open list, it needs to be removed from it before being explored.

Again this optimization is more useful for navigational graphs with high branching factors (i.e. grid based graphs rather than for example navmesh based navigational graphs which have low branching factors). The benefit of the early expansion is since for graphs with high branching factors a large number of nodes are added to the open list when a node is explored for the first time, in being able to reduce the costs of the node addition is beneficial especially in the case of long straight uninterupted paths. In such cases, the early exploration will always occur and the overall search time will be reduced.

Anyways, this was a very brief post (more like a random collection of thoughts) on some possible optimizations to the basic A* algorithm. It wasn’t exactly super technical but I’m a bit bored and this was a good way to kill some time on a lazy public holiday afternoon. Again I need to say that there is no one set of optimizations that will work in all situations, evaluate your application and optimize for it.

Great post, you’ve got some nice ideas here.

I used to blindly use a binary heap for the open list with the thought that O(n) would always be worse than O(log n) and didn’t consider all the insertions and the idea of marking for early exploration without adding them to the open list. which would speed up pathfinding in open areas quite alot i assume.

Will certainly try out these ideas next time I’m writing a pathfinder for a game.

By the way, do you perhaps know a good resource on pathfinding using nav meshes with some implementation details?

Only article I’ve found was http://www.ai-blog.net/archives/000152.html while good at giving a general idea of what it is and what the advantages are, but I always wanted some implementation details.

Anyway, thanks for this post!

Pingback: Help! Computing power or better algorithm required | cartesian product

Pingback: welcome