This post is a quick brain dump on the topic of animation root motion and how to blend it. There doesn’t seem to be a lot of information on the topic online and so I think it’s worth doing a fast post on it.

Before we carry-on, lets quickly discuss what root motion is. Root motion is a description of the overall character displacement in the world for a given animation. This is often authored as a separate bone in the DCC (max/maya/MB) and the animation is exported to the game-engine with this bone as the root of the animation hierarchy (at least conceptually). This information can then be used by animation-driven systems to actually move the game character in the world. Weirdly, the topic of root motion is not one I can find a lot of information on and I guess this is probably since most games tend not be be root motion driven as it is often easier to get the desired reactivity of characters with gameplay driven approaches (trading the physicality of movement for reactivity to inputs).

To determine the motion of a character each frame, we calculate the range on the animation time line that this frame update covered and return a delta value between position of the root bone at the start and at the end of the time covered on the animation timeline. This delta transform is then applied to the current world transform of the character we are animating.

Now lets quickly touch on animation blending, which is pretty much at the core of all current-gen animation systems. No matter what sort of fancy logic we have at the highest level (HFSM, parametric blendspace, etc.), we almost always end up blending some animation X with some animation Y to produce some pose for our characters.

In general, when we blend animation poses we tend to blend the rotation, translation and scale of each bone separately: the rotations are spherically interpolated (SLERP) while the translation and scale are linearly interpolated (LERP). This linear interpolation for translation makes sense since the translations for a bone describe the length of that bone.

Now when dealing with the root motion track, it’s important to note that the delta value we mentioned earlier contains different information from that of a regular bone in the skeleton. The delta value contains three pieces of information describing the character motion for the update:

- The heading direction of the character’s motion )
- The distance moved (or displacement) along the heading
- The character facing delta

The first two are stored in the translation portion of the root motion delta while the third is represented by the rotation of the root motion delta. Now when blending root motion, if we decide to simply do a LERP as we would for any regular bones we end up screwing up the first two piece of information. Let’s look at a simple example below:

In the above image, we are interpolating between two vector that have the same length but different directions. The yellow vector represent the linear interpolation between these two vectors. As you can see when we are 50% in between the two vectors, the resulting vector has a length that is half that of the original vectors. If these vectors represented the movement of a character e.g. strafing fwd+right, we would end up moving half as slow as we would if we only moved in a single direction. This is obviously not what we wanted and in-fact will cause some nasty foot sliding with the resulting animation.

What we actually need to do is to interpolate the heading direction and the distance traveled separately. The simplest way to achieve this is to use a Normalized LERP (NLERP) instead of the simple LERP. An NLERP functions as follows:

- Calculate the length of both vectors and linearly interpolate between the lengths
- Normalize both vectors, and linearly interpolate between them to get the resulting heading direction
- Scale the interpolated heading direction to the interpolated length calculated in the first step.

The result of this operation is shown by the cyan vector in the above example. As you can see the length of the vector is now correct. Unfortunately in terms of heading, we still have an issue. Lets look at another example below where there vectors have different lengths as well as different directions.

The LERP result is obviously not what we want but the NLERP results in an inconsistent angular velocity over the course of the interpolation. Basically, changes heading faster in the middle of the interpolation that it does at either end. This could result in a relatively nasty jerk when transitioning between two animations. There is a third approach to interpolated the vectors and that is to do it in a spherical manner (SLERP). This functions as follows:

- Calculate the length of both vectors and linearly interpolate between the lengths
- Normalize both vectors, and calculate the angle between them (dot product and ACos)
- Multiply the angle between them with the blend weight (interpolation parameter)
- Calculate the axis of rotation between the two vectors (cross product) and the required rotation (quaternion from axis/angle)
- Apply the rotation to the
**from**vector and scale the result to the interpolated length calculated in the first step

Now obviously as you can tell this is a very expensive way to interpolate two vectors, but it does give the best results, as visualized by the purple vector in the examples above.

Now the problems with the motion blending I described are worst the larger the angle between the two vectors is (see the example below). Here you can clearly see the NLERP result initially lagging behind and then overtaking the SLERP result.

So basically the take away from this is simply: don’t LERP translations when blending root motion. Based on your needs and performance budget, choose between either a NLERP or SLERP. Now this is only really needed if you are building a game that is animation driven, if not… well… move along, nothing to see here 😛