So recently I’ve hit this problem twice and while I’ve solved it to a satisfactory degree, I can’t help but feel there might be a more elegant solution. In this post I’m going to outline the problem as well as my solution with the intention of either starting a discussion or perhaps helping someone else who is facing the same problem.

So the issue is simple, I have a object with a reference transform X and I also have a set of N target transforms. I need to find the target transform in the set that is the closest in both position and orientation to the reference transform X. Visually this can be represented below:

Now obviously this is an error minimization problem but the trick is to represent the error in orientation and position in a way that we don’t bias the overall error in favor of one or the other (i.e. both orientation and position are equally important). In lay mans terms, we need to reduce the error of the orientation and position to the same range. So let’s start with treating the two components of the transform individually.

Orientation is simple enough, we define a “forward” axis for our transform and using the reference transform as a starting point, we calculate the angle needed for the shortest rotation between the reference transform’s forward axis and the target transform’s forward axis. Since we are looking for the shortest rotation, these deltas are in the range or [-180, 180] which allows us to calculate an error metric for the orientation to be the absolute value of the delta angle divided by 180 degrees which results in a 0-1 error metric where 0.0 is a perfectly matching orientation and 1.0 is the worst possible error.

Position is the problem, it’s easy enough to calculate the distance between the reference and each target but those distances cant be easily compared to the orientation error since they are not in a 0-1 range. And this is where I starting feeling like what I did was slightly in-elegant. So solve the problem, I did the dumbest thing I could think of. I simple calculated the max distance between any of the target transforms and the reference transform. I then used this as my error scale and for each target transform, divided it’s distance to the reference transform by the worst possible error. This reduces the position error to a [0,1] range where 0.0 is a perfectly matching position and 1.0 is as for the orientation case the worst case scenario.

Since I now have two 0-1 error values, it is relatively trivial to find the best target transform relative to the reference transform, it also easily allows me to bias towards orientation or position by applying a 0-1 bias weight, where 0 is only using orientation and 1.0 is only taking position into account.

This seems to solve the problem well enough for me but I’m curious if there isn’t a different/better solution. Something just feels off for me with my approach.