[Engineering] Coordination Conversion in the Mathematical Perspective
This article give an mathematical perspective to the coordination conversion. It is a general method to convert the coordination from one to another. It is very useful in computer graphics.
Introduction⌗
In computer graphics, it is often necessary to convert coordinates from one system to another. For instance, if we have a point in the world coordinate system and we want to determine its coordinates in the camera coordinate system, we can use a general method to perform this conversion.
In this article, I will illustrate the method in 2D, but it can easily be extended to 3D. Suppose you are reading points from a scene format or other resource file where the positive direction is upwards. However, your code is based on a format where the positive direction is downwards, which is the coordinate system most familiar to graphics programmers.
Typically, the top left corner of your window is considered to be the origin (0, 0) and extends to the right and downwards. In the simplest case, you can simply flip the y-coordinate, which is a straightforward operation.
However, objects in a scene are usually organized in a tree structure, where each node has its own local coordinate system. Before proceeding, let’s understand why objects are organized in a tree rather than a flat structure. One reason is that we often need to dynamically update or modify objects. Using a local coordinate system makes these modifications intuitive and easy. For example, if you want to rotate an object, you can simply rotate its local coordinate system. If you want to scale an object, you can scale its local coordinate system, and this operation will also affect its child objects, which is typically what we want.
However, another important aspect is that in dynamic circumstances, this type of organization will effectively utilize both the immutable and mutable parts. For instance, if you only want to rotate the innermost object, simply apply a rotation matrix instead of directly changing its final coordinates.
Object Representation⌗
First, let us clarify the representation of an object. An object is represented by a matrix , which denotes the transformation from its local coordinate system to its parent’s coordinate system.
It is important to remember that when you draw any object from a tree, what actually occurs is that its final points are transformed from its local coordinate system to the outermost coordinate system.
But your renderer only accepts positive coordinates in the downward direction, denoted as . Therefore, you need to flip it at the beginning by multiplying it with a scale matrix , where is a diagonal matrix.
Then you obtain the final object by using the following equation:
Please note that dealing with will increase your mental workload when interacting with graphics APIs that can only handle the opposite coordinate system, or vice versa. If you think simply as just flipping y axis, Like
It is incorrect because the order of matrix multiplication is important.
When all matrices are diagonal matrices, the order of multiplication does not matter. However, when there is at least one non-diagonal matrix, the order of multiplication becomes significant.
Clues from Basics⌗
We can start from the basics. Let’s consider the case where there is only one object, represented by
When we want a local-down coordinate system, which means we require a form
Note that the latter is not equal to the former, or the equation doesn’t hold. So using another notation would be better. Also, remember that , so we get another more essential form:
So what’s next? What do we need to do to finish the conversion? We need to make the two equations equal:
This means that regardless of the coordinate systems of or , they must have the same global coordinates after being transformed by their respective and . The factor helps reconcile the difference between the two coordinate systems, specifically in the opposite y direction.
Do you have any clues from this?
The equation holds if and only if .
Voila! We obtain the new transformation matrix for the object:
Since is a diagonal matrix where , then .
One More Step⌗
When we have a more general case, we can assume that the original object hierarchy is:
We can denote each new transform matrix as , and then apply the conversion equation mentioned above:
This equation holds if and only if:
So many ? Don’t forget that we have already obtained , which is the new matrix that allows for the correct transformation of the first object with a local-down coordinate system.
Yes, the conversion of the -th object is based on the fact that the preceding objects have already been converted. This means that we can use to convert the second object to a local-down coordinate system, and then use to convert the third object, and so on.
The equations for the conversion matrices are as follows:
You can use mathematical induction to determine that the general form
With the exception of the special case , where and , the result can be simplified as follows:
Excellent! There is no need to calculate the inverse matrix of , which is a very expensive operation. You simply need to multiply on the left and right of . In other words, only when flipping the object’s local y-axis direction, the new matrix is only related to itself.
Take One Step Further⌗
If you observe the process by which we derive the conversion equation, you will notice that nothing is assumed except for the form of , which is a diagonal matrix used to flip the y-axis. Therefore, the conversion equation is also valid for other types of . can be generalized to any matrix , which represents the basis vectors of a new coordinate system that can be used to convert from one coordinate system to another. For instance, if you wish to convert from a left-hand coordinate system to a right-hand coordinate system, you can use .
Another General Perspective⌗
From this perspective, the above question can be viewed as a specific scenario: converting a child object with an upward-facing y-axis to the root window with a downward-facing y-axis. This can be described as a more general question:
How can one convert the coordinates of an object from one tree to another within the same tree?
Suppose we want to convert to as shown in the image. This is very useful when we want to apply an effect on that is related to , such as masking with .
It is obvious if we use the conclusion above:
The object in the local coordinate system of to the root is .
The object in the local coordinate system of to the root is .
Note that object at , denoted as , can be viewed from as , meaning that , where is the equivalent of in , achieved by adding a to make them equivalent.
We can determine that both the left-side object and the right-side object can be represented in the local coordinate system of , without having to be in the root coordinate system.
Go ahead. The equation holds if and only if , where
The transformation from to is
just like the path flow shown in the image.