This article give an mathematical perspective to the coordination conversion. It is a general method to convert the coordination from one to another. It is very useful in computer graphics.

Introduction

In computer graphics, it is often necessary to convert coordinates from one system to another. For instance, if we have a point PP in the world coordinate system and we want to determine its coordinates in the camera coordinate system, we can use a general method to perform this conversion.

In this article, I will illustrate the method in 2D, but it can easily be extended to 3D. Suppose you are reading points from a scene format or other resource file where the positive direction is upwards. However, your code is based on a format where the positive direction is downwards, which is the coordinate system most familiar to graphics programmers.

Typically, the top left corner of your window is considered to be the origin (0, 0) and extends to the right and downwards. In the simplest case, you can simply flip the y-coordinate, which is a straightforward operation.

However, objects in a scene are usually organized in a tree structure, where each node has its own local coordinate system. Before proceeding, let’s understand why objects are organized in a tree rather than a flat structure. One reason is that we often need to dynamically update or modify objects. Using a local coordinate system makes these modifications intuitive and easy. For example, if you want to rotate an object, you can simply rotate its local coordinate system. If you want to scale an object, you can scale its local coordinate system, and this operation will also affect its child objects, which is typically what we want.

However, another important aspect is that in dynamic circumstances, this type of organization will effectively utilize both the immutable and mutable parts. For instance, if you only want to rotate the innermost object, simply apply a rotation matrix instead of directly changing its final coordinates.

Object Representation

First, let us clarify the representation of an object. An object is represented by a matrix MM, which denotes the transformation from its local coordinate system to its parent’s coordinate system.

It is important to remember that when you draw any object from a tree, what actually occurs is that its final points are transformed from its local coordinate system to the outermost coordinate system.

PWindowUp=M1×M2×M3×M4×pLocalUp P_{WindowUp} = M_{1} \times M_{2} \times M_{3} \times M_{4} \times p_{LocalUp}

tree
Object Tree

But your renderer only accepts positive coordinates in the downward direction, denoted as PWindowDownP_{WindowDown}. Therefore, you need to flip it at the beginning by multiplying it with a scale matrix SS, where SS is a diagonal matrix.

S=[1001] S = \begin{bmatrix} 1 & 0 \\ 0 & -1 \end{bmatrix}

Then you obtain the final object by using the following equation:

PWindowDown=S×M1×M2×M3×M4×pLocalUp P_{WindowDown} = S \times M_{1} \times M_{2} \times M_{3} \times M_{4} \times p_{LocalUp}

Please note that dealing with pLocalUpp_{LocalUp} will increase your mental workload when interacting with graphics APIs that can only handle the opposite coordinate system, or vice versa. If you think simply as just flipping y axis, Like

PWindowDown=M1×M2×M3×M4×S×pLocalUp P_{WindowDown} = M_{1} \times M_{2} \times M_{3} \times M_{4} \times S \times p_{LocalUp}

It is incorrect because the order of matrix multiplication is important.

S×M1×M2×M3×M4×pLocalUpM1×M2×M3×M4×S×pLocalDownS \times M_{1} \times M_{2} \times M_{3} \times M_{4} \times p_{LocalUp} \neq M_{1} \times M_{2} \times M_{3} \times M_{4} \times S \times p_{LocalDown}

When all matrices are diagonal matrices, the order of multiplication does not matter. However, when there is at least one non-diagonal matrix, the order of multiplication becomes significant.

Clues from Basics

We can start from the basics. Let’s consider the case where there is only one object, represented by PWindowDown=S×M×plocalUpP_{WindowDown} =S\times M \times p_{localUp}

When we want a local-down coordinate system, which means we require a form PWindowDown=M×pLocalDownP_{WindowDown} = M \times p_{LocalDown}

Note that the latter MM is not equal to the former, or the equation doesn’t hold. So using another notation MM^\prime would be better. Also, remember that pLocalDown=S×pLocalUpp_{LocalDown} = S \times p_{LocalUp}, so we get another more essential form:

PWindowDown=M×S×pLocalUpP_{WindowDown} = M^\prime \times S \times p_{LocalUp}

So what’s next? What do we need to do to finish the conversion? We need to make the two equations equal:

S×M×pLocalUp=M×S×pLocalUp S \times M \times p_{LocalUp} = M^\prime \times S \times p_{LocalUp}

This means that regardless of the coordinate systems of pLocalUpp_{LocalUp} or pLocalDownp_{LocalDown}, they must have the same global coordinates after being transformed by their respective MM and MM^\prime. The SS factor helps reconcile the difference between the two coordinate systems, specifically in the opposite y direction.

Do you have any clues from this?

The equation holds if and only if S×M=M×SS \times M = M^\prime \times S.

Voila! We obtain the new transformation matrix for the object:

M=S×M×S1=S×M×S M^\prime = S \times M \times S^{-1} = S \times M \times S

Since SS is a diagonal matrix where S=[1001]S = \begin{bmatrix} 1 & 0 \\ 0 & -1 \end{bmatrix}, then S1=SS^{-1} = S.

One More Step

When we have a more general case, we can assume that the original object hierarchy is:

PWindowDown=S×M1×M2×M3×M4×pLocalUp P_{WindowDown} = S \times M_{1} \times M_{2} \times M_{3} \times M_{4} \times p_{LocalUp}

We can denote each new transform matrix as MnM^\prime_{n}, and then apply the conversion equation mentioned above:

S×M1×M2×M3×M4×pLocalUp=M1×M2×M3×M4×S×pLocalDown S \times M_{1} \times M_{2} \times M_{3} \times M_{4} \times p_{LocalUp} = M^\prime_{1} \times M^\prime_{2}\times M^\prime_{3}\times M^\prime_{4}\times S \times p_{LocalDown}

This equation holds if and only if:

S×M1×M2×M3×M4=M1×M2×M3×M4×S S \times M_{1} \times M_{2} \times M_{3} \times M_{4}= M^\prime_{1} \times M^\prime_{2}\times M^\prime_{3}\times M^\prime_{4}\times S

So many MM^\prime? Don’t forget that we have already obtained M1M^\prime_{1}, which is the new matrix that allows for the correct transformation of the first object with a local-down coordinate system.

Yes, the conversion of the nn-th object is based on the fact that the preceding n1n-1 objects have already been converted. This means that we can use M1M^\prime_{1} to convert the second object to a local-down coordinate system, and then use M2M^\prime_{2} to convert the third object, and so on.

The equations for the conversion matrices are as follows:

M1=SM1S1M^\prime_{1} = SM_{1}S^{-1} M2=M11SM1M2S1M^\prime_{2} = {M^\prime_{1}}^{-1}SM_{1}M_{2}S^{-1} M3=M21M11SM1M2M3S1M^\prime_{3} = {M^\prime_{2}}^{-1}{M^\prime_{1}}^{-1}SM_{1}M_{2}M_{3}S^{-1} M4=M31M21M11SM1M2M3M4S1M^\prime_{4} = {M^\prime_{3}}^{-1}{M^\prime_{2}}^{-1}{M^\prime_{1}}^{-1}SM_{1}M_{2}M_{3}M_{4}S^{-1}

You can use mathematical induction to determine that the general form Mn=MpreviousMnS1M^\prime_{n} = M_{previous} M_{n} S^{-1}

With the exception of the special case SS, where S1=SS^{-1}=S and SS1=S1S=SS=ISS^{-1}=S^{-1}S=SS=I, the result can be simplified as follows:

M1=SM1S1M^\prime_{1} = SM_{1}S^{-1} M2=SM2S1M^\prime_{2} = SM_{2}S^{-1} M3=SM3S1M^\prime_{3} = SM_{3}S^{-1} M4=SM4S1M^\prime_{4} = SM_{4}S^{-1}

Excellent! There is no need to calculate the inverse matrix of MpreviousM_{previous}, which is a very expensive operation. You simply need to multiply SS on the left and right of MnM_{n}. In other words, only when flipping the object’s local y-axis direction, the new matrix is only related to itself.

Take One Step Further

If you observe the process by which we derive the conversion equation, you will notice that nothing is assumed except for the form of SS, which is a diagonal matrix used to flip the y-axis. Therefore, the conversion equation is also valid for other types of SS. SS can be generalized to any matrix BB, which represents the basis vectors of a new coordinate system that can be used to convert from one coordinate system to another. For instance, if you wish to convert from a left-hand coordinate system to a right-hand coordinate system, you can use B=[1001]B = \begin{bmatrix} -1 & 0 \\ 0 & 1 \end{bmatrix}.

Another General Perspective

From this perspective, the above question can be viewed as a specific scenario: converting a child object with an upward-facing y-axis to the root window with a downward-facing y-axis. This can be described as a more general question:

How can one convert the coordinates of an object from one tree to another within the same tree?

tree
Object Tree

Suppose we want to convert M9M_{9} to MM11M_{M11} as shown in the image. This is very useful when we want to apply an effect on M11M_{11} that is related to M9M_{9}, such as masking M11M_{11} with M9M_{9}.

It is obvious if we use the conclusion above:

The object 99 in the local coordinate system of M9M_9 to the root is M5×M1×M6×M7×M9×pLocal9{M_{5} \times M_{1} \times } M_{6}\times M_{7}\times M_{9}\times p_{Local{9}}.

The object 99 in the local coordinate system of M11M_{11} to the root is M5×M1×M10×M11×pLocal11{M_{5} \times M_{1} \times } M_{10}\times M_{11}\times p_{Local{11}}.

Note that object 99 at M9M_{9}, denoted as plocal9p_{local9}, can be viewed from M11M_{11} as B×plocal11B \times p_{local11}, meaning that p9=B×plocal11p_{9} = B \times p_{local11}, where plocal11p_{local11} is the equivalent of plocal9p_{local9} in M11M_{11}, achieved by adding a BB to make them equivalent.

M5×M1×M6×M7×M9×B×pLocal11=M5×M1×M10×M11×pLocal11\sout{M_{5} \times M_{1} \times } M_{6}\times M_{7}\times M_{9}\times B \times p_{Local{11}} = \sout{M_{5} \times M_{1} \times } M_{10}\times M_{11}\times p_{Local{11}}

We can determine that both the left-side object plocal9p_{local9} and the right-side object plocal11p_{local11} can be represented in the local coordinate system of M5M_{5}, without having to be in the root coordinate system.

Go ahead. The equation holds if and only if M7×M9×B=M10×M11M_{7}\times M_{9}\times B= M_{10}\times M_{11}, where

B=M91×M71×M10×M11 B = {M_{9}}^{-1}\times {M_{7}}^{-1}\times M_{10}\times M_{11}

The transformation from M9M_9 to M11M_{11} is

plocal11=B1×plocal9=M111×M101×M7×M9×plocal9 p_{local11} = B^{-1} \times p_{local9} = {M_{11}}^{-1} \times {M_{10}}^{-1} \times M_{7} \times M_{9} \times p_{local9}

just like the path flow P1P2P3P4P5P_1 \rightarrow P_2 \rightarrow P_3 \rightarrow P_4 \rightarrow P_5 shown in the image.