OpenGL Projection Matrix

Related Topics: OpenGL Transformation

Overview
Perspective Projection
Orthographic Projection

The mathematical expressions in this page are re-written by Jiergir Ogoerg using MathML. This page is intended to promote MathML for better presentation of mathematical notation in HTML page, and to encourage the web browsers for better MathML support.
Feel free to leave your comments about MathML.

Overview

A computer monitor is a 2D surface. A 3D scene rendered by OpenGL must be projected onto the computer screen as a 2D image. GL_PROJECTION matrix is used for this projection transformation. First, it transforms all vertex data from the eye coordinates to the clip coordinates. Then, these clip coordinates are also transformed to the normalized device coordinates (NDC) by dividing with w component of the clip coordinates.

A triangle clipped by frustum

Therefore, we have to keep in mind that both clipping (frustum culling) and NDC transformations are integrated into GL_PROJECTION matrix. The following sections describe how to build the projection matrix from 6 parameters; left, right, bottom, top, near and far boundary values.

Note that the frustum culling (clipping) is performed in the clip coordinates, just before dividing by w_c. The clip coordinates, x_c, y_c and z_c are tested by comparing with w_c. If each clip coordinate is less than -w_c, or greater than w_c, then the vertex will be discarded. Then, OpenGL will reconstruct the edges of the polygon where clipping occurs.

Perspective Projection

Perspective Frustum and Normalized Device Coordinates (NDC)

In perspective projection, a 3D point in a truncated pyramid frustum (eye coordinates) is mapped to a cube (NDC); the range of x-coordinate from [l, r] to [-1, 1], the y-coordinate from [b, t] to [-1, 1] and the z-coordinate from [n, f] to [-1, 1].

Note that the eye coordinates are defined in the right-handed coordinate system, but NDC uses the left-handed coordinate system. That is, the camera at the origin is looking along -Z axis in eye space, but it is looking along +Z axis in NDC. Since glFrustum() accepts only positive values of near and far distances, we need to negate them during the construction of GL_PROJECTION matrix.

In OpenGL, a 3D point in eye space is projected onto the near plane (projection plane). The following diagrams show how a point (x_e, y_e, z_e) in eye space is projected to (x_p, y_p, z_p) on the near plane.

Top View of Frustum

Side View of Frustum

From the top view of the frustum, the x-coordinate of eye space, x_e is mapped to x_p, which is calculated by using the ratio of similar triangles;
$\frac{x_{p}}{x_{e}} = \frac{−n}{z_{e}}$
$x_{p} = \frac{−n \cdot x_{e}}{z_{e}} = \frac{n \cdot x_{e}}{{−z}_{e}}$

From the side view of the frustum, y_p is also calculated in a similar way;
$\frac{y_{p}}{y_{e}} = \frac{−n}{z_{e}}$
$y_{p} = \frac{−n \cdot y_{e}}{z_{e}} = \frac{n \cdot y_{e}}{{−z}_{e}}$

Note that both x_p and y_p depend on z_e; they are inversely propotional to -z_e. In other words, they are both divided by -z_e. It is a very first clue to construct GL_PROJECTION matrix. After the eye coordinates are transformed by multiplying GL_PROJECTION matrix, the clip coordinates are still a homogeneous coordinates. It finally becomes the normalized device coordinates (NDC) by divided by the w-component of the clip coordinates. (See more details on OpenGL Transformation.)
$(\begin{matrix} x_{clip} \\ y_{clip} \\ z_{clip} \\ w_{clip} \end{matrix}) = M_{projection} \cdot (\begin{matrix} x_{eye} \\ y_{eye} \\ z_{eye} \\ w_{eye} \end{matrix}), (\begin{matrix} x_{ndc} \\ y_{ndc} \\ z_{ndc} \end{matrix}) = (\begin{matrix} \frac{x_{clip}}{w_{clip}} \\ \frac{y_{clip}}{w_{clip}} \\ \frac{z_{clip}}{w_{clip}} \end{matrix})$

Therefore, we can set the w-component of the clip coordinates as -z_e. And, the 4th of GL_PROJECTION matrix becomes (0, 0, -1, 0).
$(\begin{matrix} x_{c} \\ y_{c} \\ z_{c} \\ w_{c} \end{matrix}) = (\begin{matrix} \cdot & \cdot & \cdot & \cdot \\ \cdot & \cdot & \cdot & \cdot \\ \cdot & \cdot & \cdot & \cdot \\ 0 & 0 & −1 & 0 \end{matrix}) (\begin{matrix} x_{e} \\ y_{e} \\ z_{e} \\ w_{e} \end{matrix}), ∴ w_{c} = {−z}_{e}$

Next, we map x_p and y_p to x_n and y_n of NDC with linear relationship; [l, r] ⇒ [-1, 1] and [b, t] ⇒ [-1, 1].

Mapping from x_p to x_n

x_{n} = \frac{1 - (−1)}{r - l} \cdot x_{p} + β

1 = \frac{2 r}{r - l} + β // substitute (x_{p}, x_{n}) with (r, 1)

β = 1 - \frac{2 r}{r - l} = \frac{r - l}{r - l} - \frac{2 r}{r - l} = \frac{r - l - 2 r}{r - l} = \frac{−r - l}{r - l}

= - \frac{r + l}{r - l}

∴ x_{n} = \frac{2 x_{p}}{r - l} - \frac{r + l}{r - l}

Mapping from y_p to y_n

y_{n} = \frac{1 - (−1)}{t - b} \cdot y_{p} + β

1 = \frac{2 t}{t - b} + β // substitute (y_{p}, y_{n}) with (t, 1)

β = 1 - \frac{2 t}{t - b} = \frac{t - b}{t - b} - \frac{2 t}{t - b} = \frac{t - b - 2 t}{t - b} = \frac{−t - b}{t - b}

= - \frac{t + b}{t - b}

∴ y_{n} = \frac{2 y_{p}}{t - b} - \frac{t + b}{t - b}

Then, we substitute x_p and y_p into the above equations.

x_{n} = \frac{2 x_{p}}{r - l} - \frac{r + l}{r - l} (x_{p} = \frac{n x_{e}}{{−z}_{e}})

= \frac{2 \cdot \frac{n \cdot x_{e}}{{−z}_{e}}}{r - l} - \frac{r + l}{r - l}

= \frac{2 n \cdot x_{e}}{(r - l) ({−z}_{e})} - \frac{r + l}{r - l}

= \frac{\frac{2 n}{r - l} \cdot x_{e}}{{−z}_{e}} - \frac{r + l}{r - l}

= \frac{\frac{2 n}{r - l} \cdot x_{e}}{{−z}_{e}} + \frac{\frac{r + l}{r - l} \cdot z_{e}}{{−z}_{e}}

= \frac{(\frac{2 n}{r - l} \cdot x_{e} + \frac{r + l}{r - l} \cdot z_{e})}{{−z}_{e}}

y_{n} = \frac{2 y_{p}}{t - b} - \frac{t + b}{t - b} (y_{p} = \frac{n y_{e}}{{−z}_{e}})

= \frac{2 \cdot \frac{n \cdot y_{e}}{{−z}_{e}}}{t - b} - \frac{t + b}{t - b}

= \frac{2 n \cdot y_{e}}{(t - b) ({−z}_{e})} - \frac{t + b}{t - b}

= \frac{\frac{2 n}{t - b} \cdot y_{e}}{{−z}_{e}} - \frac{t + b}{t - b}

= \frac{\frac{2 n}{t - b} \cdot y_{e}}{{−z}_{e}} + \frac{\frac{t + b}{t - b} \cdot z_{e}}{{−z}_{e}}

= \frac{(\frac{2 n}{t - b} \cdot y_{e} + \frac{t + b}{t - b} \cdot z_{e})}{{−z}_{e}}

Note that we make both terms of each equation divisible by -z_e for perspective division (x_c/w_c, y_c/w_c). And we set w_c to -z_e earlier, and the terms inside parentheses become x_c and y_c of the clip coordiantes.

From these equations, we can find the 1st and 2nd rows of GL_PROJECTION matrix.
$(\begin{matrix} x_{c} \\ y_{c} \\ z_{c} \\ w_{c} \end{matrix}) = (\begin{matrix} \frac{2 n}{r - l} & 0 & \frac{r + l}{r - l} & 0 \\ 0 & \frac{2 n}{t - b} & \frac{t + b}{t - b} & 0 \\ \cdot & \cdot & \cdot & \cdot \\ 0 & 0 & −1 & 0 \end{matrix}) (\begin{matrix} x_{e} \\ y_{e} \\ z_{e} \\ w_{e} \end{matrix})$

Now, we only have the 3rd row of GL_PROJECTION matrix to solve. Finding z_n is a little different from others because z_e in eye space is always projected to -n on the near plane. But we need unique z value for the clipping and depth test. Plus, we should be able to unproject (inverse transform) it. Since we know z does not depend on x or y value, we borrow w-component to find the relationship between z_n and z_e. Therefore, we can specify the 3rd row of GL_PROJECTION matrix like this.
$(\begin{matrix} x_{c} \\ y_{c} \\ z_{c} \\ w_{c} \end{matrix}) = (\begin{matrix} \frac{2 n}{r - l} & 0 & \frac{r + l}{r - l} & 0 \\ 0 & \frac{2 n}{t - b} & \frac{t + b}{t - b} & 0 \\ 0 & 0 & A & B \\ 0 & 0 & −1 & 0 \end{matrix}) (\begin{matrix} x_{e} \\ y_{e} \\ z_{e} \\ w_{e} \end{matrix}), z_{n} = \frac{z_{c}}{w_{c}} = \frac{A z_{e} + B w_{e}}{{−z}_{e}}$

In eye space, w_e equals to 1. Therefore, the equation becomes;
$z_{n} = \frac{A z_{e} + B}{{−z}_{e}}$

To find the coefficients, A and B, we use the (z_e, z_n) relation; (-n, -1) and (-f, 1), and put them into the above equation.
$\{\begin{matrix} \frac{−A n + B}{n} = −1 \\ \frac{−A f + B}{f} = 1 \end{matrix} → \{\begin{matrix} −A n + B = −n & (1) \\ −A f + B = f & (2) \end{matrix}$

To solve the equations for A and B, rewrite eq.(1) for B;
$B = A n - n (1')$

Substitute eq.(1') to B in eq.(2), then solve for A;
$−A f + (A n - n) = f (2)$
$- (f - n) A = f + n$
$A = - \frac{f + n}{f - n}$

Put A into eq.(1) to find B;
$(\frac{f + n}{f - n}) n + B = −n (1)$
$B = −n - (\frac{f + n}{f - n}) n = - (1 + \frac{f + n}{f - n}) n = - (\frac{f - n + f + n}{f - n}) n = - \frac{2 f n}{f - n}$

We found A and B. Therefore, the relation between z_e and z_n becomes;
$z_{n} = \frac{- \frac{f + n}{f - n} z_{e} - \frac{2 f n}{f - n}}{{−z}_{e}} (3)$

Finally, we found all entries of GL_PROJECTION matrix. The complete projection matrix is;
$(\begin{matrix} \frac{2 n}{r - l} & 0 & \frac{r + l}{r - l} & 0 \\ 0 & \frac{2 n}{t - b} & \frac{t + b}{t - b} & 0 \\ 0 & 0 & \frac{- (f + n)}{f - n} & \frac{- 2 f n}{f - n} \\ 0 & 0 & −1 & 0 \end{matrix})$
OpenGL Perspective Projection Matrix

This projection matrix is for a general frustum. If the viewing volume is symmetric, which is and , then it can be simplified as;
$\begin{matrix} \{\begin{matrix} r + l = 0 \\ r - l = 2 r (width) \end{matrix} & , & \{\begin{matrix} t + b = 0 \\ t - b = 2 t (height) \end{matrix} \end{matrix}$

$(\begin{matrix} \frac{n}{r} & 0 & 0 & 0 \\ 0 & \frac{n}{t} & 0 & 0 \\ 0 & 0 & \frac{- (f + n)}{f - n} & \frac{−2 f n}{f - n} \\ 0 & 0 & −1 & 0 \end{matrix})$

Before we move on, please take a look at the relation between z_e and z_n, eq.(3) once again. You notice it is a rational function and is non-linear relationship between z_e and z_n. It means there is very high precision at the near plane, but very little precision at the far plane. If the range [-n, -f] is getting larger, it causes a depth precision problem (z-fighting); a small change of z_e around the far plane does not affect on z_n value. The distance between n and f should be short as possible to minimize the depth buffer precision problem.

Comparison of Depth Buffer Precisions

Orthographic Projection

Orthographic Volume and Normalized Device Coordinates (NDC)

Constructing GL_PROJECTION matrix for orthographic projection is much simpler than perspective mode.

All x_e, y_e and z_e components in eye space are linearly mapped to NDC. We just need to scale a rectangular volume to a cube, then move it to the origin. Let's find out the elements of GL_PROJECTION using linear relationship.

Mapping from x_e to x_n

x_{n} = \frac{1 - (−1)}{r - l} \cdot x_{e} + β

1 = \frac{2 r}{r - l} + β // substitute (x_{e}, x_{n}) with (r, 1)

β = 1 - \frac{2 r}{r - l} = - \frac{r + l}{r - l}

∴ x_{n} = \frac{2}{r - l} \cdot x_{e} - \frac{r + l}{r - l}

Mapping from y_e to y_n

y_{n} = \frac{1 - (−1)}{t - b} \cdot y_{e} + β

1 = \frac{2 t}{t - b} + β // substitute (y_{e}, y_{n}) with (t, 1)

β = 1 - \frac{2 t}{t - b} = - \frac{t + b}{t - b}

∴ y_{n} = \frac{2}{t - b} \cdot y_{e} - \frac{t + b}{t - b}

Mapping from z_e to z_n

z_{n} = \frac{1 - (−1)}{−f - (−n)} \cdot z_{e} + β

1 = \frac{2 f}{f - n} + β // substitute (z_{e}, z_{n}) with (−f, 1)

β = 1 - \frac{2 f}{f - n} = - \frac{f + n}{f - n}

∴ z_{n} = \frac{−2}{f - n} \cdot z_{e} - \frac{f + n}{f - n}

Since w-component is not necessary for orthographic projection, the 4th row of GL_PROJECTION matrix remains as (0, 0, 0, 1). Therefore, the complete GL_PROJECTION matrix for orthographic projection is;
$(\begin{matrix} \frac{2}{r - l} & 0 & 0 & - \frac{r + l}{r - l} \\ 0 & \frac{2}{t - b} & 0 & - \frac{t + b}{t - b} \\ 0 & 0 & \frac{−2}{f - n} & - \frac{f + n}{f - n} \\ 0 & 0 & 0 & 1 \end{matrix})$
OpenGL Orthographic Projection Matrix

It can be further simplified if the viewing volume is symmetrical, and
$\begin{matrix} \{\begin{matrix} r + l = 0 \\ r - l = 2 r (width) \end{matrix} & , \{\begin{matrix} t + b = 0 \\ t - b = 2 t (height) \end{matrix} \end{matrix}$

$(\begin{matrix} \frac{1}{r} & 0 & 0 & 0 \\ 0 & \frac{1}{t} & 0 & 0 \\ 0 & 0 & \frac{−2}{f - n} & - \frac{f + n}{f - n} \\ 0 & 0 & 0 & 1 \end{matrix})$

Credits

Jiergir Ogoerg: MathML Conversion

←Back

Hide Comments