Documentation – Pilot v25.2

Geometric Camera Models

Definitions

Term Definition Notes
Backward Camera Model A Geometric Camera Model that transforms 2D image points into 3D rays.  
Camera-Relative World Point A World Point in the reference frame of the camera.  
Extrinsics A portion of a Geometric Camera Model that describes the position and orientation of an object within a world coordinate system. See also: Pose.
Forward Camera Model A Geometric Camera Model that transforms 3D world points into 2D image points.  
Geometric Camera Model A model that describes the geometric properties of a camera. This is sometimes referred to as a “geometric calibration model”. A Geometric Camera Model is composed of Extrinsics, Intrinsics, and Distortion.
Global World Point A World Point in the global reference frame.  
Image Coordinate System   See (todo)
Image Point An ordered pair (x, y) describing a 2D location on the focal plane. An Image Point is defined in an Image Coordinate System.
Intrinsics A portion of a Geometric Camera Model that describes the internal geometric properties of the camera.  
Pose The position and orientation of an object relative some world coordinate system. See also: Extrinsics.
Principal Point The point on the focal plane that intersects the optical axis.  
World Point An ordered triple (X, Y, Z) describing a 3D location in a world coordinate system.  

Notation

All vectors are column vectors. Uppercase \(X\), \(Y\), and \(Z\) refer to 3D world points. Lowercase Uppercase \(x\) and \(y\) refer to 2D image points.

Homogeneous Coordinates

Homogeneous coordinates are a set of coordinates with useful properties for perspective geometry [1, 2]:

  • Infinity may be represented with a finite value.
  • Rotations and translations may be represented by a single matrix operation.
  • Homogeneous coordinates may be used for a space with an arbitrary dimension, including 2D (image) and 3D (world) coordinates.

A “standard” coordinate is referred to as inhomogeneous. 

Properties 

  • If and only if the last coordinate of a homogeneous coordinate is 0, then it is at infinity.
  • If and only if the last coordinate of a homogeneous coordinate is not 0, then it is at a finite location.
  • Two homogeneous points are the same iff there exists a non-zero scalar multiplier between them, i.e., \(\mathbf{x}=k\cdot\mathbf{y}\).

Transforms

Inhomogeneous to Homogeneous

The simplest way to convert from an inhomogeneous coordinate to a homogeneous one is to append a 1 to the end of the coordinate. 

\(\begin{bmatrix}x\\y\end{bmatrix}\rightarrow\begin{bmatrix}x\\y\\1\end{bmatrix}\)

\(\begin{bmatrix}X\\Y\\Z\end{bmatrix}\rightarrow\begin{bmatrix}X\\Y\\Z\\1\end{bmatrix}\)

The general conversion is to append 1 and multiply by any non-zero real number.

\(\begin{bmatrix}x\\y\end{bmatrix}\rightarrow\begin{bmatrix}k\cdot x\\k\cdot y\\k\end{bmatrix}\)

\(\begin{bmatrix}X\\Y\\Z\end{bmatrix}\rightarrow\begin{bmatrix}k\cdot X\\k\cdot Y\\k\cdot Z\\k\end{bmatrix}\)

Homogeneous to Inhomogeneous

To convert from a homogeneous coordinate to an inhomogeneous one, divide all of the components by the last one, which is discarded.

\(\begin{bmatrix}x\\y\\w\end{bmatrix}\rightarrow\begin{bmatrix}x/w\\y/w\end{bmatrix}\)

\(\begin{bmatrix}X\\Y\\Z\\W\end{bmatrix}\rightarrow\begin{bmatrix}X/W\\Y/W\\Z/W\end{bmatrix}\)

Intrinsics

The intrinsic matrix, \(\mathbf{K}\), is an upper-triangular matrix that transforms a world coordinate relative to the camera into a homogeneous image coordinate. There are two general and equivalent forms of the intrinsic matrix:

\(\mathbf{K}=\begin{bmatrix} f_x & s & pp_x \\ 0 & f_y & pp_y \\ 0 & 0 & 1\end{bmatrix}=\begin{bmatrix} f & s & pp_x \\ 0 & f\cdot\alpha & pp_y \\ 0 & 0 & 1\end{bmatrix}\)

where

Variable Description
\(f_x\) The x-focal length
\(f_y=f_x\cdot\alpha\) The y-focal length
\(\alpha=f_y/f_x\) The focal length ratios 
\(s\) The skew
\(\mathbf{pp}=\begin{bmatrix}{pp}_x&{pp}_y\end{bmatrix}^\top\) The principal point (intersection of the optical axis with the focal plane)

The intrinsic matrix of the \(j\)th camera is applied to the \(i\)th camera-relative 3D point to produce a homogeneous image point.

\(\begin{bmatrix} {x}_{i}^{j} \\ {y}_{i}^{j} \\{w}_{i}^{j} \end{bmatrix}=\mathbf{K}^j\begin{bmatrix} {X}_{i} \\ {Y}_{i}\\{Z}_i \end{bmatrix}\)

Notes:

  • The focal length(s) are in pixel pitch units
  • The principal point is defined in an image coordinate system

Camera Models

Simple Pinhole

The forward direction of the simple pinhole model is: Extrinsics → Intrinsics.

The forward direction of the OpenCV model is given by:

  1. Transform a camera-relative world point \(\begin{bmatrix} {X}_{c} & {Y}_{c} &{Z}_{c} \end{bmatrix}^\top\) through the intrinsics camera matrix: \(\begin{bmatrix} {x_a} \\ {y_a}\\w\end{bmatrix}=\begin{bmatrix} f_x & s & pp_x \\ 0 & f_y & pp_y \\ 0 & 0 & 1\end{bmatrix}\begin{bmatrix} {X}_{c} \\ {Y}_{c} \\{Z}_{c} \end{bmatrix}\)
  2. Convert to an inhomogeneous image point

OpenCV

The OpenCV camera model [3] is defined by the OpenCV library. The version used in Imatest is that of OpenCV 4.12.0.

The forward direction of the OpenCV model is: Extrinsics → Distortion → Intrinsics.

The forward direction of the OpenCV model is given by:

  1. Transform a camera-relative world point \(\begin{bmatrix} {X}_{c} & {Y}_{c} &{Z}_{c} \end{bmatrix}^\top\) into an undistoted image point \(\begin{bmatrix} {x_a} & {y_a}\end{bmatrix}^\top\): \(\begin{bmatrix} {x_a} \\ {y_a}\end{bmatrix}=\begin{bmatrix} {X}_{c} / {Z}_{c} \\ {Y}_{c} / {Z}_{c}\end{bmatrix}\)
  2. Compute the radius, \(r\): \(r=\sqrt{x_a^2+y_a^2}\)
  3. Apply the radial distortion: \(\begin{bmatrix} {x_b} \\ {y_b}\end{bmatrix}=\begin{bmatrix} {x_a\cdot \frac{1+k_1\cdot r^2+k_2\cdot r^4+k_3\cdot r^6}{1+k_4\cdot r^2+k_5\cdot r^4+k_6\cdot r^6}} \\ {y_a\cdot \frac{1+k_1\cdot r^2+k_2\cdot r^4+k_3\cdot r^6}{1+k_4\cdot r^2+k_5\cdot r^4+k_6\cdot r^6}}\end{bmatrix}\)
  4. Apply the tangential distortion: \(\begin{bmatrix} {x_c} \\ {y_c}\end{bmatrix}=\begin{bmatrix} {x_b + 2\cdot p_1\cdot x_a\cdot y_a+p_2\cdot (r^2+2\cdot x_a^2) \\ y_b+ p_1\cdot (r^2+2\cdot y_a^2)+2\cdot p_2\cdot x_a\cdot y_a}\end{bmatrix}\)
  5. Apply the thin-prism distortion: \(\begin{bmatrix} {x_d} \\ {y_d}\end{bmatrix}=\begin{bmatrix} {x_c + s_1\cdot r^2 + s_2\cdot r^4 \\ y_c+ s_3\cdot r^2 + s_4\cdot r^4}\end{bmatrix}\)
  6. Apply the tilt-distortion (and convert back to an inhomogeneous image coordinate): \(w\cdot \begin{bmatrix} {x_e} \\ {y_e} \\ 1\end{bmatrix}=\begin{bmatrix}\cos^2(\tau_x) \cdot \cos(\tau_y) + \cos(\tau_x)\cdot \sin^2(\tau_y) & 0 & 0 \\ -\sin(\tau_x) \cdot \sin(\tau_y) & \cos^2(\tau_x)\cdot \cos(\tau_y) + \cos(\tau_y) \cdot \sin^2(\tau_x) & 0 \\ \sin(\tau_y) & -\cos(\tau_y)\cdot \sin(\tau_x) & \cos(\tau_x) \cdot \cos(\tau_y)\end{bmatrix}\begin{bmatrix} {x_d\\y_d\\1}\end{bmatrix}\)
  7. Apply the intrinsics: \(\begin{bmatrix} {x_f} \\ {y_f}\end{bmatrix}=\begin{bmatrix} f_x\cdot x_e + s\cdot y_e + {pp}_x \\ f_y\cdot y_e + {pp}_y\end{bmatrix}\)

References

[1] Hartley, R., & Zisserman, A. (2003). Multiple view geometry in computer vision. Cambridge University Press.

[2] Ma, Y., Soatto, S., Košecká, J., & Sastry, S. (2004). An invitation to 3-d vision: from images to geometric models (Vol. 26). New York: Springer.

[3] OpenCV. Camera Calibration and 3D Reconstruction. https://docs.opencv.org/4.12.0/d9/d0c/group__calib3d.html [Accessed 2025-08-18]