Module 3: Transformations Lesson 3 of 3

Homogeneous Coordinates

In the previous lessons, rotation was a matrix multiplication and translation was an addition. Having two different operations is awkward — especially when you need to chain many transforms together. Homogeneous coordinates unify rotation and translation into a single matrix multiplication.

The Problem

To rotate and then translate a 2D point:

p=R(θ)p+t\mathbf{p}' = R(\theta) \cdot \mathbf{p} + \mathbf{t}

This requires two separate operations. We can’t represent translation as a 2×2 matrix multiply on a 2D vector. But what if we added an extra dimension?

2D Homogeneous Transformations

The key idea: represent a 2D point (x,y)(x, y) as a 3D vector (x,y,1)(x, y, 1) by appending a 1. Now we can build a 3×3 matrix that handles both rotation and translation:

T=[cosθsinθtxsinθcosθty001]T = \begin{bmatrix} \cos\theta & -\sin\theta & t_x \\ \sin\theta & \cos\theta & t_y \\ 0 & 0 & 1 \end{bmatrix}

Applying it:

[xy1]=T[xy1]=[xcosθysinθ+txxsinθ+ycosθ+ty1]\begin{bmatrix} x' \\ y' \\ 1 \end{bmatrix} = T \cdot \begin{bmatrix} x \\ y \\ 1 \end{bmatrix} = \begin{bmatrix} x\cos\theta - y\sin\theta + t_x \\ x\sin\theta + y\cos\theta + t_y \\ 1 \end{bmatrix}

One multiplication does both rotation and translation!

SE(2): Special Euclidean Group

The set of all 2D rigid-body transformations (rotation + translation, no scaling or shearing) forms the group SE(2).

  • Special: determinant = +1 (no reflections)
  • Euclidean: distances are preserved
  • This is the mathematically precise way to describe robot motion in a plane

Structure of the 3×3 Matrix

T=[R2×2t2×101×21]=[rotationtranslation001]T = \begin{bmatrix} R_{2 \times 2} & \mathbf{t}_{2 \times 1} \\ \mathbf{0}_{1 \times 2} & 1 \end{bmatrix} = \begin{bmatrix} \text{rotation} & \text{translation} \\ 0 \quad 0 & 1 \end{bmatrix}

The top-left 2×2 block is the rotation matrix. The top-right column is the translation vector. The bottom row is always [001][0 \quad 0 \quad 1].

Transformation Composition in SE(2)

The biggest advantage of homogeneous coordinates: chaining transforms is just matrix multiplication.

Ttotal=T2T1T_{\text{total}} = T_2 \cdot T_1

Apply T1T_1 first, then T2T_2 — same right-to-left rule as before.

Worked Example

Transform 1: Rotate 90° and translate (1,0)(1, 0):

T1=[011100001]T_1 = \begin{bmatrix} 0 & -1 & 1 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{bmatrix}

Transform 2: Rotate 0° and translate (0,2)(0, 2):

T2=[100012001]T_2 = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 2 \\ 0 & 0 & 1 \end{bmatrix}

Composite T2T1T_2 \cdot T_1:

T2T1=[011102001]T_2 \cdot T_1 = \begin{bmatrix} 0 & -1 & 1 \\ 1 & 0 & 2 \\ 0 & 0 & 1 \end{bmatrix}

Applying to the origin (0,0,1)T(0, 0, 1)^T: the point ends up at (1,2)(1, 2). First T1T_1 rotates and translates to (1,0)(1, 0), then T2T_2 shifts it up by 2 to (1,2)(1, 2).

3D Homogeneous Transformations

The same idea extends to 3D using 4×4 matrices:

T=[R3×3t3×101×31]=[r11r12r13txr21r22r23tyr31r32r33tz0001]T = \begin{bmatrix} R_{3 \times 3} & \mathbf{t}_{3 \times 1} \\ \mathbf{0}_{1 \times 3} & 1 \end{bmatrix} = \begin{bmatrix} r_{11} & r_{12} & r_{13} & t_x \\ r_{21} & r_{22} & r_{23} & t_y \\ r_{31} & r_{32} & r_{33} & t_z \\ 0 & 0 & 0 & 1 \end{bmatrix}

This is an element of SE(3) — the Special Euclidean group in 3D. SE(3) has 6 degrees of freedom: 3 for rotation (from SO(3)) and 3 for translation.

Efficient Inverse

The inverse of a homogeneous transform has a special structure:

T1=[RTRTt01]T^{-1} = \begin{bmatrix} R^T & -R^T \mathbf{t} \\ \mathbf{0} & 1 \end{bmatrix}

No need for general matrix inversion — just transpose RR and compute RTt-R^T \mathbf{t}. This is much faster and more numerically stable.

Robot Coordinate Frames Robotics Application

Every link of a robot arm has its own coordinate frame. The transformation Ti1,iT_{i-1,i} describes how frame ii relates to frame i1i-1:

  • The rotation part RR encodes the relative orientation between the two frames
  • The translation part t\mathbf{t} encodes the offset between their origins

These 4×4 homogeneous transformation matrices encode both rotation and translation between consecutive joints in a single, elegant matrix.

Forward Kinematics: Chaining Transforms

For a robot arm with nn joints, the end-effector pose relative to the base is:

T0,n=T0,1T1,2Tn1,nT_{0,n} = T_{0,1} \cdot T_{1,2} \cdot \ldots \cdot T_{n-1,n}

Each Ti1,iT_{i-1,i} depends on the joint angle θi\theta_i and the link geometry. Multiply them all together to get the end-effector’s position and orientation.

2-Link Planar Robot Arm Robotics Application

Consider a 2-link planar arm with joint angles θ1\theta_1, θ2\theta_2 and link lengths L1L_1, L2L_2.

Frame 0 → Frame 1 (rotate by θ1\theta_1, extend by L1L_1):

T0,1=[cosθ1sinθ1L1cosθ1sinθ1cosθ1L1sinθ1001]T_{0,1} = \begin{bmatrix} \cos\theta_1 & -\sin\theta_1 & L_1\cos\theta_1 \\ \sin\theta_1 & \cos\theta_1 & L_1\sin\theta_1 \\ 0 & 0 & 1 \end{bmatrix}

Frame 1 → Frame 2 (rotate by θ2\theta_2, extend by L2L_2):

T1,2=[cosθ2sinθ2L2cosθ2sinθ2cosθ2L2sinθ2001]T_{1,2} = \begin{bmatrix} \cos\theta_2 & -\sin\theta_2 & L_2\cos\theta_2 \\ \sin\theta_2 & \cos\theta_2 & L_2\sin\theta_2 \\ 0 & 0 & 1 \end{bmatrix}

The end-effector position from T0,2=T0,1T1,2T_{0,2} = T_{0,1} \cdot T_{1,2}:

x=L1cosθ1+L2cos(θ1+θ2)x = L_1\cos\theta_1 + L_2\cos(\theta_1 + \theta_2)y=L1sinθ1+L2sin(θ1+θ2)y = L_1\sin\theta_1 + L_2\sin(\theta_1 + \theta_2)

This is forward kinematics — computing where the end-effector is from joint angles. Try it in the interactive demo below!

Denavit-Hartenberg (DH) Convention

The 2-link example above manually constructed each transform. For a 6-DOF industrial arm, we need a systematic method. The Denavit-Hartenberg convention provides exactly this — a standard way to describe any serial robot with just 4 numbers per joint.

The Four DH Parameters

Each joint i in a serial manipulator is fully described by:

ParameterSymbolMeaning
Joint angleθi\theta_iRotation about the z-axis (variable for revolute joints)
Link offsetdid_iTranslation along the z-axis (variable for prismatic joints)
Link lengthaia_iTranslation along the x-axis
Link twistαi\alpha_iRotation about the x-axis

The DH Transformation Matrix

These four parameters combine into a single 4×4 homogeneous transform:

Ti=[cosθisinθicosαisinθisinαiaicosθisinθicosθicosαicosθisinαiaisinθi0sinαicosαidi0001]T_i = \begin{bmatrix} \cos\theta_i & -\sin\theta_i\cos\alpha_i & \sin\theta_i\sin\alpha_i & a_i\cos\theta_i \\ \sin\theta_i & \cos\theta_i\cos\alpha_i & -\cos\theta_i\sin\alpha_i & a_i\sin\theta_i \\ 0 & \sin\alpha_i & \cos\alpha_i & d_i \\ 0 & 0 & 0 & 1 \end{bmatrix}

The full forward kinematics is then: T0,n=T1T2TnT_{0,n} = T_1 \cdot T_2 \cdot \ldots \cdot T_n

The same arm from above can be described with a DH parameter table:

Jointθ\thetaddaaα\alpha
1θ1\theta_10L1L_10
2θ2\theta_20L2L_20

Since all joints are revolute and planar, only θ\theta varies and the twist α=0\alpha = 0. Plugging into the DH formula with α=0\alpha = 0 and d=0d = 0 gives exactly the transforms we derived manually — the DH convention just provides a systematic way to arrive at them.

DH in Code

The math library provides dhTransform(a, alpha, d, theta) which implements the DH matrix above. For the 2-link arm: dhTransform(L1, 0, 0, theta1) produces the same T0,1T_{0,1} we built by hand.

Note: There are two DH conventions — Standard (used here, from Denavit & Hartenberg) and Modified (from Craig’s textbook), which places coordinate frames differently. Always check which convention your library uses.

Common DH Pitfalls

  • Parameter order varies between textbooks: some use (θ\theta, dd, aa, α\alpha), others use (aa, α\alpha, dd, θ\theta). Always check.
  • Standard vs. Modified DH: These produce different transforms for the same robot. Mixing them is a frequent source of bugs.
  • Frame assignment: Choosing where to place each coordinate frame has rules that can be subtle for complex robots. This is covered in detail in Module 5.

Inverse Transforms

The inverse T1T^{-1} “undoes” a transformation. If TT converts from frame A to frame B, then T1T^{-1} converts from B back to A.

Application: A robot’s camera sees an object in its local frame. To find the object’s position in the world frame, apply the camera’s transformation. To go the other way (world → camera), apply the inverse.

Interactive Visualization

Explore a 2-link planar robot arm. Adjust joint angles and link lengths to see how the homogeneous transformation matrices chain together. Each joint’s coordinate frame is shown, and the matrices update in real-time.

Controls
-3.1415926535897933.141592653589793
-3.1415926535897933.141592653589793
1.500
0.52.5
1.200
0.52

T₀₁ (Base → Joint 1)

[0.707-0.7071.061]
[0.7070.7071.061]
[0.0000.0001.000]

T₁₂ (Joint 1 → End)

[0.500-0.8660.600]
[0.8660.5001.039]
[0.0000.0001.000]

T₀₂ = T₀₁ · T₁₂

[-0.259-0.9660.750]
[0.966-0.2592.220]
[0.0000.0001.000]

End-Effector Pose

Position: (0.750, 2.220)
Orientation: 105.0°

Key insight: The end-effector pose T₀₂ is computed by multiplying T₀₁ · T₁₂. This "chaining" of homogeneous transforms is the foundation of forward kinematics — computing where the end-effector is from joint angles.

Practice Problems

  1. Compute the 3×3 homogeneous transform for a rotation by 90° and translation (2,3)(2, 3).

  2. Given T1T_1 = rotate 45° with translation (1,0)(1, 0) and T2T_2 = rotate −45° with translation (0,1)(0, 1), where does the origin (0,0)(0, 0) end up after applying T2T1T_2 \cdot T_1?

  3. For a 2-link robot with L1=L2=1L_1 = L_2 = 1, what joint angles (θ1,θ2)(\theta_1, \theta_2) reach the point (1,1)(1, 1)?

Answers
  1. T=[012103001]T = \begin{bmatrix} 0 & -1 & 2 \\ 1 & 0 & 3 \\ 0 & 0 & 1 \end{bmatrix}

  2. First compute T1(0,0,1)TT_1 \cdot (0, 0, 1)^T: the origin maps to (1,0)(1, 0) (the translation part of T1T_1). Then compute T2(1,0,1)TT_2 \cdot (1, 0, 1)^T: T2T_2 rotates by −45° and translates by (0,1)(0, 1). The result is (1cos(45°)+0,1sin(45°)+1+0)=(22,122)(0.707,0.293)(1 \cdot \cos(-45°) + 0, 1 \cdot \sin(-45°) + 1 + 0) = (\frac{\sqrt{2}}{2}, 1 - \frac{\sqrt{2}}{2}) \approx (0.707, 0.293).

  3. The end-effector must satisfy x=cosθ1+cos(θ1+θ2)=1x = \cos\theta_1 + \cos(\theta_1 + \theta_2) = 1 and y=sinθ1+sin(θ1+θ2)=1y = \sin\theta_1 + \sin(\theta_1 + \theta_2) = 1. One solution: θ1=90°\theta_1 = 90°, θ2=90°\theta_2 = -90° gives (0+1,1+0)=(1,1)(0 + 1, 1 + 0) = (1, 1). ✓ Another solution: θ10°\theta_1 \approx 0°, θ290°\theta_2 \approx 90° gives (1+0,0+1)=(1,1)(1 + 0, 0 + 1) = (1, 1). ✓

Key Takeaways

  1. Homogeneous coordinates unify rotation and translation into a single matrix multiplication
  2. In 2D: 3×3 matrices in SE(2); in 3D: 4×4 matrices in SE(3)
  3. Chaining transforms = matrix multiplication: T0,n=T0,1T1,2Tn1,nT_{0,n} = T_{0,1} \cdot T_{1,2} \cdot \ldots \cdot T_{n-1,n}
  4. Efficient inverse: T1T^{-1} uses RTR^T and RTt-R^T\mathbf{t} (no general matrix inversion needed)
  5. This is the mathematical foundation of forward kinematics — the basis of all robot motion

Next Steps

You now have the mathematical tools for robot kinematics! In Module 4, we’ll explore eigenvalues and eigenvectors, coordinate frames in more depth, and alternative rotation representations like quaternions that avoid gimbal lock.