AI Logbook
Live Learning Feed

AI Logbook

Understanding intelligent systems from first principles.

Transposition & Shapes: Aligning the Math

Aligning the MathThe Matrix TransposeImplementing the T Property

๐Ÿง The Theory

AI/ML Concept: Aligning the Math

Why is transposition so critical in machine learning? It is the ultimate "adapter cable" for matrix multiplication.

Imagine you are calculating the error for a batch of predictions. You have:

  • A predictions vector y^\hat{y} shaped (100,1)(100, 1)
  • An actual truth vector yy shaped (100,1)(100, 1)

If you want to calculate the dot product to find the total error, the math will crash! The inner dimensions (1)(1) and (100)(100) do not match.

By transposing the first vector to y^T\hat{y}^T (making it 1ร—1001 \times 100), the math perfectly aligns: (1ร—100)โ‹…(100ร—1)(1 \times 100) \cdot (100 \times 1). The inner dimensions match (100=100100 = 100), and the result is a 1ร—11 \times 1 matrix (a single scalar number), representing your total error! You will use .T constantly in libraries like PyTorch to massage your data shapes so the neural network layers connect perfectly.

๐Ÿ“The Math

Math: The Matrix Transpose

Sometimes, the matrices we want to multiply don't have matching inner dimensions. To fix this, we use an operation called Transposition.

Transposing a matrix simply means flipping it over its diagonal. The rows become columns, and the columns become rows.

We denote a transposed matrix with a capital "T" superscript (ATA^T).
If matrix AA has a shape of (3,2)(3, 2):
A=[123456]A = \begin{bmatrix} 1 & 2 \\ 3 & 4 \\ 5 & 6 \end{bmatrix}

Then ATA^T will have a shape of (2,3)(2, 3):
AT=[135246]A^T = \begin{bmatrix} 1 & 3 & 5 \\ 2 & 4 & 6 \end{bmatrix}

โš™๏ธThe Code

class Matrix:
    def __init__(self, data: list[list[float]]):
        if data:
            self.__validate(data)
            self.data = data
            self.number_of_rows = len(data)
            self.number_of_cols = len(data[0])            
        else:
            self.data = []
            self.number_of_rows = 0
            self.number_of_cols = 0

    def __validate(self, data: list[list[float]]) -> None:
        """Private method to ensure matrix is a perfect rectangle."""
        number_of_cols = len(data[0])
        for row in data:
            if len(row) != number_of_cols:
                raise ValueError("All rows must have the same number of columns to form a valid matrix.")

    @property
    def shape(self) -> tuple[int, int]:
        """Returns the shape of the matrix as (rows, columns)."""
        return (self.number_of_rows, self.number_of_cols)
    
    def __mul__(self, scalar: float) -> "Matrix":
        """Scalar multiplication: scales every element by the scalar."""
        return Matrix([[element * scalar for element in row] for row in self.data])

    def __add__(self, other: "Matrix") -> "Matrix":
        """Matrix addition: adds elements of identically shaped matrices."""
        if isinstance(other, Matrix):
            if self.shape != other.shape:
                raise ValueError("Matrices must have the same shape for addition")
            return Matrix([
                [a + b for a, b in zip(row1, row2)]
                for row1, row2 in zip(self.data, other.data)
            ])
        else:
            raise TypeError(f"Unsupported operand type for +: 'Matrix' and '{type(other).__name__}'")
        
    def dot_vector(self, vector: list[float]) -> list[float]:
        """Multiplies the matrix by a 1D vector (Batch Dot Product)."""
        if self.number_of_cols != len(vector):
            raise ValueError("The number of columns in the matrix must exactly equal the number of elements in the vector")
        return [sum(a * b for a, b in zip(row, vector)) for row in self.data]
    
    def dot_matrix(self, other: "Matrix") -> "Matrix":
        """Multiplies the matrix by another matrix (Batch Matrix Multiplication)."""
        if self.number_of_cols != other.number_of_rows:
            raise ValueError("The number of columns in the first matrix must equal the number of rows in the second matrix for multiplication")
        
        result = [
            [
                sum(self.data[i][k] * other.data[k][j] for k in range(other.number_of_rows))
                for j in range(other.number_of_cols)
            ]
            for i in range(self.number_of_rows)
        ]
        
        return Matrix(result)

    @property
    def T(self) -> "Matrix":
        """Returns the transpose of the matrix."""
        return Matrix([[self.data[i][j] for i in range(self.number_of_rows)] for j in range(self.number_of_cols)])
    def __repr__(self) -> str:
        """Helper to print the matrix cleanly in the terminal."""
        rows_str = "\n  ".join(str(row) for row in self.data)
        return f"Matrix(\n  {rows_str}\n)"

# --- Example Usage: Transposition and Shapes ---

# Create a 3x2 Design Matrix
X = Matrix([
    [1.0, 2.0],
    [3.0, 4.0],
    [5.0, 6.0]
])

# Transpose the Matrix
X_T = X.T

print(f"Original Matrix X {X.shape}:")
print(X)

print(f"\nTransposed Matrix X.T {X_T.shape}:")
print(X_T)

# Example: Aligning math for deep learning
W = Matrix([
    [0.5, 0.5, 0.5],
    [0.1, 0.1, 0.1]
]) # Shape (2, 3)

# If we try to do W * X, it fails: (2, 3) * (3, 2) inner dimensions match!
# But if we wanted to calculate X * W, we'd need (3, 2) * (2, 3).
print(f"\nMultiplying X (3, 2) by W (2, 3) works because inner dimensions match:")
print(X.dot_matrix(W))

Code Breakdown

  • @property def T(self) -> "Matrix": We use the @property decorator to mimic the standard API of NumPy and PyTorch, allowing us to call X.T instead of X.T().
  • for j in range(self.number_of_cols): The outer loop iterates through the original columns. These will become the new rows.
  • for i in range(self.number_of_rows): The inner loop pulls the elements from the original rows.
  • self.data[i][j]: By swapping the index order relative to how we usually read a matrix, we effectively flip the data across its diagonal, completing the transposition.