What is a Vector? Translating the Real World into Code
๐ง The Theory
AI/ML Concept: Feature Representation
Computers do not natively understand real-world concepts like a "house," an "image," or a "song." To feed data into a Machine Learning model, we must translate these entities into vectors. This critical first step is called feature representation.
- Tabular Data: If we are building a model to predict house prices, we might define a house using three features: number of bedrooms, number of bathrooms, and age in years. A 3-bedroom, 2-bathroom house built 15 years ago becomes a data point in 3D space: .
- Image Data: A grayscale image is represented as a vector where each element corresponds to the brightness of a single pixel.
- Text Data: Words are mapped to high-dimensional vectors (often 300+ dimensions) where the numbers represent semantic meaning.
Every object an AI ever interacts with is converted into this numerical format so the underlying mathematical engine can process it.
๐The Math
What is a Vector?
In mathematics, a vector is an ordered list of numbers that represents a point or a direction in a coordinate space. The number of elements in the vector determines its dimension ().
- A 2-dimensional vector represents a point on a flat plane (x, y axes):
- A 3-dimensional vector represents a point in physical space (x, y, z axes):
While humans struggle to visualize anything beyond 3 dimensions, the mathematical rules remain exactly the same for an -dimensional space, allowing us to represent high-dimensional vectors algebraically:
โ๏ธThe Code
class Vector:
def __init__(self, attributes: list[float]):
self.attributes = attributes
def __sub__(self, other: "Vector") -> "Vector":
if isinstance(other, Vector):
if len(self.attributes) != len(other.attributes):
raise ValueError("Vectors must have the same dimension for subtraction.")
return Vector([s - o for s, o in zip(self.attributes, other.attributes)])
else:
raise TypeError("Unsupported operand type for -: 'Vector' and '{}'".format(type(other).__name__))
def __repr__(self) -> str:
attributes_str = ", ".join("{:.2f}".format(a) for a in self.attributes)
return "Vector({})".format(attributes_str)
# Example Usage: Simulating two houses with 4 features (bedrooms, bathrooms, age, sqft)
house_A = Vector([3, 2, 15, 2000])
house_B = Vector([4, 3, 10, 2500])
# Computes the difference in features: [-1.00, -1.00, 5.00, -500.00]
difference = house_A - house_B
print(f"Feature Difference: {difference}")Code Breakdown
class Vector: We define a custom class to represent our mathematical vector, bypassing NumPy initially to understand the raw mechanics.def __init__(self, attributes): Initializes the vector with an arbitrary sequence of numbers. Unlike a hardcoded 3D vector (x, y, z), this allows for -dimensional feature representation.if isinstance(other, Vector): Guards against invalid operations. We can only subtract a Vector from another Vector.if len(self.attributes) != len(other.attributes): Dimensionality Check: Mathematically, vector addition and subtraction are only defined for vectors residing in the exact same dimensional space.zip(self.attributes, other.attributes): Pairs corresponding elements from both vectors together.[s - o for s, o in ...]: List Comprehension: Iterates through the zipped pairs, performs the element-wise subtraction , and builds the new resulting list in a single, optimized pass.return Vector(...): Wraps the resulting list back into a newVectorobject, allowing for method chaining later on.