The Geometry of Probability: The Sigmoid Function
๐ง The Theory
AI/ML Concept: Numerical Stability in Sigmoid and Robust Logistic Computation
๐งช Experimentation: Triggering the Overflow
When translating theoretical calculus into software, hardware limitations dictate architectural constraints.
The Vulnerability:
Passing a massive negative integer (e.g., ) into the sigmoid function requires the CPU to calculate . This number is astronomically large and exceeds the 64-bit floating-point memory limits of standard Python arrays, resulting in a RuntimeWarning: overflow encountered in exp.
The Engineering Fix:
Before the matrix reaches the exponential function, it must pass through a filter. Using np.clip(z, -250, 250) artificially limits the maximum and minimum values the exponent will ever process. Because is already infinitesimally close to , capping the input prevents memory overflow without degrading the mathematical precision of the probability output.
๐ Connection: The First Neuron
Where is this used?
The sigmoid function is the core operating mechanism of Logistic Regression. It is used in production systems to predict binary outcomes: e.g., Fraud/Not Fraud, Malignant/Benign, or System Failure/System Healthy.
Why does this matter?
A standard linear equation () wrapped inside a squashing function () is the exact mathematical definition of an Artificial Neuron. Deep Learning networks are constructed by stacking thousands of these exact, computationally simple logistic regression units into interconnected layers. Mastering this local function is mastering the atomic unit of neural network architecture.
๐The Math
Math: The Sigmoid Function
Linear regression relies on the equation . When applied to probability, this linear dot product fails because it outputs values extending towards negative and positive infinity, violating the foundational rule that probabilities must exist between and .
The Sigmoid Function () mathematically squashes any real number into a strict to boundary:
Mathematical Limits:
- As , . The equation resolves to .
- As , . The equation resolves to .
- When , . The equation resolves to (The decision boundary).
โ๏ธThe Code
import numpy as np
def sigmoid(z: np.ndarray) -> np.ndarray:
z_extreme = np.clip(z, -250, 250)
return np.round(1 / (1 + np.exp(-z_extreme)), 2)
# Test 1: The Bounds
z_normal = np.array([-10, 0, 10])
print("Normal Bounds:", sigmoid(z_normal))
# Test 2: Break Things (The Overflow)
z_extreme = np.array([-1000, 1000])
print("Extreme Bounds:", sigmoid(z_extreme))Code Breakdown
This script defines the mathematical sigmoid function required to map raw linear outputs into bounded probabilities. It includes an explicit memory safety check (np.clip) to prevent numpy float overflows during extreme exponential calculations.