Back
Machine Learning2024

NFL Win Probability Neural Network

A neural network built from scratch using NumPy to predict the probability that the home team wins an NFL game, given a snapshot of the game state. Implemented without PyTorch, TensorFlow, or autograd.

NFL Win Probability Neural Network

Overview

This project demonstrates a deep understanding of neural network fundamentals by implementing forward propagation, backpropagation, and gradient descent entirely from scratch. The model predicts NFL game outcomes based on real-time game state features.

What I Learned

  • Implementing a neural network from scratch without high-level frameworks
  • Understanding forward and backward propagation in detail
  • Manual gradient computation using the chain rule
  • Implementing gradient descent optimization
  • Preventing data leakage in machine learning datasets
  • Building a complete ML pipeline from data preprocessing to model evaluation
  • Validating backpropagation with numerical gradient checking

Problem Definition

Task:

Predict the probability that the home team wins the game

Input:

A snapshot of the game at a single moment in time

Output:

A probability in the range [0, 1]

Label:

home_win = 1 if the home team eventually wins the game, otherwise 0

Dataset

The dataset is derived from NFL play-by-play data sourced from nflfastR via nflreadr. Each row represents a game state snapshot from which we make predictions.

FeatureDescription
score_diffhome_score minus away_score
seconds_remainingSeconds remaining in the game
quarterGame quarter (1 to 4)
downDown (1 to 4)
yards_to_goYards needed for a first down
yardline_100Distance to opponent end zone
possession_is_home1 if home team has possession, else 0

Neural Network Architecture

This project uses a single hidden layer neural network with the following architecture:

X (N, D)

→ z1 = X @ W1 + b1

→ h = tanh(z1)

→ z2 = h @ W2 + b2

→ yhat = sigmoid(z2)

→ loss = binary_cross_entropy(yhat, y)

Why tanh and sigmoid?

  • tanh introduces nonlinearity and has a simple derivative
  • sigmoid maps logits to probabilities
  • binary cross entropy pairs naturally with sigmoid

Backpropagation

Backpropagation is implemented manually using the chain rule. Every gradient is computed explicitly without relying on automatic differentiation libraries.

Key identity used:

dL/dz2 = (yhat - y) / N

From there, gradients flow backward to W2 and b2, then through tanh using (1 - h²), and finally to W1 and b1.

Technical Implementation

PythonNumPyNeural NetworksBackpropagationGradient DescentMachine Learning