What is a Markov process?

A stochastic process where the next state depends only on the current state, not on the past.

How does PageRank use Markov chains?

It models a random surfer who jumps between pages; the steady-state probabilities are the PageRank scores.

Can Markov chains model long-term dependencies?

Not directly; they’re memoryless. For longer context, you need higher-order Markov models or other techniques.

What is the law of large numbers?

As you repeat an experiment many times, the average outcome approaches the expected value.

How many shuffles randomize a deck?

Seven riffle shuffles are enough for practical randomness [Wikipedia Riffle shuffle (n.d.)](https://en.wikipedia.org/wiki/Riffle_shuffle).

Is the PageRank algorithm still used by Google?

Yes, it’s still a core part of Google’s ranking, combined with many other signals.

Discover how Markov chains explain everything from shuffling cards to Google rankings. Learn the law of large numbers, build models, and avoid pitfalls.

Markov Chain: How Randomness Shapes Our World

Published by Brav

Table of Contents

TL;DR

I’ll show you how a simple memoryless process explains shuffling cards, neutron reactions, word prediction, and Google rankings.
Learn the law of large numbers and why it matters for dependent events.
Build your own Markov model with easy steps and a quick Python example.
Discover common pitfalls—like how spam can game PageRank or how language models can get stuck in loops.
Take action: start experimenting with Markov chains on your favorite data set.

Why this matters

I’ve spent years trying to shuffle a deck of cards in a way that feels truly random. One night I was in a bar with a friend, and we kept riffle-shuffling the same deck. After the seventh shuffle, we realized the deck looked the same as it did before the first shuffle. That frustration turned into curiosity: how many shuffles does it take to randomize a deck? The answer—seven—was a lesson that rippled into everything I’d learn about probability.

But shuffling isn’t the only puzzle. When I first saw a list of search results, I wondered why some pages always appeared first. Then I learned that Google’s search engine uses a Markov chain model to rank pages—something that can explain a lot more than a simple keyword count. And when I read about the Manhattan Project, I was struck by the way a single neutron could start a chain reaction that exploded like a bomb. All of these mysteries boil down to the same idea: the next step depends only on where you are now, not on how you got here. That’s the heart of a Markov chain.

Core concepts

A Markov chain is a sequence of random states where the future depends only on the present. It’s like standing at a crossroads and choosing a path based on where you are right now, without looking back at where you were yesterday or the day before. I learned that this “memoryless” property comes from Andrey Markov, who proved that even when events depend on each other, the law of large numbers still holds Mathworld — Markov Chain (n.d.).

The law of large numbers says that if you repeat an experiment many times, the average outcome will settle near the expected value. For example, if I flip a fair coin ten times I might get six heads and four tails. If I flip it 100 times, I expect to get about 50 heads and 50 tails, and the numbers will get closer to 50 as I keep flipping Wikipedia Law of large numbers (n.d.). This convergence holds even for dependent events, thanks to Markov’s work Wikipedia Markov chain (n.d.).

Markov chains are also the engine behind many models: the way letters follow each other in a book, the way a random surfer jumps from page to page on the web, or the way a neutron moves through a reactor core. Each of these is a chain of states with transition probabilities that we can estimate from data. Markov even used letter sequences in text to test dependence Wikipedia Markov chain (n.d.).

Google was founded in 1998 and still uses PageRank for ranking, although many other signals are now involved Wikipedia Google (n.d.).

Parameter	Use Case	Limitation
Memoryless property Mathworld — Markov Chain (n.d.)	Simple state prediction (e.g., next word, shuffling)	Cannot capture long-term dependencies
Random sampling Wikipedia Monte C arlo method (n.d.)	Monte Carlo simulations for complex systems (e.g., neutron chain reaction)	Requires many iterations, computational cost
Link analysis Wikipedia PageRank (n.d.)	PageRank ranking of web pages	Sensitive to link spam, requires damping factor

How to apply it

Collect data – For a deck of cards, note the order after each shuffle. For text, count how often a vowel follows another vowel. For web pages, count how often a link goes from one site to another.
Build a transition matrix – Each row and column represents a state; the entry is the probability of moving from one state to another. For a coin, that’s a 2 × 2 matrix with 0.5 on the diagonal and 0.5 off-diagonal. For cards, it’s a 52 × 52 matrix that’s nearly impossible to fill by hand, but you can simulate it with code.
Simulate a random walk – Pick a starting state, then choose the next state according to the transition probabilities. Repeat. That’s a Monte Carlo simulation in a nutshell Wikipedia Monte C arlo method (n.d.). Use it to estimate the probability of a neutron chain reaction in a bomb, or to generate random text.

For instance, the Fat Man bomb weighed 6 kg of plutonium and produced an explosion equivalent to 25,000 tons of TNT, a classic example of a neutron chain reaction in a nuclear bomb Wikipedia Manhattan Project (n.d.).

Apply PageRank – Imagine a surfer who, at each step, follows a link with probability 0.85 or jumps to a random page with probability 0.15. The steady-state distribution of this random walk is the PageRank score for each page Wikipedia PageRank (n.d.). The damping factor of 0.85 ensures a random surfer can jump to any page Wikipedia PageRank (n.d.). That’s why a few high-quality links can catapult a page to the top of search results.
Test the model – After you’ve built the transition matrix, run many simulations and compare the average outcome to the expected value. If they line up, you’re using the law of large numbers to your advantage.

Here’s a tiny Python snippet that builds a Markov model for a coin flip and simulates 1000 trials:

import random
# Transition matrix for a fair coin
T = [[0.5, 0.5],  # from heads
     [0.5, 0.5]]  # from tails
state = 0  # start at heads
counts = [0, 0]
for _ in range(1000):
    state = random.choices([0, 1], weights=T[state])[0]
    counts[state] += 1
print('heads:', counts[0], 'tails:', counts[1])

Pitfalls & edge cases

Dependence assumption – The law of large numbers still works for dependent events, but you must estimate the right transition probabilities. If you assume independence when it’s wrong, you’ll get misleading results Wikipedia Markov chain (n.d.).
Spam in PageRank – If a site creates a web of self-links, the random surfer can get trapped, inflating its PageRank. Google counters this with the damping factor, which forces a 15 % chance of jumping to a random page, ensuring the rank stays fair Wikipedia PageRank (n.d.).
Feedback loops in language models – When training data contains repetitive patterns, a Markov-style model can learn to repeat them forever. Large language models use attention mechanisms to break out of this loop, but the risk remains Wikipedia Markov chain (n.d.).
Monte Carlo cost – The more accurate you want the simulation, the more samples you need. For complex systems like a nuclear reactor, you might need millions of iterations, which can be expensive.
Randomness in shuffling – A single riffle shuffle is far from random. The Gilbert–Shannon–Reeds model shows that seven riffle shuffles bring you close to full randomness Wikipedia Riffle shuffle (n.d.).

Quick FAQ

Question	Answer
What is a Markov process?	A stochastic process where the next state depends only on the current state, not on the past.
How does PageRank use Markov chains?	It models a random surfer who jumps between pages; the steady-state probabilities are the PageRank scores.
Can Markov chains model long-term dependencies?	Not directly; they’re memoryless. For longer context, you need higher-order Markov models or other techniques.
What is the law of large numbers?	As you repeat an experiment many times, the average outcome approaches the expected value.
How many shuffles randomize a deck?	Seven riffle shuffles are enough for practical randomness Wikipedia Riffle shuffle (n.d.).
Is the PageRank algorithm still used by Google?	Yes, it’s still a core part of Google’s ranking, combined with many other signals.

Conclusion

Markov chains are a simple idea that explains a huge range of real-world phenomena—from a deck of cards to the first nuclear bomb. By learning how to build a transition matrix and simulate a random walk, you can start experimenting with your own data today. Try predicting the next word in a sentence, ranking a tiny web of pages, or simulating a simple Monte Carlo experiment. Once you grasp the memoryless property, you’ll see it lurking in many places, and you’ll be ready to tackle the next probability puzzle.

Last updated: December 21, 2025