Explaining Bitcoin Mining as a Poisson Distribution

If you ever watched a block chain explorer, such as mempool.space, and waited for the next block to be mined, you would’ve realized that it varies in the time it takes to mine each bitcoin block. However, if you waited long enough and observed many blocks you would realize that the average time for each block to be mined is around 10 minutes.

If we assume that the hash rate is constant the whole time, a new block will be mined once every 10 minutes on average.

The times at which a new bitcoin block is mined could be modeled by a poisson distribution. An important characteristic of a poisson distribution is being memoryless; meaning that whether an event has just occurred or that an event hasn’t occurred in a long time will give us no clue about the likelihood that another event will occur soon.

Because of that, we can always expect the next bitcoin block to be mined in 10 minutes at any given moment. Therefore, if you have already been waiting for 4 minutes now, we still expect the next block to be mined in the next 10 minutes, not 6 minutes, as if you haven’t been waiting at all. This memoryless characteristic also works backwards in time just as well as forwards in time. Therefore it would work the same way I just explained — if you choose a random time, the preceding block will have been mined on average 10 minutes earlier.

Poisson Math

Let’s take a bitcoin miner, Alice, performing a fraction, 0 < p ≤ 1, of the total network hash rate. If we know that the average amount of time it takes for a bitcoin block to be mined is N minutes, then we know that Alice will take an average t = N/p minutes to mine a block.

Now let Tₙ represent a random variable of the time between each of the blocks mined by Alice (waiting time). Due to the pseudo-random properties of the hash function, mining is a Markov process, being memoryless. Which means that one event in the process does not effect any other event and are both statistically identical.

Markdown Monster icon

In other words, if Alice has waited a time s without mining a block (Tₙ > s), the probability that Alice won’t mine the block in the next t minutes, P(Tₙ>t+s | Tₙ>s), is identical to the probability as if she had not waited the time s, P(Tₙ>t).

Due to this property, the time between each block mined, Tₙ, follows an exponential distribution,

Markdown Monster icon

In order to break it down better, let us rewrite the memoryless property as such,

Markdown Monster icon Eq. 1

Now let’s convert P(Tₙ>t) into a function, ƒ(t), representing the probability that Alice won’t mine a bitcoin block in the next t minutes. Eq. 1 could be represented as such,

Markdown Monster icon Eq. 2

In order to understand this relation better, lets take s and t both to be ½ and apply it to the above equation:

Markdown Monster icon

We could see a pattern in the way this works. Let’s try split the 1 into ¼ instead,

Markdown Monster icon

We could now express the pattern as such,

Markdown Monster icon Eq. 3

Now let’s introduce the variable m,

Markdown Monster icon

If we substitute Eq. 3 into the above formula we will get the following,

Markdown Monster icon

By picking the values of m and n, we are able to calculate the value of t > 0 using m/n,

Markdown Monster icon

Since we know the general rule, e^( ln(x) ) = x, we will define α = -ln( f(1) ) — where f(1) is between 0 and 1, and α > 0,

Markdown Monster icon

What we just got is what you call the exponential distribution (as mentioned above) which is the mathematical function that gives the probabilities of occurrence, in our case representing the time between each block.

The variable α represents how fast the exponential will decline with time. A larger value of α means that the probability of still waiting (Tₙ > t) is low at large times and therefore events occur more often.

We could also confirm that our function, f(t), satisfies the memoryless property described in Eq. 2 as such,

Markdown Monster icon

Let’s take the probability, P(Tₙ > t), as we calculated above to be equal to e^(-αt). P(Tₙ > t) can also be interpreted as the probability of Tₙ being at least t.

The ‘At Least One’ Rule. To calculate the probability of an event occurring at least once, it will be the complement of the event never occurring. This means that the probability of the event never occurring and the probability of the event occurring at least once will equal one, or a 100% chance.

Using the ‘At Least One’ rule, we can rewrite this as 1 minus the probability of Tₙ between 0 and t. The reason for 0 is because time can’t be negative. Therefore we could write it as such,

Markdown Monster icon

If we solve this for the value of F(t), we get the probability density function:

Markdown Monster icon

A poisson process is a random point process such that the waiting times between events are exponentially distributed. We could simulate the distribution of waiting times between each bitcoin block mined using Python.

import numpy as np
import matplotlib.pyplot as plt

N = 10000  # number of blocks
t = 10  # average waiting time between mined blocks

rand = np.random.RandomState(42)  # universal random seed
mined_blocks_time = N * t * np.sort(rand.rand(N))

waiting_times = np.diff(mined_blocks_time)

plt.style.use('seaborn')
plt.hist(waiting_times, bins=np.arange(100), density=True, color='black')
plt.axvline(waiting_times.mean(), color='orange', linestyle='dotted')
plt.xlabel('Block waiting times (min)')
plt.ylabel('Probability density')

plt.show()

The orange line in the plot represents the mean waiting time (approx. 10 minutes). As we expected, the plot is shaped similar to an exponential distribution. The simulation of waiting times between bitcoin blocks approximates a poisson process.

The difference between a poisson distribution and an exponential distribution is that the Poisson distribution is a discrete distribution (taking only certain values like integers, ex. 0 to 4) and deals with the number of occurrences in a fixed period of time, while the exponential distribution is a continuous distribution (taking any value within a specified range like infinity) and deals with the time between occurrences of successive events as time flows by continuously.

Now let’s say we want to find the number of blocks mined during an interval of τ minutes. We would be able to do that since the poisson process is memoryless and doesn’t require us to know about anything that happened before the time interval.

Set N to be the number of blocks mined in a poisson process with parameter α and a time interval τ. Let’s find the probability that a k amount of blocks are mined in this time interval, P(N=k).

Let’s start with P(N=0), assuming that no blocks get mined. If no block is mined within interval τ, then it must happen some time after: (τ, ∞).

Now what do we use for the value of τ? We could use 0 or any other value since it is memoryless and whatever happened before time τ won’t matter.

If we take P(N=1), we need to consider all the possible times, x, that a bitcoin block could be mined between 0 and τ. We could write the probability density function for it happening at x as αe^(-αx). Then, since only one block could be mined in the interval [0, τ], there cannot be a second block mined in [x, τ]. Therefore its waiting time it must fall in [τ−x,∞).

Now let’s do P(N=2), considering the first mined block happens at x and second at x+y - where the waiting time for the first block is x and second is y, both occurring between [0,τ]. In this case, the first waiting time must fall in [0,τ−y]. Since only two blocks must be mined in [0,τ], the third mined block, must occur outside. Therefore the third waiting time must fall in [τ−x−y,∞).

If we repeat this again, we would see the pattern clearer,

Markdown Monster icon

What we just got is the probability distribution of mining a k number of bitcoin blocks within a time interval of τ, given bitcoin’s poisson distribution.

Finally, we could also find the expected number of bitcoin blocks that are mined in a time interval of τ with parameter α,

Therefore, N follows a poisson distribution with a mean value of ατ. This result is what we expected since it is knows that the mathematics of bitcoin mining follows a poisson distribution.

We could also see that the mean value of N and α are directly proportional to each other such that a larger value of α suggests that a miner will mine a block more often (more blocks mined within a certain time interval).

https://www.sciencedirect.com/topics/mathematics/poisson-distribution

Thank you for reading

Follow me on Twitter @suhailsaqan!