Geometric Distribution in SystemVerilog
Posted on Fri 09 October 2020 in Logic
Introduction
Despite the fact that SystemVerilog is a "kitchen sink of everything" programming language (the language reference manual is over 1300 pages long), for some reason it does not define an implementation of the geometric probability distribution. In this article, we will correct that oversight. We will explain what the geometric distribution is, and why it is useful for verification and/or validation. Finally, we will develop and test a high performance implementation that only uses a bit of floating point math and some pre-defined system tasks.
Testbench Transaction Scheduling
Figure 1 shows a rudimentary testbench setup for a logic design. For simplicity, we are not concerned with verification methodology (ie VMM, OVM, UVM, etc), or any other testbench implementation details. The testbench drives stimulus to the design under test (DUT) inputs, the DUT performs some useful operation, and the DUT drives outputs to the testbench. Assume this is a synchronous design with a single clock. Furthermore, assume the "control" signals require a handshake from transmitter to receiver, similar to a ready/valid protocol. That is, the testbench response side can backpressure the DUT, and the DUT can backpressure the testbench stimulus side.
The conventional name for a single transfer of data between the testbench and DUT is a "transaction". There are two important aspects to the transaction:
- what data is being transferred
- when does the transfer happen (what clock cycle for a synchronous design)
SystemVerilog provides a large set of tools and techniques for generating constrained-random transaction data. For example, if the DUT in Figure 1 is an adder, you can represent the stimulus using constrained random variables:
class Data;
rand bit [31:0] op_a;
rand bit [31:0] op_b;
rand bit carry_in;
endclass : Data
But how do we model transaction timing in SystemVerilog? It is easy to write logic that just sends new data whenever the interface is ready, but what is the correct way to think about inserting idle (or "delay") cycles? There are always corner cases hidden in tricky backpressure scenarios. What tools are available for modeling random stimulus and response schedules?
The naive approach is to write code like the following:
for (...) begin
int delay;
delay = $urandom_range(9, 1);
repeat (delay) wait_one_clock_cycle();
transmit(data);
end
Ignoring any additional delay from backpressure, this will produce a uniformly distributed delay schedule. That is, delays will range from 1 to 9 with equal probability, which will produce an average delay of \(\frac{1+9}{2} = 5\).
There are many, clever ways to generate a random amount of time between transactions. Rather than looking at techniques for implementing a smorgasbord of special cases, we will instead turn to queueing theory. The most common stochastic queueing models assume that interarrival and service times obey the exponential distribution. This is a continuous distribution, which means that time between events is modeled as a real (ie floating point) number. Since our DUT and testbench are synchronous, we can simplify the representation of time between events to be a positive integer. Therefore, we can use the discrete analogue of the exponential: the geometric distribution.
The Geometric Distribution
Suppose you are doing an experiment that consists of a number of independent trials that can either be classified as a "success" or a "failure". This type of trial is called a Bernoulli Trial. We will use the variable \(n\) to represent the number of independent trials, and \(p\) to represent the probability of a "success". The geometric distribution is a mathematical formula that tells you the probability \(P\) of having to perform \(n\) trials with probability of success \(p\) until you get exactly one "success". We apologize in advance for using both \(P\) and \(p\) in the same formula to mean different things, but it is standard terminology.
Formulas
The formula for the geometric distribution is:
The mean (or "expected value") \(\mu\) is:
The variance \(\sigma\) is:
And the sum of all possible outcomes must equal \(1\) (ie 100% probability).
A full explanation of these formulas can be found in any good textbook on probability and statistics, or Wikipedia. For our purposes, we will take them at face value.
Benefits
Taking a step back to connect the dots, why is the geometric distribution good for modeling the interarrival delay of transactions?
Figure 2 shows a chart of the geometric distributions of three values of \(p\): 0.25, 0.5, and 0.75.
For convenience, the following table lists more data points.
\(n\) | \(p = 0.10\) | \(p = 0.25\) | \(p = 0.5\) | \(p = 0.75\) |
---|---|---|---|---|
1 | 0.1000 | 0.2500 | 0.500000 | 0.75000000 |
2 | 0.0900 | 0.1875 | 0.250000 | 0.18750000 |
3 | 0.0810 | 0.1406 | 0.125000 | 0.04687500 |
4 | 0.0729 | 0.1055 | 0.062500 | 0.01171875 |
5 | 0.0656 | 0.0791 | 0.031250 | 0.00292969 |
6 | 0.0590 | 0.0593 | 0.015625 | 0.00073242 |
7 | 0.0531 | 0.0445 | 0.007813 | 0.00018311 |
8 | 0.0478 | 0.0334 | 0.003906 | 0.00004578 |
9 | 0.0430 | 0.0250 | 0.001953 | 0.00001144 |
10 | 0.0387 | 0.0188 | 0.000977 | 0.00000286 |
11 | 0.0349 | 0.0141 | 0.000488 | 0.00000072 |
12 | 0.0314 | 0.0106 | 0.000244 | 0.00000018 |
13 | 0.0282 | 0.0079 | 0.000122 | 0.00000004 |
14 | 0.0254 | 0.0059 | 0.000061 | 0.00000001 |
15 | 0.0229 | 0.0045 | 0.000031 | 0.00000000 |
Choosing \(p = 0.25\) yields an average interarrival rate of \(\mu = \frac{1}{0.25} = 4\). It will create a schedule that is randomly distributed using an exponentially decaying amount of delay that averages 4 clock cycles (posedge to posedge) between transactions. This corresponds to an average duty cycle of 25% (one active, three idle). Most (about 95%) of all transactions will have a delay between 1 and 10 cycles. However, you will still observe random delays of 15 with probability 0.45%. Due to the Markovian property of the geometric distribution, samples are independent of one another.
One other subtle benefit of using the geometric distribution is the ability to schedule non-integer average delay values. If you take the \(p = 0.75\) example from the table, this corresponds to an average duty cycle of 75%. This means on average you will see three transactions every four cycles. The interarrival rate is \(\mu = \frac{4}{3} \approx 1.33\). This type of delay pattern is difficult to achieve using uniformly distributed delay values.
Implementation
Enough storytelling. Show me the code!
One approach to implementing a geometric distribution in SystemVerilog would be to write a DPI layer to an existing C library, such as the GNU Scientific Library. See the GSL documentation for implementations of several random number distributions.
A significant drawback of this approach is that using DPI can make your implementation difficult to port between environments. Not all simulators support DPI equally, and not all EDA-friendly operating systems (eg RedHat, SuSE) will have GSL installed by default. GSL is a dependency, dependencies add complexity, and complexity leads to additional costs.
Fortunately for us, there is a straightforward technique for simulating the geometric distribution, \(P_n = p \cdot (1-p)^{n-1}\). First, generate a number \(U\) uniformly distributed in (0, 1). Then calculate:
The proof is easy enough for a non-mathematician to follow, but for the sake of brevity we will not attempt to reproduce it here. If you are curious about the details, consult Section 11.4 of Introduction to Probability Models for the detailed proof.
Uniform Random Floating Point Number
Unfortunately, SystemVerilog does not provide a builtin function or system task that returns a random floating point number in (0, 1). So we will need to write one.
While researching how to implement this function, I discovered that since floating point numbers are weird, the topic of uniformly distributed floating point numbers is very deep, and full of magic and mystery. A particularly interesting read on that subject is Daniel Lamire's blog post titled How many floating-point numbers are in the interval [0,1]?. Rather than diving down a rabbit hole, we will use the "good enough" approach of taking a uniformly distributed unsigned integer with \(N\) bits in \((0, 2^N)\), and divide it by \(2^N\). As you will see later, this technique produces better than adequate results.
// Uniform returns a random number uniformly distributed in (0, 1).
// It uses 48 bits of entropy, which should be more than enough.
function real Uniform();
bit [47:0] rbits;
// Get some random bits; throw away the rare '0' output.
do begin
// Xilinx Vivado Simulator v2020.1.1 $urandom system task produces skewed
// results in bytes 0 and 3 of the output.
// As a workaround, only use bytes 1 and 2.
// See Xilinx Simulation and Verification Forum for more info:
// https://forums.xilinx.com/t5/Simulation-and-Verification/Vivado-v2020-1-1-simulator-urandom-function-is-not-uniformly/m-p/1155801
`ifdef XILINX_SIMULATOR
rbits[47:32] = ($urandom() >> 8);
rbits[31:16] = ($urandom() >> 8);
rbits[15: 0] = ($urandom() >> 8);
// Assume other simulators have balanced $urandom implementations.
// Your mileage may vary.
`else
rbits[47:32] = $urandom();
rbits[31: 0] = $urandom();
`endif
end while (rbits == 0);
// (0, 2^48) / 2^48
return $itor(rbits) / 281474976710656.0;
endfunction : Uniform
Warning
It is important to test whether your simulator's implementation of $urandom is uniformly distributed. A non-uniform implementation can lead to highly skewed results. If $urandom causes problems for you, there are several other ways of producing uniformly distributed, random integers on an interval.
Geometric Function
Now that we have an implementation of \(U\), a floating point number uniformly distributed in (0, 1), we can implement the rest:
// Geometric returns a random sample from a geometric distribution with
// probability of success 'p' in (0, 1].
function int Geometric(real p);
real u;
assert (p > 0 && p <= 1);
if (p == 1.0) begin
return 1;
end
u = Uniform();
return 1 + $rtoi($ln(u) / $ln(1.0 - p));
endfunction : Geometric
Testing
Putting all the pieces together, let's test out the implementation by taking one million samples from the Geometric function, using \(p = 0.20\).
module Top();
function real Uniform();
// Replace with implementation
endfunction : Uniform
function int Geometric(real p);
// Replace with implementation
endfunction : Geometric
// Note: tune these or replace with $value$plusargs command-line inputs
parameter real p = 0.20;
parameter int N = 1_000_000;
initial begin : main
int histogram [int];
longint sum_x, sum_xx;
real expected_mean, expected_variance;
real mean, variance;
for (int i = 0; i < N; i++) begin
automatic int x = Geometric(p);
histogram[x] += 1;
sum_x += x;
sum_xx += x * x;
end
$display("[==== Histogram ====]");
for (int x = 1; ; x++) begin
automatic real expected = p * (1-p)**(x-1) * N;
$display("x = %4d: Exp %10.2f, Got %8d", x, expected, histogram[x]);
if (expected < 1.0) break;
end
$display();
$display("[==== Statistics ====]");
expected_mean = (1.0 / p);
mean = $itor(sum_x) / N;
expected_variance = (1.0 - p) / p**2;
variance = ($itor(sum_xx) - $itor(sum_x**2) / N) / (N-1);
$display("Mean : Exp %8.2f, Got %8.2f", expected_mean, mean);
$display("Variance: Exp %8.2f, Got %8.2f", expected_variance, variance);
end : main
endmodule : Top
Using the simulator of your choice, you should be able to compile and simulate this code to produce an output similar to:
[==== Histogram ====] x = 1: Exp 200000.00, Got 200261 x = 2: Exp 160000.00, Got 159731 x = 3: Exp 128000.00, Got 127969 x = 4: Exp 102400.00, Got 102460 x = 5: Exp 81920.00, Got 81786 x = 6: Exp 65536.00, Got 65772 x = 7: Exp 52428.80, Got 52204 x = 8: Exp 41943.04, Got 41806 x = 9: Exp 33554.43, Got 33595 x = 10: Exp 26843.55, Got 26788 x = 11: Exp 21474.84, Got 21884 x = 12: Exp 17179.87, Got 17024 x = 13: Exp 13743.90, Got 13489 x = 14: Exp 10995.12, Got 11108 x = 15: Exp 8796.09, Got 8907 x = 16: Exp 7036.87, Got 7131 x = 17: Exp 5629.50, Got 5709 x = 18: Exp 4503.60, Got 4518 x = 19: Exp 3602.88, Got 3732 x = 20: Exp 2882.30, Got 2766 x = 21: Exp 2305.84, Got 2233 x = 22: Exp 1844.67, Got 1756 x = 23: Exp 1475.74, Got 1508 x = 24: Exp 1180.59, Got 1203 x = 25: Exp 944.47, Got 913 x = 26: Exp 755.58, Got 759 x = 27: Exp 604.46, Got 611 x = 28: Exp 483.57, Got 485 x = 29: Exp 386.86, Got 378 x = 30: Exp 309.49, Got 306 x = 31: Exp 247.59, Got 235 x = 32: Exp 198.07, Got 197 x = 33: Exp 158.46, Got 133 x = 34: Exp 126.77, Got 130 x = 35: Exp 101.41, Got 92 x = 36: Exp 81.13, Got 69 x = 37: Exp 64.90, Got 68 x = 38: Exp 51.92, Got 57 x = 39: Exp 41.54, Got 48 x = 40: Exp 33.23, Got 35 x = 41: Exp 26.58, Got 32 x = 42: Exp 21.27, Got 21 x = 43: Exp 17.01, Got 23 x = 44: Exp 13.61, Got 19 x = 45: Exp 10.89, Got 11 x = 46: Exp 8.71, Got 6 x = 47: Exp 6.97, Got 5 x = 48: Exp 5.58, Got 2 x = 49: Exp 4.46, Got 7 x = 50: Exp 3.57, Got 5 x = 51: Exp 2.85, Got 4 x = 52: Exp 2.28, Got 3 x = 53: Exp 1.83, Got 1 x = 54: Exp 1.46, Got 1 x = 55: Exp 1.17, Got 0 x = 56: Exp 0.94, Got 1 [==== Statistics ====] Mean : Exp 5.00, Got 5.00 Variance: Exp 20.00, Got 19.98
That's not too bad 😃. Different seeds will produce marginally different results, but we were unable to observe any wildly unlikely outputs during testing. As long as you take the time to validate the behavior of $urandom, this should be good enough for production usage.
Conclusion
Modeling transaction interarrival time is an underdeveloped corner of the SystemVerilog language, and associated verification methodologies. However, this facility is critical for exercising design flow control. In this article, we have explained how to implement a high quality, dependency-free implementation of the geometric random probability distribution. This distribution can be used to model any testbench component that requires random wait or delay cycles, such as interarrival or service times.
We recommend you use this implementation as part of a library of plug and play random distributions. For example, you might need to swap in a vanilla deterministic model, or you might have a special need for something more exotic such as the Erlang distribution, or perhaps the well-known Normal distribution (ie the Bell Curve). There is no "correct" model for all needs, but we hope this contribution helps engineers improve their methodology, and inspire them to take a more systematic approach to testbench transaction scheduling.