<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>code-spot &#187; Mathematics</title>
	<atom:link href="http://code-spot.co.za/category/mathematics/feed/" rel="self" type="application/rss+xml" />
	<link>http://code-spot.co.za</link>
	<description>a programming blog</description>
	<lastBuildDate>Sun, 27 Feb 2011 07:18:03 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.4</generator>
		<item>
		<title>Update to Functional Equations Reference (version 1.3)</title>
		<link>http://code-spot.co.za/2010/09/30/update-to-functional-equations-reference-version-1-3/</link>
		<comments>http://code-spot.co.za/2010/09/30/update-to-functional-equations-reference-version-1-3/#comments</comments>
		<pubDate>Thu, 30 Sep 2010 11:27:07 +0000</pubDate>
		<dc:creator>herman.tulleken</dc:creator>
				<category><![CDATA[Mathematics]]></category>
		<category><![CDATA[Snippet]]></category>
		<category><![CDATA[difference equation]]></category>
		<category><![CDATA[discrete calculus]]></category>
		<category><![CDATA[functional equation]]></category>
		<category><![CDATA[product]]></category>
		<category><![CDATA[quotient.]]></category>
		<category><![CDATA[sum]]></category>
		<category><![CDATA[z-transform]]></category>

		<guid isPermaLink="false">http://code-spot.co.za/?p=1073</guid>
		<description><![CDATA[This is a substantial update of this reference document. The most important addition is the chain and substitution rules for arithmetic difference calculus (ADC). Other additions include: more properties of the discrete power function, more properties of ADC operators, definitions of analog functions, and ranges of convergence of (some) z-transforms. I also corrected some errors that [...]


Related posts:<ol><li><a href='http://code-spot.co.za/2010/08/25/update-to-functional-equations-reference/' rel='bookmark' title='Permanent Link: Update to Functional Equations Reference'>Update to Functional Equations Reference</a></li>
<li><a href='http://code-spot.co.za/2009/05/27/update-reference-for-functional-equations/' rel='bookmark' title='Permanent Link: Update: Reference for Functional Equations'>Update: Reference for Functional Equations</a></li>
<li><a href='http://code-spot.co.za/difference-and-functional-equations-reference/' rel='bookmark' title='Permanent Link: Difference and Functional Equations Reference'>Difference and Functional Equations Reference</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p><img class="size-full wp-image-421 alignleft" title="1052727062_0ec2c67ea4_small" src="http://code-spot.co.za/blog/wp-content/uploads/2009/02/1052727062_0ec2c67ea4_small.jpg" alt="" width="142" height="142" />This is a substantial update of this reference document. The most important addition is the chain and substitution rules for arithmetic difference calculus (ADC). Other additions include: more <span style="font-size: 13.2px;">properties of the discrete power function, more properties of ADC operators, definitions of analog functions, and ranges of convergence of (some) z-transforms. I also corrected some errors that were discovered since the last version.</span></p>
<p>Grab it <a href="http://code-spot.co.za/difference-and-functional-equations-reference/">here</a>.</p>


<p>Related posts:<ol><li><a href='http://code-spot.co.za/2010/08/25/update-to-functional-equations-reference/' rel='bookmark' title='Permanent Link: Update to Functional Equations Reference'>Update to Functional Equations Reference</a></li>
<li><a href='http://code-spot.co.za/2009/05/27/update-reference-for-functional-equations/' rel='bookmark' title='Permanent Link: Update: Reference for Functional Equations'>Update: Reference for Functional Equations</a></li>
<li><a href='http://code-spot.co.za/difference-and-functional-equations-reference/' rel='bookmark' title='Permanent Link: Difference and Functional Equations Reference'>Difference and Functional Equations Reference</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://code-spot.co.za/2010/09/30/update-to-functional-equations-reference-version-1-3/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Update: Reference for Functional Equations</title>
		<link>http://code-spot.co.za/2009/05/27/update-reference-for-functional-equations/</link>
		<comments>http://code-spot.co.za/2009/05/27/update-reference-for-functional-equations/#comments</comments>
		<pubDate>Wed, 27 May 2009 08:11:32 +0000</pubDate>
		<dc:creator>herman.tulleken</dc:creator>
				<category><![CDATA[Downloads]]></category>
		<category><![CDATA[Mathematics]]></category>
		<category><![CDATA[Snippet]]></category>
		<category><![CDATA[binomial transform]]></category>
		<category><![CDATA[functional equation]]></category>
		<category><![CDATA[functional equations]]></category>
		<category><![CDATA[z-transform]]></category>

		<guid isPermaLink="false">http://code-spot.co.za/?p=609</guid>
		<description><![CDATA[In this new  version of Reference for Functional Equations I added several more z-transform pairs. I also started to add binomial transform pairs. The definition for the binomial is not consistent among different authors. I arbitrarily chose one, and later I changed it. I will probably change it again. Several typos were fixed. I am working on [...]


Related posts:<ol><li><a href='http://code-spot.co.za/2010/08/25/update-to-functional-equations-reference/' rel='bookmark' title='Permanent Link: Update to Functional Equations Reference'>Update to Functional Equations Reference</a></li>
<li><a href='http://code-spot.co.za/difference-and-functional-equations-reference/' rel='bookmark' title='Permanent Link: Difference and Functional Equations Reference'>Difference and Functional Equations Reference</a></li>
<li><a href='http://code-spot.co.za/2010/09/30/update-to-functional-equations-reference-version-1-3/' rel='bookmark' title='Permanent Link: Update to Functional Equations Reference (version 1.3)'>Update to Functional Equations Reference (version 1.3)</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p><img class="size-full wp-image-421 alignleft" title="1052727062_0ec2c67ea4_small" src="http://code-spot.co.za/blog/wp-content/uploads/2009/02/1052727062_0ec2c67ea4_small.jpg" alt="1052727062_0ec2c67ea4_small" width="142" height="142" />In this new  version of <a href="http://code-spot.co.za/difference-and-functional-equations-reference/">Reference for Functional Equations</a> I added several more z-transform pairs. I also started to add binomial transform pairs. The definition for the binomial is not consistent among different authors. I arbitrarily chose one, and later I changed it. I will probably change it again. Several typos were fixed. I am working on a system to include proofs so that the tables can be checked more easily.</p>


<p>Related posts:<ol><li><a href='http://code-spot.co.za/2010/08/25/update-to-functional-equations-reference/' rel='bookmark' title='Permanent Link: Update to Functional Equations Reference'>Update to Functional Equations Reference</a></li>
<li><a href='http://code-spot.co.za/difference-and-functional-equations-reference/' rel='bookmark' title='Permanent Link: Difference and Functional Equations Reference'>Difference and Functional Equations Reference</a></li>
<li><a href='http://code-spot.co.za/2010/09/30/update-to-functional-equations-reference-version-1-3/' rel='bookmark' title='Permanent Link: Update to Functional Equations Reference (version 1.3)'>Update to Functional Equations Reference (version 1.3)</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://code-spot.co.za/2009/05/27/update-reference-for-functional-equations/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Generating Random Integers With Arbitrary Probabilities</title>
		<link>http://code-spot.co.za/2009/04/28/generating-random-integers-with-arbitrary-probabilities/</link>
		<comments>http://code-spot.co.za/2009/04/28/generating-random-integers-with-arbitrary-probabilities/#comments</comments>
		<pubDate>Tue, 28 Apr 2009 07:12:28 +0000</pubDate>
		<dc:creator>herman.tulleken</dc:creator>
				<category><![CDATA[Algorithms]]></category>
		<category><![CDATA[Mathematics]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[2D]]></category>
		<category><![CDATA[optimisation]]></category>
		<category><![CDATA[probability]]></category>
		<category><![CDATA[random]]></category>
		<category><![CDATA[random distribution]]></category>
		<category><![CDATA[random integer]]></category>
		<category><![CDATA[random number generation]]></category>

		<guid isPermaLink="false">http://code-spot.co.za/?p=582</guid>
		<description><![CDATA[I finally laid my hands on Donald Knuth’s The Art of Computer Programming (what a wonderful set of books!), and found a neat algorithm for generating random integers 0, 1, 2, … , n – 1, with probabilities p_0, p_1, … , p_(n-1). I have written about generating random numbers (floats) with arbitrary distributions for [...]


Related posts:<ol><li><a href='http://code-spot.co.za/2009/04/15/generating-random-points-from-arbitrary-distributions-for-2d-and-up/' rel='bookmark' title='Permanent Link: Generating Random Points from Arbitrary Distributions for 2D and Up'>Generating Random Points from Arbitrary Distributions for 2D and Up</a></li>
<li><a href='http://code-spot.co.za/2008/09/21/generating-random-numbers-with-arbitrary-distributions/' rel='bookmark' title='Permanent Link: Generating Random Numbers with Arbitrary Distributions'>Generating Random Numbers with Arbitrary Distributions</a></li>
<li><a href='http://code-spot.co.za/2009/04/15/estimating-a-continuous-distribution-from-a-sample-set/' rel='bookmark' title='Permanent Link: Estimating a Continuous Distribution from a Sample Set'>Estimating a Continuous Distribution from a Sample Set</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p><img style="display: inline" title="header" src="http://code-spot.co.za/blog/wp-content/uploads/2009/04/header2.png" alt="header" width="500" height="332" /></p>
<p>I finally laid my hands on Donald Knuth’s <em><a href="http://www.amazon.com/Art-Computer-Programming-Volumes-Boxed/dp/0201485419">The Art of Computer Programming</a></em> (what a wonderful set of books!), and found a neat algorithm for generating random integers 0, 1, 2, … , n – 1, with probabilities p_0, p_1, … , p_(n-1).</p>
<p>I have written about generating random numbers (floats) with arbitrary distributions for <a href="http://code-spot.co.za/2008/09/21/generating-random-numbers-with-arbitrary-distributions/">one dimension</a> and <a href="http://code-spot.co.za/2009/04/15/generating-random-points-from-arbitrary-distributions-for-2d-and-up/">higher dimensions</a>, and indeed that method can be adapted for generating integers with specific probabilities. However, the method described below is much more concise, and efficient (I would guess) for this special case. Moreover, it is also easy to adapt it to generate floats for continuous distributions.</p>
<p><span id="more-582"></span></p>
<h2>Description of the Algorithm</h2>
<p>The basic idea of the algorithm is simple. We have two tables of length n that contains integers (K, L), and a third table that contains probabilities (P). The first table merely contains the integers 0 to n-1 (thus, we need not actually store it explicitly). The other two tables are computed before generation (more about that below).</p>
<p>To generate a random number, we generate a random integer (uniformly distributed between 0 and n-1 inclusive), and a random float (uniformly distributed between 0 and 1). The integer tells us which cells in the table to use. First we lookup the probability in P at that index. If the float is smaller than that value, we return the integer in K, otherwise we return the integer in L. (Note, the integer in K is exactly the random integer itself. Therefore, we do not actually have a table K – we simply use the random integer.)</p>
<p>Now this will only give the desired result if the tables have been constructed for this to work. Before looking at how these tables are generated, let us look at a very simple example. Suppose we want to generate 0, 1, 2, with probabilities 3/18, 7/18, 8/18. Can you see why the following tables will work?</p>
<table border="0" cellspacing="0" cellpadding="0" width="500">
<tbody>
<tr>
<td width="125" valign="top"></td>
<td width="125" valign="top">0</td>
<td width="125" valign="top">1</td>
<td width="125" valign="top">2</td>
</tr>
<tr>
<td width="125" valign="top">P</td>
<td width="125" valign="top">1/2</td>
<td width="125" valign="top">1</td>
<td width="125" valign="top">5/6</td>
</tr>
<tr>
<td width="125" valign="top">L</td>
<td width="125" valign="top">2</td>
<td width="125" valign="top">*</td>
<td width="125" valign="top">1</td>
</tr>
</tbody>
</table>
<p>There is only one way to generate 0: if the random integer is 0, and the random float is below 1/2. This will happen with probability 1/3 x 1/2 = 1/6 = 3/18.</p>
<p>There are two ways of generating 1:</p>
<ul>
<li>if the random integer is 1 (probability 1/3 = 6/18); or</li>
<li>if the random integer is 2, and the float is above 5/6 (probability 1/3 x 1/6 = 1/18).</li>
</ul>
<p>Adding these probabilities, we get 6/18 + 1/18 = 7/18.</p>
<p>There are also two ways to generate 2:</p>
<ul>
<li>if the random integer is 0 and the random float is above 1/2 (1/3 x 1/2 = 1/6 = 3/18); or</li>
<li>if the random integer is 2 and the random float is below 5/6 (1/3 x 5/6 = 5/18).</li>
</ul>
<p>Adding these probabilities, we get 3/18 + 5/18 = 8/18.</p>
<h2>Generating the tables</h2>
<p>Consider this problem:</p>
<ul>
<li>We have n<em> </em>squares that we want to paint.</li>
<li>We have five colours of paint, possibly different amounts of paint for each colour.</li>
<li>Each square has a border in one of the colours; no two borders are the same colour.</li>
<li>In total, there is just enough paint to cover the n squares <em>exactly</em>.</li>
<li>We want to paint each square with at most two colours, with one colour matching the border.</li>
</ul>
<p>To do this, we sort the paint buckets in ascending order. We paint the square that matches the first bucket with the first bucket, and whatever remains with the last bucket. The first bucket is empty (why?), and there might be some paint remaining in the last bucket. We now put this bucket back so that the buckets are sorted according to the new quantities of paint. The first square is completely covered (why?). Note that the painted square’s colour corresponds with the depleted colour.</p>
<p>The situation is now: we have n-1 unpainted squares, and n-1 colours of paint. This is the same problem as the initial problem, with one less square and one less colour. Therefore, we proceed as before, and repeat this process until all the squares have been painted and all the paint has been used.</p>
<p>To answer the two <em>why</em>’s above:</p>
<p>The first bucket is always empty, because the smallest bucket cannot cover more than one square. Since it is the smallest bucket, all other buckets must contain at least as much paint. Thus, we have n colours, and enough paint of each colour to paint more than one square. Thus, in total, we must have more paint than is required to paint n squares. But we said that we have <em>exactly</em> the right amount of paint needed, not more. Therefore, the smallest amount of paint cannot cover more than one square.</p>
<p>The first square is always covered completely, since the last bucket always contains enough paint for at least one square. If it did not, since it is the largest bucket, all other buckets will paint less than one square. In total, we would have n colours, each that can cover less than one square. Thus, all our paint will cover less than n squares: we do not have enough paint. But we said that we <em>do</em>, so this cannot be. Therefore, we must have enough paint in the last bucket to cover at least one square.</p>
<p>Below is an illustration of this process for three colours. Here we assume that 1 litre of paint covers 1 square. We have 1/3 = 3/6 litre of red, 7/6 litre of green, and 8/6 litre of cyan.</p>
<h3>Sort Paint</h3>
<p><img style="display: inline" title="step0" src="http://code-spot.co.za/blog/wp-content/uploads/2009/04/step0.png" alt="step0" width="500" height="92" /></p>
<h3>Paint from first bucket</h3>
<p><img style="display: inline" title="step1" src="http://code-spot.co.za/blog/wp-content/uploads/2009/04/step1.png" alt="step1" width="500" height="92" /></p>
<h3>Paint from last bucket</h3>
<p><img style="display: inline" title="step2" src="http://code-spot.co.za/blog/wp-content/uploads/2009/04/step2.png" alt="step2" width="500" height="92" /></p>
<h3>Sort Paint</h3>
<p><img style="display: inline" title="step3" src="http://code-spot.co.za/blog/wp-content/uploads/2009/04/step3.png" alt="step3" width="399" height="92" /></p>
<h3>Paint from first bucket</h3>
<p><img style="display: inline" title="step4" src="http://code-spot.co.za/blog/wp-content/uploads/2009/04/step4.png" alt="step4" width="396" height="92" /></p>
<h3>Paint from last bucket</h3>
<p><img style="display: inline" title="step5" src="http://code-spot.co.za/blog/wp-content/uploads/2009/04/step5.png" alt="step5" width="396" height="110" /></p>
<h3>Sort paint</h3>
<p><img style="display: inline" title="step6" src="http://code-spot.co.za/blog/wp-content/uploads/2009/04/step6.png" alt="step6" width="299" height="110" /></p>
<h3>Paint from first bucket</h3>
<p><img style="display: inline" title="step7" src="http://code-spot.co.za/blog/wp-content/uploads/2009/04/step7.png" alt="step7" width="291" height="110" /></p>
<p>As you can see, the solution above corresponds with the tables given in the example above. Indeed, the algorithm for calculating the tables is exactly the same as the paint algorithm:</p>
<ul>
<li>The n colours correspond to the n integers (from 0 to n-1) that we want to generate.</li>
<li>The initial amounts of paints corresponds with the (relative) probabilities that we want to generate each integer.</li>
<li>The amount of paint used to paint a square of the same border is the entry in table P – the probability of using the number associated with that cell (i.e., “the border”).</li>
<li>The other colour used to paint a square (if any) corresponds to the entry in table K.</li>
</ul>
<h2>A Small Optimisation in the Implementation</h2>
<h3>Generating two uniform random numbers for the price of one</h3>
<p>There is a trick to generate a random integer (0 &lt;= n &lt; k) and a random float (0 &lt;= x &lt; 1) from a single random float (0 &lt;= u &lt; 1) that is used in the implementation of this algorithm (see download below).</p>
<p>The trick is:</p>
<ul>
<li>n = floor (uk)</li>
<li>x = uk – n</li>
</ul>
<p>This assumes things about the random generator and the accuracy required. (I do not want to get into the details here).</p>
<h2>Download</h2>
<p>A python implementation of the above algorithm.</p>
<p><a href="http://www.code-spot.co.za/downloads/python/non_uniform_random_int.py">non_uniform_random_int.py</a></p>


<p>Related posts:<ol><li><a href='http://code-spot.co.za/2009/04/15/generating-random-points-from-arbitrary-distributions-for-2d-and-up/' rel='bookmark' title='Permanent Link: Generating Random Points from Arbitrary Distributions for 2D and Up'>Generating Random Points from Arbitrary Distributions for 2D and Up</a></li>
<li><a href='http://code-spot.co.za/2008/09/21/generating-random-numbers-with-arbitrary-distributions/' rel='bookmark' title='Permanent Link: Generating Random Numbers with Arbitrary Distributions'>Generating Random Numbers with Arbitrary Distributions</a></li>
<li><a href='http://code-spot.co.za/2009/04/15/estimating-a-continuous-distribution-from-a-sample-set/' rel='bookmark' title='Permanent Link: Estimating a Continuous Distribution from a Sample Set'>Estimating a Continuous Distribution from a Sample Set</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://code-spot.co.za/2009/04/28/generating-random-integers-with-arbitrary-probabilities/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Estimating a Continuous Distribution from a Sample Set</title>
		<link>http://code-spot.co.za/2009/04/15/estimating-a-continuous-distribution-from-a-sample-set/</link>
		<comments>http://code-spot.co.za/2009/04/15/estimating-a-continuous-distribution-from-a-sample-set/#comments</comments>
		<pubDate>Wed, 15 Apr 2009 14:27:06 +0000</pubDate>
		<dc:creator>herman.tulleken</dc:creator>
				<category><![CDATA[Algorithms]]></category>
		<category><![CDATA[Mathematics]]></category>
		<category><![CDATA[Tutorial]]></category>
		<category><![CDATA[2D]]></category>
		<category><![CDATA[convolution]]></category>
		<category><![CDATA[distribution estimation]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[random]]></category>
		<category><![CDATA[random distribution]]></category>

		<guid isPermaLink="false">http://code-spot.co.za/?p=445</guid>
		<description><![CDATA[It is sometimes necessary to find the distribution given a sample set from that distribution. If we do not know anything about the distribution, we cannot recover it exactly, so here we look at ways of finding a (discrete) approximation. I will cover the case for 2D sets here, but the ideas are easily extended [...]


Related posts:<ol><li><a href='http://code-spot.co.za/2009/04/15/generating-random-points-from-arbitrary-distributions-for-2d-and-up/' rel='bookmark' title='Permanent Link: Generating Random Points from Arbitrary Distributions for 2D and Up'>Generating Random Points from Arbitrary Distributions for 2D and Up</a></li>
<li><a href='http://code-spot.co.za/2009/04/28/generating-random-integers-with-arbitrary-probabilities/' rel='bookmark' title='Permanent Link: Generating Random Integers With Arbitrary Probabilities'>Generating Random Integers With Arbitrary Probabilities</a></li>
<li><a href='http://code-spot.co.za/2008/09/21/generating-random-numbers-with-arbitrary-distributions/' rel='bookmark' title='Permanent Link: Generating Random Numbers with Arbitrary Distributions'>Generating Random Numbers with Arbitrary Distributions</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p><img style="display: inline" title="header_rand_dist2" src="http://code-spot.co.za/blog/wp-content/uploads/2009/04/header-rand-dist2.png" alt="header_rand_dist2" width="500" height="374" /> It is sometimes necessary to find the distribution given a sample set from that distribution. If we do not know anything about the distribution, we cannot recover it exactly, so here we look at ways of finding a (discrete) approximation.</p>
<p><span id="more-445"></span></p>
<p>I will cover the case for 2D sets here, but the ideas are easily extended to any dimension.</p>
<h2>Visual Inspection From a Scatter Plot</h2>
<p>The easy way to estimate the distribution is to simply look at a scatter plot of the samples.</p>
<table border="0" cellspacing="0" cellpadding="2" width="500">
<tbody>
<tr>
<td width="181" valign="top"><img src="http://code-spot.co.za/blog/wp-content/uploads/2009/03/scatter-pixel.png" alt="scatter_pixel" width="300" height="300" /></td>
<td width="316" valign="top">A scatter plot of a 2D sample set. (A pixel was simply drawn at each point of the sample. The images was slightly blurred to give the pixels more substance).Here we can already estimate the distribution, as we can intuitively &#8220;see&#8221; the shape.</td>
</tr>
<tr>
<td width="181" valign="top"></td>
<td width="316" valign="top"></td>
</tr>
</tbody>
</table>
<p>If the distribution is simple, we can often “guess” suitable parameters for a mathematical function. But this method is not suitable when more accuracy is required. And obviously this method does not work for higher dimensions.</p>
<p>I always use a visual inspection as a first step when using the other methods described here. First, it allows you to decide which approach to use. Second, it serves as a rough benchmark to test the results of a better method against.</p>
<h2>Averaging Over a Grid</h2>
<p>For this method, we divide the domain in cells, and count the number of points in each cell. To normalise, we divide the counts by the total number of points to get a distribution.</p>
<p>Typically, we choose a cell size large enough so that all cells contain a few points.</p>
<p>It is possible to use different cell sizes, so that regions of higher density has more cells, for instance, by using a quad tree. However, this approach makes it difficult to interpret the values (they should be scaled down by a factor of the area they represent), and to generate random numbers from this distribution. I can’t really imagine a situation where this approach will be preferred above using a regular grid or convolution.</p>
<h2>Convolution</h2>
<p>There are two ways to implement convolution. The first should be used when the sample set is small relative to the size of the domain, otherwise the second method will be more efficient.</p>
<h3>Sparse Point Convolution</h3>
<p>Generate a very granular empty grid over the domain.</p>
<p>For each point p in the sample set,</p>
<ul>
<li>map p to the grid (calculate the coordinates x,y of p in the grid),</li>
<li>add a number to all the cells in the neighbourhood of p in the grid.</li>
</ul>
<p>Now normalise the grid.</p>
<p>The neighbourhood is typically a circle or a square. The number you add can be the same for all neighbours (in which case any positive number will do), or it can be scaled depending on the distance from p.</p>
<p>The size of the neighbourhood depends on the density of your sample. In general, the denser it is, the smaller the neighbourhood can be. Smaller neighbourhoods lead to faster execution. To prevent “holes” in the approximation, use a radius that corresponds to the maximum of the distances between a point and its closest neighbour.</p>
<table border="0" cellspacing="0" cellpadding="2" width="500">
<tbody>
<tr>
<td width="179" valign="top"><img src="http://code-spot.co.za/blog/wp-content/uploads/2009/03/scatter-square16.png" alt="scatter_square16" width="300" height="301" /></td>
<td width="319" valign="top"><img src="http://code-spot.co.za/blog/wp-content/uploads/2009/03/square.png" alt="square" width="100" height="100" />Estimation with a constant square neighbourhood.</td>
</tr>
<tr>
<td width="179" valign="top"><img src="http://code-spot.co.za/blog/wp-content/uploads/2009/03/scatter-circle16.png" alt="scatter_circle16" width="300" height="300" /></td>
<td width="319" valign="top"><img src="http://code-spot.co.za/blog/wp-content/uploads/2009/03/circle.png" alt="circle" width="100" height="100" /><br />
Estimation with a constant circular neighbourhood.</td>
</tr>
<tr>
<td width="179" valign="top"><img src="http://code-spot.co.za/blog/wp-content/uploads/2009/03/scatter-cone16.png" alt="scatter_cone16" width="300" height="300" /></td>
<td width="319" valign="top"><img src="http://code-spot.co.za/blog/wp-content/uploads/2009/03/cone.png" alt="cone" width="100" height="100" /><br />
Estimation with a circular neighbourhood with a falloff.</td>
</tr>
</tbody>
</table>
<h3>Traditional Convolution</h3>
<p>Generate a very granular empty grid over the domain.</p>
<p>For each point p in the sample set, map p to the grid (calculate the coordinates x,y of p in the grid), and add one to that location in the grid.</p>
<p>Now choose a square, symmetrical convolution matrix, and perform a discrete convolution on the grid:</p>
<p>new_grid[i, j] = sum_{i,j} grid[i][j] * c[i][j]</p>
<p>Here the i, j go over the indices of the convolution matrix. (Normally, the convolution is defined as new_grid[i, j] = sum_{i,j} grid[n - i][n - j] * c[i][j]. However, since we are suing a symmetrical matrix, these definitions are equivalent, and we need not perform the extra calculation).</p>
<p>Now normalise the grid.</p>
<p>The convolution can be a square or circle of 1s, or be filled with numbers that grow smaller outwards. These correspond to the three neighbourhoods described for the sparse convolution.</p>
<p>Note that the new_grid is larger than the original by one less than the size of the convolution matrix in each dimension. The centre of the new grid corresponds with the original grid.</p>
<p>For example, if the original grid was 100&#215;100, and the convolution matrix was 5&#215;5, the new grid will be 104&#215;104. The point (2, 2) in the new grid corresponds to point (0, 0) in the original grid.</p>
<h2>About Normalisation</h2>
<p>It is customary to normalise the distribution so that all the probabilities add to 1. But for many purposes we need only relative probabilities, and this step can be skipped. For example, the method of generating random numbers described in a previous post uses only relative probabilities.</p>
<h2>A Few Tips</h2>
<ul>
<li>Always test your distribution by <a href="http://code-spot.co.za/2009/04/15/generating-random-points-from-arbitrary-distributions-for-2d-and-up/">generating a random set from it</a>, and comparing it with the original sample. They should match qualitatively.</li>
<li>The smaller your sample set, the cruder the approximation should be. That is, cells, neighbourhoods or convolution matrices should be big. There is a limit to the accuracy you can obtain from any sample – if you try to exceed it, your results will be poor.</li>
<li>The most common implementation errors are made at the borders (of any grid or matrix) – watch out for them!</li>
</ul>
<h2>Download</h2>
<p>There is an example of implementation in 2D in with the <a href="http://code-spot.co.za/python-image-code/">Python Image Code</a>. See the file random_distributions_demo.py.</p>


<p>Related posts:<ol><li><a href='http://code-spot.co.za/2009/04/15/generating-random-points-from-arbitrary-distributions-for-2d-and-up/' rel='bookmark' title='Permanent Link: Generating Random Points from Arbitrary Distributions for 2D and Up'>Generating Random Points from Arbitrary Distributions for 2D and Up</a></li>
<li><a href='http://code-spot.co.za/2009/04/28/generating-random-integers-with-arbitrary-probabilities/' rel='bookmark' title='Permanent Link: Generating Random Integers With Arbitrary Probabilities'>Generating Random Integers With Arbitrary Probabilities</a></li>
<li><a href='http://code-spot.co.za/2008/09/21/generating-random-numbers-with-arbitrary-distributions/' rel='bookmark' title='Permanent Link: Generating Random Numbers with Arbitrary Distributions'>Generating Random Numbers with Arbitrary Distributions</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://code-spot.co.za/2009/04/15/estimating-a-continuous-distribution-from-a-sample-set/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Generating Random Points from Arbitrary Distributions for 2D and Up</title>
		<link>http://code-spot.co.za/2009/04/15/generating-random-points-from-arbitrary-distributions-for-2d-and-up/</link>
		<comments>http://code-spot.co.za/2009/04/15/generating-random-points-from-arbitrary-distributions-for-2d-and-up/#comments</comments>
		<pubDate>Wed, 15 Apr 2009 13:53:38 +0000</pubDate>
		<dc:creator>herman.tulleken</dc:creator>
				<category><![CDATA[Algorithms]]></category>
		<category><![CDATA[Mathematics]]></category>
		<category><![CDATA[2D]]></category>
		<category><![CDATA[distribution function]]></category>
		<category><![CDATA[grids]]></category>
		<category><![CDATA[n-dimensional distributions]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[random]]></category>
		<category><![CDATA[random distribution]]></category>
		<category><![CDATA[random number generation]]></category>
		<category><![CDATA[response curve]]></category>
		<category><![CDATA[Special Numbers Library]]></category>

		<guid isPermaLink="false">http://code-spot.co.za/?p=550</guid>
		<description><![CDATA[I have already covered how to generate random numbers from arbitrary distributions in the one-dimensional case. Here we look at a generalisation of that method that works for higher dimensions. The basic trick, while easy to understand, is hard to put in words (without reverting to mathematical equations). For two dimensions, we divide the plane [...]


Related posts:<ol><li><a href='http://code-spot.co.za/2008/09/21/generating-random-numbers-with-arbitrary-distributions/' rel='bookmark' title='Permanent Link: Generating Random Numbers with Arbitrary Distributions'>Generating Random Numbers with Arbitrary Distributions</a></li>
<li><a href='http://code-spot.co.za/2009/04/28/generating-random-integers-with-arbitrary-probabilities/' rel='bookmark' title='Permanent Link: Generating Random Integers With Arbitrary Probabilities'>Generating Random Integers With Arbitrary Probabilities</a></li>
<li><a href='http://code-spot.co.za/2009/04/15/estimating-a-continuous-distribution-from-a-sample-set/' rel='bookmark' title='Permanent Link: Estimating a Continuous Distribution from a Sample Set'>Estimating a Continuous Distribution from a Sample Set</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p><img style="display: inline" title="header_rand_dist" src="http://code-spot.co.za/blog/wp-content/uploads/2009/04/header-rand-dist.png" alt="header_rand_dist" width="500" height="375" /></p>
<p>I have already covered <a href="http://code-spot.co.za/2008/09/21/generating-random-numbers-with-arbitrary-distributions/">how to generate random numbers from arbitrary distributions</a> in the one-dimensional case. Here we look at a generalisation of that method that works for higher dimensions.</p>
<p>The basic trick, while easy to understand, is hard to put in words (without reverting to mathematical equations). For two dimensions, we divide the plane into slices. Each slice is a 1D distribution. We also calculate a distribution from summing the frequencies in each slice. The latter distribution gives us one coordinate, and the appropriate slice to use. The distribution of that slice then gives the second coordinate. All distributions are put into inverse accumulative response curves as was done to generate <a href="http://code-spot.co.za/2008/09/21/generating-random-numbers-with-arbitrary-distributions/">one-dimensional random numbers</a>. (You should review that before implementing the 2D case).</p>
<p>In more dimensions, we also slice the space up into 1D distributions. Sums of these give us more distributions, which we can sum again, and again, until we reach a single distribution. This is used for the first coordinate, and to determine which distribution to use for the next coordinate. This goes on, until a 1D slice gives us the final coordinate. Again, all distributions are converted to inverse accumulative response curves.</p>
<p>If the above is unclear, I hope the detailed description below clears things up.</p>
<p><span id="more-550"></span></p>
<h2>Two Dimensions</h2>
<h3>Describe the Distribution</h3>
<p>Divide the domain into a regular grid. Let us use the domain 10..40 x 10..40. This means, all the points we will generate will fall in the square between the lines x = 10, x = 40, y = 10 and y = 40.</p>
<p>For later use, we will need the minimum x-coordinate of the domain (x0 = 10), the width of the domain (w = x1 – x0 = 40 – 10 = 30).</p>
<p>Assign to each cell the number of random numbers that should be generated for that cell for any suitable total of points. You need not normalise these values.</p>
<p>For our example we will use the 4&#215;4 grid shown below:</p>
<table border="0" cellspacing="0" cellpadding="0" width="500">
<tbody>
<tr>
<td width="125" valign="top">1</td>
<td width="125" valign="top">2</td>
<td width="125" valign="top">4</td>
<td width="125" valign="top">8</td>
</tr>
<tr>
<td width="125" valign="top">2</td>
<td width="125" valign="top">3</td>
<td width="125" valign="top">5</td>
<td width="125" valign="top">11</td>
</tr>
<tr>
<td width="125" valign="top">4</td>
<td width="125" valign="top">5</td>
<td width="125" valign="top">7</td>
<td width="125" valign="top">11</td>
</tr>
<tr>
<td width="125" valign="top">8</td>
<td width="125" valign="top">11</td>
<td width="125" valign="top">11</td>
<td width="125" valign="top">11</td>
</tr>
</tbody>
</table>
<p>For later use, we need the width of the grid (gw = 4).</p>
<p>The method described here won’t work if any of these values are 0.</p>
<h3>Calculate Cumulative Grids</h3>
<p>Calculate the cumulative sums for the columns of the array.</p>
<table border="0" cellspacing="0" cellpadding="0" width="500">
<tbody>
<tr>
<td width="125" valign="top">1</td>
<td width="125" valign="top">2</td>
<td width="125" valign="top">4</td>
<td width="125" valign="top">8</td>
</tr>
<tr>
<td width="125" valign="top">3</td>
<td width="125" valign="top">5</td>
<td width="125" valign="top">9</td>
<td width="125" valign="top">19</td>
</tr>
<tr>
<td width="125" valign="top">7</td>
<td width="125" valign="top">10</td>
<td width="125" valign="top">16</td>
<td width="125" valign="top">30</td>
</tr>
<tr>
<td width="125" valign="top">15</td>
<td width="125" valign="top">21</td>
<td width="125" valign="top">27</td>
<td width="125" valign="top">41</td>
</tr>
</tbody>
</table>
<p>Calculate a cumulative sum for the last row</p>
<table border="0" cellspacing="0" cellpadding="0" width="500">
<tbody>
<tr>
<td width="125" valign="top">15</td>
<td width="125" valign="top">36</td>
<td width="125" valign="top">63</td>
<td width="125" valign="top">104</td>
</tr>
</tbody>
</table>
<h3>Construct Inverse Response Curves</h3>
<p>For each column of cumulative sums, append a zero at the beginning, and create an inverse response curve as described for the one dimensional case. Thus the first columns inverse response curve will be created from 0, 1, 3, 7, 15. Call these curves cy[0] … cy[3].</p>
<p>For the row of cumulative sums, append a zero at the beginning, and create an inverse response curve. Call this curve cx.</p>
<h3>Determine Random Point</h3>
<p>You are now ready to generate random points from the specified distribution.</p>
<p>First, generate two uniform random numbers, urx and ury.</p>
<p>Use urx to do a lookup into curve c[x]. The result rx gives the x-coordinate of your non-uniform random number. Use this number to decide which column curve to use: subtract the minimum domain x-coordinate, divide it by the domains width, multiply it by the number of columns, and floor it to an integer.</p>
<p>ix = (rx – x0)/w * gw</p>
<p>Now use ury to do a lookup in cy[ix]. The result ry is the y-coordinate of the number.</p>
<p><img style="display: inline; margin: 5px 10px 0px 0px" title="rand_dist_red" src="http://code-spot.co.za/blog/wp-content/uploads/2009/04/rand-dist-red.png" alt="rand_dist_red" width="200" height="200" align="left" />This image shows a plot of samples generated from the distribution specified above (normalised to fit in the image boundaries). Brighter areas indicate that more points occur in that region.</p>
<p>Visually, it looks like the sample mimics the original distribution well.</p>
<h2>N Dimensions</h2>
<h3>Describe the Distribution</h3>
<p>Divide the domain in a regular N-dimensional grid, and assign frequencies to cells in the grid. For future calculations, we will need the lowest coordinate of the domain along each dimension (x0_k), the width of the domain for each dimension (w_k), and the number of cells in the grid for each dimension (gw_k).</p>
<h3>Calculate Cumulative Grids</h3>
<p>Accumulate sums along one dimension in a N-dimensional grid G_N. The last subgrid contains the totals of all “columns” along that dimension, and is a (N-1) dimensional structure.</p>
<p>Accumulate sums of this subgrid into a (N-1) dimensional grid G_(N-1) along another dimension. Again, the last subgrid contains the totals of the columns of that dimension.</p>
<p>This must be repeated until a single row, G_1 is produced. In general, the last subgrid in G_k contains the totals of columns in G_(k-1) along dimension k. This must be accumulated in a (k-1)-dimensional grid G_(k-1).</p>
<h3>Construct Inverse Response Curves</h3>
<p>Now for each grid G_k, we need to construct inverse response curves from the columns along dimension k. Remember to append 0 at the beginning of each column. The curves must be put into a k-1-dimensional structure, C_k.</p>
<h3>Determine Random Point</h3>
<p>Generate N uniformly distributed random numbers ur_1…ur_N.</p>
<p>Use ur_1 to do a lookup in C_1. The result r_1 is the first coordinate of your point. Use it to determine to calculate an appropriate index i_1:</p>
<p><span style="font-family: 'Courier New'; line-height: 18px; white-space: pre;">i</span>_1 = (ur_1 – x0_1)/w_1 * gw_1</p>
<p>i_1 is the index of the curve to use from C_2: lookup ur_2 in C_2[i_1] to obtain r_2. This is the second coordinate of your point. Determine i_2:</p>
<p><span style="font-family: 'Courier New'; line-height: 18px; white-space: pre;">i</span>_2 = (ur_2 – x0_2)/w_2 * gw_2</p>
<p>i_1 and i_2 determine the curve to use from C3, i.e. the curve C3[i_1, i_2].</p>
<p>Repeat this process until all coordinates are determined. In general, use ur_k to do a lookup in C_k[i_1, i_2, …, i_(k-1)] to obtain r_k. Use this to calculate i_k:</p>
<p><span style="font-family: 'Courier New'; line-height: 18px; white-space: pre;">i</span>_k = (ur_k – x0_k)/w_k * gw_k</p>
<p>That’s it!</p>
<h2>Download</h2>
<p>There is an example of implementation in 2D in with the <a href="http://code-spot.co.za/python-image-code/">Python Image Code</a>. See the file random_distributions_demo.py. (I know it is annoying that it is coupled with all the other image code… I am working on a better solution!)</p>


<p>Related posts:<ol><li><a href='http://code-spot.co.za/2008/09/21/generating-random-numbers-with-arbitrary-distributions/' rel='bookmark' title='Permanent Link: Generating Random Numbers with Arbitrary Distributions'>Generating Random Numbers with Arbitrary Distributions</a></li>
<li><a href='http://code-spot.co.za/2009/04/28/generating-random-integers-with-arbitrary-probabilities/' rel='bookmark' title='Permanent Link: Generating Random Integers With Arbitrary Probabilities'>Generating Random Integers With Arbitrary Probabilities</a></li>
<li><a href='http://code-spot.co.za/2009/04/15/estimating-a-continuous-distribution-from-a-sample-set/' rel='bookmark' title='Permanent Link: Estimating a Continuous Distribution from a Sample Set'>Estimating a Continuous Distribution from a Sample Set</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://code-spot.co.za/2009/04/15/generating-random-points-from-arbitrary-distributions-for-2d-and-up/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>A Reference for Functional Equations</title>
		<link>http://code-spot.co.za/2009/02/26/a-reference-for-functional-equations/</link>
		<comments>http://code-spot.co.za/2009/02/26/a-reference-for-functional-equations/#comments</comments>
		<pubDate>Thu, 26 Feb 2009 11:06:00 +0000</pubDate>
		<dc:creator>herman.tulleken</dc:creator>
				<category><![CDATA[Downloads]]></category>
		<category><![CDATA[Mathematics]]></category>
		<category><![CDATA[Snippet]]></category>
		<category><![CDATA[difference equation]]></category>
		<category><![CDATA[discrete calculus]]></category>
		<category><![CDATA[functional equation]]></category>
		<category><![CDATA[functional equations]]></category>
		<category><![CDATA[z-transform]]></category>

		<guid isPermaLink="false">http://code-spot.co.za/?p=406</guid>
		<description><![CDATA[I have not posted in a while; one reason is that I got sucked into some interesting mathematics; the work-in-progress Reference for Functional Equations is the result. If you are interested in such things &#8211; have a look. Related posts:Update: Reference for Functional Equations Update to Functional Equations Reference Update to Functional Equations Reference (version 1.3)


Related posts:<ol><li><a href='http://code-spot.co.za/2009/05/27/update-reference-for-functional-equations/' rel='bookmark' title='Permanent Link: Update: Reference for Functional Equations'>Update: Reference for Functional Equations</a></li>
<li><a href='http://code-spot.co.za/2010/08/25/update-to-functional-equations-reference/' rel='bookmark' title='Permanent Link: Update to Functional Equations Reference'>Update to Functional Equations Reference</a></li>
<li><a href='http://code-spot.co.za/2010/09/30/update-to-functional-equations-reference-version-1-3/' rel='bookmark' title='Permanent Link: Update to Functional Equations Reference (version 1.3)'>Update to Functional Equations Reference (version 1.3)</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p><img class="alignleft size-full wp-image-421" title="1052727062_0ec2c67ea4_small" src="http://code-spot.co.za/blog/wp-content/uploads/2009/02/1052727062_0ec2c67ea4_small.jpg" alt="1052727062_0ec2c67ea4_small" width="142" height="142" />I have not posted in a while; one reason is that I got sucked into some interesting mathematics; the work-in-progress <a href="http://code-spot.co.za/difference-and-functional-equations-reference/">Reference for Functional Equations</a> is the result. If you are interested in such things &#8211; have a look.</p>


<p>Related posts:<ol><li><a href='http://code-spot.co.za/2009/05/27/update-reference-for-functional-equations/' rel='bookmark' title='Permanent Link: Update: Reference for Functional Equations'>Update: Reference for Functional Equations</a></li>
<li><a href='http://code-spot.co.za/2010/08/25/update-to-functional-equations-reference/' rel='bookmark' title='Permanent Link: Update to Functional Equations Reference'>Update to Functional Equations Reference</a></li>
<li><a href='http://code-spot.co.za/2010/09/30/update-to-functional-equations-reference-version-1-3/' rel='bookmark' title='Permanent Link: Update to Functional Equations Reference (version 1.3)'>Update to Functional Equations Reference (version 1.3)</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://code-spot.co.za/2009/02/26/a-reference-for-functional-equations/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Generating Random Numbers with Arbitrary Distributions</title>
		<link>http://code-spot.co.za/2008/09/21/generating-random-numbers-with-arbitrary-distributions/</link>
		<comments>http://code-spot.co.za/2008/09/21/generating-random-numbers-with-arbitrary-distributions/#comments</comments>
		<pubDate>Sun, 21 Sep 2008 18:16:59 +0000</pubDate>
		<dc:creator>herman.tulleken</dc:creator>
				<category><![CDATA[Algorithms]]></category>
		<category><![CDATA[Mathematics]]></category>
		<category><![CDATA[Simulation]]></category>
		<category><![CDATA[Tutorial]]></category>
		<category><![CDATA[C++]]></category>
		<category><![CDATA[distribution function]]></category>
		<category><![CDATA[probability]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[random]]></category>
		<category><![CDATA[random number generation]]></category>
		<category><![CDATA[response curve]]></category>
		<category><![CDATA[sampling]]></category>
		<category><![CDATA[Special Numbers Library]]></category>

		<guid isPermaLink="false">http://code-spot.co.za/?p=123</guid>
		<description><![CDATA[For many applications, detailed statistical models are overkill. Instead, we can get away with a rough description of the distribution &#8211; not in mathematical formula form, but just as a graph with a few sample points. For example, when trying to model the traffic around a school, you might know that the graph looks something [...]


Related posts:<ol><li><a href='http://code-spot.co.za/2009/04/15/generating-random-points-from-arbitrary-distributions-for-2d-and-up/' rel='bookmark' title='Permanent Link: Generating Random Points from Arbitrary Distributions for 2D and Up'>Generating Random Points from Arbitrary Distributions for 2D and Up</a></li>
<li><a href='http://code-spot.co.za/2009/04/28/generating-random-integers-with-arbitrary-probabilities/' rel='bookmark' title='Permanent Link: Generating Random Integers With Arbitrary Probabilities'>Generating Random Integers With Arbitrary Probabilities</a></li>
<li><a href='http://code-spot.co.za/2008/12/07/random-steering-7-components-for-a-toolkit/' rel='bookmark' title='Permanent Link: Random Steering &#8211; 7 Components for a Toolkit'>Random Steering &#8211; 7 Components for a Toolkit</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<div class="zemanta-img zemanta-action-click">
<p class="zemanta-img-attribution">For many applications, detailed statistical models are overkill. Instead, we can get away with a rough description of the distribution &#8211; not in mathematical formula form, but just as a graph with a few sample points.</p>
<p class="zemanta-img-attribution">For example, when trying to model the traffic around a school, you might know that the graph looks something like this:</p>
<p class="zemanta-img-attribution"><img src="http://code-spot.co.za/blog/wp-content/uploads/2008/09/school.png" alt="school" width="481" height="289" /></p>
<p class="zemanta-img-attribution">The input is the number of minutes before the first bell rings, and the output the number of children dropped off at that time. You know that most kids are brought before the bell rings, and that the closer to the bell, the more kids are being brought every minute. Only a few kids are late.</p>
<p class="zemanta-img-attribution">This tutorial describes how to generate random numbers that can generate a distribution described by an arbitrary (piece-wise linear) curve, as the one above.</p>
<p class="zemanta-img-attribution"><span id="more-123"></span></p>
<h2>Implementation</h2>
</div>
<h3>1. Calculate an accumulative probability</h3>
<p>The first step is to run an accumulative sum of our original sample points. The following code snippet shows how it can be done in C++; the array &#8220;samples&#8221; contains the original samples.</p>
<div>
<pre style="font-size: 8pt; margin: 0em; overflow: visible; width: 100%; color: black; line-height: 12pt; font-family: consolas, 'Courier New', courier, monospace; background-color: #f4f4f4; border-style: none; padding: 0px;"><span style="color: #0000ff">for</span>(<span style="color: #0000ff">int</span> i = 1; i &lt; n; i++)
{
    samples[i] += samples[i - 1];
}</pre>
</div>
<p>   The snippet above does the calculations in-place, but it is not necessary, as is shown in the snippet below:</p>
<div>
<pre style="font-size: 8pt; margin: 0em; overflow: visible; width: 100%; color: black; line-height: 12pt; font-family: consolas, 'Courier New', courier, monospace; background-color: #f4f4f4; border-style: none; padding: 0px;">newSamples[0] = samples[0];

<span style="color: #0000ff">for</span>(<span style="color: #0000ff">int</span> i = 1; i &lt; n; i++)
{
    newSamples[i] = samples[i] + newSamples[i - 1];
}</pre>
</div>
<p>   <img src="http://code-spot.co.za/blog/wp-content/uploads/2008/09/accumulative.png" alt="accumulative" width="481" height="289" /> The graph above shows the accumulative probability density.</p>
<h3>2. Calculate inverse sample points</h3>
<p>This step is easy &#8211; we simply swap the input and output samples. It is not necessary to do this explicitly, just swap them in the argument list of the function call in the next step.  <img src="http://code-spot.co.za/blog/wp-content/uploads/2008/09/inverse.png" alt="inverse" width="481" height="289" /></p>
<h3>3. Interpolate between samples</h3>
<p>I use a special data structure that makes it very easy to compute the interpolations from discreet sample points: the response curve. There are two varieties. The ordinary response curve, with uniformly spaced samples, and the xy-response curve, where samples can be arbitrarily spaced.  <img src="http://code-spot.co.za/blog/wp-content/uploads/2008/09/u-response.png" alt="u-response" width="481" height="289" /> <strong>Ordinary Response Curve.</strong> <img src="http://code-spot.co.za/blog/wp-content/uploads/2008/09/xy-response.png" alt="xy-response" width="481" height="289" /> <strong>XY-Response Curve.</strong> The implementation is very simple, so I won&#8217;t describe it here. A C++ implementation is available in the <a href="http://code.google.com/p/specialnumbers/">Special Numbers Library</a>; a Python implementation is available with the example below. Here I simply explain the usage as it relates to this tutorial.  We proceed as follows:</p>
<ul>
<li>First, we construct an xy-response curve from our samples of the inverse accumulative probability.</li>
<li>Then we sample this curve at regular intervals, and load it into a ordinary response curve (we do this, simply because the ordinary response curve is much faster than the xy-version).</li>
</ul>
<div>
<pre style="font-size: 8pt; margin: 0em; overflow: visible; width: 100%; color: black; line-height: 12pt; font-family: consolas, 'Courier New', courier, monospace; background-color: #f4f4f4; border-style: none; padding: 0px;"><span style="color: #008000">//...</span></pre>
<pre style="font-size: 8pt; margin: 0em; overflow: visible; width: 100%; color: black; line-height: 12pt; font-family: consolas, 'Courier New', courier, monospace; background-color: #f4f4f4; border-style: none; padding: 0px;"><span style="color: #0000ff">int</span> oldSampleCount = 7;</pre>
<pre style="font-size: 8pt; margin: 0em; overflow: visible; width: 100%; color: black; line-height: 12pt; font-family: consolas, 'Courier New', courier, monospace; background-color: #f4f4f4; border-style: none; padding: 0px;"><span style="color: #0000ff">int</span> newSampleCount = 20;

<span style="color: #0000ff">float</span> inputMin = 0.0f;
<span style="color: #0000ff">float</span> inputMax = 3.2f;

<span style="color: #008000">//note: input and output is swopped arround, because we want the inverse!</span></pre>
<pre style="font-size: 8pt; margin: 0em; overflow: visible; width: 100%; color: black; line-height: 12pt; font-family: consolas, 'Courier New', courier, monospace; background-color: #f4f4f4; border-style: none; padding: 0px;">XYResponseCurve&lt;<span style="color: #0000ff">float</span>, oldSampleCount&gt; xyCurve(outputSamples, inputSamples);</pre>
<pre style="font-size: 8pt; margin: 0em; overflow: visible; width: 100%; color: black; line-height: 12pt; font-family: consolas, 'Courier New', courier, monospace; background-color: #f4f4f4; border-style: none; padding: 0px;"><span style="color: #0000ff">for</span>(<span style="color: #0000ff">int</span> i = 0; i &lt; newSampleCount; i++)</pre>
<pre style="font-size: 8pt; margin: 0em; overflow: visible; width: 100%; color: black; line-height: 12pt; font-family: consolas, 'Courier New', courier, monospace; background-color: #f4f4f4; border-style: none; padding: 0px;">{</pre>
<pre style="font-size: 8pt; margin: 0em; overflow: visible; width: 100%; color: black; line-height: 12pt; font-family: consolas, 'Courier New', courier, monospace; background-color: #f4f4f4; border-style: none; padding: 0px;">    input = ((float) i / (newSampleCount - 1)) * (inputMax - inputMin) + inputMin;</pre>
<pre style="font-size: 8pt; margin: 0em; overflow: visible; width: 100%; color: black; line-height: 12pt; font-family: consolas, 'Courier New', courier, monospace; background-color: #f4f4f4; border-style: none; padding: 0px;">    uniformOutput[i] = xyCurve(input);</pre>
<pre style="font-size: 8pt; margin: 0em; overflow: visible; width: 100%; color: black; line-height: 12pt; font-family: consolas, 'Courier New', courier, monospace; background-color: #f4f4f4; border-style: none; padding: 0px;">}</pre>
<pre style="font-size: 8pt; margin: 0em; overflow: visible; width: 100%; color: black; line-height: 12pt; font-family: consolas, 'Courier New', courier, monospace; background-color: #f4f4f4; border-style: none; padding: 0px;">ResponseCurve&lt;<span style="color: #0000ff">float</span>, newSampleCount&gt; curve(inputMin, inputMax, uniformOutput);</pre>
</div>
<p>   <img src="http://code-spot.co.za/blog/wp-content/uploads/2008/09/uniform.png" alt="uniform" width="481" height="289" /></p>
<h3>4. Map uniform random numbers to the input range of the IAPDF, and calculate the output</h3>
<p>Now that we have the response curve, we can map uniform random numbers to the appropriate input range. For the example above, we need to map to the range [0, 3.2]. The snippet below shows how to generate random numbers with the distribution shown above:</p>
<div>
<pre style="font-size: 8pt; margin: 0em; overflow: visible; width: 100%; color: black; line-height: 12pt; font-family: consolas, 'Courier New', courier, monospace; background-color: #f4f4f4; border-style: none; padding: 0px;"><span style="color: #008000">//...create the curve c</span></pre>
<pre style="font-size: 8pt; margin: 0em; overflow: visible; width: 100%; color: black; line-height: 12pt; font-family: consolas, 'Courier New', courier, monospace; background-color: #f4f4f4; border-style: none; padding: 0px;"><span style="color: #0000ff">float</span> input = random(); <span style="color: #008000; ">//Uniformly distributed function between 0 and 1.</span></pre>
<pre style="font-size: 8pt; margin: 0em; overflow: visible; width: 100%; color: black; line-height: 12pt; font-family: consolas, 'Courier New', courier, monospace; background-color: #f4f4f4; border-style: none; padding: 0px;"><span style="color: #0000ff">float</span> scaledInput = r * (inputMax - inputMin) + inputMin;</pre>
<pre style="font-size: 8pt; margin: 0em; overflow: visible; width: 100%; color: black; line-height: 12pt; font-family: consolas, 'Courier New', courier, monospace; background-color: #f4f4f4; border-style: none; padding: 0px;"><span style="color: #0000ff">float</span> output = curve(scaledInput);</pre>
</div>
<p>   The graph below shows how 10 000 random numbers are distributed. It follows the original graph closely; the discrepancy at -10 is caused by the way samples are counted (all samples between -10 (inclusive) and 0 (exclusive) are counted and plotted at -10.   <img src="http://code-spot.co.za/blog/wp-content/uploads/2008/09/result.png" alt="result" width="481" height="289" /></p>
<h2>Tips and Pitfalls to Avoid</h2>
<ol>
<li>Generate sequences for all intermediary steps.</li>
<li>Use Excel, Calc, or some other spread sheet program to debug these sequences visually when things go wrong.</li>
<li>It is very easy to get confused with input, and output, especially after the swap. Watch out for this!</li>
<li>It is easy to get confused with the number of samples for the various sequences.</li>
<li>If you implement your own Response Curve data structure, unit tests will save you huge amounts of time.</li>
<li>Always make sure that you sample enough points, especially if your original distribution graph has rapid changes in it.</li>
<li>Always confirm that your random output follows the distribution you wanted.</li>
<li>It might be faster to use this method even when mathematical formulas are available.</li>
</ol>
<h2>Downloads</h2>
<h3>Example C++ Source Code</h3>
<p><a href="http://code-spot.co.za/downloads/cpp_examples/arbitrary_distribution.cpp">http://code-spot.co.za/downloads/cpp_examples/arbitrary_distribution.cpp</a> Requires <span style="font-weight: normal;"><a href="http://code.google.com/p/specialnumbers/">Special Numbers Library</a></span></p>
<h3>Example Python Source Code</h3>
<p><a href="http://code-spot.co.za/downloads/python_examples/random_distributions.py">http://code-spot.co.za/downloads/python_examples/random_distributions.py</a></p>


<p>Related posts:<ol><li><a href='http://code-spot.co.za/2009/04/15/generating-random-points-from-arbitrary-distributions-for-2d-and-up/' rel='bookmark' title='Permanent Link: Generating Random Points from Arbitrary Distributions for 2D and Up'>Generating Random Points from Arbitrary Distributions for 2D and Up</a></li>
<li><a href='http://code-spot.co.za/2009/04/28/generating-random-integers-with-arbitrary-probabilities/' rel='bookmark' title='Permanent Link: Generating Random Integers With Arbitrary Probabilities'>Generating Random Integers With Arbitrary Probabilities</a></li>
<li><a href='http://code-spot.co.za/2008/12/07/random-steering-7-components-for-a-toolkit/' rel='bookmark' title='Permanent Link: Random Steering &#8211; 7 Components for a Toolkit'>Random Steering &#8211; 7 Components for a Toolkit</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://code-spot.co.za/2008/09/21/generating-random-numbers-with-arbitrary-distributions/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>

