I found π using bash and Montecarlo method and it was shocking!

Clickbait!!! Just joking 😀

A few years ago I read a post (in italian, here, see the linked articles) from a fellow blogger, discussing the Montecarlo method to evaluate (or rather, approximate!) $\pi$. In layman’s terms, if you have a square with side length L, its area will be $L^2$. The diameter of the circle inscribed in it will also be L resulting in area $A = \pi L^2/4$ that is $\pi/4$ ≅ 0.785 times the area of the square.

Now, if you pick N random points within the whole square, some of them will be within the circle, their count approaching $\frac{\pi}{4} N$ as N gets larger. In other words, the probability p for a point to be inside the circle is $\pi/4$.

What’s next? Pull out some random numbers (spoiler: integers!), count, compute, plot.

Yeah, plot, with bash ( 😎 *wears sunglasses*), just for fun.

Setting the stage

Numbers to shoot to target (the circle inscribed in the square) are kindly provided by $RANDOM, ranging from 0 to 32767 ($2^{15}-1$). Two of them make a couple of coordinates so, since they are both ≥ 0, the points are located only in the 1st quadrant of the Cartesian plane. Basically we are dealing with a quarter of a circle, but also a quarter of a square, therefore the areas ratio still stands. A bit of maths to check whether we’re in or out of the circular sector, some more to normalize number for the “plotting”, and finally a carefully crafted printf to place the mark. Moving on to code: #!/bin/bash # Estimate pi with Montecarlo method: # Draw random points within a quarter of circle # Display help function about() { echo "Estimate the value of pi with Montecarlo method: draw a bunch of random numbers {0..32767}, do some maths mumbo-jumbo and BAM! here's pi (sort of...). Usage:$(basename $0) [<draws>|h] <draws> defaults to 1000 if not passed as argument on the command line h shows help" exit 0 } # Show help if argument is "h" if [[ "x$*" == "xh" ]]; then
fi

# Window dimensions
cols=$(tput cols) # horiz. axis (x) lines=$(tput lines) # vert. axis (y) - save 2 lines for result and prompt

# Exit if window size has less than 15 cols/lines
if [[ $(($cols+$lines)) -lt 30 ]]; then echo "Window must have at least 15 columns or lines" exit 1 fi tput clear draws=${1:-1000}
# Hit counters
on_tgt=0
off_tgt=0
r_sqd=$(($radius**2))

# Check if points are inside or outside
for (( i=0; i<draws; i++ )); do
x=$RANDOM y=$RANDOM
p_sqd=$((($x**2)+($y**2))) if [[$p_sqd -le $r_sqd ]] # within quarter of circle then mark="*" color=32 #green on_tgt=$(($on_tgt+1)) else mark="-" color=31 # red off_tgt=$(($off_tgt+1)) fi # Normalize to terminal width/height norm_x=$(($x*$cols/$radius)) norm_y=$(($y*$lines/$radius)) printf "\033[${norm_y};${norm_x}H\033[${color};1m${mark}" done # move cursor to actual last available line printf "\033[$((${lines}+1));1H" echo -e "\033[34;1mEstimated value of pi ($on_tgt hits / $draws draws):$(echo "scale=6; 4*$on_tgt/$draws" | bc)"
exit 0


That’s it, pretty easy. I am basically checking the couple of numbers against the circle equation $x^2+y^2=r^2$, that defines a circle with radius $r$ centered in $O(0,0)$.

“Plotting” results is quite nice, at least for the visuals, but of course not very… ehmcircular. Instead of single pixels I am using characters, “boxes” that are tall and narrow: depending on the proportion of the terminal font of choice, for example 1 drawing unit on $x$ axis (columns) might span across 8 pixels, 1 drawing unit on $y$ (rows) across 14 pixels. For this reason, also, the typical window has more columns than lines: therefore the values 0 – 32767 on $y$ axis (that, by the way, points downwards – row 1 is on top of window, last row is at the bottom) are “normalized” on a smaller distance, i.e. “squeezed” on a shorter axis.

But… is it cool to see it running, scattering coloured marks across an humble terminal window? Definitely yes. Someone should make a screensaver out of it!

The trick is just the “printf \033[<line>;<row>H” instruction, the moves the cursor to the desired position: it might not be useful in the usual text pretty print, but I can imagine a few more “recreational” uses of it. Think about a battleship game, crosswords or, more simply, a scoreboard printed on screen that has fields to populate with players’ scores. One of these ideas should sooner or later become a script… but only time will tell which one and when!

Let’s move on to maths: how good are the results and how much does the approximation improve with the number of draws N? Ehm… any other question? 😉

Actually, results are slightly variable and inconsistent with N: you can get “good” approximations (up to 3rd or even 4th decimal digit) with fewer draws and, vice-versa, “bad” approximations with lots or draws. The main culprit is the $RANDOM generator, that is not random enough: I am going to investigate this issue soon and also try different sources of random numbers. That said, using $RANDOM seemed anyway enough for the “educational” purpose of the script. The right decimals, zillions of them, of $\pi$ have already been calculated in a more rigorous (and hardware-intensive) way. Moreover, for usage in real life calculations, we don’t need more than 6 or 7 digits, according to what even NASA people say!

In the meantime, I checked what happens with a wide range of N, adding repeated runs to be on the safe side. For this purpose I slightly modified the script, removing all the code related to screen output and including the redirection to a text file for further analysis.

...
echo $draws$(echo "scale=6; 4*$on_tgt/$draws" | bc ) >> results_file
exit 0


The one-liner loop makes 5 runs for each N, ranging from 501 to 20001 (round numbers reduce significant digits):

for n in {501..20001..500}; do \
for rep in {1..5}; do \
bash monte_pi_results.sh $n; \ done; \ done  Sample file content: 501 3.096000 501 3.168000 501 3.248000 ... 20001 3.136000 20001 3.136200 Once you get the data, a few more one-liners provide insights of the results: • Minimum / Maximum estimated value: sort -n -k2 results_file | sed -n -e '1p;$p'


Sample output:

9001 3.112098
1001 3.212787
• Results matching at least 2 or 3 digits (3.14 or 3.141) of actual $\pi$ (and 3 right digits are all most people need, remember that!):
grep "3\.141" results_file


Sample output:

11001 3.141532
19001 3.141729
• Average estimated value by N:
for d in {501..20001..500}; do \
awk -v draws=$d 'BEGIN {avg=0; n=0}; \$1==draws { avg = avg+$2; n+=1 }; \ END {printf("%s\t%f\n", draws, avg/n)}' results_file; \ done  Sample output: 501 3.134414 1001 3.212787 ... 19501 3.141887 20001 3.150600 Finally, after so many wacky results, I started wondering what could be the best estimation my semi-flawed script could hand me: I wanted to simulate the perfect run! That’d mean hitting all the possible “points” in the 32767 × 32767 (quarter of) square just once. To accomplish the task, I simply inverted the circle equation, solving it by $y$ and calculating the floor of the square root (because I only need integers and points inside the circular sector): $y\leq\lfloor\sqrt{r^2-x^2}\rfloor$. Setting no scale value for bc does the trick for the floor evaluation, since all decimals are cut off. Of course, that’s another (almost) one-liner to get a file containing the value of $x$, its square, the difference $r^2-x^2$ and finally the floor of the square root of the latter value: max=$((32767**2))
for i in {0..32767}; do \
isqd=$((i**2)); \ diff=$(($max-(i**2))); \ froot=$(echo "sqrt($diff)" | bc); \ echo -e "$i\\t$isqd\\t$diff\\t$froot" >> perfect_run; \ done  Sample content of file: 0 0 1073676289 32767 1 1 1073676288 32766 2 4 1073676285 32766 3 9 1073676280 32766 ... 32766 1073610756 65533 255 32767 1073676289 0 0 Now, summing up values in the last field results in the count of all points inside the approximate (i.e. drawn at discrete steps) quarter of circle. Dividing this value by the total number of points provides the “best” estimation of $\pi$ you can get with the given number of available points: on_tgt=$(awk 'BEGIN {tot=0} {tot=tot+$4} END {print tot}' perfect_run) echo "scale=50; 4*$on_tgt/\$((32767**2))" | bc -l
3.14159229793701814718942722223047993565218799388052


The result is correct up to the 6th digit (not bad…), compared to the actual value (as found here: $3.14159265358979323846264338327950288419716939937510$) that is just ~0.0000003 or 0.00001% off. Can you live with that? Most probably you can! 🙂

Well, that’s all, at last! Stay tuned and $\pi$ attention to next posts: more random (literally) stuff is coming on this topic!

Annunci