# A starter in Python: finding “curious numbers”

After so much procrastination (an art I seem to master quite well…), at last I started messing with Python. The final kick-start came from a short post about “curious numbers”  on gorgeous John D. Cook’s blog. In short, a number is curious when it has the last digits of its square equal to the number itself, e.g. 252 = 625. That’s all I needed to know.

Finding all curious numbers up to a given number of digits seemed a good task to start writing a few lines of Python. Bash shell was an option, too, although I should have limited the search to numbers up ~3.037×109 (the square root of the largest integer the shell can handle, 263-1 ≅ 9.22×1018).

One last note: you’re about to read a quite long rambling about coding, numbers and learning Python. I chose to describe at length the process that brought me from the basics to functional code, gradually improved and extended, rather than providing the final result of my efforts. It is a way to describe the method I’m trying to adopt to learn Python and, hopefully, a useful example of approach.

## Keyboard & Screen

First attempt is rough and inefficient: brute force processing all numbers n between 102 and 109 (exponent defined by variable order) and check if they meet the requirements (last order digits are the same as n). It just worked, no matter how well, as I was mainly getting used to the peculiar Python syntax at the time: indentation to define blocks of code and no closing words for clauses (no if ... fi or do ... done like *nix shell, for example).

First loop of improvement (no optimizations, yet!): interactively set the upper limit for search, accepting only integers ≥ 3, and measure execution time. Typical Python stuff involved here: import what you need, try something / except when input is garbage, and boast the 3.x style of print():

import time

while True:
try:
limit = int(input('Enter limit: '))
if limit >= 3:
break
except ValueError:
pass

texec = time.clock()

for order in range(2, limit+1):
for n in range(int(10**(order-1)), 10**order):
n_sqr = n**2
tail = (n**2 % 10**order)
if n == tail:
print(order,n_sqr,tail)

texec = time.clock() - texec

print('Execution time: ' + str(texec) + ' seconds')


Sample output:

2 625 25 2 5776 76 3 141376 376 3 390625 625 4 87909376 9376 5 8212890625 90625 6 11963109376 109376 6 793212890625 890625 7 8355712890625 2890625 7 50543227109376 7109376 Execution time: 14.183878 seconds

Data show how bad this implementation is (spoiler: very bad!). When processing all numbers with 2 up to limit digits, things escalate quickly: at order = 2 the search is in interval [10, 99] (90 numbers), at order = 3 in [100, 999] (900 additional numbers to search, 990 in total) and so on. The table below summarizes the magnificent inefficiency of this iteration.

Order
(N)
Processed
numbers (10N-10)
Execution
time (s)
timeN / timeN-1
3 990 0.002199
4 9990 0.015459 ~7
5 99990 0.139038 ~9
6 999990 1.402630 ~10
7 9999990 14.183878 ~10
8 99999990 134.339030 ~9.5

The duration of each whole run increases by a factor of ~10 when the search limit is increased by 1, of course because there are ~10 times more numbers to process (hint: geometric progression). You can figure out what it means to reach order = 9 or higher. I did it once, for science sake, and never again: the longest ~27 minutes of my life.

To go up, up, and away I have to narrow the search.

## Pen & Paper

The sample output above shows a posteriori the path to optimize the search. Among single digit numbers, only 0, 1, 5 and 6 can be classified as curious. $0^2=\textbf{0} \quad 1^2=\textbf{1} \quad 5^2=2\textbf{5} \quad 6^2=3\textbf{6}$

That narrows down the possible choices quite a lot; moreover, 0 and 1 are “special cases” valid only for single digit numbers (i.e. themselves!) therefore only 5 and 6 can lead to a curious number.

That is easily demonstrated: for a generic 2-digit number ending with 1, for example ( $d\;1$ means first digit is $d$, second digit is $1$)

    d  1 *
d  1 =
__________
d  1 +
d2  d    =
__________
d2 2d  1 $2d$ (2nd digit of squared number) = $d$ (corresponding digit in number) is true only for $d=0$ (falls back to 1-digit number case). Things are a little more interesting for $d\;5$ and $d\;6$: the first easily leads to 25 as a solution, the latter allows for a few additional considerations.

The square of $d\;6$ results in $(d^2+ d)\;(2d + 3)\;6$ that has the solution $2d + 3 = d$ for $d = -3$, which makes no sense because $d$ is a single digit. Since $2d + 3$ has to be a single digit, too, carrying 1 or 2 to the left digit if $\geq 10$ (for $d \geq 4$), we need the modulo 10 of it (the remainder of division by 10). So: $a/b = \textbf{q}uotient+\textbf{r}emainder\quad\\ r=a-bq|_{a=2d+3, b=10}=2d+3-10q$

and finally $2d + 3 - 10q = d \to d=10q-3$; any (integer) value of q provides a viable solution, but only $q=1$ gives a single digit one, $d=7$, leading to 76 as curious number. Great! Any number ending in 25 or 76 is good: the script will be processing 2×10limit-2 compared to 10limit-10 numbers (2×106 compared to ~108 at limit = 8 for example, ~50 times less) with a massive efficiency improvement. Measured execution time is consistent with the reduced amount of processed numbers: 2.64s are roughly 1/50th of the previous 134.34s.

And I used a tuple! And I had no idea at all two numbers enclosed in parentheses were a tuple! Yeah, how cool is this thing? 😉

# blah blah import and input stuff unchanged
for order in range(2, limit+1):
for n in range(10**(order-2), 10**(order-1)):
for last in (25, 76):
num = (n * 100) + last
num_sqr = num**2
tail = (num_sqr % (10**order))
if num == tail:
print(order, num_sqr, tail)
# blah blah time and printing stuff unchanged


The long execution times at higher limits are only postponed, though.

The final leap towards the best implementation (at least, the best I could devise!) is made when the last digits to append to numbers are updated at each loop with the results. That is: first time (order = 2) the script finds 25 and 76, stores them and uses them for the next loop (order = 3). Now 25 and 76 will be appended to form a 3-digit number, meaning we only need to loop over 1 remaining digit, (9 values, [1,9]) to find the next couple of numbers, 376 and 625. Overall, a total of a mere 9×limit numbers will be processed! That means you probably won’t notice the execution time until you set limit=106 or so (that’ll be 106 digits!). Again, this statement, already shown by program output, can be demonstrated quite easily comparing side by side the squares of two generic numbers ending with the same digit. I’ll consider last digit 6, 2- and 3-digit numbers for the example:

      d   6 *                  c     d   6 *
d   6 =                  c     d   6 =
_____________       ________________________
6d+3 6 +                 6c    6d+3 6 +
d2   6d     =           cd    d2     6d    +
_____________       c2  cd    6c           =
d2+d 2d+3 6         ________________________
c2 cd+c 2c+d2+d 2d+3 6

Once a solution $S$ is found for a $k$ digit long number, the solution for $k+1$ digits must originate from $S$ because the squares of both numbers share the same last $k$ digits. Finally, it is also demonstrated why there are only 2 curious number for any given number of digits (including the special case of 1st digit = 0): moving to the left digits, the expressions to calculate their value is always a 1st degree equation. In the example above (3rd and 4th digit of $(c\;d\;6)^2$ with $d=7$ as calculated before): $2c+d^2+d\quad2d+3|_{d=7}=2c+49+7\quad14+3=2c+56+1\quad7$ and finally $2c+57=c \to c=-57 \to c=3$ (due to all the modulo stuff above).

A few sheets of paper were harmed in the process, but for a good reason.

## Back to Keyboard & Screen

The modifications to previous version are quite little, and yet they have a dramatic effect on speed. As I realized from all the scribbling shown above, I have to feed the calculation loops with the result of one iteration as last digits of the next one. So, I started with a list of the first two (last_digits) that is updated after a solution is found (replacing the current value last, just used for the calculations, with the new one num: last_digits[last_digits.index(last)] = num). The inner loop only cycles on a single digit, that is made large enough to accommodate the remaining digits. Clever trick. It works. Me happy 😀 Another nice refinement I found, that shaves off a few hundredths or tenths of a second, is the break instruction after the if clause: once the only new solution for a given group of last_digits is found, the inner loop should stop and restart the search with the next group of last_digits.

import time

while True:
try:
limit = int(input('Enter limit: '))
if limit >= 2:
break
except ValueError:
pass

texec = time.clock()

last_digits = [5, 6]

for order in range(2, limit+1):
for last in last_digits:
for n in range(int(10**(order-1)), 10**order, 10**(order-1)):
num = n + last
num_sqr = num**2
tail = num_sqr % (10**order)
if num == tail:
last_digits[last_digits.index(last)] = num
print(order, tail, num_sqr)
break

texec = time.clock() - texec

print('Execution time: ' + str(texec) + ' seconds')


How fast is this? Well, this fast (order = 20):

Or even this fast (order = 500, the squared number is 1000-ish digits long, merely 0.21s to do all the magic):
Quite impressive, isn’t it? Just out of curiosity, I calculated how long it would take to reach this same order at the pace of the brute force version of the script: the result is not a curious number, it’s a ludicrous number. I’ll just leave it here: 5.13×10486 years. Well, that would be the hell of an uptime!