Four Algorithms for Computing Square Roots

One fine summer day in 1961, when I was just ten years old, I was helping Marvin J. Hurt -- one of the workers at Lazy Mountain Children's Home -- lay vinyl tile in the recently rebuilt main building. I remember that day vividly. The main building had been destroyed in a fire the preceding December. My mother, who was only 44 years old, had died just a few months later, on June 7, 1961. My sister Betty and I spent that summer in the care of our neighbors, the Hurts, while our father was attending summer school at Whitworth College in Spokane, Washington.

The main buidling at LMCH circa 1957.

"Uncle" John, a.k.a. hollow-leg Johnny because of his voracious appetite, was a polymath with a particular interest in arithmetic. He was spreading adhesive on the bare concrete, about 10 or 20 square feet at a time, then fitting white, pink, and black vinyl tiles into a random assortment in the room that was eventually to become the chapel in the rebuilt main building. I was his assistant, fetching more adhesive, rags, solvent, and tiles as he called for them. We labored for hours. Eventually John called a halt.

"How would you like to learn something new?" he asked. Sounds good, I replied. "This is a procedure I learned in school many years ago," he continued. "I was in high school when I learned this trick, but you're pretty good with numbers, so I bet you can understand it." And with a carpenter's pencil instead of chalk, and with the bare concrete floor for a blackboard, he proceeded to sketch out a technique for computing the square root of two to an arbitrary number of decimal places. "We start like this."

    1. ?            
  √ 2.00 00 00 00 00
  

"We write in a 1 as the first digit for √2 because we know that 12 = 1 < 2 < 22 = 4. And we group the digits to the right of the decimal point in pairs because the square of a number has twice as many digits as the number itself. Is all that clear enough?" I nodded my head. "Good. Now we square the first digit on top, and carry it down below, then subtract it from our starting point, sort of like in long division. Then we double the number on top, and write that result down off to the right of our main work area, like this."

    1. ?                             
  √ 2.00 00 00 00 00       1 x 2 = 2?1 00                             
    1 00                             
  

"Now we have to think a little harder. See the question mark I wrote after the two?" Yes. "Now we have to ask ourselves 'What's the largest digit we can put in there so that when we multiply 2? by ? we get a result that is less than 100?' " [Marked in red above.] "Four," I said, after a moment's thought. So he wrote that in, and proceeded to make another calculation, like this.

    1. 4  ?                                         
  √ 2.00 00 00 00 00       1 x 2 = 24   14 x 2 = 28?1 00                          x 4               
    1 00                           96               
     −96                                            
       4 00                                         
  

"So now," I said, thinking aloud, "we should write in a '1', because 281 is less than 400, but 564 is more than 400. Is that right?" "Absolutely," John replied. "You catch on quickly." And we pressed ahead.

    1. 4  1  ?                                             
  √ 2.00 00 00 00 00          14 x 2 = 281   141 x 2 = 282?1 00                              x  1                 
    1 00                               281                 
     −96                                                   
       4 00                                                
      −2 81                                                
       1 19 00                                             
  
    1. 4  1  4  ?                                          
  √ 2.00 00 00 00 00          14 x 2 = 281   141 x 2 = 2824    1414 x 2 = 2828?1 00                              x  1            x   4
    1 00                               281            11296
     −96                                                   
       4 00                                                
      −2 81                                                
       1 19 00                                             
      −1 12 96                                             
          6 04 00                                          
  

I really don't remember how far we continued to grind out the digits of √2 that afternoon; we probably stopped after six or seven places. And we eventually went back to laying tile, so that John's beautiful illustration of the root extraction algorithm was soon erased from view. But the procedure remains indelibly etched into my memory.

Years later, after I had learned more mathematics, I came to understand that this algorithm is based on the algebraic identity (a + b)2 = a2 + 2ab + b2, and that by applying the Binomial Theorem the method can be generalized to extract cube roots and fourth roots and so on, ad infinitum. I have even used it a few times when an electronic calculator was not close at hand. But I don't suppose this algorithm is taught to school children any longer. Why bother doing calculations with a pencil and a piece of paper when a computer is readily available?

⇚ ⇛ ⇚ ⇛ ⇚ ⇛ ⇚ ⇛


Some ten years later I learned about the Newton-Raphson method for locating the zeroes of a differentiable function which, when applied to the equation x2 = n, leads to a simple procedure for extracting √n. Given n > 0, choose an initial divisor d (d = ½n will work OK), then divide n by d. Form the average of the resulting quotient q and the trial divisor d, and use that average as a new trial divisor d1. Then iterate the algorithm until the desired degree of precision has been atttained. This procedure is simple and direct, but it is not as well-suited for manual calculation as the algorithm John Hurt taught me when I was still a youngster; the many long divisions required are in general more laborious than a few trial multiplications. We can minimize the number of long divisions we perform by expressing the successive trial divisors as rational fractions, and not as decimals. But eventually we must perform a couple of long divisions, to be certain the desired level of precision has been attained.

Here's a simple example, to make the Newton-Raphson algorithm clearer. We will solve for √2 using the Newton-Raphson technique.

              d          q          Average 
                                           
  Step 1:     1          2             3    
                                       2    
                                           
  Step 2:     3          4             17   
              2          3             12   
                                           
  Step 3:     17         24           577   
              12         17           408   
    
  Step 4:    577        816          665857 
             408        577          470832 
  

In case you're a little rusty with your arithmetic, the easiest way to form the average of two rational fractions pq and rs is to form a new fraction (ps+qr)(2qs). In other words, cross-multiply the numerators and denominators. Then add the two products to get a new numerator and multiply the two denominators together (to put the two fractions over a common denominator). The desired average is one-half of this result. We'll run into a similar procedure as part of the third algorithm, spelled out in the next section.

Newton-Raphson converges very quickly: 577 divided by 408 is 1.4142156..., 665,857 divided by 470,832 is 1.4142135623746899..., and √2 = 1.414213562373095048801688724210... So after just three iterations we've already got four decimal places right, and after the fourth iteration we're up to eleven-place accuracy. Not bad at all. But can we do any better?

⇚ ⇛ ⇚ ⇛ ⇚ ⇛ ⇚ ⇛


About twenty years after graduating from college, I got enthusiastic about learning some more math. For many years I was too busy with actuarial work and with computer programming to think about pure math very much. But along about 1990 my professional life began to get less hectic, and I was able to spend a little time just playing around with numbers. I noticed some interesting relationships, as noted below, and started asking myself a few questions.

   1² − 2 ×  1² =     1 −     2 = −1                                           2² − 5 ×  1² =      4 −      5 = −1
   3² − 2 ×  2² =     9 −     8 = +1    7² − 3 ×  4² =     49 −     48 = +1    9² − 5 ×  4² =     81 −     80 = +1
   7² − 2 ×  5² =    49 −    50 = −1                                          38² − 5 × 17² =  1,444 −  1,445 = −1
  17² − 2 × 12² =   289 −   288 = +1   97² − 3 × 56² =  9,409 −  9,408 = +1  161² − 5 × 72² = 25,921 − 25,920 = +1
  
   5² − 6 ×  2² =    25 −    24 = +1    8² − 7 ×  3² =     64 −     63 = +1    3² − 8 ×  1² =      9 −      8 = +1
  49² − 6 × 20² = 2,401 − 2,400 = +1  127² − 7 × 48² = 16,129 − 16,128 = +1   17² − 8 ×  6² =    289 −    288 = +1
 

Given an arbitrary integer N that is not a perfect square, is it always possible to solve the Diophantine Equation x2Ny2 = 1? (Because of an historical accident, this particular problem is known as "Pell's Equation", after John Pell, an English mathematician of the seventeenth century.) For which values of N can the closely related Diophantine Equation x2Ny2 = −1 be solved? Is there an efficient way to find such solutions?

An idea quickly popped into my head. Since for any natural number N that is not a perfect square there exists a positive integer k such that k2 < N < (k + 1)2, we can be certain that |√Nk| = ω < 1. Therefore the powers of ω, such as ω2, ω3, etc. must converge more or less rapidly towards 0. This suggests a third algorithm for approximating √N: compute the successive powers ω2, ω4, ω8 by squaring the result again and again. Each iteration will yield an expression of the form ±aN ± b where a and b are integers. At each iteration, the fraction ba will approach √N more closely. If a and b have a common factor, we should eliminate it before continuing. If N is closer to (k + 1)2 than it is to k2, we should start with ω = |k + 1 − √N|; by minimizing the initial ω, we force the algorithm to converge more quickly. Here are some examples.

    N      ω           ω²              ω⁴                    ω⁸               √N based on ω⁸    √N to twelve places
    2   √2 −  1      3 −  2√2       17 −   12√2          577 −      408√2     1.414215686274     1.414213562373... 
    3    2 − √3      7 −  4√3       97 −   56√3       18,817 −   10,864√3     1.732050810014     1.732050807568... 
    5   √5 −  2      9 −  4√5      161 −   72√5       51,841 −   23,184√5     2.236067977915     2.236067977499... 
    6   √6 −  2      5 −  2√6 *     49 −   20√6        4,801 −    1,960√6     2.449489795918     2.449489742783... 
    7    3 − √7      8 −  3√7 *    127 −   48√7       32,257 −   12,192√7     2.645751312335     2.645751311064... 
    8    3 − √8     17 −  6√8      577 −  204√8      665,857 −  235,416√8     2.828427124749     2.828427124746... 
   10  √10 −  3     19 −  6√10     721 −  228√10   1,039,681 −  328,776√10    3.162277660169     3.162277660168... 
  
  * A common factor has been eliminated here.
  

For the small integers shown above, this algorithm always yields solutions of Pell's Equation. For larger integers this isn't usually the case. Here are a few illustrative examples. Some solve Pell's Equation. Some don't.

    N      ω          ω²                b² − Na² (for ω²)                  ω⁴                          b² − Na² (for ω⁴)                
   13    4 − √13   29 − 8√13    29² − 13 × 8² =   841 −   832 = 9    1,673 − 464√13    1,673² − 13 × 464² =  2,798,929 −  2,798,848 = 81
   17  √17 −   4   33 − 8√17    33² − 17 × 8² = 1,089 − 1,088 = 1    2,177 − 528√17    2,177² − 17 × 528² =  4,739,329 −  4,739,328 =  1
   27  √27 −   5   26 − 5√27*   26² − 27 × 8² =   676 −   675 = 1    1,351 − 260√27    1,351² − 27 × 260² =  1,825,201 −  1,825,200 =  1
   43    7 − √43   46 − 7√43*   46² − 43 × 7² = 2,116 − 2,107 = 9    4,633 − 644√43    4,223² − 43 × 644² = 17,833,729 − 17,833,648 = 81
   53  √53 −   7   51 − 7√51*   51² − 53 × 7² = 2,601 − 2,597 = 4    5,198 − 714√51    5,198² − 53 × 714² = 27,019,204 − 27,019,188 = 16
  
  * A common factor has been eliminated here.
  

Even though the procedure illustrated above involves raising a quadratic integer to higher and higher powers -- an operation conceptually much different from division followed by the computation of an average -- the basic underlying arithmetic comes out exactly the same as the procedure dictated by the Newton-Raphson algorithm outlined above. Here are some examples, presented in the Newton-Raphson format.

            --  Step 1  --           --  Step 2  --             --  Step 3  --      
   N     d       q    Average      d     q     Average      d       q      Average  
  
  13     4      13      29        29    104     1,673     1,673   6,032   5,597,777 
                 4       8         8     29      464       464    1,673   1,552,544 
                 
  17     4      17      33        33    136     2,177     2,177   8,976   9,478,657 
                 4       8         8     33      528       528    2,177   2,298,912 
                 
  27     5      27      26        26    135     1,351     1,351   7,020   3,650,401 
                 5       5         5     26      260       260    1,351    702,520  
  
  

Here's another way to think about Pell's Equation. If N is not a perfect square, we know that the square root of N cannot be expressed as a rational number ba. But if we can find two integers a and b such that b2Na2 = ±1, we can be certain that ba is the rational number closest to √N (among all the rational numbers with a denominator ≤ a). And if we can find one such number pair { b, a }, we can find an infinite sequence of such pairs, because

  b² − Na² = ±1   ==>   (b² − Na²)² = 1   <==>   b⁴ − 2Nb²a² + a⁴ = 1   <==>   b⁴ + 2Nb²a² + a⁴ − 4Nb²a² =  1   <==>
  
                                                    (b² + Na²)² − N(2ba)² = 1                                                    
  

In other words, if the number pair { b, a } solves Pell's Equation, then { b² + Na², 2ba } is another, larger solution. Moreover, each successive solution will yield the best possible rational approximation to √N (given the size of the denominator) and will give us roughly twice as many decimal digits of precision as the preceding solution did. But how can we find the smallest, or fundamental, solution to Pell's Equation for a given integer N? Does such a solution always exist?

⇚ ⇛ ⇚ ⇛ ⇚ ⇛ ⇚ ⇛


Fast forward to the year 2000. By that time I had read just about all the math books available on the shelves of the Denver Public Library. One day, while I was contemplating the odd way regular continued fractions slice up the unit interval [0, 1], I realized there is another way to attack Pell's Equation. I saw how this procedure can be applied to develop an algorithm for computing the square root of any integer. The algorithm works like this.

Suppose that N is a natural number (not a perfect square) and set ω = √N. As before, let k be the greatest integer such that k2 < N. We begin with a simple equation:

  
  ω = k + (ω − k) = k  +     1     = k  +    1     
                             1            (ω + k)  
                          (ω - k)        (ω² − k²) 
  

where, at the last step, we have multiplied the fraction 1(ω − k) by unity (written in the form (ω + k)(ω + k)) and then replaced (ω − k)(ω + k) with (ω² − k²). Why did we do that? Well, (ω² − k²) = Nk² (because ω = √N), so (ω² − k²) will always be a positive integer. Now let's make the substitution (ω² − k²) = Nk² = d, just to simplify the algebra. Since we know that k < ω < k + 1, we can find the largest integer m such that dm ≤ 2k, and then proceed as follows.

  
  ω =  k +      1     =  k +      1      =  k +          1           =  k  +       1               
             (ω + k)           (ω + k)           (md + ω + k − md)             m   +       1       
             (ω² − k²)             d                      d                                 d        
                                                                                       (ω + k − md) 
  

After one stage of the algorithm we have extracted m, the first partial denominator in the fractional part of √N. Now we just set k1 = k − md, and repeat the same trick. By iterating this procedure, we can grind out the successive partial denominators one by one. At some point -- a point we cannot always determine except by plowing through the algorithm -- we will reach a "remainder" value (ω + k)1, which always produces a partial denominator equal to 2k and a new "remainder" (ω − k), and we're right back at our starting point. The process reminds me of the way one finds the decimal expansion of the reciprocal of some integer m that is relatively prime to ten via long division: after running through a longer or shorter cycle of possible remainders (modulo m), one eventually hits a remainder of unity, and the process then repeats itself from the beginning.

I'm afraid this description is a little confusing. So let's work through a few concrete examples. And let's start with the square root of 5.

  
  √5 = 2 + (√5 − 2) = 2  +      1     = 2  +      1     = 2  +       1      = 2  +      1           = 2  +        1               
                                1             (√5 + 2)         (4 + √5 − 2)         4   +    1              4   +     1           
                            (√5 − 2)           (5 − 4)                                         1                   4  +    1      
                                                                                           (√5 − 2)                          1    
                                                                                                                         (√5 − 2) 
  

A moment's reflection should convince you that the square root of any natural number N of the form k2 + 1 will have a continued fraction expansion of the form k; 2k, 2k, .... And it is also obvious that the smallest solution of Pell's equation for such an integer N is { 2k2 + 1, 2k }. So let's tackle a tougher number. Let's try to find the square root of 20.

  
  √20 = 4  +      1      = 4  +      1      = 4  +        1        = 4  +            1     = 4  +        1         
                  1              (√20 + 4)          (8 + √20 − 4)           2  +     1             2 +       1     
              (√20 − 4)          (20 − 16)                4                          4                  4(√20 + 4) 
                                                                                 (√20 − 4)                   4     
                                                                                                                   
                                                                                                                   
      =  4 +         1         = 4  +        1              = 4  +        1                                        
              2  +      1               2 +       1                  2  +       1                                  
                    (√20 + 4)                (8 + √20 − 4)                 8 +      1                              
                                                                                    1                              
                                                                                (√20 − 4)                          
  

√20 works out with a cycle length, or period, of 2: 4; 2, 8, 2, 8, ... the same two integers { 2, 8 } repeat over and over and over again. If you think about it, you will realize that the square root of every number of the form N = k2 + k will also have a canonical continued fraction representation with a period of length two, and that fraction will look like this: k; 2, 2k, 2, 2k, ... Similarly, if N = k2 + 2k, the continued fraction for √N will be k; 1, 2k, 1, 2k, ... Here's an example using 35 (35 = 25 + 10 = 52 + 2 * 5).

  
  √35 = 5  +      1      = 5  +      1      = 5  +        1        = 5  +            1     = 5  +        1         
                  1              (√35 + 5)         (10 + √35 − 5)           1  +     1             1 +       1     
              (√35 − 5)          (35 − 25)               10                          10                10(√35 + 5) 
                                                                                 (√35 − 5)                  10     
                                                                                                                   
                                                                                                                   
      =  5 +         1         = 5  +        1              = 5  +        1                                        
              1  +      1               1 +       1                  1  +       1                                  
                    (√35 + 5)               (10 + √35 − 5)                10 +      1                              
                                                                                    1                              
                                                                                (√35 − 5)                          
  

OK, one more example and we'll be done. Let's take a bigger number -- √697 (which has a period of 3). I have an idea why it comes out the way it does, but I don't know how to characterize all the numbers with a period of 3 simply or concisely. Anyway, a period of 3 is fairly short, and easy to illustrate. Note first that 262 = 676 < 697 < 729 = 272. Note also that 697 - 676 = 21, and that 212 + 162 = 441 + 256 = 697.

  
  √697 = 26  +      1      = 26  +        1      = 26  +          1        = 26  +         1         = 26  +              1      
                    1                (√697 + 26)          (42 + √697 − 16)          2  +     1                2  +        1      
               (√697 − 26)               21                      21                         21                     21(√697 + 16) 
                                                                                        (√697 − 16)                      441     
    
       = 26  +           1            = 26  +           1             = 26  +           1                                        
                2  +         1                 2  +         1                  2  +         1                                    
                     (42 + √697 − 26)                2  +       1                    2  +         1                              
                            21                                 21                           21(√697 + 26)                        
                                                          (√697 − 26)                            21                              

       = 26  +           1                    = 26  +           1                                                                
                2  +         1                         2  +         1                                                            
                      2  +         1                         2  +         1                                                      
                            (52 + √697 − 26)                        52  +      1                                                 
                                                                               1                                                 
                                                                          (√697 − 26)                                            
  

In case you're interested in investigating some canonical continued fraction expansions of integral square roots other than the 380 instances I have cataloged in tables here and also over here, I'm sharing a copy of the Visual Basic program I used to generate those tables. I ran this program (many times) under Microsoft's Excel spreadsheet software. I have also used some similar VB programs to compute the periods of the square roots of all 1,955 non-square integers less than 2,000. The longest period in that range belongs to √1726, which clocks in at 88 partial denominators. Interestingly, the square roots of integers can also be expressed using −1, instead of +1, as the constant partial numerator. I have also used a VB program to investigate some of those, up to √999. Those "negatve" expansions are also periodic and palindromic, and tend to be much longer than the cannical expansions with positive partial numerators. For the first 968 non-square integers (1 - 999, omitting 31 perfect squares), the average period length is 10 95121. For the same set of 968 integers, the "negative" periods average 26 139484, or roughly 144% longer than the canonical periods. The longest canonical period among the first 999 integers is 60 (occurs for √919 and √991). The longest "negative" period is 188, at √661. The "negative" fractions tend to be very long because they converge very slowly. The reason this is so can be understood more clearly by thinking about the recurrence formulas one uses to evaluate continued fractions in general.

OK, enough chit-chat. Here's that program.

Sub Sqrtcf()
Dim n, p, d, w, m, k, k1, b(200) As Currency
Rem p is the number whose square root is to be extracted
Rem n is greatest integer < √p
Rem w is "previous" integer in continued fraction
Rem m is "remainder" at each step
Rem d, k are divisors of m
Rem k1 is next denominator in continued fraction
Rem b (a one-dimensional matrix) stores values of k1 for printing

Dim i, j As Long
Rem i and j are loop counters

On Error GoTo 300                   Rem Just in case something makes us blow up
p = 415                             Rem Select integer to be processed
i = 1                               Rem Initialize index for b()
n = Int(Sqr(p))                     Rem Extract first c. f. digit
If n ^ 2 = p Then GoTo 300          Rem Abort if p is a perfect square
b(i) = n                            Rem Store first partial denominator
d = 1                               Rem Initialize "factor" value
w = n                               Rem Initialize "previous" partial denominator

100 i = i + 1                       Rem Increment loop counter
    If i > 200 Then GoTo 300        Rem Avoid matrix overflow
    m = p - w ^ 2                   Rem Compute next remainder
    k = Int(m / d)                  Rem Calculate second factor of m
    If k * d <> m Then GoTo 300     Rem This should never ever happen!
    k1 = Int((n + w) / k)           Rem Extract next partial denominator
    b(i) = k1                       Rem and save it in output vector
    If (k1 = 2 * n) Then GoTo 200   Rem If we're done, go print results
    w = k * k1 - w                  Rem Else reset "previous" value
    d = k                           Rem and reset factor, then iterate
    GoTo 100

200 For j = 1 To i                  Rem Display accumulated data
    Debug.Print p; j; b(j)          Rem Processed integer, index, partial denominator
    Next j
    Exit Sub
300 Debug.Print p; i; "Error"; Erl(); Err(); Error()
End Sub