Coding interview Essentials

2. Square root of an integer

Square root of an integer

Introduction

The square root is not only one of the central operations in mathematics (used almost as often as addition, multiplication, or division), but it is also at the core of countless everyday gadgets and cool technology we use daily, like the radio and GPS systems.

The square root of a number \(x\), denoted with the \(\sqrt{x}\) symbol, is formally defined to be a number \(y\) such that \(y^2 = y\times y=x\). For example: \(\sqrt{4} = 2\) and \(\sqrt{1253} \approx 35.3977\). The square root is defined for every positive real number but in this lesson, we will derive an algorithm for calculating the square root for integers.

As for almost every coding interview problem, there are several possible solutions and approaches we can take to tackle this problem. In this lesson, we will learn how to write a simple and yet sub-optimal solution that runs in \(O(\sqrt{n})\) time, as well as a much faster and elegant logarithmic time solution.

Problem statement

Write a function that calculates the integral part of the square root of an integer \(n\) i.e. \(\lfloor \sqrt{n}\rfloor\). You cannot use any library functions.


Given \(n=9\) the function returns \(3\): \(\lceil\sqrt{9 \rceil}=3\)


Given \(n=11\) the function returns \(3\): \(\lceil\sqrt{11 \rceil}\approx\lceil3.316624 \rceil=3\)


Given \(n=18\) the function returns \(4\): \(\lceil\sqrt{11 \rceil}\approx\lceil4.242640 \rceil=4\)

Clarification Questions

What is the maximum value the parameter \(n\) can take?

The greatest input is guaranteed to be smaller than \(2^{32}\).

Is \(n\) guaranteed to be always positive?

Yes, there is no need to check for invalid input.

Discussion

A brute-force solution is quickly derivable from the definition of square root given above (\(\sqrt{x} = y\) where \(y^2 = x\).) and the interviewer is very likely expecting to see it mentioned or appearing on the whiteboard within the first few minutes of the interview.

Brute-Force

We know that if \(y = \sqrt{x}\) then \(y^2 = x\). Moreover, \(y\) is an integer only when \(x\) is a perfect square[^1]. If \(x\) is not a perfect square, then \(y\) is a real number and the following holds true: \(\lfloor{y}^2 \rfloor \leq x\) and \(\lceil{y} \rceil^2 > x\). For instance, \(\sqrt{5} \approx 2.2360\) and \(2^2=4 \leq 5\) and \(3^2=9 > 5\).

We can use this last property to blindly loop through all the integers \(k=0,1,2,\ldots\) until the following is true: \(k^2\leq n\) and \((k+1)^2 > n\). A solution is guaranteed to be found because eventually, \(k\) will be equal to \(\lfloor y \rfloor\). Moreover, it is clear that no more than \(\sqrt{n}\) numbers will be tested, which proves that the time complexity of this approach is \(O(\sqrt{n})\).

Listing [list:square_root_brute_force] shows a C++ implementation of this idea.

Listing 1: $O(\sqrt{n})$ solution to the problem of finding the square root of an integer.

int square_root_brute_force(const int n)
{
  long i = 0;
  while ((i * i) <= n)
    i++;
  // i at this point is the smallest element s.t. i*i > n
  return i - 1;
}

It is worth noticing that the variable \(i\) has a type that is larger in size than an . This is necessary in order to prevent overflows during the calculation of \(i^2\) (see the highlighted line). One of the constraints of the problem is that the largest input can be \(n=2^{32}-1\); The square of that number does not fit in a \(4\) bytes .

Logarithmic Solution

Binary search can be effectively used to solve this problem: in order to show that, we are going to look at the problem from a slightly different angle. Let \[F(k)=\begin{cases} 0 & \text{: } k^2 \leq n \\ 1 & \text{: } k^2 > n \end{cases} \label{eq:square_root_piecewice}\] be a piece-wise function that partition the search space \([0\ldots n]\) into two parts, as shown in Table 2.1:

  1. the numbers less or equal than \(\sqrt{n}\)

  2. the numbers strictly greater or equal than \(\sqrt{n}\)

Partition of the search space according to the function in Eq. [eq:square_root_piecewice]
\(0\) \(1\) \(2\) \(\lfloor \sqrt{n \rfloor}\) \(\lfloor \sqrt{n \rfloor}+1\) \(n\)
\(0\) \(0\) \(1\) \(1\) \(1\)

Clearly, the answer we are looking for is the greatest value \(k\) s.t. \(F(k) = 0\). Notice that every number in the left part of the search space, \(0 \leq l \leq \lfloor n \rfloor\) has \(F(l) = 0\), while the elements in the right side,\(\lfloor n \rfloor +1 \leq r \leq n\), have \(F(r) = 1\).

Because the function \(F(k)\) splits the search space into two parts, we can use binary search to find the end of the first partition (this is true in general, and if we ever recognize a problem that presents these characteristics, we can apply binary search to it). We can do that because if we pick an integer from in \([0,n]\), say \(k\), and \(F(k) = 1\) we know that \(k\) is not the solution and crucially, also that all the values greater than \(k\) are not good candidates because they all belong to the right partition. On the other hand, if \(F(k) = 0\), we know that \(k\) might be the solution but also that, all the elements smaller than \(k\) are not good candidates as \(k\) is already a better answer than any of those numbers would be. The idea above is implemented in Listing [list:square_root_binary_search].

The algorithm works by maintaining an interval (defined by the variables and ): inside of it lies the solution, which is initially set to be the entire search space \([0,n]\). It iteratively shrinks this range by testing the middle element of \([l,r]\) (value hold by ), and this can lead to one of the following three scenarios:

  1. \(^2 = n\): is the solution and also that \(n\) is a perfect square.

  2. \(^2 > n\): is not the solution and we can also exclude all numbers \(k \geq\) from the search (by setting ).

  3. \(^2 < n\): is the best guess we have found so far (it might be the solution). We can, however, exclude every number \(k <\) (by assigning ) as when squared, they would also be smaller than \(^2\).

Pay attention to the way the midpoint between \(l\) and \(r\) is calculated. It is common to see it calculated by using the following formula: \((l+r)/2\). However, this can lead to overflow problems when \(l+r\) does not fit in an .

Finally, the time and space complexity of this algorithm is \(O(log(n))\) and \(O(1)\), respectively. A good improvement w.r.t. to the complexity of the brute-force solution.