Coding interview Essentials

36. Distance between nodes in BST

Distance between nodes in BST

Introduction

In the problem described in this chapter we will investigate how to find the distance between two nodes in a binary search tree. As we will see this problem can be approached and solved easily if we are able identify the key underlying concepts. This will become apparent after we look at a few examples and our advice for approaching this problem (and to be honest all problems on graphs and trees) is to draw and discuss quite a few examples with your interviewer. This help you get a much better intuitive understanding of what the problem is really about, which will eventually lead to the eureka moment.

This is a relatively short chapter because the solution is built on top of the solution for the problem discussed in chapter [REFERENCE]. In Section 40.1 we will have a look at the formal problem statement and in Section 40.3 we discuss the solution approach and we will look into two possible different implementations: recursive and iterative.

Problem statement

Write a function that takes as input a binary search tree \(T\) and two nodes \(p\) and \(q\) and returns the distance between \(p\) and \(q\). The distance between two nodes \(D(p,q)\) is defined as the number of edges you need to traverse to get from \(p\) to \(q\).


Given the tree shown in Figure 40.2, \(p = 1\) and \(q=3\), the function returns \(D(1,3)=2\)

If \(p=3\) and \(q=2\) the function returns \(1\). [ex:distance_between_nodes_in_tree:example1]


Given the tree shown in Figure 40.1, \(p = 5\) and \(q=2\), the function returns \(D(5,2)=6\) [ex:distance_between_nodes_in_tree:example2]

Clarification Questions

Can \(p\) be equal to \(q\)?

Yes, this is a valid case.

Is it guaranteed for \(p\) and \(q\) to be present in \(T\)?

Yes, you can assume \(T\) always contains both \(p\) and \(q\).

Discussion

As already mentioned in the introduction the key to solving this problem lies in clearly identifying the correct approach from the problem statement by using several examples, solving them by hand and then identifying similarities in their solutions.

Let’s have a look at some instances of this problem and their solution. If we consider \(T\) to the tree depicted in Figure 40.3 then the distance between nodes \(p=3\) and \(q=9\) is \(4\) and can be found by walking up the tree from node \(3\) to a node \(5\) (follow the red arcs) from which we can descend and reach node \(9\) (green arcs). Note that node \(5\) is the first node from which we can travel down the tree and reach both nodes \(3\) and \(9\). You can also see that the same reasoning applies to the trees shown in Figures 40.4 and 40.5 where you get the minimum distance by traveling up to the first node that allows you to reach the destination node. Notice that in Figure 40.5 the path upwards has length zero as from \(p\) you can already reach \(q\) by traveling down the tree.

At this point it should be clear that the minimum distance between two nodes can be calculated as the sum of distances from \(p\) and \(q\) their lowest common ancestor (LCA). The LCA is the lowest node from which it is possible to walk in a downward direction and reach both \(p\) and \(q\). In order to go from \(p\) to \(q\) one must pass through their LCA. If you need to refresh your memory on the topic of finding the LCA on binary search trees you can read Chapter 39. Listing [list:distance_between_nodes_in_tree] shows a possible implementation of the idea described above.

Figure 1: Binary Search tree of the Example [ex:distance_between_nodes_in_tree:example2]

. [fig:distance_between_nodes_in_tree:example2]

Figure 2: Binary Search tree of the Example [ex:distance_between_nodes_in_tree:example1]

. [fig:distance_between_nodes_in_tree:example1]

Listing 1: Solution to the problem of finding the distance between two nodes in a binary search tree.

/**
 * Calculates the distance between T and a node with payload `val`
 * Perform a classic BST visit/search (downward) from T for val.
 *
 * @param T is valid binary search tree
 * @param val is the value to be searched in T
 * @return the distance between T and val
 */
template <typename U>
int find_distance_down(const Node<U>* const T, const U val)
{
  assert(T && "node val exists and is reachable from T");
  const auto& payload = T->val;
  if (payload == val)
    return 0;
  if (val <= payload)
    return 1 + find_distance_down(T->left, val);
  return 1 + find_distance_down(T->right, val);
}

/**
 * Find the distance between two nodes a tree
 *
 * @param T is valid binary search tree
 * @param p is the payload of a node in T
 * @param q is the payload of a node in T
 * @return the minimum number of edges to traverse to get from p to q
 */
template <typename U>
int min_distance_nodes_BST(Node<U>* T, const U p, const U q)
{
  const Node<U>* const lca = find_LCA(T, p, q);
  return find_distance_down(lca, p) + find_distance_down(lca, q);
}

Conclusion

In this chapter we have seen how we can efficiently solve the problem of finding the distance between two nodes in a binary search tree by using the concept of LCA (discussed more in details in Chapter 39). The general strategy is that we can calculate the distance between the LCA and both \(p\) and \(q\). The sum of these two distances is the final answer. The distance between two nodes in a binary search tree can be found by slightly modifying the standard search algorithm for BSTs so that we return the number of recursive calls made instead of a boolean value signaling whether the element to be searched was found or not.