Coding interview Essentials

16. Verify BST property

Verify BST property

Introduction

This problem is binary search trees, which are one of the most discussed data structures during coding interviews. This problem has been asked at Microsoft, Google, Amazon, Ebay and it is among the group of questions that should be well understood and digested because its solution’s structure can be applied to many others on binary trees.

Problem statement

Given a binary tree [@cit:wiki:BST], determine if it is a valid binary search tree.

Assume a BST is defined as follows:

  • The left subtree of a node contains only nodes with keys less than the node’s key.

  • The right subtree of a node contains only nodes with keys greater than the node’s key.

You can assume that the tree structure is defined as shown in Listing [list:verify_BST:tree_structure]:

Listing 1: Binary tree definition used in this exercice.

 struct TreeNode {
     int val;
     TreeNode *left;
     TreeNode *right;
     TreeNode(int x) : val(x), left(nullptr), right(nullptr) {}
 };


For the following tree the function should return false.

        5
       / \
      1   4
         / \
        3   6
    


For the following tree the function should return true.

        2
       / \
      1   3
    

[example:verify_BST_:one]
For the following tree the function should return true.

        10
       / \
      1   14
     / \    \
0  9     16
        /  \
       15   19
    

Clarification Questions

Are all elements in the tree distinct?

Yes, you can assume all elemets are distinct.

How many nodes does the tree contain?

Up to \(10^6\) nodes.

Discussion

The question is asking for a function that verifies whether a given tree is a binary search tree or not. But what does it mean exactly? A tree \(T\) is a binary search tree if:

  1. Every node has two subtree (named left and right, respectively) i.e. \(T\) is a binary tree

  2. given a node \(n\) in the tree all the nodes in its left subtree are smaller than the value in \(n\).

  3. additionally, all nodes in the right subtree are larger.

For instance the tree in the example [ex:verify_BST:no_BST] is not a valid BST because node \(15\) is a right descendent of the root but is it not greater than it.

[language=c++,frame=single,caption={Binary tree, but not a binary search tree.}",label=ex:verify_BST:no_BST, numbers=none]
        20
        / \
       10 30
         / \
        15  40

A common mistake

When solving this question a very common mistake is to use the following greedy algorithm to verify the BST property: for each node \(n\) of the tree verify that n.val > n->left.val && n.val < n->right.val i.e. that the value of the node curreclty analyzed is greater than the value of its left child but smaller than its right one. This algorithm might work even for some trees but fails on others, and it is thus not correct. For instance the algorithm described above fails on the example [ex:verify_BST:no_BST] as it will say that it is a valid BST.

It is clear at this point that all nodes in the tree need to be visited in order to verify whether the BST property holds and that somehow the information about the values from nodes higher in the tree need to be passed down to the children and descendants. Let’s see how this can be done.

Top Down approach

When talking about tree, one should immediately think about a top down approach and recursion. This problem is no different and in-fact becomes almost trivial if approached using recursione and the following key considerations are made:

  1. every node can be thought as the root of a tree for which the BST property needs to hold (and thus verified).

  2. empty trees satisfy the BST property

  3. every nodes must be within a certain range that is determined by its parent. For instance, given the node \(15\) in the example [example:verify_BST_:one], in order for the tree to be a valid it must be within the range \((14,16)\). Why is that? Because its parent, the node \(16\) must be within the range \((14,+\infty)\) and additionally node \(15\), being the left subtree of node \(16\) must be lower than its parent. The same reasoning can be applied recusively up to the root of the tree where the range of the value is simply \((-\infty, +\infty)\) (no constraints).

    The node \(9\) in the example [example:verify_BST_:one] must be within the range \((1,10)\) for similar reasons.

To summarize, we can visit the tree in a top-down fashion and maintain a range that the current node must satisfy starting with a range equal to \((-\infty, +\infty)\) for the root (meaning that de-facto there is no restriction on the value the root can take). Once the value is checked against the range, then the same function can be applied to the right and left children but making sure that the range is modified accordingly when recurring on the children.

But how does such a range change when visiting down the tree? The idea is simple: Given a node \(n\) with parent \(p\) and range \((l_p, u_p)\) then:

  • if \(n\) is the right child of \(p\), then the range for \(n\) is: \((p, u_p)\): all nodes in the right subtree of \(p\) must be higher than \(p\). Note that \(p > l_p\) (otherwise the BST property would be violeted when checking \(p\)) and thus the range for \(n\) becomes smaller meaning that all the constrains coming from the ancestors of \(p\) will also be satisfied with the new range \((p, u_p)\).

  • A similar reasoning applies if \(n\) is the left child of \(p\). The range for \(n\) is then : \((l_u,p)\).

The idea above is shown in Listing [list:verify_BST]. Note how short and concise the code can be implemented with recursion. The complexity of this approach is \(O(n)\) time and \(O(1)\) space (if you do not count the space on the stack taken by the recursion.

Listing 2: Linear time recursive solution to the problem of verifying the BST property.

inline bool isValidBST_helper(const TreeNode* const root,
                              const long lower,
                              const long upper)
{
  if (!root)
    return true;

  return (root->val > lower) && (root->val < upper)
         && isValidBST_helper(root->left, lower, root->val)
         && isValidBST_helper(root->right, root->val, upper);
}

bool isValidBST_top_down(TreeNode* root)
{
  static constexpr long INF = std::numeric_limits<long>::max();
  return isValidBST_helper(root, -INF, INF);
}

Brute force

Another way to solve this problem is to read carefully the definition of BST and realize that, for each node \(n\) we need to check if the left subtree contains any element greater than \(n\) and whether its right subtree contains any element smaller than \(n\). This problem becomes almost trivial provided two functions are available, min_tree(TreeNode* root) and max_tree(TreeNode* root) for retrieving the min and max value respectively of a tree. The idea above can be implemented as shown in Listing [list:verify_BST_bruteforce]

Listing 3: Quadratic solution to the problem of verifying the BST property.

#include <iostream>

int tree_min(TreeNode *root)
{
  if (!root)
    return std::numeric_limits<int>::max();
  return std::min({root->val, tree_min(root->left), tree_min(root->right)});
}

int tree_max(TreeNode *root)
{
  if (!root)
    return std::numeric_limits<int>::min();
  return std::max({root->val, tree_max(root->left), tree_max(root->right)});
}
bool isValidBST_min_max(TreeNode *root)
{
  if (!root)
    return true;

  bool left_ok = true;
  if (root->left)
    left_ok = tree_max(root->left) <= root->val;
  bool right_ok = true;
  if (root->right)
    right_ok = tree_min(root->right) > root->val;

  return (left_ok && right_ok)
         && (isValidBST_min_max(root->left) && isValidBST_min_max(root->right));
}