Coding interview Essentials

33. Count the number of islands

Count the number of islands

In this chapter we will discuss a classic interview problem question which requires that we count the number of islands on a map provided as a 2D boolean matrix. There are many versions of the statements for this problem, some with a wordy and playful background story and others where the map is just given to you as graph of some sort. Thankfully all the versions can be solved with the approaches below which are based on standard graph visiting algorithms.

Problem statement

Write a function that given a \(2\)D boolean matrix counts the number of islands in the grid. Cells in the input matrix containing a \(1\) represent land while \(0\) water. An island is a group of adjacent land cells. Every cell in the input matrix can be adjacent to the \(4\) cells that are next to it on the same row or column.

The input grid is a 2D of size \(n\times m\).


Given the following input grid (depicted in Figure 37.1) the function returns \(4\). The text representation of this example is given below where \(1\) represents land, and a \(0\) water:

    1110000
    0100000
    0110110
    0010110
    0100000
    0110010
    0011000
    
Figure 1: Visual representation of the example 1 for the problem of counting the number of islands in a map. Cells belonging to the same island share the same color.
Figure 2: Visual representation of the example 1 after visiting the cells of the first island.

Clarification Questions

Can \(n\) or \(m\) be \(0\)?

Yes, the map can be empty

Can the input grid be modified?

Yes, the input matrix is not read-only.

Discussion

Essentially, what this problem is asking us to do is identify the number of clusters of \(1\)s in the input matrix. One way to do this is by looping through the map one cell at a time until we find a \(1\), let’s say at cell \((i,j)\). Because this particular \(1\) must be part of an island, what we can then do is start exploring the island one cell at a time, moving from a \(1\) to an adjacent one, until there are no \(1\)s we have not already visited. When we visit a land cell we mark it as . This is useful because when we resume our normal linear scanning of the map we want to make sure we do not count the visited cells as being the starting point of an uncounted island. For instance in the example in Figure 37.1 we can start our visit at cell \((0,0)\) which is a \(1\) and is not yet visited. This means that this particular cell is part of an island that we did not count yet. At this point we can start visiting the cells adjacent to \((0,0)\) i.e. cells: \((0,0), (0,1), (0,2), (1,1), (2,1), (2,2), (3,2)\). When a cell is visited it is marked as shown in Figure [fig:number_islands:visited] by the red cross \(\times\) and after that all of its neighboring land cell are visited similarly in a recursive manner. When we have exhausted all the cells of the island \((0,0)\) is part of we can resume our linear search remembering we have explored an additional island.

The visit can be performed using a BFS or DFS approach. In the following section we will look at a recursive and iterative implementation of the DFS approach. We prefer the DFS approach over BSF mostly because it is easier to code recursively. The iterative version (in Listing 37.3.0.2) can, however, be turned into a BFS quite easily just by changing the policy of the order in which cells are to be visited.

DFS iterative

Listing [list:number_islands:iterative] shows a possible iterative implementation of the DFS approach as described above. Note that the core of the algorithm is the function that uses a stack to keep track of the cells that are still to be visited. For each cell that is actually visited we will also try to visit all pieces of yet unvisited land in all four directions (up, down, left and right). We do this by adding them to the pile of cells to be visited. When a cell is actually visited, it is marked as such in its corresponding cell of the array . When there is no more land left in the stack it means that the island has been completely visited and we can return. Once it is complete its execution function returns the control back in the double loop of the function which will skip all the visited cells and will trigger another invocation of as soon as another unvisited \(1\) cell is found. That \(1\) has to be part of a not yet unaccounted island together with all its adjacents land cells.

Also please note how the if at line \(22\) takes care of not visiting cells that are outside the boundaries of the map, cells that are not land or already visited because this would lead to out-of-bound errors, incorrect results and infinite loops, respectively.

The complexity of this implementation in Listing [list:number_islands:iterative] is \(O(n\times m)\) for both time and space because we visit the whole map at least once and we use space proportional to \(n\times m\) for the array .

Listing 1: Iterative DFS solution to the problem of counting the number of islands in a map.

using cell = std::pair<int, int>;
void visit(const cell& c,
           const std::vector<std::vector<bool>>& grid,
           std::vector<std::vector<bool>>& visited)
{
  const int n = grid.size();
  const int m = grid[0].size();

  std::stack<cell> S;
  S.push(c);
  while (!S.empty())
  {
    auto p = S.top();
    S.pop();

    const auto [x, y] = p;
    visited[x][y]     = true;

    constexpr std::array<cell, 4> cross = {
        cell{-1, 0}, cell{1, 0}, cell{0, -1}, cell{0, 1}};
    for (const auto& inc : cross)
    {
      const auto nx = x + inc.first;
      const auto ny = y + inc.second;
      if (nx >= 0 && nx < n && ny >= 0 && ny < m && grid[nx][ny]
          && !visited[nx][ny])
      {
        S.push({nx, ny});
      }
    }
  }
}

int count_island_iterative_DFS(const std::vector<std::vector<bool>>& grid)
{
  if (grid.size() == 0 || grid[0].size() == 0)
    return 0;

  const int n = grid.size();
  const int m = grid[0].size();
  int ans     = 0;

  std::vector<std::vector<bool>> visited(n, std::vector<bool>(m, false));
  // search for a piece of unvisited land
  for (int i = 0; i < n; i++)
  {
    for (int j = 0; j < m; j++)
    {
      if (!grid[i][j] || visited[i][j])
        continue;
      // found one, mark as visited all the piece of
      // land you can reach from here
      ans++;
      visit({i, j}, grid, visited);
    }
  }
  return ans;
}

Do we, however, really need to have a dedicated matrix just to store the information about which cell is visited? In fact no, as we can store that information in-place in the input matrix. All we have to do when marking a cell visited is turn the value in the input grid (which is modifiable) for that cell from land (\(1\)) to water (\(0\)) meaning that cell can never be considered as part of an island in the future. If we do that, the space complexity does not change because we still use space to store the cells to be visited in the stack, but the amount of space used will be lower in practice and the overall solution will look cleaner and simpler which is always a plus during an interview.

Listing 2: Alternative iterative DFS solution, without dedicated space for marking visited cells, to the problem of counting the number of islands in a map.

using cell = std::pair<int, int>;

void visit(const cell& c, std::vector<std::vector<bool>>& grid)
{
  const int n = grid.size();
  const int m = grid[0].size();

  std::stack<cell> S;
  S.push(c);
  while (!S.empty())
  {
    auto p = S.top();
    S.pop();

    const auto [x, y] = p;
    grid[x][y]        = false;  // mark the original map

    constexpr std::array<cell, 4> cross = {
        cell{-1, 0}, cell{1, 0}, cell{0, -1}, cell{0, 1}};
    for (const auto& inc : cross)
    {
      const auto nx = x + inc.first;
      const auto ny = y + inc.second;
      if (nx >= 0 && nx < n && ny >= 0 && ny < m && grid[nx][ny])
      {
        S.push({nx, ny});
      }
    }  // for
  }    // while
}

int count_island_iterative_DFS_improved(std::vector<std::vector<bool>>& grid)
{
  if (grid.size() == 0 || grid[0].size() == 0)
    return 0;

  const int n = grid.size();
  const int m = grid[0].size();
  int ans     = 0;

  for (int i = 0; i < n; i++)
  {
    for (int j = 0; j < m; j++)
    {
      // visited cells are turned into water during the visit
      if (!grid[i][j])
        continue;
      ans++;
      visit({i, j}, grid);
    }
  }
  return ans;
}

DFS recursive

The same idea can be implemented recursively which, in our opinion, makes the overall implementation more expressive, shorter and easier to reason about as well as to explain. As such, our first choice is always to use this approach if possible when dealing with problems similar to the one presented here. Listing [list:number_islands:recursive] shows a possible implementation of a recursive DFS solution for this problem. Note that the recursion only happens during the DFS itself and that the driver function is basically the same as the ones shown in the previous two solutions [list:number_islands:iterative] and [list:number_islands:iterative_1].

Listing 3: Recursive DFS solution, to the problem of counting the number of islands in a map.

using cell = std::pair<int, int>;
void visit_recursive(const cell& c, std::vector<std::vector<bool>>& grid)
{
  const int n = grid.size();
  const int m = grid[0].size();

  const auto [x, y] = c;
  // base case: a cell is out of the map or already visited or water
  if (x < 0 || y < 0 || x >= n || y >= m || !grid[x][y])
    return;

  // mark as visited
  grid[x][y] = false;

  // visit all cells that can potentially extend this island
  visit_recursive({x + 1, y}, grid);
  visit_recursive({x - 1, y}, grid);
  visit_recursive({x, y + 1}, grid);
  visit_recursive({x, y - 1}, grid);
}
int count_island_recursive_DFS(std::vector<std::vector<bool>>& grid)
{
  if (grid.size() == 0 || grid[0].size() == 0)
    return 0;

  const int n = grid.size();
  const int m = grid[0].size();
  int ans     = 0;

  for (int i = 0; i < n; i++)
    for (int j = 0; j < m; j++)
      if (grid[i][j])
      {
        ans++;
        visit_recursive({i, j}, grid);
      }

  return ans;
}