Coding interview Essentials

41. Number of Dice Rolls With Target Sum

Number of Dice Rolls With Target Sum

Introduction

Dices have been around for centuries. The oldest known dice were discovered in Iran as part of a staggering five-thousand-year-old Backgammon[^36] set.

If we think about what a die is, the first thing that comes to mind is the fair \(6\) faced-die (see Figure 45.2), but potentially any polyhedron can be a die: consider for instance a \(20\)-faced icosahedron (see Figure 45.3), or a \(12\)-faced dodecahedron (see Figure 45.1). In this chapter’s problem, we will use up to \(30\) dice at the same time, each with up to \(30\) faces, and we are going to calculate the number of ways we can obtain a certain target value when we throw them all at the same time.

Problem statement

You have \(d\) dice, and each die has \(f\) faces numbered \(1, 2, \ldots..., f\). Write a function that returns the number of possible ways to roll the dice so the sum of the upper faces equals a given target number \(t\). Because this number can be very large, the answer should be returned modulo \(10^9 + 7\).


Given \(d=1\), \(f=6\) and \(t=6\) the function should return \(1\). Clearly, there is only one way to obtain \(6\) when rolling a common \(6\)-face cubic die.

[ex:dice_rolls:example2]
Given \(d=2\), \(f=6\) and \(t=7\) the function should return \(6\). Table 45.1 lists all the possible ways of obtaining \(7\) from two common \(6\)-face dice.


Given \(d=2\), \(f=3\) and \(t=7\) the function should return \(0\) because the highest number obtainable by rolling two dices with three faces is \(6\).

Possible valide arrangements of two dice for the Example [ex:dice_rolls:example2]
First die Second die
1 6
2 5
3 4
4 3
5 2
6 1

Clarification Questions

What is the maximum number of dices, faces, and the highest target value possible?

\(30\),\(30\) and \(1000\), respectively.

Figure 1: Example of dice with 12 faces.
Figure 2: Example of common 6 faces dice.
Figure 3: Example of dice with 20 faces.

Discussion

Let’s start by noting that the answer can be astronomically high, but more importantly, that the number of possible value combinations resulting from rolling \(d\) dices is even larger. If each die has \(f\) faces, then we are looking at \(f^d\) possible distinct roll outcomes! During an interview, a brute-force approach, where we go over each and every possible roll outcome of the \(d\) dice, is completely out of question (considering the constraints on the maximum number of dices and faces we might get for input) unless we are willing to wait around \(6 \times 10^{18}\) years. Even if we implement this algorithm so that it can run on the fastest supercomputer available today (which is capable of a staggering \(\approx450\) [^37] operations per second), it would still require \(\frac{30^{30}}{10e15}\)s to run to completion. By that time, humanity will be long gone, the universe will be a dark and cold place, but most importantly, the position you are dreaming of will have gone to somebody else.

This type of "counting" questions is usually (and crucially more efficiently) solved by using a dynamic programming approach. In fact, this question shares a lot of similarities with the classical dynamic programming Coin change problem, to the point that we could solve this one using the solution to the other. In fact we can stretch this reasoning so as to consider the problem addressed in this chapter to be a proper specialization of the *Coin change* problem, where the number of available coins is equal to \(d\) and the denomination of the coins are \(1,2,\ldots,f\): we have coins of the same denomination as the dice faces.

Brute-force

For science’s sake, let’s see how a brute-force approach would be applied here. As we all know, a brute-force approach evaluates all possible outcomes of rolling \(d\) dice and keeps track of how many yield a total sum of \(t\). When dealing with this kind of task, where you have to enumerate/generate the elements of a given set, recursion is usually the way to go, especially if the elements (in this case a combination of face values) have a recursive definition. In this specific case, we can generate all possible combinations of \(d\) faces we can obtain from rolling \(d\) dices by doing the following:

  • generate the combinations from rolling \(d-1\) dice;

  • create \(f\) copies of them;

  • prepend \(1\) to the items of the first copy;

  • prepend \(2\) to the items of the second copy;

  • \(\ldots\)

  • prepend \(f\) to the items of the last copy.

For instance, we can generate the all rolls outcome of \(3\) six-faced dice by:

  1. Generate all the outcomes for only \(2\) of them \(C_2\):
    \(\! \begin{aligned}[t] C_2 = \{ & (1,1),(1,2),(1,3), (1,4), (2,1), (2,2), (2,3), (2,4),\\ &(3,1), (3,2), (3,3), (3,4), (4,1), (4,2), (4,3), (4,4)\} \end{aligned}\)

  2. Append \(1\) to each of the elements of \(C_2\):
    \(\! \begin{aligned}[t] C_3^1 = \{ &(1,1,1),(1,1,2),(1,1,3),(1,1,4),(1,2,1),(1,2,2),(1,2,3),(1,2,4),\\ &(1,3,1),(1,3,2),(1,3,3),(1,3,4),(1,4,1),(1,4,2),(1,4,3),(1,4,4)\} \end{aligned}\)

  3. Append \(2\) to each of the elements of \(C_2\):
    \(\! \begin{aligned}[t] C_3^2 = \{ &(2,1,1),(2,1,2),(2,1,3),(2,1,4),(2,2,1),(2,2,2),(2,2,3),(2,2,4),\\ &(2,3,1),(2,3,2),(2,3,3),(2,3,4),(2,4,1),(2,4,2),(2,4,3),(2,4,4)\} \end{aligned}\)

  4. Append \(3\) to each of the elements of \(C_2\):
    \(\! \begin{aligned}[t] C_3^3 = \{ &(3,1,1),(3,1,2),(3,1,3),(3,1,4),(3,2,1),(3,2,2),(3,2,3),(3,2,4),\\ &(3,3,1),(3,3,2),(3,3,3),(3,3,4),(3,4,1),(3,4,2),(3,4,3),(3,4,4)\} \end{aligned}\)

  5. Prepend \(4\) to each of the elements of \(C_2\):
    \(\! \begin{aligned}[t] C_3^4 = \{ & (4,1,1),(4,1,2),(4,1,3),(4,1,4),(4,2,1),(4,2,2),(4,2,3),(4,2,4),\\ &(4,3,1),(4,3,2),(4,3,3),(4,3,4),(4,4,1),(4,4,2),(4,4,3),(4,4,4)\} \end{aligned}\)

  6. Finally, return \(C_3 = \{C_2^1 \cup C_2^2 \cup C_2^3 \cup C_2^4\}\)

The definition above is correct but not very useful in practice. It requires making many copies of a potentially very large (indeed exponential) set of items. We will therefore use a different approach that will still run in exponential time (this section is named brute-force after all) but that can be used as a basis for developing a more efficient DP solution.

We start by rolling the first die. Clearly we have \(f\) possible values we can get, but once the value for this specific die is set (say we got the value \(x\)) we are left with \(d-1\) dices to roll and we still have to make up for \(t-x\) with the remaining \(d-1\) dice in order to reach our target value \(t\). Once a die is rolled, we are left with exactly the original problem on a smaller number of dice and target value. This is why recursion is handy as we can describe the solution to the entire problem in terms of solutions to sub-problems. We can continue this recursive process - rolling one die at a time - until we reach one of the following cases:

  1. \(d<0\) or \(t<0\) the answer is \(0\). There is no solution to the problem when the number of dice to use is negative or the target number is negative.

  2. \(t=0\). We have reached the target value \(t\). If we have used all dice then we have a solution, otherwise we do not. In other words:

    • if \(d=0\), we have used all \(d\) dice and the sum is exactly equal to \(t\). This is a valid combination. We have rolled \(d\) dice and the sum of their faces is exactly equal to \(t\).

    • if \(d>0\), we have not rolled all the dice, yet we have already reached our target value. If we continue to roll, we will generate a combination with a total sum higher than \(t\). This is not a good combination.

The idea above can be better expressed using the recurrence relation shown in Equation [eq:dice_rolls:dpformula] where \(S(d,t,f)\) is the number of ways one can obtain a target value \(t\) by throwing \(d\) dice. Note that the third parameter never changes and thus it does not play a dynamic role in the recurrence.

\[S(d,t,f)=\begin{cases} 1 \: \: \text{ if } \: d=t=0 \\ 0 \: \: \text{ if } \: d=0, \: t>0 \\ \\ \sum_{1}^{\min(f,t)} S(d-1,t-j,f) \:\: \text{ otherwise} \end{cases} \label{eq:dice_rolls:dpformula}\]

Listing [list:dice_rolls:bruteforce] shows a possible implementation of such idea. Please note that this code is remarkably similar to the brute-force solution for the Coin Change problem in Chapter [ch:coin_change]. Section [sec:min_difficulty_job_scheduler:generate_all_combination] also discusses the problem of generating combinations and the material discussed there can be adapted and applied here.

Listing 1: Brute-force (enumerating all possible combinations) solution for the problem of counting the number of dice rolls summing up to a target number $t$.

int num_rolls_to_target_bruteforce(const int dices,
                                   const int f,
                                   const int target)
{
  constexpr unsigned long MOD = 1e9 + 7;
  // ops. overreached
  if (target < 0 || dices < 0)
    return 0;

  // no more die to roll. Have we reached the target value?
  if (dices == 0)
    return target == 0 ? 1 : 0;

  // for each possible face
  int ans = 0;
  for (int i = 1; i <= f; i++, ans %= MOD)
  {
    // we assume we rolled the face with value i and solve the associated
    // subproblem
    ans += num_rolls_to_target_bruteforce(dices - 1, f, target - i);
  }
  return ans;
}

The time and space complexity of this approach are exponential and constant, respectively.

The proof of this can be derived from the solution of the recurrence relation shown in Equation [eq:dice_rolls:dpformula_complexity]: \[S(d,t) = S(d-1,t-1) + S(d-1,t-2) \label{eq:dice_rolls:dpformula_complexity}\]

where \(S(d,t)\)expresses the number of invocations of the function for a given number of dice \(d\) and target value \(t\) when \(f=2\). The resulting invocation tree is complete and has height \(h=t\) (assuming \(d \leq t\), but the same reasoning can be applied in the other cases). The cost of each node of the tree is \(O(1)\). The number of nodes in such a tree is exponentially proportional to its height.

Dynamic Programming - Recursive top-down

The brute-force solution we laid down in Section 45.3.1 can be turned into a nice and efficient one with the help of DP, and in particular of memoization. As with many other problems solvable with DP, the brute-force solution above can be turned into a much more efficient one by simply realizing that the same sub-problems are solved over and over again. For instance, imagine the case where \(f=5\). We might end up solving the sub-problem where \(d=1\) and \(t=3\) from the sub-problem where \(d=2\), \(t=4\) (by rolling the face with \(1\)), or from the sub-problem \(d=2\), \(t=5\) (by rolling the face with \(2\)).

However, the maximum number of distinct invocations for the function is not larger than \(30\times 1000 = 30000\): the maximum number of dice multiplied by the largest target value as these are the only function parameters varying during the execution. If we can somehow guarantee that no duplicate work is done for a given \(d\) and \(t\), then we can get away with only \(O(d\times t)\) function calls.

Consider, for instance, what happens during the execution of the code in Listing [list:dice_rolls:bruteforce] for the following input:

  • \(d = 3\)

  • \(f = 6\)

  • \(t = 12\)

The function is called \(6\) times and the sub-problem is solved \(5\) times. You can verify this yourself if you draw the recursion tree for or simply add a print statement at the beginning of the function. All these duplicate calls are superfluous and they represent work that can be saved if the result of each sub-problem is stored in a cache as shown in the following Listing.

The function subproblem is solved \(6\) times and the subproblem is solved \(5\) times. All these superfluous execution can be avoided if the result of each of the subproblem is stored into a cache as shown in Listing [list:dice_rolls:dp].

Listing 2: Dynamic programming with memoization top-down recursive solution for the problem of counting the number of dice rolls summing up to a target number $t$.

using Cache = std::vector<std::vector<unsigned long>>;

int num_rolls_to_target_memoization_helper(const int target,
                                           const int dices,
                                           const int f,
                                           Cache& cache)
{
  constexpr unsigned long MOD = 1e9 + 7;

  if (target < 0 || dices < 0)
    return 0;

  if (dices == 0)
    return target == 0;

  if (cache[dices][target] != -1)
    return cache[dices][target];

  unsigned long ans = 0;
  for (int i = 1; i <= f; i++, ans %= MOD)
    ans +=
        num_rolls_to_target_memoization_helper(target - i, dices - 1, f, cache);

  cache[dices][target] = ans;
  return ans;
}

int num_rolls_to_target_memoization(int d, int f, int target)
{
  Cache DP(d + 1, std::vector<unsigned long>(target + 1, -1));
  return num_rolls_to_target_memoization_helper(target, d, f, DP);
}

This implementation is an almost identical copy of the brute-force solution, except for the addition of a cache. Note how before actually trying to compute the answer we first look into the cache (see highlighted lines) to see if it is already present in the cache. If not, we solve the problem and before returning the answer we save the result in the cache.

Dynamic programming - Iterative bottom-up

Turning the top-down-up solution shown in Section 45.3.2 into a bottom-up one is relatively easy. The recursive definition of the solution (see Equation [eq:dice_rolls:dpformula]) clearly shows that we can calculate the answer for a given number of dice \(d\) and a target value \(t\) if we have already calculated the values for targets \(t-1, t-2, \ldots, t-f\) and \(d-1\). We also know how to easily calculate the answer for all possible target values when \(d=0\) and \(d=1\) and from there we can apply the reasoning above. This is exactly the strategy that Listing [list:dice_rolls:dpbottomup] implements.

Listing 3: Dynamic programming bottom-up solution for the problem of counting the number of dice rolls summing up to a target number $t$.

using Cache = std::vector<std::vector<unsigned long>>;

int num_rolls_to_target_bottom_up(const int d, const int f, const int target)
{
  Cache DP(d + 1, std::vector<unsigned long>(target + 1, 0));

  for (int j = 1; j <= f && j <= target; j++)
  {
    // only one way to make a given value with 1 dice
    DP[1][j] = 1;
  }

  // 1 way to make 0 with 0 dice
  DP[0][0] = 1;

  // num dices
  for (int i = 2; i <= d; i++)
  {
    // target value
    for (int t = 1; t <= target; t++)
    {
      // face value for the ith die
      for (int j = 1; j <= f && j <= t; j++)
      {
        DP[i][t] += DP[i - 1][t - j];
      }
    }
  }
  return DP[d][target];
}

We use a 2D table (initialized with zeros) with \(d+1\) rows and \(t+1\) columns where each cell corresponds to the solution of a sub-problem where \(d=i\) and \(t=j\). The first loop takes care of filling the table with "known" values for all sub-problems where \(d=1\). If we only have a die with \(f\) faces, there is only one way we can achieve the target values \(1,2,\ldots, f\) and no way to obtain any higher values. The rest of the code fills the table one row at a time (one die at a time) by using the values of the previous row.

The time and space complexity of this algorithm are \(\Theta(dtf)\) and \(\Theta(dt)\), respectively. However, the space complexity can be easily lowered to \(\Theta(t)\) because, as already mentioned, we only need space for two rows of the DP table: one for the values of the current \(d\) and one for the values at \(d-1\).