Recursion optimization

I’m trying to find a certain path through my 2d array

My array could look like this:

temp = [
    ["placeholder", 2, 0],
    ["placeholder", 1, 7, 3],
    ["placeholder", 4, 5, 8],
    ["placeholder", 6, 3, 5, 2],
    ["placeholder", 7],
    ["placeholder", 3, 0],
]

The inner arrays contain a placeholder followed by a variable amount of integers. These integers range in value between 0-19 (0 and 19 both included)

I wanna find a path from top to bottom through these inner arrays where no number is used more than once.

At temp[0] i can choose my 1st value either 2 or 0
At temp[1] i can choose my 2nd value either 1, 7 or 3
At temp[2] i can choose my 3rd value either 4, 5 or 8
At temp[3] i can choose my 4th value either 6, 3, 5 or 2
Notice at this step I cannot pick 2 if I had already chosen 2 in the 1st step
I cannot pick 3 if I had already chosen 3 in the 2nd step
I cannot pick 5 if I had already chosen 5 in the 3rd step
etc.

These are some legal paths:

  • 2-1-4-6-7-3
  • 0-1-4-5-7-3

These are some illegal paths:

  • 2-1-4-2-7-3 (2 is touched twice)
  • 0-1-4-5-7-0 (0 is touched twice)

I want my function to output a completely legal path.
I just want my function to return false if no path is found.

I’ve tried to write my own recursive solution to this problem, which I think is working, but suffers from heavy inefficiency and therefore takes literal ages to complete.
The size of my intended array is more like 20 x 1..20

So far my code looks like this (written in Python 3) (and with a temp array):

temp= [
    ["placeholder", 7, 3],
    ["placeholder", 3, 3],
    ["placeholder", 4, 3],
    ["placeholder", 3, 8],
    ["placeholder", 1, 3],
]

def findpath(array, path = []):
    
    if path and path.count(path[-1]) > 1:
        return False

    if len(path) == len(array):
        print(path)
        return True
    
    routes = array[len(path)][1:]

    for route in routes:
        path.append(route)
        if findpath(array, path):
            return True
        path.pop(-1)

findpath(temp)

This code prints: [7, 3, 4, 8, 1]

This approach is brute-forcing through every possibility until a solution is reached. In a worst-case scenario, my temp array would contain 20 arrays all containing 20 values each, that would make for 20! possible solutions, or in other words 2.432.902.008.176.640.000 possible paths. Let’s exaggerate and say my pc takes 1 ms to process a single iteration of this function. My math could be wrong but that would mean this process could take 77,146,816.6 years to complete (not accounting for leap years), and imma be honest, I don’t got this kinda time.

There must be a smarter way to approach this problem.
One thing I’ve noticed is that if the function encounters an inner array at step e.g. 15, containing only a single int e.g. 2, then the function will try adjusting its latest picks (first step 14 then step 13) one at a time, without realizing it has already picked 2 at step 2. In this situation by going back to step 2 and change the value from 2 to something else will save a f*** ton of iteration. The only problem is I have no idea how to implement this, nor any idea how to optimize this further.

Here is a real example array (not sure if it contains a solution)

temp = [
['placeholder', 2, 6, 11, 14, 15, 16, 18], 
['placeholder', 2, 6, 7, 10, 11, 14, 15, 16, 17, 18, 19], 
['placeholder', 2, 6, 10, 11, 14, 15, 16, 18, 19], 
['placeholder', 2, 6, 11, 14, 15, 16, 18, 19], 
['placeholder', 2, 6, 11, 14, 15, 16], 
['placeholder', 2, 6, 7, 10, 11, 14, 15, 16, 18, 19], 
['placeholder', 2, 6, 7, 8, 9, 10, 11, 14, 15, 16, 17, 18, 19], 
['placeholder', 11, 15, 16], 
['placeholder', 11, 14, 15, 16], 
['placeholder', 0, 1, 2, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19], 
['placeholder', 11, 15], 
['placeholder', 2, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19], 
['placeholder', 1, 2, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19], 
['placeholder', 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19], 
['placeholder', 2, 6, 7, 8, 9, 10, 11, 13, 14, 15, 16, 17, 18, 19], 
['placeholder', 2, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19], 
['placeholder', 15], 
['placeholder', 0, 1, 2, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19], 
['placeholder', 2, 11, 14, 15, 16], 
['placeholder', 2, 6, 7, 8, 10, 11, 14, 15, 16, 17, 18, 19]
]

Hope you can help me learn <3

Answer

functional heritage

Recursion is a functional heritage and so using it with functional style yields the best results. This means avoiding mutations like .append and .pop and other side effects.

Below we use a set to efficiently skip invalid combinations and a tuple to store the path. A new set and tuple are constructed for each recursion –

def traverse(t, s = set(), p = ()):
  if not t:
    yield p
  else:
    [ [_, *values], *more ] = t
    for v in values:
      if v not in s:
        yield from traverse(more, {*s, v}, (*p, v))

Using the data in your sample program –

temp = 
  [ ["placeholder", 7, 3]
  , ["placeholder", 3, 3]
  , ["placeholder", 4, 3]
  , ["placeholder", 3, 8]
  , ["placeholder", 1, 3]
  ]

for p in traverse(temp):
  print(p)
(7, 3, 4, 8, 1)
(7, 3, 4, 8, 1)

And using the data in your original question –

temp = 
  [ ["placeholder", 2, 0]
  , ["placeholder", 1, 7, 3]
  , ["placeholder", 4, 5, 8]
  , ["placeholder", 6, 3, 5, 2]
  , ["placeholder", 7]
  , ["placeholder", 3, 0]
  ]

for p in traverse(temp):
  print(p)
(2, 1, 4, 6, 7, 3)
(2, 1, 4, 6, 7, 0)
(2, 1, 4, 3, 7, 0)
(2, 1, 4, 5, 7, 3)
(2, 1, 4, 5, 7, 0)
(2, 1, 5, 6, 7, 3)
(2, 1, 5, 6, 7, 0)
(2, 1, 5, 3, 7, 0)
(2, 1, 8, 6, 7, 3)
(2, 1, 8, 6, 7, 0)
(2, 1, 8, 3, 7, 0)
(2, 1, 8, 5, 7, 3)
(2, 1, 8, 5, 7, 0)
(2, 3, 4, 6, 7, 0)
(2, 3, 4, 5, 7, 0)
(2, 3, 5, 6, 7, 0)
(2, 3, 8, 6, 7, 0)
(2, 3, 8, 5, 7, 0)
(0, 1, 4, 6, 7, 3)
(0, 1, 4, 5, 7, 3)
(0, 1, 4, 2, 7, 3)
(0, 1, 5, 6, 7, 3)
(0, 1, 5, 2, 7, 3)
(0, 1, 8, 6, 7, 3)
(0, 1, 8, 5, 7, 3)
(0, 1, 8, 2, 7, 3)

generators are lazy

Because we’re using a generator, we can stop generating the paths at any time. Perhaps you only want to find the first valid path –

def first_valid_path(t):
  for path in traverse(t):
    return path

temp = 
  [ ["placeholder", 2, 0]
  , ["placeholder", 1, 7, 3]
  , ["placeholder", 4, 5, 8]
  , ["placeholder", 6, 3, 5, 2]
  , ["placeholder", 7]
  , ["placeholder", 3, 0]
  ]

print(first_valid_path(temp))
(2, 1, 4, 6, 7, 3)

optimisation

One such optimisation we can make is to sort the input array by the number of possibilities, least posibilities first, as these are the most difficult to fulfill –

temp.sort(key=len)
print(temp)
[ ['placeholder', 15]
, ['placeholder', 11, 15] 
, ['placeholder', 11, 15, 16] 
, ['placeholder', 11, 14, 15, 16] 
, ['placeholder', 2, 11, 14, 15, 16] 
, ['placeholder', 2, 6, 11, 14, 15, 16] 
, ['placeholder', 2, 6, 11, 14, 15, 16, 18] 
, ['placeholder', 2, 6, 11, 14, 15, 16, 18, 19] 
, ['placeholder', 2, 6, 10, 11, 14, 15, 16, 18, 19] 
, ['placeholder', 2, 6, 7, 10, 11, 14, 15, 16, 18, 19] 
, ['placeholder', 2, 6, 7, 10, 11, 14, 15, 16, 17, 18, 19] 
, ['placeholder', 2, 6, 7, 8, 10, 11, 14, 15, 16, 17, 18, 19]
, ['placeholder', 2, 6, 7, 8, 9, 10, 11, 14, 15, 16, 17, 18, 19] 
, ['placeholder', 2, 6, 7, 8, 9, 10, 11, 13, 14, 15, 16, 17, 18, 19] 
, ['placeholder', 2, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19] 
, ['placeholder', 2, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]  
, ['placeholder', 1, 2, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19] 
, ['placeholder', 0, 1, 2, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19] 
, ['placeholder', 0, 1, 2, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19] 
, ['placeholder', 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
]

I didn’t measure it, but a solution is found instantly (less than one second) –

print(first_valid_path(temp))
(15, 11, 16, 14, 2, 6, 18, 19, 10, 7, 17, 8, 9, 13, 12, 4, 1, 0, 5, 3)

To put the path in correct order, it would need to be un-sorted. There are a number of ways this can be done and I will consider updating the post later with a proposed solution. For now, that is left as an exercise for the reader 😀


unsort

If you made it this far, great! Above I proposed a sorting and unsorting process and below I’ll show one such way to handle the unsorting –

def traverse(init):
  def sort(t):
    z = list(enumerate(t))
    z.sort(key=lambda _: len(_[1]))
    return z

  def unsort(p):
    p.sort(key=lambda _: _[0])
    return tuple(v for (_, v) in p)

  def loop(t, s, p):
    if not t:
      yield unsort(p)
    else:
      [ (k, [_, *values]), *more ] = t
      for v in values:
        if v not in s:
          yield from loop(more, {*s, v}, [*p, (k, v)])
  
  yield from loop(sort(init), set(), [])

This works by creating an intermediate representation with sort

[ (16, ['placeholder', 15])
, (10, ['placeholder', 11, 15])
, (7, ['placeholder', 11, 15, 16])
, (8, ['placeholder', 11, 14, 15, 16])
, (18, ['placeholder', 2, 11, 14, 15, 16])
, (4, ['placeholder', 2, 6, 11, 14, 15, 16])
, (0, ['placeholder', 2, 6, 11, 14, 15, 16, 18])
, (3, ['placeholder', 2, 6, 11, 14, 15, 16, 18, 19])
, (2, ['placeholder', 2, 6, 10, 11, 14, 15, 16, 18, 19])
, (5, ['placeholder', 2, 6, 7, 10, 11, 14, 15, 16, 18, 19])
, (1, ['placeholder', 2, 6, 7, 10, 11, 14, 15, 16, 17, 18, 19])
, (19, ['placeholder', 2, 6, 7, 8, 10, 11, 14, 15, 16, 17, 18, 19])
, (6, ['placeholder', 2, 6, 7, 8, 9, 10, 11, 14, 15, 16, 17, 18, 19])
, (14, ['placeholder', 2, 6, 7, 8, 9, 10, 11, 13, 14, 15, 16, 17, 18, 19])
, (11, ['placeholder', 2, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19])
, (15, ['placeholder', 2, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19])
, (12, ['placeholder', 1, 2, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19])
, (9, ['placeholder', 0, 1, 2, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19])
, (17, ['placeholder', 0, 1, 2, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19])
, (13, ['placeholder', 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19])
]

The paths are made up of tuples (sort_key, value) and unsorted using the sort_key

[(16, 15), (10, 11), (7, 16), (8, 14), (18, 2), (4, 6), (0, 18), (3, 19), (2, 10), (5, 7), (1, 17), (19, 8), (6, 9), (14, 13), (11, 12), (15, 4), (12, 1), (9, 0), (17, 5), (13, 3)]

Finally the sort_key is removed and only the value path we care about is yielded. Now we can see the path in the correct order. The result is instantly computed (less than one second) –

print(first_valid_path(temp))
(18, 17, 10, 19, 6, 7, 9, 16, 14, 0, 11, 12, 1, 3, 13, 4, 15, 5, 2, 8)

all valid paths

Above we use find_first_path but our sorting/unsorting technique is also efficient at finding all solutions –

for p in traverse(temp):
  print(p)

For this particular input, there is only one valid path. However this program terminates quickly (less than one second), signalling that all possibilities have been exhausted.