Hydrothermal Venture

Some actual algorithms.

But I’m getting annoyed with the way AoC doesn’t give any info as to what part 2 contains. This just seems to encourage doing a dirty ad-hoc solution for part 1 just to reveal part 2.

Here part 1 deals with orthogonal lines on a discrete 2D grid.

And part 2 could go in several diffferent directions. It might expand to talk about (orthogonal) rectangles (in which case you definitely want to go with something like BSPs), it might just loosen the restriction and allow non-orthogonal lines, or it could even expand it to talking about orthogonal lines in more dimensions (e.g. lines apparing at different discrete points in time). The lines in part 2 might be connected, and the calculation different. And so on. I suppose the only hint is that the data input format cannot change.

But all would entail different trade-offs and what kind of data structures you want to go for.


Will be using Numpy as usual.

aoc = __import__('aoc-common').aoc()
import numpy as np
import re

text = aoc.input_text(5)
expr = re.compile(r'^(\d+),(\d+) -> (\d+),(\d+)$', re.M)

lines = np.array(expr.findall(text), np.int64).reshape(-1, 2, 2)

# Now we have an array like (#, 2, 2) corresponding to (num_of_lines, start/end, x/y).
# I.e. lines[2][1][0] is the x-coordinate (0) of the end-point (1) of the 3rd line (2).

# Reposition the lines so we're working in an area starting with (0,0).
lines -= lines.min((0,1))

# Figure out the size of the area in question.
x_max, y_max = lines.max((0,1))

# Assert that area is "small."
assert x_max * y_max < 1_000_000

Part 1: Querying ranges.

So we get a list of points that represent orthogonal lines, as said. They’re not really “lines,” per se, but ranges in some grid. We are to count the number of cells in the grid included in at least two such ranges.

There’s many different approaches and trade-offs to be made for such a scenario. There’s also a huge “problem space” that is interesting to think about, if you just ask different questions. It entirely changes the approach you want to take. You’re encouraged to think about these things, even though it’s probably not interesting to most people.

  • The problem would be very different if we instead expected queries on the form “is there a line at (x,y)?”
  • The problem would again be very different if the question was “what are the lines not overlapping or crossing with any other lines?”
  • Or indeed, “what is the longest line within the bounding box that can be drawn that does not intersect with any of these lines?”

But to the problem at hand, the absolute simplest (in my mind) is to just have some 2D array and mark off all the ranges:

# The area we'll "draw" in.
arena = np.zeros((x_max+1, y_max+1), 'i4')

# Identify orthogonal lines.
orthogonal = lines[np.any(lines[:,0] == lines[:,1], 1)]

# Sort all the points so that the first point has the minimum of the x and y
# coordinates, and the second the maximum. Note that if the lines were
# non-orthogonal, this might mirror it vertically or horizontally.
orthogonal = np.stack([orthogonal.min(1), orthogonal.max(1)], 1)

# Moving this loop to numpy is non-trivial. Doing slicing with a numpy array on
# another array on the numpy side can be done but it is always pretty
# complicated.
for l,u in orthogonal:
  arena[l[0]:u[0]+1, l[1]:u[1]+1] += 1

print(f'part 1 solution: {np.count_nonzero(arena >= 2)}')

If is the number of lines, is the maximum span of the coordinates, then this is , i.e. it’s quadratic with respect to the lengths we’re talking about.

It’s a good solution if the grid size is very small, and it’s linear with respect to the number of lines. (Also if we are to query the grid area and do a lot of calculations on it, this would also be a good solution because querying is constant-time.)

But it’s a bad solution if the lines are long or if the bounding grid is very big.

The second general approach is to fixate only on the lines themselves and calculate a set of overlaps as we go. The complexity here will depend on how many lines and how many sets of overlaps there are. This would be the only possible choice if the lines were unbounded. However, for this kind of problem, it would be a ugly monstrosity, since we’d want to specialize on the four different line orientations (vertical, horizontal, same-sign diagonal, opposite-sign diagonal) to do efficient line-lookup, and so it would lead to a lot of code that looks copy-pasted, which I hate. So I will not be doing that.

Part 2

And so part 2 is just a simple continuation of part 1.

# Under the assumption that all non-orthogonal lines are exactly diagonal.
diagonal = lines[np.all(lines[:,0] != lines[:,1], 1)]

def from_to(x,y):
  if x <= y:
    return np.arange(x, y+1)
    return np.arange(x, y-1, -1)

# Same caveat as in part 1.
for l,u in diagonal:
  arena[from_to(l[0],u[0]), from_to(l[1],u[1])] += 1

print(f'part 2 solution: {np.count_nonzero(arena >= 2)}')