Family

This will be more of a programming write-up than a task write-up. I don’t even know what the task involved, since I only got to look at the very last step that involved a broken PNG file with some byte switched at seemingly random (but known) positions.

That in itself is not really a hard problem but it was a nice programming exercise.

poiko did the actual task but he asked me to look at the last step which was more up my alley. This was right before the CTF ended, just in case I already had some code written for this kind of thing. I do, or could have sworn I did, but didn’t find it at the time, so all I could try when he asked was to naively decode the IDAT with zlib and then do some simple backtracking based on its error, which was basically what poiko had done semi-manually already.

But this is an incredibly painful process, and would probably take well over an hour even with mindless concentration. But why do something slowly in an hour, when you can take even more time than that to write code that will eventually do it quickly?

Next day I went back to find my old code, so I could incorporate it into flagmining in a less ad-hoc form.

So what I had was a stateless bit reader for DEFLATE streams, and on top of that I build a backtracking mechanism:

# Basic idea:

#              v--------------v------------v--------------- uncertain bytes
#        v---v-----v-------v--------------------v---------- block/symbol frames in lz
# Stream |---|-XX--|-------|--XX-----------XX-F-|------ ...
#                          ^------------------------------- we started reading here
#                                             ^------------ we found a fault here
#      ----- 2 ----------- 1 ------------------------------ we backtrack to 1, then 2, etc.

# We want to rewind to discrete frames in the LZ stream. Whenever we step over
# some uncertain bytes we try to change one of them (starting at highest
# position going down) to its next value. If we've exhausted the possibilities
# for that byte, reset it and forget about it, continuing down to the previous
# uncertain byte and/or LZ frame.

class zlib_fixer(deflate_decoder):
  def __init__(self, cp, cd, *args, **kwargs):
    """This class specifically is for the Family task of ASIS 2021 where the zlib
    stream contains 2-byte uncertainties.

    `cp` is a list of byte-positions where stream bytes are uncertain. `cd` is a
    corresponding list of choices for those bytes. They're assumed to be sorted
    in order of position.

    """
    super().__init__(*args, **kwargs)

    self.stack = []
    self.change_pos = cp
    self.change_data = cd

    self.zlib_header()

  def recurse_fix(self):
    self.stack.append(framestate(self.tell(), len(self.output),
                                 (self.lit_table, self.dist_table, self.final)))

    try:
      return self.read_and_decode()
    except ValueError:
      pass

    self.backtrack()
    return self.recurse_fix()

  def backtrack(self):
    prev_pos = self.tell()
    log.info(f"fault found at {prev_pos//8}:{prev_pos%8}")

    b = bisect.bisect_left(self.change_pos, prev_pos//8)

    while self.stack and b > 0:
      cur_pos = prev_pos
      prev_pos, out_len, tables = self.stack.pop()

      if prev_pos > 8 * self.change_pos[b-1]:
        log.debug(f"backtracking past {prev_pos//8}:{prev_pos%8} (next change at {self.change_pos[b-1]}+2)")
        continue

      b -= 1
      if not self.bump_change(b):
        log.debug(f"backtracking past {prev_pos//8}:{prev_pos%8} (change at {self.change_pos[b]}+2 was exhausted)")
        continue

      self.seek(prev_pos)
      self.output = self.output[:out_len]
      self.lit_table, self.dist_table, self.final = tables
      log.info(f"backtracked to position {prev_pos//8}:{prev_pos%8}")
      return

    self.error("backtrack beyond start! can't recover stream!")

  def bump_change(self, i):
    alts = self.change_data[i]

    self._fobj.seek(self.change_pos[i])
    current = self._fobj.read(2)
    assert current in alts

    self._fobj.seek(self.change_pos[i])
    j = alts.index(current) + 1
    if j >= len(alts):
      self._fobj.write(alts[0])
      return False
    else:
      self._fobj.write(alts[j])
      return True

  def read_and_decode(self):
    ret = super().read_and_decode()

    if not self.output:
      return ret

    line_start = (len(self.output) - 1) // 2251 * 2251
    if self.output[line_start] not in [0,1,2,3,4]:
      self.error('invalid PNG filter')

    if self.output[line_start] == 0 and set(self.output[line_start+1:]) & {3,5,7}:
      self.error('a few random disallowed colors')

    return ret

This turned out to work extremely well. It instantly finds many valid zlib stream (valid except for the CRC of course).

There’s about 150 positions in the file where the 2-byte 16-bit value at that position was one of two given options. (Apparently from some non-bijective encryption or encoding, I don’t know the details.) Which is a lot of potential streams overall.

So I added extra safeguards and caused it to backtrack also if the filter byte starting each line of a PNG image was invalid. This produces perfectly valid PNGs (again save the CRC) but they’re still garbled. Noting that only the first and final colors in the palette were being used, I added even more assumptions (these a bit more random and ad hoc, like banning 3 colors entirely). Finally this adds enough checks for it to reconstruct the image perfectly.

flag