# perfectblue CTF 2021

## General Comments {#pbctf2021}

I didn‘t play it, but I solved all the crypto tasks. Two of them I
solved for `poiko`

on Sunday while the competition was running.

They were pretty cool overall. Didn‘t look at any of the misc tasks,
the zip `poiko`

gave me just had these problems.

## Alkaloid Stream

`poiko`

had already solved this one during the actual CTF, I just
re-solved it out of own curiosity. Even “easy“ problems can be quite
enjoyable when made by good problem authors (which I know `rbtree`

to
be).

Anyway, the task: I don‘t recall the name used for this kind of system, but it‘s a discrimination problem, where we‘re trying to distinguish values that are are part of some set from fake ones. Here, the true values are linearly independent vectors in $F_{2}$ and the fakes are linear combinations from this basis. In a random order you get two numbers $(X,Y)$ where one is true and one is false.

In this specific problem none of this is important though, because it is trivially bugged. The fake values are generated from contiguous sums but the loops are bugged so the last fake value generated is always 0, allowing us to distinguish some true value $t_{i}$. The penultimate fake value will be $t_{i}$ itself, allowing us to find another $t_{j}$ and so forth.

The solve was about half an hour(?) from reading to flag.

## Steroid Stream

As above, but now the fake value $i$ is generated from some a random subset (of size $⌊3n ⌋$) from $t_{i},t_{i+1},t_{i+2},⋯$. However it is still bugged, because the last $⌊3n ⌋$ fake values are not generated at all, so they just default to 0. So immediately we have the tail end of the original true values. From there it‘s just a matter of iterating over the other values $(x,y)$ and checking if the rank of the matrix increases when we add one of these numbers but not the other (pretending the numbers represent a matrix over $F_{2}$).

```
# Code extract for solve. Much, much slower than it needs to be
# (at least O(n^4) in Python-ops, but I believe O(n^3) should be possible?)
# but fast enough for the task (couple of minutes).
mat = []
KEY = [0] * len(pks)
for i,xy in enumerate(pks):
x,y = sorted(xy)
if x != 0:
continue
mat.append(y)
KEY[i] = xy[0] == 0
r = bit_rank(mat)
while r < len(pks):
for i,xy in enumerate(pks):
if 0 in xy:
continue
n0 = bit_rank(mat + [xy[0]])
n1 = bit_rank(mat + [xy[1]])
if n0 == n1:
continue
which = n0 < n1
mat.append(xy[which])
xy[1-which] = 0
KEY[i] = which
r += 1
```

`bit_rank()`

above is just naive elimination from my `flagmining`

library:

```
lowbit = lambda x: x & -x
# ...
def bit_rank(ns):
"""Pretend the numbers represent a matrix over GF2 and calculate its rank.
"""
mat = list(ns)
r = 0
while mat:
if (pivot := mat.pop()) == 0:
continue
r += 1
lb = lowbit(pivot)
for i, row in enumerate(mat):
if row & lb:
mat[i] ^= pivot
return r
```

Task was much simpler than I expected, and took just under an hour from reading code to flag, according to my timestamps.

## GoodHash

A server accepts input strings that must strictly conform to being
printable ASCII, valid JSON, and for the JSON to contain the a property
like `.admin==True`

. This string is hashed by using the
output from `AES_GCM(key=<fixed>, data=bytes(32), nonce=<input>, auth_data=None)`

(both tag and ciphertext), and the hash
is compared against a known target value. The goal is to generate a
collision while adhering to the constrained input format.

The path to the solution was easy enough to “see“ after refreshing my memory on GCM, but I put off actually implementing it for a long time (until I felt guilty enough). I dislike GCM because there‘s always some non-trivial amount of fiddly frustration and off-by-1 bugs and the like, due to the unintuitive LSB/MSB schizophrenia it suffers from in actual implementation. So this one took several (too many) hours just for that reason alone.

Basically when the nonce is not 96 bits exactly, and it‘s the only variable factor, the “point of collision“ reduces to this part of the GCM:

```
h = AES_ECB(key, bytes(16))
cnt0 = GHASH(h, null_pad(IV) + (8*len(IV)).to_bytes(16, 'big'))
# cnt0 is then used for the CTR keystream, to mask the auth data, and so forth.
```

And `GHASH(h, B)`

works by evaluating the polynomial
$h_{n}B_{1}+h_{n−1}B_{2}+⋯+hB_{n}$ where $B_{i}$ is 16-byte block $i$ of the byte stream B,
and all byte-blocks are interpreted as values in $F_{2_{128}}$
under whatever random bit order who the hell knows.

I took the most naive way forward (there might be better ways): a nonce
like `{"admin":1,"x":"////BLOCK[A]////....BLOCK[B]...."}`

gives two completely free blocks I can customize. If $a,b,c,d,e$ are the
blocks generated in the input to `GHASH()`

from this nonce
(including the length block), then the task is to find $b,c$ of a
suitable form such that $bh+c=t=(ah_{5}+dh_{2}+eh)h_{−3}$ in
GCM‘s finite field.

To find these suitable $b,c$ I again took the dumb approach, since I figured the complexity was small enough and I just wanted it over with: generate random $b$ looking for values where either $bh$ or $bh+t$ has no bits in common with $128⌊2552_{128} ⌋$ (i.e. no high bits set in any byte). This finds candidates where the output is valid ASCII with 16-bit brute force. Then do a small local search combining these sets to produce a valid form $b$ and $c=bh+t$ where that ASCII also conforms to the allowable alphabet. Note that the this first null-set is quote-unquote constant-time because it can be generated as a table offline.

## Seed Me

A server running Java asks for a seed and requires that the
`(2048*k)`

-th `float`

output value for
$k∈[1,16]$ is larger than `~0.980155`

. It smelled like a
lattice problem.

Upon inspection (`poiko`

looked it up for me while I finished up
`GoodHash`

) Java uses a simple 48-bit LCG and floats are made
from the high 24 bits of the state. LCGs were my first love in computer
science topics and so has a special place in my heart.

I set up:

```
# [ -T0 -T1 -T2 ... -T15 B 0 ]
# [ a0 a1 a2 ... a15 0 1 ]
# [ M 0 0 .... 0 0 0 ]
# [ 0 M 0 .... 0 0 0 ]
# [ 0 0 M .... 0 0 0 ]
# [ . . . ]
```

… where $B≫2_{48}$, $M=2_{48}$ and the first 16 columns are also scaled by some similarly large constant.

The target values $T$ are constructed by seeking a sum close to the mid
point between lowest and highest acceptable value in the LCG range, i.e.
`(2**48 + int(2**48*7.331*0.1337))//2`

offset by the added constant at
that point in the LCG stream.

This finds several good candidates, seeds that pass 14 or 15 of the given checks but it doesn‘t find one that passes all 16. At this point I quietly complimented the problem author for constructing a task that wasn‘t defeated by the most naive cookie-cutter LLL approach. But noted also that the entropy of the constraints is extremely low, far exceeding the seed space, so:

a) we‘re likely looking for a very specific seed that the problem author has discovered or found somewhere, b) indicating also that there‘s alternative solutions involving pure integer programming, or by analyzing how Java‘s specific multiplier behaves in 2048-dimensional space, b) in particular, it‘s likely that the hyperplanes of the LCG are near-orthogonal to a specific axis in this space, and so lots of fun could be had there.

But still, it‘s usually not that hard to “convince“ LLL to find the answer you want if you know it‘s there. I saw two obvious ways forward: adjust the target to be slightly off-center, and/or use a non-uniform scale for the columns applying uneven weight. I applied these techniques more or less randomly and will admit to not really knowing much theory here (the “I have no idea what I‘m doing“ dog meme picture comes to mind), I just followed some basic instincts. But randomizing the weights instantly nudged LLL in the right direction and found a (the only?) seed that works.

Problem description to flag took around 1.5 hours so a pretty fast problem.

## Yet Another PRNG

This one was the most fun and enjoyable problem (in my view). It seemed
simple enough, it smelled of nuts (as in brain teasers), and I love
Euler hacking^{1}. It was decently challenging and rewarding. I am far
from certain I found the intended or most optimal solution.

The problem rephrased in conciser numpy code (numpy built into core CPython when?) is as follows:

```
# These are justified as nothing-up-my-sleeve numbers.
A = np.array([[4256, 307568, 162667],
[593111, 526598, 630723],
[383732, 73391, 955684]], object)
# OBS 1: these are NOT justified...
moduli = np.array([2**32 - 107, 2**32 - 5, 2**32 - 209], dtype=object)
M = 2**64 - 59
def gen(s):
# We maintain three separate streams (as rows in `s`) and for each iteration
# the next stream value is some linear combination of the three previous
# values with coefficients from rows in A.
#
# The streams are reduced with different moduli, and finally summed under a
# fourth modulus.
while True:
out = s[:,0] * [2*moduli[0], -moduli[2], -moduli[1]] # OBS 2: the order.
yield sum(out) % M # summed under a fourth modulus
n = (s*A).sum(1) % moduli # each stream gets its own modulus
s = np.c_[s[:,1:], n] # cycle state: shift out the most recent numbers, add new ones
# 9x 32-bit unknowns.
U = np.array([[random.getrandbits(32) for _ in range(3)] for _ in range(3)], dtype=object)
g = gen(U)
for _ in range(12):
print(next(g)) # output "hints"
# use next 12 outputs to encode flag.
```

I spent several wasteful hours going down misguided rabbit holes chasing
clever Euler hacks, which was unfruitful, but I had a lot of fun doing
it. I don‘t have a proper time estimate because I started preliminary
analysis before bed and then continued in the morning, but I would say I
used *at least* 5 hours on this problem, which I don‘t regret.

The final modulus does not seem like the difficult part, as it will only
be relevant half the time anyway, thus I decided early on to ignore it,
figuring that *if all else fails* it‘s a 12-bit brute force. The
problem lies in reasoning about these three values that have already
been reduced by different moduli by the time they‘re added using
further arithmetic…

My initial idea was to think of the original values in the unknown state
as faux-$Q$ values in order to unify the moduli calculation to
something under $(modm_{0}m_{1}m_{2})$ or similar, but I was probably just
being delusional. However during this I made two important observations:
the moduli were indeed very suspicious. For example $2m_{0}=m_{1}+m_{2}$,
which I assume is relevant, but I didn‘t find a *direct* way to exploit
it. I got the feeling there‘s some clever insight I missed here, like
“oh! but that means we can just think of the streams as blah blah in
linear algebra“ but my brain didn‘t deliver on the blah blah part.

Anyway, the second observation is how the multipliers switches order when giving the final sum, mimicking CRT. The output is (ignoring the $M$ modulus) $2m_{0}(x(modm_{0}))−m_{2}(y(modm_{1}))−m_{1}(z(modm_{2}))$ (to be fair: this is much clearer in the actual task than I made it seem in the numpy golf above), which is equivalent to $(2m_{0}x(modm_{0}))−(m_{2}y+m_{1}z(modm_{1}m_{2}))$, so that at least reduces it to two moduli.

And, but, aha! Now the moduli are twice the bit length, meaning that they‘re “applied“ less. Expanding, we have something like $2m_{0}(c_{0}x_{0}+c_{1}x_{1}+c_{2}x_{2})(modm_{0})$ for the first independent stream, where all the inner values are 32-bit, so the $k$ in $(⋯)−r=m_{0}k$ will also be around 32-bit.

$m_{0}$ and $m_{1}m_{2}$ are very close, their difference is only $102_{2}$, so then the idea is as follows: pretend the modulus is some $m_{′}$ between theese two values and then we have ourselves a good old lattice problem. The inaccuracy introduced in each constraint will be $≈32+g_{2}102_{2}−1$, but the modulus is ~64 bits, so we‘re still getting a handful of good bits of state per equation.

And a final note is that the first 3 outputs are just combinations of
the original random values, meaning they are known to be accurate up to
1 overflow, so they give a a ton of extra information. Likewise the
fourth output has higher accuracy than the rest due to the values in
`A`

being only 20 bits.

Now, factoring in the modulus $M$ I ignored earlier, I didn‘t find a very elegant way to do this, but brute forced it for the first four values, and simply ignored it for the remaining constraints since the error it introduces there is less than the error introduced by the pseudo-modulus $m_{′}$.

In the end I had some big, ugly lattice that is too ugly to reproduce here, but it succeeded in finding all the original 9 seed values given the correct factor of $M$ for the first four output values (so 16 lattices).

```
b'pbctf{Wow_how_did_you_solve_this?_I_thought_this_is_super_secure._Thank_you_for_solving_this!!!}'
```

## Yet Another RSA

When I first glanced at this problem I thought it was some weird elliptic curve thing and that the title was a troll. I immediately became very suspicious that it would be a Google-the-paper for one of those weird cryptosystems that academics and grad students pump out. “Diffie-Hellman over non-uniform smooth Titmann-Buttworth groups of semi-regular order“ and then the incestuous follow ups with “Analysis of …“ and “Attack against …“. (If I sound bitter it‘s only because I‘m jealous.)

So, OK, the problem…is to find the logarithm of a number under some scarily complicated group operation. All arithmetic is performed modulo some RSA-number $n=pq$. My initial glance quickly proved wrong, it definitely wasn‘t an elliptic curve. The group has apparent order $(p_{2}+p+1)(q_{2}+q+1)$ whose representation is given as two numbers in $Z_{n}∪None$ (or something like that, anyway). The primes used have the special form $a_{2}+3b_{2}$, the private exponent is suspiciously low, and so on. Tons and tons of red flags screaming “Google me.“

But first thing I did was to simplify. I looked at the case where the modulo was a single prime and tried (in vain) to reason about what the hell the group operation “did“ to points geometrically or visually by looking at easy stuff like $(1,2)∗(2,1),(−1,2)∗(1,2),(−2,1)∗(1,−2),etc$ and expressing the results in human-readable p-fractions (e.g. showing each coordinate as $dn (modp)$ when such $n,d$ of small absolute value can be found easily). It wasn‘t particularly enlightening.

I tried to Google-the-paper at this point but didn‘t find anything promising so I just started a Wikipedia-hole instead. I came across the projective linear group and duh, the rather obvious $p_{2}+p+1=p−1p_{3}−1 $ finally hit me. Thus I figured it was modelling some operation over the projective plane (thus all the “fractions“), and from the clue of $p_{3}−1$ I carefully re-examined the group operation while thinking about $F_{p_{3}}$ and yes indeed, it was modelling multiplication over ℤ~pq~[X]/(X^3^-2)! (Where $projective≈monic$, to really abuse math terms.)

I also wrote a quick utility class for playing with this group properly. (Here modified with stuff I discovered below.)

```
def mk_point(N, r):
display = lambda x: min(x-N, x, key=abs)
@dataclass(eq=False, unsafe_hash=True)
class _pt:
x: int
y: int
z: int = 1
def __iter__(self):
return iter((self.x, self.y, self.z))
def __eq__(self, other):
return tuple(self @ other.z) == tuple(other @ self.z)
def __add__(self, other):
px,py,pz = self
qx,qy,qz = other
return _pt((px*qx + (py*qz + pz*qy)*r) % N,
(px*qy + py*qx + r*pz*qz) % N,
(py*qy + px*qz + pz*qx) % N)
def __rmul__(self, n):
return self * n
def __mul__(self, n):
return generic_pow(_pt.__add__, _pt(1,0,0), self, n)
def pell(self):
x,y,z = self
return (x**3 + r*y**3 + r**2*z**3 - 3*r*x*y*z) % N # == 1
def __neg__(self):
return NotImplemented
def __matmul__(self, k):
return _pt(self.x*k % N, self.y*k % N, self.z*k % N)
def __repr__(self):
if self.z == 0:
if self.y == 0:
return '<INFPOINT>'
return f'<INFLINE {display(mod_inv(self.y, N)*self.x%N)}>'
iz = mod_inv(self.z, N)
return f'<{display(self.x*iz%N)} : {display(self.y*iz%N)}>'
return _pt
```

With this information I was able to Google-the-paper much better. I
spent a lot of distracted time on an interesting paper called *A Group
Law on the Projective Plane with Applications in Public Key
Cryptography* (2020), but it didn‘t go anywhere toward a solution on
this problem. But thinking about the special form of the primes, and
Pell‘s equation, I found *A novel RSA-like cryptosystem based on a
generalization of Redei rational functions* (2017) using cubic Pell.
Yup, there it was: everything.

Oh yeah, back there I was also trying to look for invariants to find the
curve in $A$ it was following, as I figured there would be
one(?). I checked all sorts of quadratic forms, some cubics, but never
found it. No wonder, because as the paper above the curve (cubic Pell)
for this particular instance turns out to be: $x_{3}+2y_{3}−6xy+3$.
Jesus. (To be fair, that does mean it‘s easy to find a point, namely
`(1,1)`

!)

```
Pt = mk_point(900397, 2)
P = Pt(1,1)
Q = P * 1337
assert P.pell() == Q.pell()
```

I mean it‘s cool…but for nothing?

This paper also makes bold claims about how resistant it is to various
classical attacks et cetera, but then the citations immediately leads to
another paper (*Classical Attacks on a Variant of the RSA Cryptosystem*)
with a rather dry counter:

They claimed that the classical small private attacks on RSA such as Wiener’s continued fraction attack do not apply to their scheme. In this paper, we show that, on the contrary, Wiener’s method as well as the small inverse problem technique of Boneh and Durfee can be applied to attack their scheme.

In the end it was super anticlimactic because the whole thing was a bit ugh. The small inverse attack of course turns out to just be simple algebra. Duh. I might have figured it out on my own, but due to all the rabbit holes above, all the wishful thinking about how there was something cool about the group, the mental fatigue set in and I didn‘t even bother looking at the plain small-$d$ algebra.

I‘m still convinced the special form of the primes leads to a backdoor in this cubic Pell group though. I mean, it has to, right? Like, why else? Why?

This task took the longest, like God knows how many hours, a day‘s amount of “work.“ But in the end didn‘t feel worth it.

```
b'pbctf{I_love_to_read_crypto_papers_\x04+\x81\xf4-Th)Gj2m\x95\xc7\xd5\xe9\x8cZ\xaa\xcei\xc8u\xb3\xc3\x95\x9f\xdep\xae4\xcb\x10\xbdo\xd5\x83\x9c\xca\x1b3\xdee\xef\x89y\x07w"^\x1ez\x96\xb1\x1a\xd2\x9d\xc6\xfd\x1b\x8e\x1fz\x97\xba \x00\xf7l\xd4Yv\xb0\xd8\xb8\x0e\xf4\x93\xa4\x9fB\x97\xab\xd3eD\xa8\xc9\xa7x\x90r'
b"and_implement_the_attacks_from_them}\xfb\x03\\\xdd\x9ch\x14\x89\x1d]\xfdf\xa8R\x81s\xf0\xbf\xfb\xa0\xe1\x90\xcfd\x82\xb4\xa5\x0b\x02\xc4r\x00wb|^\xd3\xf4\xb0N\xec\xf52\xe1\xb7\x9bF\x8dzW\xcbQ\xf3\xb7\xe7\x81N\x1e\\\xfb\x1c:\xbb'\x11\xadQ.\x8e [,\xdee\xd7\x86\x95\x1ff\x18\x16u\xe4\x95jPcn{\x9f"
```

Ehh.

(Edit/addendum: OK, after spoiling myself and reading other writeups
etc., it‘s possible the small-d stuff was the intention. I have a
theory: the problem author probably/might have just come across the
attack-paper above and thought it would be cool in and by itself, some
classics against something novel, but didn‘t consider that the solvers
would be so taken in by the cool group stuff, i.e. the novelty, that
then coming back down to *blah blah…Coppersmith…blah blah* in the
end would be a disappointment?)

What I call exploratory mathematics, or experimental mathematics, especially when involving classical number theory.↩