the cr0wn

The UK’s Top Competitive Cyber Security Team

HSCTF 2020: Emojis (misc)

For this challenge we had to reverse a script written in an odd esolang based entirely on emojis:

emojis.txt

TLDR

Step 1: What Do the Emojis Mean Mason?

This emoji language is described at esolangs, called β€œEmoji-gramming” or simply β€œπŸ’»β€, it’s a language with 24 registers, addition, subtraction and conditional flow control. Helpfully, the author of this language has kindly provided an interpreter we can use to get a feel of things. Lemme just open it up… Oh god. It’s like magic number soup crossed with an obfuscation tier challenge. Well, we’ll just treat that as a reference implementation black box and do our own thing.

First things first, lets convert this emoji script hellscape into something a bit more human readable. Writing a disassembler (code at end) we produce the following output:

ADDR    OPCODE LHS    RHS   ORIGINAL
========================================
000:    MOV    r0     IN    (πŸ˜Šβ™ˆπŸŽ€)
001:    MOV    r1     IN    (πŸ˜Šβ™‰πŸŽ€)
002:    MOV    r2     IN    (πŸ˜Šβ™ŠπŸŽ€)
003:    MOV    r3     IN    (πŸ˜Šβ™‹πŸŽ€)
004:    MOV    r4     IN    (πŸ˜Šβ™ŒπŸŽ€)
005:    MOV    r5     IN    (πŸ˜Šβ™πŸŽ€)
006:    MOV    r6     IN    (πŸ˜Šβ™ŽπŸŽ€)
007:    MOV    r7     IN    (πŸ˜Šβ™πŸŽ€)
008:    MOV    r8     IN    (πŸ˜Šβ™πŸŽ€)
009:    MOV    r9     IN    (πŸ˜Šβ™‘πŸŽ€)
010:    MOV    r10    IN    (πŸ˜Šβ™’πŸŽ€)
011:    MOV    r11    IN    (πŸ˜Šβ™“πŸŽ€)
012:    MOV    r21    IN    (πŸ˜ŠπŸ•™πŸŽ€)

013:    IFEQ   r1     r21   (πŸ˜΅β™‰πŸ•™)
014:    SUB    r11    r0    (πŸ˜ˆβ™“β™ˆ)
015:    IFEQ   r1     r5    (πŸ˜΅β™‰β™)
016:    MOV    r2     r7    (πŸ˜Šβ™Šβ™)
017:    SUB    r1     r4    (πŸ˜ˆβ™‰β™Œ)
018:    ADD    r9     r6    (πŸ˜‡β™‘β™Ž)
019:    IFEQ   r4     r9    (πŸ˜΅β™Œβ™‘)
020:    MOV    r12    r3    (πŸ˜ŠπŸ•β™‹)
021:    SUB    r2     8     (πŸ˜ˆβ™ŠπŸ’–)
022:    MOV    r3     r21   (πŸ˜Šβ™‹πŸ•™)
023:    MOV    r21    r12   (πŸ˜ŠπŸ•™πŸ•)
024:    ADD    r1     r7    (πŸ˜‡β™‰β™)
025:    SUB    r1     r3    (πŸ˜ˆβ™‰β™‹)
026:    ADD    r0     r11   (πŸ˜‡β™ˆβ™“)
027:    ADD    r2     4     (πŸ˜‡β™ŠπŸ’ž)
028:    ADD    r3     2     (πŸ˜‡β™‹πŸ’•)
029:    IFEQ   r3     r5    (πŸ˜΅β™‹β™)
030:    MOV    r17    IN    (πŸ˜ŠπŸ••πŸŽ€)
031:    IFEQ   r4     r9    (πŸ˜΅β™Œβ™‘)
032:    ADD    r4     r9    (πŸ˜‡β™Œβ™‘)
033:    ADD    r11    1     (πŸ˜‡β™“πŸ’œ)
034:    SUB    r10    8     (πŸ˜ˆβ™’πŸ’–)
035:    ADD    r7     r8    (πŸ˜‡β™β™)
036:    SUB    r5     r6    (πŸ˜ˆβ™β™Ž)
037:    IFEQ   r10    4     (πŸ˜΅β™’πŸ’ž)
038:    ADD    r6     r8    (πŸ˜‡β™Žβ™)
039:    ADD    r8     8     (πŸ˜‡β™πŸ’–)
040:    SUB    r0     r2    (πŸ˜ˆβ™ˆβ™Š)
041:    SUB    r4     r11   (πŸ˜ˆβ™Œβ™“)
042:    ADD    r2     r2    (πŸ˜‡β™Šβ™Š)
043:    SUB    r7     r11   (πŸ˜ˆβ™β™“)
044:    IFEQ   r10    0     (πŸ˜΅β™’πŸ’”)
045:    SUB    r9     r1    (πŸ˜ˆβ™‘β™‰)

046:    MOV    OUT    r0    (πŸ˜ŠπŸ“’β™ˆ)
047:    MOV    OUT    r1    (πŸ˜ŠπŸ“’β™‰)
048:    MOV    OUT    r2    (πŸ˜ŠπŸ“’β™Š)
049:    MOV    OUT    r3    (πŸ˜ŠπŸ“’β™‹)
050:    MOV    OUT    r4    (πŸ˜ŠπŸ“’β™Œ)
051:    MOV    OUT    r5    (πŸ˜ŠπŸ“’β™)
052:    MOV    OUT    r6    (πŸ˜ŠπŸ“’β™Ž)
053:    MOV    OUT    r7    (πŸ˜ŠπŸ“’β™)
054:    MOV    OUT    r8    (πŸ˜ŠπŸ“’β™)
055:    MOV    OUT    r9    (πŸ˜ŠπŸ“’β™‘)
056:    MOV    OUT    r10   (πŸ˜ŠπŸ“’β™’)
057:    MOV    OUT    r11   (πŸ˜ŠπŸ“’β™“)
058:    MOV    OUT    r21   (πŸ˜ŠπŸ“’πŸ•™)

Opcodes:

MOV: LHS = RHS
or
MOV: LHS = user_input_character
or
MOV: output RHS

ADD: LHS = LHS + RHS
SUB: LHS = LHS - RHS
IFEQ: if LHS == RHS then skip next line else NOP

That makes things a bit clearer. There are 3 phases:

  1. Input Flag (13chars)
  2. Mess around the flag bytes
  3. Output the flag (13chars) (Given output: xB^r_En}INc4v)

At this point in a regular rev challenge I would reach for angr and let it do it’s thing, sadly, that’s not possible here… Or is it? My solution: write a shitty angr clone!. We can use z3 to symbolically pipe input into the challenge, and track the operations done on them before being output. Then at the end we constrain our symbolic outputs to the given outputs and we can then solve for the flag input! Just like real Angr!

An example end state showing how input flag characters are tracked and their constraints:

out[0]  = flag_in[0] + flag_in[11] - (flag_in[2] - 8 + 4)
out[1]  = flag_in[1] - flag_in[4] + flag_in[7] - flag_in[12]
out[2]  = flag_in[2] - 8 + 4 + flag_in[2] - 8 + 4
out[3]  = flag_in[12] + 2
out[4]  = flag_in[4] + flag_in[9] + flag_in[6] - (flag_in[11] + 1)
out[5]  = flag_in[5] - flag_in[6]
out[6]  = flag_in[6] + flag_in[8]
out[7]  = flag_in[7] + flag_in[8] - (flag_in[11] + 1)
out[8]  = flag_in[8] + 8
out[9]  = flag_in[9] + flag_in[6] - (flag_in[1] - flag_in[4] + flag_in[7] - flag_in[12])
out[10] = flag_in[10] - 8
out[11] = flag_in[11] + 1
out[12] = flag_in[3]

Constraints, notice how they match the IFEQ jumps in the disassembly above:

flag_in[1] == flag_in[12]
flag_in[1] == flag_in[5]
flag_in[4] != flag_in[9] + flag_in[6]
flag_in[4] != flag_in[9] + flag_in[6]
flag_in[10] - 8 != 4
flag_in[10] - 8 != 0
flag_in[12] + 2 != flag_in[5]

Excitedly running this new tool we get:

➜  emoji git:(master) βœ— python soln.py ./emoji.txt sim
[*] number of output states = 128
[-] Expected output isn't possible :(

HUH?! WHAT?? z3 thinks there isn’t a possible input that could produce this output. Well that can’t be right?

Broken Challenge

Convinced we’d made a mistake in the fancy solve script we went back to basics and worked backwards from the known state to the previous state, eventually finding the flag by luck, force, and guessing:

FLAG: flag{tr3v0r_pAck3r}

I wonder why we couldn’t solve this using z3? Well if we plug this real flag into the black box script from the esolang’s wiki WE GET A DIFFERENT OUTPUT

Actual flag output = x@^tΒΎ\x13\xa0}I\x82c4v ascii:(120, 64, 94, 116, 190, 19, 160, 125, 73, 130, 99, 52, 118)
GIVEN flag output = xB^r_En}INc4v ascii:(120, 66, 94, 114, 95, 69, 110, 125, 73, 78, 99, 52, 118)

That explains why z3 thought our output was unsatisfiable, it’s because it was. Re-running with the proper output z3 happily gives us a satisfied model:

➜  emoji git:(master) βœ— python soln2.py ./emoji.txt sim
[*] number of states = 128
[*] Got one!
flag_in[0] = t
flag_in[1] = r
flag_in[2] = 3
flag_in[3] = v
flag_in[4] = 0
flag_in[5] = r
flag_in[6] = _
flag_in[7] = p
flag_in[8] = A
flag_in[9] = c
flag_in[10] = k
flag_in[11] = 3
flag_in[12] = r
flag_in[13] = !

We could have been done so much sooner. 1/10.

TLDR

Broken challenge, the given output is wrong. Use manual spreadsheet techniques to get partway to the flag then guess until the scoreboard says you’re right. sigh.

Solution script

Features:
Emoji-gramming Disasembler
Symbolic Solver
Bonus: Includes a Emoji-gramming interpreter! Pray tell it may be useful in future CTFs.