SIGSEGV when using QemuForkExecutor in "arm" feature, and Unknown error: Unix error: ECHILD #2632

RongxiYe · 2024-10-25T08:26:19Z

The issue to be present in the current main branch

$ git log | head -n 1
commit dfd5609c10da85f32e0dec74a72a432acd85310a

Describe the issue
I am doing some fuzzing practice using an tenda VC 15 router httpd, which is 32-bit arm architecture. I use a QemuForkExecutor, but got an error when load the initial inputs:

Failed to load initial corpus at ["./seed/"]

I print the error,

if state.must_load_initial_inputs() {
    state
        .load_initial_inputs(&mut fuzzer, &mut executor, &mut mgr, &intial_dirs)
        .unwrap_or_else(|a| {
             println!("{}", a);
             println!("Failed to load initial corpus at {:?}", &intial_dirs);
             process::exit(0);
         });
    println!("We imported {} inputs from disk.", state.corpus().count());
 }

and it says:

Unknown error: Unix error: ECHILD

I debug the fuzzer, and find out that the fuzzer receives a SIGSEGV in trace_edge_hitcount_ptr:

   715 pub unsafe extern "C" fn trace_edge_hitcount_ptr(_: *const (), id: u64) {
   716     unsafe {
   717         let ptr = LIBAFL_QEMU_EDGES_MAP_PTR.add(id as usize);
 ► 718         *ptr = (*ptr).wrapping_add(1);
   719     }
   720 }
   
pwndbg> p ptr
$1 = (*mut u8) 0x4d55bbdb022cd456
pwndbg> p *ptr
Cannot access memory at address 0x4d55bbdb022cd456

It seems that the value of ptr cannot be dereferenced. I know that this function is used to record the coverage, but I don't know what "id" or "ptr" mean. So I read the related instrumentation code in qemu-libafl-bridge.

//$ git log | head -n 1
//commit 805b14ffc44999952562e8f219d81c21a4fa50b9

// in accel/tcg/cpu_exec.c, cpu_exec_loop
//// --- Begin LibAFL code ---

            bool libafl_edge_generated = false;
            TranslationBlock *edge;

            /* See if we can patch the calling TB. */
            if (last_tb) {
                // tb_add_jump(last_tb, tb_exit, tb);

                if (last_tb->jmp_reset_offset[1] != TB_JMP_OFFSET_INVALID) {
                    mmap_lock();
                    edge = libafl_gen_edge(cpu, last_tb->pc, pc, tb_exit, cs_base, flags, cflags);
                    mmap_unlock();

                    if (edge) {
                        tb_add_jump(last_tb, tb_exit, edge);
                        tb_add_jump(edge, 0, tb);
                        libafl_edge_generated = true;
                    } else {
                        tb_add_jump(last_tb, tb_exit, tb);
                    }
                } else {
                    tb_add_jump(last_tb, tb_exit, tb);
                }
            }

            if (libafl_edge_generated) {
                // execute the edge to make sure to log it the first execution
                // the edge will then jump to the translated block
                cpu_loop_exec_tb(cpu, edge, pc, &last_tb, &tb_exit);
            } else {
                cpu_loop_exec_tb(cpu, tb, pc, &last_tb, &tb_exit);
            }

            //// --- End LibAFL code ---

My understanding is: if a new translation block is generated by libafl_gen_edge, it is executed first, and then it is recorded on the coverage graph by jumping to trace_edge_hitcount_ptr through the hook. (I use StdEdgeCoverageChildModule, and I remember it used the edge type hook.)
Also, I debugged this part of codes. Considering the contents of the TranslationBlock structure, I found the specific contents of the edge variable:

// edge->tc.ptr
pwndbg> p/x *itb
$7 = {
  pc = 0x40a23030,
  cs_base = 0x480,
  flags = 0x0,
  cflags = 0x800010,
  size = 0x1,
  icount = 0x1,
  tc = {
    ptr = 0x710ee4e00740,
    size = 0x38
  },
  itree = {
    rb = {
      rb_parent_color = 0xfec7058d4840804b,
      rb_right = 0x48fffff959e9ffff,
      rb_left = 0x4de9fffffebd058d
    },
    start = 0x40a23030,
    last = 0xffffffffffffffff,
    subtree_last = 0x0
  },
  jmp_lock = {
    value = 0x0
  },
  jmp_reset_offset = {0x20, 0xffff},
  jmp_insn_offset = {0x1c, 0xffff},
  jmp_target_addr = {0x710ee4e00500, 0x0},
  jmp_list_head = 0x710ee4e002c0,
  jmp_list_next = {0x0, 0x0},
  jmp_dest = {0x710ee4e00440, 0x0}
}

pwndbg> x/16x 0x710ee4e00740
0x710ee4e00740 <code_gen_buffer+1811>:  0x3456be48      0x43f7dbc5      0xbf484d55      0x7f076fa0

Note the value of tc.ptr here. It is <code_gen_buffer+1811>. The machine code it points to is 0x43f7dbc53456be48, and gdb told me it means movabs rsi, 0x4d5543f7dbc53456.
While tracing the code flow later, I found that the fuzzer jumped to a small section of code hook to prepare parameters(moving to rdi and rsi), and then jumped to trace_edge_hitcount_ptr.

   0x5a457a1901df <cpu_exec_loop.isra+783>    mov    r12, qword ptr [r8 + 0x20]
   0x5a457a1901e3 <cpu_exec_loop.isra+787>    test   eax, 0x120
   0x5a457a1901e8 <cpu_exec_loop.isra+792>    jne    cpu_exec_loop.isra+1720     <cpu_exec_loop.isra+1720>
   0x5a457a1901ee <cpu_exec_loop.isra+798>    lea    rax, [rip + 0x3d2c0cb]         RAX => 0x5a457debc2c0 (tcg_qemu_tb_exec) —▸ 0x710ee4e00000 ◂— push rbp /* 0x5641554154415355 */

// R12 is 0x710ee4e00740 (code_gen_buffer+1811) ◂— movabs rsi, 0x4d5543f7dbc53456 /* 0x43f7dbc53456be48 */

   0x710ee4e00000                           push   rbp
   0x710ee4e00001                           push   rbx
   0x710ee4e00002                           push   r12
   0x710ee4e00004                           push   r13
   0x710ee4e00006                           push   r14
   0x710ee4e00008                           push   r15
   0x710ee4e0000a                           mov    rbp, rdi        RBP => 0x5a457f038920 ◂— 0x123fb400000000
   0x710ee4e0000d                           add    rsp, -0x488     RSP => 0x7ffffcc93560 (0x7ffffcc939e8 + -0x488)
   0x710ee4e00014                           jmp    rsi                         <code_gen_buffer+1811>
    ↓   
   0x710ee4e00740 <code_gen_buffer+1811>    movabs rsi, 0x4d5543f7dbc53456     RSI => 0x4d5543f7dbc53456
   0x710ee4e0074a <code_gen_buffer+1821>    movabs rdi, 0x5a457f076fa0         RDI => 0x5a457f076fa0 ◂— 0
►  0x710ee4e00754 <code_gen_buffer+1831>    call   qword ptr [rip + 0x16]      <libafl_qemu::modules::edges::trace_edge_hitcount_ptr>
        rdi: 0x5a457f076fa0 ◂— 0
        rsi: 0x4d5543f7dbc53456

This seems to indicate that the number following movabs rsi, will become the id. But the values I have here don't look right.

My issues now are as follows:

What does id actually represent?
How is it calculated?
How can I solve this problem?
Do I need to provide any additional information?

Thank you very much!

The text was updated successfully, but these errors were encountered:

RongxiYe · 2024-10-28T02:47:49Z

Hi, I already know that id is generated through libafl_qemu_hook_edge_gen->create_gen_wrapper->gen_hashed_edge_ids(in StdEdgeCoverageChildModule). Now I am debugging this part of code...

RongxiYe · 2024-10-28T03:37:54Z

I found the process of calculating id and the intermediate value. The calculated id is indeed 0x4d5543f7dbc53456. Do you think there is a problem?

// src is 0x40a23030, dest is 0x40a23058
*RAX  0x4a265dc83567e8c8 hash_me(src)
*RAX  0x7731e3feea2dc9e  hash_me(dest)

 ► 0x5af412b8763e <libafl_qemu::modules::edges::gen_hashed_edge_ids+174>    xor    rax, rcx                        RAX => 0x4d5543f7dbc53456 (0x4a265dc83567e8c8 ^ 0x7731e3feea2dc9e)

Considering that LIBAFL_QEMU_EDGES_MAP_PTR is 0x761cc2856000, maybe it causes the SIGSEGV because it exceeds its range after the addition?

   715 pub unsafe extern "C" fn trace_edge_hitcount_ptr(_: *const (), id: u64) {
   716     unsafe {
   717         let ptr = LIBAFL_QEMU_EDGES_MAP_PTR.add(id as usize);
 ► 718         *ptr = (*ptr).wrapping_add(1);
   719     }
   720 }
$25 = 0x4d5543f7dbc53456
pwndbg> p/x LIBAFL_QEMU_EDGES_MAP_PTR
$26 = 0x761cc2856000

RongxiYe changed the title ~~SIGSEGV when using QemuForkExecutor in "arm" feature~~ SIGSEGV when using QemuForkExecutor in "arm" feature, and Unknown error: Unix error: ECHILD Oct 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SIGSEGV when using QemuForkExecutor in "arm" feature, and Unknown error: Unix error: ECHILD #2632

SIGSEGV when using QemuForkExecutor in "arm" feature, and Unknown error: Unix error: ECHILD #2632

RongxiYe commented Oct 25, 2024

RongxiYe commented Oct 28, 2024

RongxiYe commented Oct 28, 2024

SIGSEGV when using QemuForkExecutor in "arm" feature, and Unknown error: Unix error: ECHILD #2632

SIGSEGV when using QemuForkExecutor in "arm" feature, and Unknown error: Unix error: ECHILD #2632

Comments

RongxiYe commented Oct 25, 2024

RongxiYe commented Oct 28, 2024

RongxiYe commented Oct 28, 2024