-
Notifications
You must be signed in to change notification settings - Fork 744
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(r2): backtrace and control flow #1213
base: dev
Are you sure you want to change the base?
Conversation
examples/extensions/r2/hello_r2.py
Outdated
@@ -35,6 +35,7 @@ def my_sandbox(path, rootfs): | |||
ql.hook_address(func, r2.functions['main'].offset) | |||
# enable trace powered by r2 symsmap | |||
# r2.enable_trace() | |||
r2.bt(0x401906) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need an extra argument?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The extra argument is the target address for setting the bt_hook
, so we can print backtrace without emu_error()
. I will test _backtrace_fuzzy
without argument in emu_error()
insts = [Instruction(**dic) for dic in self._cmdj(f"pdj {n} @ {addr}")] | ||
return insts | ||
|
||
def _backtrace_fuzzy(self, at: int = None, depth: int = 128) -> Optional[CallStack]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More arch support
I am failing to understand why you had to change the memory module.. |
By changing the memory module, Qiling and r2 will share the same memory and thus solve the previously mentioned memory sync problem. |
Pfff.. thigs are getting more and more complicated. Having the memory content being managed in Python code is not a great idea.. I guess that is going to come with a great performance hit and tons of bugs. As a first exmaple, the memory range splitting is handling it wrong (ranges are splitted, but their data isn't). |
For performance, I don't think it will have any performance overhead as we are passing the raw pointer to Unicorn. The only difference is that now the memory is allocated by Python. @chinggg maybe you could do a small benchmark on For bugs, the memory splitting handling is buggy indeed (and probably results in the CI crash @chinggg) but can be fixed easily. I don't think it will introduce too many bugs as we have enough tests. Note even unicorn wrote the memory split on its own instead of using any qemu code. |
I know little about Unicorn internal and cannot find methods to pass uc mem to r2, so we can either allocate memory in r2 or Python code if we want to sync memory between them. I admit it is weird to deal with memory management stuff in Python since that should be the strength of Unicorn. But it seems to be more acceptable than to malloc in r2 after making PoC of both approaches. |
d003c2e
to
25f6ad7
Compare
in addition to 'invalid' instruction
3cd7ad8
to
99d21af
Compare
return | ||
|
||
if mem_info is not None: | ||
self.map_info[info_idx] = (tmp_map_info[0], tmp_map_info[1], tmp_map_info[2], mem_info, tmp_map_info[4]) | ||
self.map_info[info_idx] = (tmp_map_info[0], tmp_map_info[1], tmp_map_info[2], mem_info, tmp_map_info[4], tmp_map_info[5]) | ||
|
||
def get_mapinfo(self) -> Sequence[Tuple[int, int, str, str, str]]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Data is missing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing out this, though now there is no existing usage of change_mapinfo
that passes data as argument. When code execute into this branch, it is likely to be a change of label without any modification of perm and content since the last if body will return
b6f23fc
to
1ce1ad0
Compare
It seems now there leaves only two errors in |
BREAKING CHANGE: mem is managed in Python instead of uc BREAKING CHANGE: MapInfoEntry now has 6 elements instead of 5 BREAKING CHANGE: r2 map io from ql.mem, no full binary, now missing symbols BREAKING CHANGE: del_mapinfo and change_mapinfo recreate and remap mem Add unit tests for ql mem operations Also fix potential bug in syscall_munmap
data = self.read(lbound, ubound - lbound) # read instead of using data from map_info to avoid error | ||
mem_dict['ram'].append((lbound, ubound, perm, label, data)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, data in map_info
will be different from the return value of uc.read
here, which causes error in ql.mem.save/restore, so I made a hack here by reading mem again to avoid errors temporarily. But it seems to cover some unseen inconsistency between map_info
and uc.mem
, which could potentially be the source of other errors.
BUG: mips32 uc map 0x9000000 become 0x1000000
I add
I find that execution may still be successful if these assertions are removed, since there is only bytes level difference between uc mem and ql map_info according to the dump result. |
c_byte to c_ubyte to avoid error when r_buf_new
Checklist
Which kind of PR do you create?
Coding convention?
Extra tests?
Changelog?
Target branch?
One last thing
Now I have implemented callstack and backtrace, though the features have not been fully tested yet. I am currently working on deflat as a show case of control flow analysis ability