-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ghidra #3
Comments
I already tried nightly Ghidra (there are some repos with prebuilt windows binaries), sadly, |
I was able to get a nightly build partially working. The batch import didn't work properly, and it also doesn't support a few kinds of relocations needed, but I was able to load a few object files manually. There's a fork of Ghidra that has a few more updates to the RISCV module, but I haven't had the chance to check it out yet: https://github.com/mumbel/ghidra/tree/riscv |
I've used the out-of-tree version here: https://github.com/mumbel/ghidra_riscv Seems to work okay, even if it doesn't have the exact ISA. In general, I'd recommend just giving Ghidra an ELF from the build instead of trying to get it to digest the raw object files. This is the built processor, extract to $GHIDRA_DIR -> Ghidra -> Processors: |
Ah okay, I didn't realize that version existed. That's much cleaner. I guess the reason I'd prefer reversing straight from the objects is so that we can more easily isolate the behavior of each API function. Maybe the goal is not so much to duplicate their API in C as it is to identify how to interact with the radios? I guess I'd like to see a clear goal established when it comes to the RE work. |
@micahswitzer This is very good question. Although, I think the most easiest way of testing the setup if our RE'ed implementation works, is just reimplement their API. |
I got 3 BL602 boards on its way to me, i will make them remotely available if there are things we want to execute on real hardware. I also got a SDR that i can hook up to look at what happens in the spectrum when we poke at registers. |
@maidenone well, how you want to deal with flashing? AFAIK, the flash tools are closed-sourced. |
have not thought about that. but given that it is a SiFive E25 core that uses JTAG, making OpenOCD talk with it should not be that hard? I have poked around with OpenOCD and BMP code before to add new targets. |
My understanding of ghidra's decompiler is that it is written in C++ and doesn't depend on Java at all; but the linked repo appears to use Java..? I don't (yet) have Ghidra, nor Java, could someone present some decompilation samples over the objects in this repository? Yesterday, I experimented with adding RISC-V support to r2dec. The output is a naive translation of assembly to a pseudo-c; arguably not much better than the assembly itself. The result could be greatly improved with post-processing but r2dec isn't really designed for data-flow transformations, so this would be limited to trivial cases. This might be enough but I'd rather invest time in a new decompiler, which can use deep data-flow analysis to simplify the result. I'd start with RVSDG, an optimization-friendly data-flow intermediate representation in SSA-form without the total order of control-flow graphs. Such a decompiler could naturally be repurposed as an optimizing recompiler. Would anyone be interested in either working on improving an existing decompiler or working towards a new one? |
Yes, the decompiler itself is written in C++. However, the processor specifications are written in a DSL called Pcode, and the code that tells Ghidra how to load platform specific relocations and DWARF information is written in Java. So yes, you can run the decompiler without Java (I believe radare can do that), but it's much more useful if you use it within the context of Ghidra with all of the tools that Ghidra has to offer. That being said, the RISCV module for Ghidra is not quite production ready. In my incredibly brief testing, I noted that most of the non-trivial relocation types were not implemented. I also read that there were some other issues with the pcode that caused Ghidra to misinterpret the meaning of the assembly (not the disassembly itself). I'm relatively new to the RISC-V ISA, but I'd be willing to see if I could at least implement the missing relocations which would greatly improve the usability of Ghidra for this project. I'm also willing to lend another set of eyes on such an effort should someone else with more experience want to tackle this issue with another RE platform. |
I might be misunderstanding, but since the ELF here is built purely so the code can be pulled out of it for a raw flash image, it won't have any relocations - there isn't any code in the ROM that could load them anyway. So while the Ghidra RISCV processor doesn't support a lot of them, that doesn't matter if you load the ELF into it? |
You are correct that there will be no relocations in the final binary loaded into ROM. However, since the library code references other internal functions and data structures, relocations are necessary to allow for flexibility during the final linking step. I think you may be suggesting that we could simply compile and link a sample application which we could then RE since it would no longer have any relocations. If so, issue #6 suggests the same thing. |
yes, plaease use the However, it seems that the decompile result contains some problem. as far as I can read, some memory r/w are missing.. i can see them in the assembly, but they disappear in decompile. |
Reko, a capstone-based decompiler, might be a candidate. Unfortunately it doesn't seem to understand at least RISC-V ELF relocations. I don't know how much work is needed. |
if you come across any disassembly/instruction issues (I just fixed a bug in edit: this was done in my side time, and I haven't used this module extensively so there are likely bugs, if anything looks off, comments are welcome, and hopefully 9.2 will be out soon so you don't have to build ghidra, but would for sure at least use my |
The first cond should be
as decompiled by ghidra. This is actually an inlined funciton, and should have a name like
All other |
@Yangff Could you upload ghidra's decompilation results for the 3 ELFs for all who don't have ghidra setup? |
yes, let me try. |
@Yangff I think the problem with disappearing writes is that Ghidra doesn't know about the memory-mapped PHY stuff. I fixed that by adding it to the Memory Map with Start 0x44c00000, Size 0xd000 and marking it Read/Write+Volatile. Check mdm_reset and you should then see two distinct writes instead of the previous coalesced one (which wouldn't have worked to reset the thing). I've also attached my notes on the various PHY registers there: |
Yes, @stschake could you create a PR with that text file? I think it would be incredibly useful to keep a running list of registers and their functions as we continue to RE the blobs. |
I've sent pine64/bl602-docs#18 There is another mmio peripheral at 0x44b00000 (till ~0x44b09000) that has what the firmware calls mm or MAC management. |
The binaries @WildCryptoFox provided have exposed some bugs in Reko's Risc-V disassembler, specifically the decoding of Risc-V compressed instructions. I'm working on fixes and will have something by end of today. |
I've implemented the relocations necessary to load the raw libraries/objects into Ghidra. I'm not 100% sure it works correctly, but everything was looking nice in my testing. I've attached a build of my version of the extension here. If you find any issues with relocations specifically, you can open an issue on my fork here. Now that I have them working, I can finally start doing some actual RE! |
@micahswitzer nice, that's a decent amount of relocations (not sure why I left all those TODO comments for the ones I implemented, maybe they were untested). Hadn't come across the need to handle unlinked ELFs until now, guessing most of those go away after linking, or did you see some unimplemented in the linked demos as well? I'd fork NSA's ghidra repo and submit a PR for the new additions, probably wouldn't make it into 9.2 (which should be released soon reading their comments about it), but at least 9.2.1 hopefully. Not sure when you forked mine, but what is currently in my ghidra_riscv is what is in ghidra repo (just in tree). |
9.2 was released today, which includes RISC-V support |
@mumbel I just saw that. Great work on that feature, it will be very useful for this project! I will probably spend some time this weekend cleaning up my code so that I can submit a PR as you suggested. |
FYI: https://github.com/pine64/bl602-docs/tree/main/hardware_notes#rf-ip I found most of the code as source deep in various SDKs that people posted. As far as I can see most functions are in there and for all functions where I checked the behavior in the blob is the same as in the source. |
So given the source discovery, I'm not sure if any decompiling is still needed (I want to contribute but I'm having a hard time figuring out what exactly the intermediate goals are), but FWIW, you can add the 3 ROM sections mentioned in the link above to the memory map in Ghidra, and you can load a slightly modified version of the SVD with a slightly modified version of https://leveldown.de/blog/svd-loader/ (just comment out the sys.exit() call for non-cortex-m cpus), and it appears to work OK. |
Hi @rpavlik , at the moment, there isn't any target, because we are waiting until Bouffalo officially, it should be in end of this month according this post. After that, we will decide and focus on spare blobs 😊 |
I cracked this binary open in Ghidra as soon as I found out about it. Eager to contribute if I have time. |
previous work on RV32 and RV64
https://delaat.net/rp/2019-2020/p49/report.pdf
https://reverseengineering.stackexchange.com/questions/22558/reversing-a-key-gen-firmware-for-risc-v
Ghidra release do not support Risc-V but if you install from source it does.
The text was updated successfully, but these errors were encountered: