Improved Struct Recovery Algorithms #8135
ReversingWithMe
started this conversation in
Ideas
Replies: 1 comment
-
@ReversingWithMe I just came across this post. Did you find anything else? I am looking into array bounds and type detection. Will be helpful to know what you find. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I've looked at a couple of the Ghidra extensions/scripts for struct recovery, some of which get into inter-procedural, and some get into type hints across instruction mnemonics.
I'm curious if anyone has looked at the DLA, Data Layout Algorithm from the folks doing revng. (https://github.com/revng/revng/tree/develop/lib/DataLayoutAnalysis)
From limited testing parsing the results, it's able to recover some structs, and the hard struct recovery problems will still kill the algorithm. (A struct containing a struct, struct vs arrays, aliasing around the first index in a struct)
In those screenshots, was just seeing how I would expect it to perform, I did update the function signature and try the auto-struct identification from the context menu, which failed for me also. I am not certain there is anything that could be adapted from that code as there is a whole other analysis stack in that tooling, which could devolve into gearshift or btighidra techniques. There is no paper reference for that algorithm also.
I have not done extensive studies either into symbol vs stripped running that inference code to see how often its accurate vs Ghidra, and how easily stubs of structs could just be added to type manager to see if that helps clean up decompilation either.
I havn't seen any of these tools that natively handle local variables, outside of propagation of types.
Most of my testing is against gcc coreutils binaries, outside of toy example problems, from the function detection test cases from github.
Beta Was this translation helpful? Give feedback.
All reactions