Source code recovery is useful when source code of a target application, driver, or firmware is not available. A typical approach consists of target disassembly, decompilation, and manual edit.
Disassembly is the first step and transforms machine code into a human-readable equivalent in assembly language. 1stplugins uses first class tools such as IDA Pro Advanced. Once disassembly is done, it is possible to start analysis either at the assembly language level, or, if a decompiler is available, perform the next step and continue at a higher level language level.
Decompilation is a reverse process to compilation and linking. On input there is binary code or assembly language while on the output there is source code in a higher level language such as C or Python. Though it seems decompilation works like a charm, it is a tricky thing and its product is rarely runnable. This is due to information loss made at compile and link time (function and variable names, comments, data types, structure and class layouts, breaking into logical parts, … — all that is lost) and/or later processing such as obfuscation or source code protection. Decompiler outputs need to be further edited in order to be understood and/or run again.
Manual edits are necessary. The decompiler output needs to be validated since i) a lot of information gets lost at the compile & link time, and ii) the decompiler does not need to decompile properly. Edits include reconstruction of structure and class layouts, naming functions, methods, local and global variables, etc. This part is the most complex of the three, as it needs to be done manually and requires deep platform internals knowledge.