Guide to P-code Injection: Changing the intermediate representation of code on the fly in Ghidra

When we were developing the ghidra nodejs module for Ghidra, we realized that it was not always possible to correctly implement V8 (JavaScript engine that is used by Node.js) opcodes in SLEIGH. In such runtime environments as V8 and JVM, a single opcode might perform multiple complicated actions. To resolve this problem in Ghidra, a mechanism was designed for the dynamic injection of  p-code constructs, p-code being Ghidra’s intermediate language. Using this mechanism, we were able to transform the decompiler output from this:

to this:

Continue reading

Creating a Ghidra processor module in SLEIGH using V8 bytecode as an example

Last year our team had to analyze V8 bytecode. Back then, there were no tools in place to decompile such code and facilitate convenient navigation over it. We decided to try writing a processor module for the Ghidra framework. Thanks to the features of the language used to describe the output instructions, we obtained not only a readable set of instructions, but also a C-like decompiler. This article is a continuation of the series (1, 2) on our Ghidra plugin.

Several months went by between writing the processor module and this article. In this time, the SLEIGH specification remained unchanged, and the described module works on versions 9.1.2 – 9.2.2, which have been released during the last six months.

On ghidra.re and in the documentation distributed with Ghidra there is a fairly good description of the capabilities of the language. These materials are worth reading before writing your own modules. Preexisting processor modules by the framework’s developers might be excellent examples, especially if you know their architecture.

You can see in the documentation that the processor modules for Ghidra are written in SLEIGH, a language derived from the Specification Language for Encoding and Decoding (SLED), which was developed specifically for Ghidra. It translates machine code into p-code (the intermediate language that Ghidra uses to build decompiled code). As a language for describing processor instructions, it has a lot of limitations, although they can be reduced with the p-code injection mechanism implemented as Java code.

The source code of the new processor module is presented on github. This article will review the key concepts that are used in the development of the processor module using pure SLEIGH, with certain instructions as examples. Working with the constant pool, p-code injections, analyzer, and loader will be or have already been reviewed in other articles. Also you can read more about analyzers and loaders in The Ghidra Book: The Definitive Guide.

Continue reading

Decompiling Node.js in Ghidra

Have you ever wanted to find out how a program you often use, a game you play a lot, or the firmware of some realtime device actually works? If so, what you need is a disassembler. Better still, a decompiler. While things are pretty clear with x86–x64, Java, and Python, as there are plenty of disassemblers and decompilers to go around, with other languages, the situation is a little bit more complicated, and search engines will simply tell you ‘it can’t be done.’

Continue reading

How we bypassed bytenode and decompiled Node.js bytecode in Ghidra

I build robots for fun.

Rick Sanchez

It’s common knowledge that in 2019 the NSA decided to open source its reverse engineering framework known as Ghidra.  Due to its versatility, it quickly became popular among security researchers. This article is one of many to come dedicated to covering the technical details of the ghidra_nodejs plugin for Ghidra, developed by our team. The plugin’s main job is to parse, disassemble and decompile NodeJS Bytenode (.jsc) binaries. The focus of this article is the V8 bytecode and the relevant source code entities. A brief description of the plugin is also provided, which will be expanded upon in greater detail in subsequent articles.

Continue reading

IDA Pro Tips to Add to Your Bag of Tricks

IDA Pro is the most common software for reverse engineering in the industry. It can decompile the five most common architectures (x86/x64/ARM/PowerPC/MIPS), disassemble more than a hundred rare architectures, and debug most of them.

This article is a selection of my favorite tips for IDA Pro. Let’s get to it!

Continue reading