Reverse engineering the easy way
Imagine you have some kind of 3rd party data storage that you need to understand how to work with and the only thing you have is a detailed description of the protocol using the device. The only problem is that there is no source code available that can make this process easy to accomplish. And what is left is to implement manually this protocol while having lots of trial and error iterations. Next time in similar occasion repeat this difficult process once again. But no worries, there is one tool that comes in handy in situations like this when there is a file or a stream that you want to parse and you want to be able to do it fast.
Meet Kaitai Struct
First, here comes an official description of Kaitai Struct
Kaitai Struct is a domain-specific language (DSL) that is designed with one particular task in mind: dealing with arbitrary binary formats.
Parsing binary formats is hard, and that’s a reason for that: such formats were designed to be machine-readable, not human-readable. Even when one’s working with a clean, well-documented format, there are multiple pitfalls that await the developer: endianness issues, in-memory structure alignment, variable size structures, conditional fields, repetitions, fields that depend on other fields previously read, etc, etc, to name a few.
Kaitai Struct tries to isolate the developer from all these details and allow to focus on the things that matter: the data structure itself, not particular ways to read or write it.
Features
- Kaitai is supported on Linux and Windows (not sure about Mac).
- So far, Kaitai supports generating parsers in following languages
- C++/STL
- C#
- Java
- JavaScript
- Perl
- PHP
- Python
- Ruby
- If you want you are welcome to add one more language to the list
How to use this Kaitai?
In short, to use Katai
- You use declarative syntax to describe a data source you want to be able to parse, such as file system or image format or whatever you like, in ksy file.
- Then using Kaitai Web IDE or Katai Struct compiler you generate a code in one of the relevant supported languages, such as Java, C#, C++ etc.
- That’s it. Now use the code to get full access to your data source.
Kaitai REPL (Read–Eval–Print Loop)
To get a feeling what Kaitai is capable of you can start from playing with Kaitai REPL which has a number of examples showcasing what can be achieved with it, such as parsing doom.wad package files format.
Katai Web IDE
If you think you are ready to start applying Kaitai to real problems then jump into Katai Web IDE which is very nice and easy to use. You can upload there your data source and start writing a description of how the data source is organized.
This official wiki page will show you the main features or Web IDE.
Kaitai Compiler stand alone
It is possible to use Katai compiler in a stand alone mode via command line interface of your choice be it on Linux, Windows etc. How to do it is described here.
Resources
- Kaitai User Guide
- Kaitai Struct: KSY reference
- See what file formats were already described with Kaitai and maybe you’ll spare an effort re-implementing them yourself.
- Kaitai presentation Dissecting media file formats with Kaitai Struct by Mikhail Yakshin
-
Reverse Engineering of Binary Structures Using Kaitai Struct by Mikhail Yakshin
- Reverse engineering visual novels 101 by Masumi Nakamura
-
Reverse engineering visual novels 101, part 2 by Masumi Nakamura
One thought on “Reverse Engineering with Kaitai Struct”