Reverse engineering the Linux OS, a first approach
(disassembling Linux)

by SiuL+Hacky

(15 October 1997)

Courtesy of fravia's page of reverse engineering

Well, another VERY remarkable essay, that I am proud to present. SiuL+Hacky tackles here NEW UNCOVERED ground, and teaches all of you the first elements of Linux reverse engineering... you would have tought, as I did, that such reversing would have been useless, since the main characteristic of Linux (and of the whole GNU initiative) was to give freely the source code of any program. Yet the deficiencies of Windoze are to-day so evident that more and more "commercial" programmers are turning to Linux despite all efforts by Gate's lackeys. And if you say "commercial" you say of course limited egotistical pusillanimous minds, that introduce their banal protection schemes even into the Linux world, until yesterday incontaminated.
Enjoy this GREAT essay/tutorial by SiuL+Hacky, let's hope that he will send us more essays on this subject!

BTW, you'll find inside here dasm: a disassembler for Linux *WRITTEN* by SiuL+Hacky himself!

I. Linux Introduction.

Probably all of you know about linux, but I don't know how many people 
has linux installed in their computers. I have (as many people do) 
both o.s. in different partitions of my hard-disk. Sometimes people 
thinks of Operative Systems as religions (it use to happen also with 
editors), so I'm not gonna tell you: INSTALL IT if you want your soul 
to be saved ! If you are not sure, after reading this document, I 
think you should know for sure what to do.

A friend of mine told some time ago a joke about Operative Systems 
compared with Airlines. When you travel with Microsoft Airlines, you 
may find beautiful women at the checking desk, you may enjoy amazing 
entertaining shows before departure, when you climb in the aeroplane 
it is really comfort and full of charming stewardesses. Ok, after 
taking off the aeroplane explodes and nobody knows why. When 
travelling with unix airlines you may travel safely, but passengers 
must carry themselves the pieces of the aeroplane.

Unix is for you if you if you feel right working with DOS-boxes under 
Windows, if you use to work with network environments, if you want 
speed and safety back (your brand-new Pentium acts like a Pentium, not 
like 386) and if you find lack of excitement configuring W95 
programs. You may recover this bittersweet feeling of being in the 
middle of a deserted island when things go wrong. But if you hate 
command line programs with thousands of switches, unix is not for you.

One of the main characteristics of linux, is that it's a "free 
environment". The applications (and kernel itself) are developed by 
people and are offered to "the world" completely free. Most 
applications are developed (more or less) under GNU License. Moreover, 
a lot of the programs are provided with the source code (and you 
compile it). Though it has been ported to several platforms, is 
especially popular in x86 computers, and many users come from DOS.

II. A Cracker inside Linux world.

Linux is cool for hacking, but I had never heard anything about 
cracking in linux. As I told you, software is free and there's no 
"bunch of shareware programmers". Imagine ... protecting a program and 
giving you the source code, really nonsense.

But wait, Linux is not perfect, programs are not beautiful and 
user-friendly. One of the problems I found from start with linux, is 
multimedia. Multimedia is new in Dos/Windows world, so the old unix 
dinosaur, that hasn't changed in the last twenty years (though if you 
look inside "new" operative systems they are not that different) was 
not supposed to have lot of multimedia support. I have a cheap 
Soundblaster clone, and I cannot make it "cry" through my speakers. I 
am not waiting for Dennis Ritchie saying "bye bye" when logging out, 
but I like to "play" with sound algorithms and other stuff. 

Surprisingly in just one day I downloaded two sound-programs with the 
same nasty protections of their DOS brothers. It is really strange, 
and I don't know if it is going to be usual in the future; probably it 
will depend on Microsoft (once more), and if it finally gets into 
Linux world (now it is just a rumour). Anyway, I decided to crack 

In Linux, people use to program in C (the Linux kernel is made in C) and 
I found practically no assembler references. I had no idea if cracking 
linux was gonna be easy or not, but the fact was that I had to start 
from scratch practically. Most of the utilities I found are binary 
utilities that come with GCC (GNU C compiler), and that every linux 
user may find in the different distributions or elsewhere in the Web. 
I didn't know of their existence, but I had them in my computer. Well, 
this is for you.

III. Tools of the trade.

Here you'll find some tools that I have found or make myself, and will 
make cracking easier. Mostly are "Windoze" brothers. First of all, 
slight differences, mnemonics are named in a different way. I would 
say it's even better (Sacrilegious !), but anyway you'll have no 
problem getting these changes. You just have to be careful with 
operands, especially in mov instructions, because they are reversed, I 

mov source, destiny
	instead of usual DOS:
mov destiny, source

1) GDB. GNU Debugger.

GNU Compiler has its own debugger, it's called gdb and it has even a 
front-end for X Windows. It is neither Softice nor DOS Debug, but it 
is thought to work with the source code and executables with debug 
information. You can debug a program with assembler instructions, but 
is not comfortable. For example, you are not seeing the current 
assembler instruction, nor registers. This do not pretend to be a 
replace for the man page of gdb. There are lots of useful information 
in books or INFO documents, but here you'll get some useful clues for 

It has some features that you cannot find in Softice, for instance, 
you can debug a program that is already running ! You may use the 
"attach" command for it. Gdb runs in a virtual console, so may run 
your favorite programs while debugging.

Assembler instructions are executed with the "stepi" and "nexti" 
commands, but you cannot fire the program with these instructions. The 
programs are broken with Control-C, but you will not "surf" inside 
every instruction of kernel code. Usually you'll stop the program (for 
instance while waiting for a key) in a system call. Programs do not 
use to call directly to system calls, because a kernel update could 
make them crash. They call C functions, and C libraries (more or less 
like DLLs) will make the system calls. If you want to see a 
disassembled listing, use the "disassemble" command ("disas" will do 
also) + an address (0xaddress), though that address is just used to 
get a function (the function owner of the inst. with the address 
given) and gdb shows you the whole listing of the function from start. 
That's not cool, you know, life is tough. At least you can see current 
instruction with "display/i $eip". After breaking the program use 
"Continue" to resume execution.

The "display" command is also good for showing the value of a 
particular register (don't forget $ sign), but if you want to show all 
registers use "info registers". Finally if you want to change their 
value use "set $eax=3" for instance.

There's a wide range of breakpoints. You can set usual breakpoints "br 
*address", clear them, disable them, use conditional breakpoints 
(YES!), hardware breakpoints ... 

And finally the "backtrace" command is more or less like Softice 
"stack", and "finish" should make 'p ret', but do not trust it very 
much. Well there are lots of commands, study them, but after realizing 
the power of the dead approach, I'm sure you will not want gdb 


This is really a nice tool, especially for spying the program and its 
behaviour. It logs every System Call made by a program, WITH 
PARAMETERS and in a way you'll love it, as I'll show you afterwards. I 
like to use it this way:


where OUTPUT_FILE is the file where you want the log to be dumped.

-i: appends the value of eip when the call was made. It seems like a 
bliss, but be careful: LIBRARIES USE TO MAKE SYSTEM CALLS, not 


It should be a great tool, because show you strings inside a binary 
file, and then you can identify the evil program that is punishing 
you, but there's a simple and easier way to do it using the amazing 
"grep" command. For example if you are looking for strings as 
"Register", run this:

grep Register *

and it'll show you all the files in the current directory containing 
the string "Register". But the first field of this command is a 
general PATTERN, so it may be an exact match or a match as complicated 
as you want (learn REGULAR EXPRESSIONS for it).


What is a crack, without an Hex-Editor ? ("mental" cracking is hard, 
by now). There are very few of them in Unix (that I know of). Get 
one of them at:

It uses "VI"-style. You know, vi is the "official" editor in Unix. It 
seems that every "cool-unix-guy" must love it, or he'll be an 
"aficionado". I do prefer JOE, which "looks-like" old WordStar and old 
WordPerfect and you'll know how to quit the first time you run it :-).

Anyway, you may use, as I do, good Dos HEXEDITORS like Norton Diskedit 
(version 4 or 5). I'm not kidding, a DOS emulator (DOSEMU) is 
available in Linux, and works fine with real mode and DOS4GW programs. 
There's a Windows emulator, but it is long ago in " an early alpha 
stage ". Don't try it.


Well, at last a candle in the middle of the darkness. If is difficult 
to find assembler references, to find disassembling references is like 
looking for Money 3.0 (perhaps FidoNet has again the answer :-). I 
found only a switch in this program that gives a "dump disassembly".

This program gives you the information and data of the different 
sections (more about sections later) of a linux object (executable) 
file. It is possible to get the assembler listing of a program you 
have made (there's a switch in the compiler), but objdump is the only 
program I found that disassemble an arbitrary executable. It also 
gather information of the different "Sections" of the file. But the 
problem, is that there's no analysis information in the disassembled 
file. Some switches of objdump:

-d: Displays the assembler mnemonics contained in the code Sections. 
Note that mnemonics are displayed in the "linux-way". Something like 

0804a37a repnz scasb %es:(%edi),%al
0804a37c notl   %ecx
0804a37e movl   %ecx,0xfffffc0c(%ebp)
0804a384 movb   $0x0,0xfffffc16(%ebp,%ecx,1)  
Download reddasm.txt here!
(If you want to save a web file and you don't know how, and all 
it does is display on the screen, try to hold down the shift key when you 
click on it: it might solve your problem :-)
I programmed it in PERL. Why ? Well since my very first steps in perl 
I realize it was perfect for text-processing files (I knew nothing 
about sed, awk ...). The syntax is not very beautiful or 
high-level-looking; it's an interpreted language, so it is not the 
fastest. Anyway it always has the tools you are looking for (or you 
always dreamt of) and enables you to do a lot of things at the same 
time. It's very popular in CGI scripts. I learnt perl and CGI with a 
very good book by Eric Herrmann. Sorry, I tried not to make it very 
cryptic, but PERL is PERL, and if you don't know perl you'll probably 
don't understand it. For this reason I'll explain how it works.

BTW a perl interpreter (perl 5.0) may be found in any LINUX 
distribution, though interpreters for DOS are available too. Well 
let's start with jmp/call processing:

- The (DYNAMIC) SYMBOL TABLE is read and the elements are put into an
associative array indexed by the addresses. For instance:

- Then all call / jmp instructions are processed into another 
associative array, in this way:

- After this, the addresses of assembled instructions (from .text 
section) are checked against $jumping elements, and if it do exists, 
the reference is written.

- In the same process, call instruction are processed and if they call 
a function from the symbol table, it is also written.

For string processing, we must get further knowledge of how 
executables are build in linux. The most common format is ELF-32bits (
Executable and Linkable Format). The structure of the object is :

* ...

These sections will be "segments" when the program is executed. Some 
important sections are .init (initialization code), .fini (
termination code), .data (pretty obvious), .text (code), .rodata 
(Read-only data), and so on. Do you remember lesson 8.1 and Win32 
exe files ? Don't you think it's pretty much the same ?

These are ELF-TYPES:

Elf32_Addr	4 bytes unsigned
Elf32_Half	2 bytes unsigned
Elf32_Off	4 bytes unsigned
Elf32_Sword	4 bytes signed
Elf32_Word	4 bytes unsigned

And ELF Header is something like this:

typedef struct {
	unsigned char	e_ident[16];
	Elf32_Half	e_type;
	Elf32_Half	e_machine;
	Elf32_Word	e_version;
	Elf32_Addr	e_entry;
	Elf32_Off	e_phoff;
	Elf32_Off	e_shoff;
	Elf32_Word	e_flags;
	Elf32_Half	e_ehsize;
	Elf32_Half	e_phentsize;
	Elf32_Half	e_phnum;
	Elf32_Half	e_shentsize;
	Elf32_Half	e_shnum;
	Elf32_Half	e_shstrndx;
} Elf32_Ehdr;

For us, is important the member e_shoff, that  keeps information about 
the file offset of the Section Header Table. The SHT is an array of 
Elf32_Shdr structures. The element e_shnum tells the number of entries 
in the SHT, and e_shentsize gives the size in bytes of each entry. 
This is the Elf32_Shdr:

typedef struct {
	Elf32_Word	sh_name;
	Elf32_Word	sh_type;
	Elf32_Word	sh_flags;
	Elf32_Addr	sh_addr;
	Elf32_Off	sh_offset;
	Elf32_Word	sh_size;
	Elf32_Word	sh_link;
	Elf32_Word	sh_info;
	Elf32_Word	sh_addralign;
	Elf32_Word	sh_entsize;
} Elf32_Shdr ;

The offset of each section is taken from each sh_offset member. The 
name of each section is a little bit more complicated, because sh_name 
is an index into the section header String Table Section. Well, stop, 
I don't want you to get confused. Fortunately, objdump give us that 
information. Strings are located in the .rodata Section (for obvious 
reasons), and objdump gives the file offset of the section. If you 
want complete information on ELF format, there's a PostScript document 
for you:

There (or in any other mirror), you'll find a lot of interesting things.

Ok, then for string processing, dasm reads Section .rodata offset, and 
get its content from the binary file. We get starting address and 
size of .rodata section, so to make string processing:

- The whole .rodata section is read in a variable.
- Dasm looks for inmediate operands (with $ prefix) and checks if 
they own to .rodata section. 
- If true, the string (null terminated) is extracted from .rodata 
section,  and the reference is written.

The rest, is dirty details about format processing. The program calls
objdump, and you just have to use it this way:
	dasm  exec_file  processed_output_file
I've tested it with several programs, but if you find any bug, problem 
or you have any question, suggestion or whatever, report them to me 

NOTE: In dasm, I don't use the hex values of the instructions (switch
--show-raw-insn), because the output is not tabbed and it wastes disk
space. When we'll need this data, I'll show you how to get it easily.


For applying all this theory, we're gonna crack the couple of programs 
I told you. I chose them because they are very different and 
appropriate for beginning, you'll see. The first one is a disabled 
program with password registration, the second one is a trial with 2 
level of time protection and the same nasty behaviour of its windows 


What the hell is this ? Well, it's an encoder/decoder of MPEG layer 
III. If you don't know about it, it's a standard for audio compression 
(a really exciting subject). Every time you run the decoder you're 
asked about entering a registration code, because sample rates and 
other features are restricted to "registered users".

Let's have some fun with the new tools: "strace -oSalida l3dec" will 
dump system calls in a file called Salida. Do it, answer that you 
don't want to enter Reg.Cod., and get something like this (filtered by 

write(2, "\n***    l3dec V2.70 ISO/MPEG Au"..., 71) = 71
write(2, "|                               "..., 71) = 71
write(2, "|           copyright Fraunhofer"..., 71) = 71
write(2, "|                               "..., 71) = 71

<<<< Look! It is writing the file header

open("./l3dec", O_RDONLY)       = 4  <<<< get current directory
close(4)                        = 0
open("./register.inf", O_RDONLY)=-1 ENOENT (No such file or directory)

<<< FILE sndconf
seconds of evaluation time left -> FILE modules/soundbase

The second file is not executable, is a "relocatable Elf file" (a 
module). No problem. It is logical, for a countdown the protection 
must dwell in a resident program. This protection is a little bit more 
complicated than the first one, but is not a tough protection at all. 
Dasm sndconf, and look for "License expired" (Be indulgent with this 
long listing, trust me, it's easy):

08052101 cmpl   %esi,0x10(%eax);      <<<< some comparing
08052104 jl     08052110;             <<<< if not less flag=0
08052106 movl   $0x0,0xfffffd84(%ebp)

Referenced from jump/call at 080520f3 ; 08052104 ;
08052110 cmpl   $0x0,0xfffffd84(%ebp); <<< flag=1 seems to be good
08052117 jne    08052150;              <<< jump somewhere
08052119 pushl  %ebx;                  <<< the game is over outlaw!
0805211a pushl  %edi

Possible reference to string:
"License expired: %02d/%04d"
0805211b pushl  $0x806fc08

Reference to function : printf
08052120 call   08049138

Possible reference to string:
"Please download a fresh version from"
08052125 pushl  $0x806fb97

Reference to function : printf
0805212a call   08049138
0805212f pushl  %ebx
08052130 pushl  %edi

Possible reference to string:
"License expired: %02d/%04d"
08052131 pushl  $0x806fc08;   <<<< I love this formatted strings 
08052136 pushl  $0x807e6d0

Reference to function : fprintf
0805213b call   08049368
08052140 addl   $0x20,%esp
08052143 pushl  $0xffffffff

Reference to function : exit
08052145 call   08049598;    <<<<  beggar off
0805214a leal   0x0(%esi),%esi

Referenced from jump/call at 08052117 ;
                                      <<< Do you remember the flag ?
08052150 movl   $0x1,0xfffffd84(%ebp); <<< jump here if above flag=1
0805215a movl   0xfffffd94(%ebp),%eax
08052160 movl   %eax,0xfffffd80(%ebp)
08052166 decl   %eax
08052167 movl   %eax,0xfffffd94(%ebp)
0805216d movl   0xfffffd80(%ebp),%esi
08052173 decl   %esi
08052174 jns    08052186
08052176 decl   0xfffffd90(%ebp)
0805217c movl   $0xb,0xfffffd94(%ebp)

Referenced from jump/call at 08052174 ;
08052186 movl   0xfffffd7c(%ebp),%eax
0805218c movl   0x14(%eax),%edx
0805218f movl   0xfffffd90(%ebp),%ecx
08052195 cmpl   %ecx,%edx
08052197 jle    080521a3;             <<< jumping flag=0
08052199 movl   $0x0,0xfffffd84(%ebp);<<< flag=0 BAD GUY ! 

Referenced from jump/call at 08052197 ;
080521a3 cmpl   %edx,%ecx
080521a5 jne    080521c2
080521a7 movl   0xfffffd94(%ebp),%eax
080521ad movl   0xfffffd7c(%ebp),%esi
08052160 movl   %eax,0xfffffd80(%ebp)
08052166 decl   %eax
08052167 movl   %eax,0xfffffd94(%ebp)
0805216d movl   0xfffffd80(%ebp),%esi
08052173 decl   %esi
08052174 jns    08052186
08052176 decl   0xfffffd90(%ebp)
0805217c movl   $0xb,0xfffffd94(%ebp): 

Referenced from jump/call at 08052174 ;
08052186 movl   0xfffffd7c(%ebp),%eax
0805218c movl   0x14(%eax),%edx
0805218f movl   0xfffffd90(%ebp),%ecx
08052195 cmpl   %ecx,%edx
08052197 jle    080521a3;               <<< jumping again badflag
08052199 movl   $0x0,0xfffffd84(%ebp);  <<< flag =0

Referenced from jump/call at 08052197 ;
080521a3 cmpl   %edx,%ecx
080521a5 jne    080521c2
080521a7 movl   0xfffffd94(%ebp),%eax
080521ad movl   0xfffffd7c(%ebp),%esi
080521b3 cmpl   %eax,0x10(%esi)
080521b6 jl     080521c2;             <<< again
080521b8 movl   $0x0,0xfffffd84(%ebp)

Referenced from jump/call at 080521a5 ; 080521b6 ;
080521c2 pushl  %ebx
080521c3 pushl  %edi

Possible reference to string:
"License will expire after: %02d/%04d"
080521c4 pushl  $0x806fc24

Ejem, if flag=1 your license don't expire, and then lot of 
possibilities of flag=0. Pretty obvious. Use your favorite dos/unix 
hexeditor (or copy the file to your dos partition, reboot and run the 
damned Windoze hexeditor) and do a general Search/Replace: 
(... objdump -d --show-raw-insn sndconf | grep 080521b)

c7 85 84 fd ff ff 00 00 00 00  movl $0x0,0xfffffd84(%ebp)
changes to:
c7 85 84 fd ff ff 01 00 00 00  movl $0x1,0xfffffd84(%ebp);ALWAYS GOOD!

You'll notice that the message even disappear. But we must get rid of 
the countdown too. Dasm soundbase and look for "seconds" (you may see 
that this file has line information):

Possible reference to string:
"OSS: The evaluation time has elapsed. Please reload the driver."
<<<< if you're executing this part
<<<< you are a really bad guy

00005901 <sound_open_sw+71> pushl  $0x944
        RELOC: 00005902 R_386_32 .rodata; << look! objdump smts helps
00005906 <sound_open_sw+76> call   00005907 <sound_open_sw+77>

<<< movl   $0xffffffed,%eax

Possible reference to string:
"d: Driver partially removed. Can't open device" <<<< String references sometimes fail

00005910 <sound_open_sw+80> addl   $0x4,%esp
00005913 <sound_open_sw+83> popl   %ebx
00005914 <sound_open_sw+84> popl   %esi
00005915 <sound_open_sw+85> ret    
00005916 <sound_open_sw+86> leal   0x0(%esi),%esi
00005919 <sound_open_sw+89> leal   0x0(%esi,1),%esi

Referenced from jump/call at 000058ff ; 
00005920 <sound_open_sw+90> movl   0x0,%eax
		RELOC: 00005921 R_386_32 jiffies_R2f7c7437
00005925 <sound_open_sw+95> subl   %eax,%edx
00005927 <sound_open_sw+97> movl   %edx,%eax

Possible reference to string:
"en configured"
00005929 <sound_open_sw+99> movl   $0x64,%ecx
0000592e <sound_open_sw+9e> xorl   %edx,%edx
00005930 <sound_open_sw+a0> divl   %ecx,%eax
00005932 <sound_open_sw+a2> pushl  %eax

Possible reference to string:
"OSS: %d seconds of evaluation time left" <<< Here you are a not so good guy

00005933 <sound_open_sw+a3> pushl  $0x99e
		RELOC: 00005934 R_386_32 .rodata
00005938 <sound_open_sw+a8> call   00005939 <sound_open_sw+a9>
	RELOC: 00005939 R_386_PC32 printk_Rad1148ba; << printing what? 

Possible reference to string:
"river partially removed. Can't open device"

0000593d <sound_open_sw+ad> addl   $0x8,%esp

Referenced from jump/call at 000058e8 ; 000058ec ; 000058f6 ; 

00005940 <sound_open_sw+b0> movl   %ebx,%eax; <<<I want to jump here !

Look at this, before seeing the rest of the code:
- If you are a not so good guy you come from 58ff
- You bypass the countdown message if you come from 58e8;58ec and 58f6
- If you don't get these jumping you are a really bad guy.
It seems to be a REAL HOT AREA. Ok, you cannot wait anymore, I'll show you:

000058e0 <sound_open_sw+50> movl   0x1148,%edx
		RELOC: 000058e2 R_386_32 .data
000058e6 <sound_open_sw+56> testl  %edx,%edx
000058e8 <sound_open_sw+58> je     00005940; <<< FIRST OPPORTUNITY
000058ea <sound_open_sw+5a> testl  %ebx,%ebx
000058ec <sound_open_sw+5c> je     00005940; <<< movl   %ebx,%eax

Possible reference to string:
"artially removed. Can't open device"

000058f0 <sound_open_sw+60> andl   $0xf,%eax

Possible reference to string:
" Driver partially removed. Can't open device"

000058f3 <sound_open_sw+63> cmpl   $0x6,%eax
000058f6 <sound_open_sw+66> je     00005940; <<< THIRD ONE
000058f8 <sound_open_sw+68> movl   0x0,%eax
		RELOC: 000058f9 R_386_32 jiffies_R2f7c7437
000058fd <sound_open_sw+6d> cmpl   %edx,%eax
000058ff <sound_open_sw+6f> jbe    00005920; <<< LAST ONE EVEN BEING
                                             <<< A NOT S.G. GUY

If i'm honest i don't like this variety. If you look for hits for the 
FIRST key variable 0x1148 (apparently 0x1148=0 is a good thing), it 
is never (directly) assigned to 0. I don't like, perhaps it works, 
but I do prefer the other two options (that deal with the same thing). 

000058f0 <sound_open_sw+60> 83 e0 0f  andl   $0xf,%eax
000058f3 <sound_open_sw+63> 83 f8 06  cmpl   $0x6,%eax
000058f6 <sound_open_sw+66> 74 48  je     00005940
000058f0 <sound_open_sw+60> 83 e0 0f  andl   $0xf,%eax
000058f3 <sound_open_sw+63> 83 f8 06  cmpl   $0x6,%eax
000058f6 <sound_open_sw+66> eb 48  jmp     00005940 

It apparently works, and I say apparently 'cause I told before that 
this buggy module doesn't work anyhow :-) 
Well, easy cracks for a new area. Good linuxing !

(c) SiuL+Hacky 1997. All rights reversed
You are deep inside fravia's page of reverse engineering, choose your way out:

redhomepage redlinks redanonymity +ORC redstudents' essays redacademy database
redtools redcocktails redantismut CGI-scripts redsearch_forms redmail_fravia
redIs reverse engineering legal?