Cybersecurity Notes
MathematicsCryptography
  • Cybersecurity Notes
  • Binary Exploitation
    • Stack
      • Introduction
      • ret2win
      • De Bruijn Sequences
      • Shellcode
      • NOPs
      • 32- vs 64-bit
      • No eXecute
      • Return-Oriented Programming
        • Calling Conventions
        • Gadgets
        • Exploiting Calling Conventions
        • ret2libc
        • Stack Alignment
      • Format String Bug
      • Stack Canaries
      • PIE
        • Pwntools, PIE and ROP
        • PIE Bypass with Given Leak
        • PIE Bypass
      • ASLR
        • ASLR Bypass with Given Leak
        • PLT and GOT
        • ret2plt ASLR bypass
      • GOT Overwrite
        • Exploiting a GOT overwrite
      • RELRO
      • Reliable Shellcode
        • ROP and Shellcode
        • Using RSP
        • ret2reg
          • Using ret2reg
      • One Gadgets and Malloc Hook
      • Syscalls
        • Exploitation with Syscalls
        • Sigreturn-Oriented Programming (SROP)
          • Using SROP
      • ret2dlresolve
        • Exploitation
      • ret2csu
        • Exploitation
        • CSU Hardening
      • Exploiting over Sockets
        • Exploit
        • Socat
      • Forking Processes
      • Stack Pivoting
        • Exploitation
          • pop rsp
          • leave
    • Heap
      • Introduction to the Heap
      • Chunks
      • Freeing Chunks and the Bins
        • Operations of the Fastbin
        • Operations of the Other Bins
      • Malloc State
      • malloc_consolidate()
      • Heap Overflow
        • heap0
        • heap1
      • Use-After-Free
      • Double-Free
        • Double-Free Protections
        • Double-Free Exploit
      • Unlink Exploit
      • The Tcache
        • Tcache: calloc()
        • Tcache Poisoning
      • Tcache Keys
      • Safe Linking
    • Kernel
      • Introduction
      • Writing a Char Module
        • An Interactive Char Driver
        • Interactivity with IOCTL
      • A Basic Kernel Interaction Challenge
      • Compiling, Customising and booting the Kernel
      • Double-Fetch
        • Double-Fetch without Sleep
      • The Ultimate Aim of Kernel Exploitation - Process Credentials
      • Kernel ROP - ret2usr
      • Debugging a Kernel Module
      • SMEP
        • Kernel ROP - Disabling SMEP
        • Kernel ROP - Privilege Escalation in Kernel Space
      • SMAP
      • modprobe_path
      • KASLR
      • KPTI
    • Browser Exploitation
      • *CTF 2019 - oob-v8
        • The Challenge
      • picoCTF 2021 - Kit Engine
      • picoCTF 2021 - Download Horsepower
  • Reverse Engineering
    • Strings in C++
    • C++ Decompilation Tricks
    • Reverse Engineering ARM
  • Blockchain
    • An Introduction to Blockchain
  • Smart Contracts and Solidity
  • Hosting a Testnet and Deploying a Contract
  • Interacting with Python
  • Writeups
    • Hack The Box
      • Linux Machines
        • Easy
          • Traceback
        • Medium
          • Magic
          • UpDown
        • Hard
          • Intense
      • Challenges
        • Web
          • Looking Glass
          • Sanitize
          • Baby Auth
          • Baby Website Rick
        • Pwn
          • Dream Diary: Chapter 1
            • Unlink Exploit
            • Chunk Overlap
          • Ropme
    • picoGym
      • Cryptography
        • Mod 26
        • Mind Your Ps and Qs
        • Easy Peasy
        • The Numbers
        • New Caesar
        • Mini RSA
        • Dachshund Attacks
        • No Padding, No Problem
        • Easy1
        • 13
        • Caesar
        • Pixelated
        • Basic-Mod1
        • Basic-Mod2
        • Credstuff
        • morse-code
        • rail-fence
        • Substitution0
        • Substitution1
        • Substitution2
        • Transposition-Trial
        • Vigenere
        • HideToSee
    • CTFs
      • Fword CTF 2020
        • Binary Exploitation
          • Molotov
        • Reversing
          • XO
      • X-MAS CTF 2020
        • Pwn
          • Do I Know You?
          • Naughty
        • Web
          • PHP Master
      • HTB CyberSanta 2021
        • Crypto
          • Common Mistake
          • Missing Reindeer
          • Xmas Spirit
          • Meet Me Halfway
  • Miscellaneous
    • pwntools
      • Introduction
      • Processes and Communication
      • Logging and Context
      • Packing
      • ELF
      • ROP
    • scanf Bypasses
    • Challenges in Containers
    • Using Z3
    • Cross-Compiling for arm32
Powered by GitBook
On this page
  • Basic Strings
  • Small Object Optimization

Was this helpful?

Export as PDF
  1. Reverse Engineering

Strings in C++

Last updated 8 months ago

Was this helpful?

Basic Strings

Reversing C++ can be a pain, and part of the reason for that is that in C++ a std::string can be dynamically-sized. This means its appearance in memory is more complex than a char[] that you would find in C, because std::string :

  • Pointer to the allocated memory (the actual string itself)

  • Logical size of string

  • Size of allocated memory (which must be bigger than or equal to logical size)

The actual string content is dynamically allocated on the heap. As a result, std::string looks something like this in memory:

class std::string
{
    char* buf;
    size_t len;
    size_t allocated_len;
};

This is not necessarily a consistent implementation, which is why many decompilers don't recognise strings immediately - they can vary between compilers and different versions.

Small Object Optimization

Decompilers can confuse us even more depending on how they optimise small objects. Simply put, we would prefer to avoid allocating space on the heap unless absolutely necessary, so if the string is short enough, we try to fit it within the std::string struct itself. For example:

class std::string
{
    char* buf;
    size_t len;
    
    // union is used to store different data types in the same memory location
    // this saves space in case only one of them is necessary 
    union
    {
        size_t allocated_len;
        char local_buf[8];
    };
};

In this example, if the string is 8 bytes or less, local_buf is used and the string is stored there instead. buf will then point at local_buf, and no heap allocation is used.

An analysis of different compilers' approaches to Small Object Optimization can be found .

actually contains 3 fields
here