# Strings in C++

## Basic Strings

Reversing C++ can be a pain, and part of the reason for that is that in C++ a `std::string` can be dynamically-sized. This means its appearance in memory is more complex than a `char[]` that you would find in C, because `std::string` [actually contains 3 fields](https://shaharmike.com/cpp/std-string/):

* Pointer to the allocated memory (the actual string itself)
* Logical size of string
* Size of allocated memory (which must be bigger than or equal to logical size)

The actual string content is dynamically allocated on the **heap**. As a result, `std::string` looks something like this in memory:

```cpp
class std::string
{
    char* buf;
    size_t len;
    size_t allocated_len;
};
```

This is not necessarily  a *consistent* implementation, which is why many decompilers don't recognise strings immediately - they can vary between compilers and different versions.

## Small Object Optimization

Decompilers can confuse us even more depending on how they optimise small objects. Simply put, we would prefer to avoid allocating space on the heap unless absolutely necessary, so if the string is short enough, we try to fit it within the `std::string` struct itself. For example:

```cpp
class std::string
{
    char* buf;
    size_t len;
    
    // union is used to store different data types in the same memory location
    // this saves space in case only one of them is necessary 
    union
    {
        size_t allocated_len;
        char local_buf[8];
    };
};
```

In this example, if the string is 8 bytes or less, `local_buf` is used and the string is stored there instead. `buf` will then point at `local_buf`, and no heap allocation is used.

An analysis of different compilers' approaches to Small Object Optimization can be found [here](https://shaharmike.com/cpp/std-string/).


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://ir0nstone.gitbook.io/notes/rev/strings-in-c++.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
