arrow-left

All pages
gitbookPowered by GitBook
1 of 3

Loading...

Loading...

Loading...

Operations of the Other Bins

When a non-fast chunk is freed, it gets put into the Unsorted Bin. When new chunks are requested, glibc looks at all of the bins

  • If the requested size is fastbin size, check the corresponding fastbinarrow-up-right

    • If there is a chunk in it, return it

  • If the requested chunk is of smallbin size,

    • If there is a chunk in it, return it

  • If the requested chunk is large (of largebin size), with . We will get into the mechanisms of this at a later point, but essentially I lied earlier - fastbins do consolidate, but not on freeing!

  • Finally, we iterate through the chunks in the unsorted bin

    • If it is empty, we service the request through making the heap larger by , pushing the now-smaller top chunk to start at a higher memory address

  • If the requested size is equal to the size of the chunk in the bin, return the chunk

  • If it's smaller, split the chunk in the bin in two and return a portion of the correct size

  • If it's larger, we , and after this there may be free chunks big enough to service the request

    • If not, we again service the request by and pushing the now-smaller top chunk to start at a higher memory address

One thing that is very easy to forget is what happens on allocation and what happens on freeing, as it can be a bit counter-intuitive. For example, the fastbin consolidation is triggered from an allocation!

Freeing Chunks and the Bins

hashtag
An Overview of Freeing

When we are done with a chunk's data, the data is freed using a function such as free(). This tells glibc that we are done with this portion of memory.

In the interest of being as efficient as possible, glibc makes a lot of effort to recycle previously-used chunks for future requests in the program. As an example, let's say we need 100 bytes to store a string input by the user. Once we are finished with it, we tell glibc we are no longer going to use it. Later in the program, we have to input another 100-byte string from the user. Why not reuse that same part of memory? There's no reason not to, right?

Operations of the Fastbin

Fastbins are a singly-linked list of chunks. The point of these is that very small chunks are reused quickly and efficiently. To aid this, chunks of fastbin size do not consolidate (they are not absorbed into surrounding free chunks once freed).

A fastbin is a LIFO (Last-In-First-Out) structure, which means the last chunk to be added to the bin is the first chunk to come out of it. Glibc only keeps track of the HEAD, which points to the first chunk in the list (and is set to 0 if the fastbin is empty). Every chunk in the fastbin has an fd pointer, which points to the next chunk in the bin (or is 0 if it is the last chunk).

When a new chunk is freed, it's added at the front of the list (making it the head):

check the corresponding smallbinarrow-up-right
we first consolidate the largebinsarrow-up-right
malloc_consolidate()
splitting the top chunk
consolidate the fastbinsarrow-up-right
splitting the top chunk

It is the bins that are responsible for the bulk of this memory recycling. A bin is a (doubly- or singly-linked) list of free chunks. For efficiency, different bins are used for different sizes, and the operations will vary depending on the bins as well to keep high performance.

When a chunk is freed, it is "moved" to the bin. This movement is not physical, but rather a pointer - a reference to the chunk - is stored somewhere in the list.

hashtag
Bin Operations

There are four bins: fastbins, the unsorted bin, smallbins and largebins.

When a chunk is freed, the function that does the bulk of the work in glibc is _int_free()arrow-up-right. I won't delve into the source code right now, but will provide hyperlinks to glibc 2.3, a very old one without security checks. You should have a go at familiarising yourself with what the code says, but bear in mind things have been moved about a bit to get to there they are in the present day! You can change the version on the left in bootlin to see how it's changed.

  • First, the size of the chunk is checkedarrow-up-right. If it is less than the largest fastbin size, add it to the correct fastbinarrow-up-right

  • Otherwise, if it's mmapped, munmap the chunkarrow-up-right

  • Finally, consolidate themarrow-up-right and put them into the unsorted binarrow-up-right

What is consolidation? We'll be looking into this more concretely later, but it's essentially the process of finding other free chunks around the chunk being freed and combining them into one large chunk. This makes the reuse process more efficient.

hashtag
Fastbins

Fastbins store small-sized chunks. There are 10 of these for chunks of size 16, 24, 32, 40, 48, 56, 64, 72, 80 or 88 bytes including metadata.

hashtag
Unsorted Bin

There is only one of these. When small and large chunks are freed, they end of in this bin to speed up allocation and deallocation requests.

Essentially, this bin gives the chunks one last shot at being used. Future malloc requests, if smaller than a chunk currently in the bin, split up that chunk into two pieces and return one of them, speeding up the process - this is the Last Remainder Chunkarrow-up-right. If the chunk requested is larger, then the chunks in this bin get moved to the respective Small/Large bins.

hashtag
Small Bins

There are 62 small bins of sizes 16, 24, ... , 504 bytes and, like fast bins, chunks of the same size are stored in the same bins. Small bins are doubly-linked and allocation and deallocation is FIFO.

The purpose of the FD and BK pointers as we saw before are to points to the chunks ahead and behind in the bin.

Before ending up in the unsorted bin, contiguous small chunks (small chunks next to each other in memory) can coalesce (consolidate), meaning their sizes combine and become a bigger chunk.

hashtag
Large Bins

63 large bins, can store chunks of different sizes. The free chunks are ordered in decreasing order of size, meaning insertions and deletions can occur at any point in the list.

The first 32 bins have a range of 64 bytes:

Like small chunks, large chunks can coalesce together before ending up in the unsorted bin.

hashtag
Head and Tail

Each bin is represented by two values, the HEAD and TAIL. As it sounds, HEAD is at the top and TAIL at the bottom. Most insertions happen at the HEAD, so in LIFO structures (such as the fastbins) reallocation occurs there too, whereas in FIFO structures (such as small bins) reallocation occurs at the TAIL. For fastbins, the TAIL is null.

1st bin: 512 - 568 bytes
2nd bin: 576 - 632 bytes
[...]

The fd of the newly-freed chunk is overwritten to point at the old head of the list

  • HEAD is updated to point to this new chunk, setting the new chunk as the head of the list

  • Let's have a visual demonstration (it will help)! Try out the following C program:

    We get:

    As you can see, the chunk a gets reassigned to chunk f, b to e and c to d. So, if we free() a chunk, there's a good chance our next malloc() - if it's of the same size - will use the same chunk.

    It can be really confusing as to why we add and remove chunks from the start of the list (why not the end?), but it's really just the most efficient way to add an element. Let's say we have this fastbin setup:

    In this case HEAD points to a, and a points onwards to b as the next chunk in the bin (because the fd field of a points to b). Now let's say we free another chunk c. If we want to add it to the end of the list like so:

    We would have to update the fd pointer of b to point at c. But remember that glibc only keeps track of the first chunk in the list - it only has the HEAD stored. It has no information about the end of this list, which could be many chunks long. This means that to add c in at the end, it would first have to start at the head and traverse through the entire list until it got to the last chunk, then overwrite the fd field of the last chunk to point at c and make c the last chunk.

    Meanwhile, if it adds at the HEAD:

    All we need to do is:

    • Set the fd of c to point at a

      • This is easy, as a was the old head, so glibc had a pointer to it stored already

    • HEAD is then updated to c, making it the head of the list

      • This is also easy, as the pointer to c is freely available

    This has much less overhead!

    For reallocating the chunk, the same principle applies - it's much easier to update HEAD to point to a by reading the fd of c than it is to traverse the entire list until it gets to the end.

    #include <stdio.h>
    #include <stdlib.h>
    
    int main() {
        char *a = malloc(20);
        char *b = malloc(20);
        char *c = malloc(20);
        
        printf("a: %p\nb: %p\nc: %p\n", a, b, c);
    
        puts("Freeing...");
    
        free(a);
        free(b);
        free(c);
    
        puts("Allocating...");
    
        char *d = malloc(20);
        char *e = malloc(20);
        char *f = malloc(20);
    
        printf("d: %p\ne: %p\nf: %p\n", d, e, f);
    }
    a: 0x2292010
    b: 0x2292030
    c: 0x2292050
    Freeing...
    Allocating...
    d: 0x2292050
    e: 0x2292030
    f: 0x2292010
    HEAD --> a -> b
    HEAD --> a -> b -> c
    HEAD --> c -> a -> b