When i was working with a garbage collector, I got the need for a memory buffer which needs to be allocated in a contiguous way. Most of them knows how the malloc works and how you will get the pointers from it. Since its dynamically allocated, most of the time you get different addresses which are non-contiguous.
void *allocate_memory(size_t size) { void *p = malloc(size); return p; }
In the above code, p will hold random address in each request which cannot be accessed in a contiguous way. For example, Lets say you have given an interview question to construct a faster linked list which can also be accessed based on index, with allowed rise in memory cost. Then using malloc, its impossible to get addresses so that, you can do *(++p).
For doing the above kind of operation over pointers returned by malloc, we need to align the address returned by malloc. This concept of getting contiguous memory address can be achieved efficiently using TLB/Virtual address method. But simpler and hacky way is to align the addresses from malloc.
void *allocate_memory(size_t size, size_t alignment) { void *p; p=malloc(((size+alignment-1)&~(alignment-1))+sizeof(void *)+alignment-1); return p; }
The above code looks complex. :) But it is actually very simple and it gives aligned addresses always.
For example, Consider our linked list case. When we create a new linked list node, we will try aligning it with 64k.
struct Node* createNewNode() { struct Node *p; p=allocate_memory(sizeof(Node), 64 * 1024); return p; } void constructList() { struct Node *n1, n2; // create 2 nodes n1 = createNewNode(); n2 = createNewNode(); // assert if n2 is not equal to n1+1 ASSERT( n2 == n1+1) }The above code, makes it clear that, n1+1 will give you n2. Let me explain how this happens.
Update: thanks for the comment. The code works with only userspace allocators which by default gives logical memory which is contiguous only in few platforms. So, which means it works only for Windows. :) For getting contigious allocation in a platform and allocator independent way, we need a pool allocator above the system allocator. Instead of using malloc, use the a pool allocater to allocate chunks. I will add a simple algorithm for it soon.
- malloc (size) is the normal allocation code
- With this, allocate some more space of the required alignment, say, malloc(size + alignment)
- NOTE: alignment should be always even.
- It will be converted to odd. i.e, malloc(size + (alignment-1))
- The above means, instead of allocated space for size, you have allocated some padding size of (alignment-1)
Nice article.. To complement this article, I wish to add few cases, where it doesn't work!
ReplyDelete1. This works nicely for user-apps, which has its own (private) user-mode virtual memory. For kernel components, there is no guarantee. Bcoz, all kernel components share same virtual memory; between two successive allocations, some other kernel component can alloc memory, leading to non-contiguous!
2. Specifically, in Windows, kernel components, will always allocate memory in chunks of PAGE_SIZE. So, you need to tweak this program, to always align to PAGE_SIZE!
My best wishes to everyone.
Bye!