skip to content
Relatively General .NET

Does code rot over time?

by Oren Eini

posted on: July 10, 2024

“This is Old Code” is a programmer’s idiom meaning “There Be Dragons”.  The term “Legacy Code” is a nice way to say “Don’t make me go there” Those are very strange statements when you think about it.  Code is code, just ones & zeros stored on a disk somewhere. It doesn’t go bad over time.When you write a line of code, it doesn’t have an expiration date, after all. For food, it makes sense, there are bacteria and such that would make it go bad. But what is it about old code that is so problematic?I want to take a look at a couple of examples of old code and examine how they stood the test of time.  I chose those two projects because there has been no activity on either project since about 2014 or so. No meaningful activity or changes for the past decade is a great place to start looking at code rots. Note that I’m not looking at the age of a codebase, but whether it was left to pasture long enough to exhibit code rot issues.Rhino.Mocks is a mocking framework for .NET that I spent close to a decade working on. It was highly popular for several years and was quite capable. The vast majority of the code, written about 15 years ago, is now frozen, and I haven’t touched it since around 2016.I was able to clone the Rhino Mocks repository, run the build script and the tests in a single command. However… trying to actually use this in a modern system would result in an error similar to this one:Method not found: 'System.Reflection.Emit.AssemblyBuilder System.AppDomain.DefineDynamicAssembly(System.Reflection.AssemblyName, System.Reflection.Emit.AssemblyBuilderAccess)'.'Internally, Rhino Mocks does dynamic code generation, which relies on very low level APIs. Apparently, these APIs are not consistent between .NET Framework and .NET Core / the new .NET. To get Rhino Mocks working on the current version of .NET, we would need to actually fix those issues. That would require someone who understands how dynamic code generation and IL emitting work. I remember facing a lot of InvalidProgramException in the past, so that isn’t a generally applicable skill. ALICE is a tool for checking the crash correctness of databases and similar applications. It has a really interesting paper associated with it and was used to find several consistency issues with many systems (from databases to Git and Mercurial). The code was last touched in 2015 but the actual work appears to have happened just over ten years ago. ALICE made quite a splash when it came out, and many projects tried to test it against themselves to see if there were any issues with their usage of the file system APIs. Trying to run ALICE today, you’ll run into several problems. It uses Python 2.x, which is no longer supported. Moving it to Python 3.x was not a big deal, but a much bigger problem is that ALICE is very closely tied to the syscalls of the kernel (it monitors them to see how the application uses the file system API). Since ALICE was released, new syscalls were introduced, and the actual I/O landscape has changed quite a bit (for example, with IO_Uring). Making it work, even for a relatively small test case, was not a trivial task.The most interesting aspect of this investigation was not the particular problems that I found, but actually figuring out what is the process of addressing them. Just updating the code to the latest version is a mechanical process that is pretty easy.Updating the actual behavior, however, would require a high degree of expertise. Furthermore, it would also need a good understanding and insight into the codebase and its intended effects. A codebase that hasn’t been touched in a long while is unlikely to have such insight. When we talk about a codebase rotting, we aren’t referring to the source files picking up viruses or the like, we are discussing the loss of information about what the code is actually doing. Even worse, even if we can follow what the code is doing, understanding how to modify it is a high-complexity task. What about ongoing projects? Projects that have continuous updates and dedicated team members associated with them. It turns out that they can rot as well. Here is an example taken from the RavenDB codebase. This is a pretty important method as it adds an item to a B+Tree, which is quite a common (and important) operation in a database:You can see that this isn’t much of a function, most of the behavior happens elsewhere. However, you can see that this code has been around for a while. It was modified by four different people over the span of a decade. It is also pretty stable code, in terms of the number of changes that happened there.This is a small function, but you can see it pretty clearly when you are looking at the code at large. There are whole sections that are just… there. They are functional and work well, and no one needs to touch them for a very long period of time. Occasionally, we make minor changes, but for the most part, they are not touched much at all. How does that play into the notion of code rot? The code wouldn’t suffer as badly as the examples above, of course, since it is still being run and executed on an ongoing basis. However, the understanding of the code is definitely diminished. The question is, do we care? Those are the stable parts, the ones we don’t need to touch. Until we do… that is, and what happens then?Just making changes in our codebase for the sake of making changes is a bad idea. But going into the codebase and leaving it in a better state than before is a good practice. This helps ensure it doesn’t become a daunting ‘there be dragons’ scenario.

From Microservices to Modular Monoliths

by Ardalis

posted on: July 10, 2024

What do you do when you find yourself in microservice hell? How do you keep the gains you (hopefully) made in breaking up your legacy ball…Keep Reading →

C# 13: Explore the latest preview features

by Kathleen Dollard

posted on: July 09, 2024

C# 13 focuses on flexibility and performance, with top features like params collections for added flexibility, lock object for improved performance, and partial properties to support generators.

Failing to map: a tale of false hopes in mmap land

by Oren Eini

posted on: July 08, 2024

I usually talk about the things that I do that were successful. Today I want to discuss something that I tried but failed at. Documenting failed approaches is just as important, though less enjoyable, as documenting what we excel at.In order to explain the failure, I need to get a bit deeper into how computers handle memory. There is physical memory, the RAM sticks that you have in your machine, and then there is how the OS and CPU present that memory to your code. Usually, the abstraction is quite seamless, and we don’t need to pay attention to it. Occasionally, we can take advantage of this model. Consider the following memory setup, showing a single physical memory page that was mapped in two different locations:In this case, it means that you can do things like this:*page1 = '*'; printf("Same: %d - Val: %c\n", (page1 == page2), *page2); // output is: // Same: 0 - Val: *In other words, because the two virtual pages point to the same physical page in memory, we can modify memory in one location and see the changes in another. This isn’t spooky action at a distance, it is simply the fact that the memory addresses we use are virtual and they point to the same place. Note that in the image above, I modified the data using the pointer to Page 1 and then read it from Page 2. The Memory Management Unit (MMU) in the CPU can do a bunch of really interesting things because of this. You’ll note that each virtual page is annotated with an access permission. In this case, the second page is marked as Copy on Write. That means that when we read from this page, the MMU will happily read the data from the physical page it is pointed to. But when we write, the situation is different. The MMU will raise an exception to the operating system, telling it that a write was attempted on this page, which is forbidden. At this point, the OS will allocate a new physical page, copy the data to it, and then update the virtual address to point to the new page. Here is what this looks like:Now we have two distinct mappings. A write to either one of them will not be reflected on the other. Here is what this looks like in code:*page1 = '1'; // now printf("Page1: %c, Page2: %c\n", *page1, *page2); // output: Page1: 1, Page2: 1 *page2 = '2'; // force the copy on write to occur printf("Page1: %c, Page2: %c\n", *page1, *page2); // output: Page1: 1, Page2: 2As long as the modifications happened through the first page address (the orange one in the image), there was no issue and any change would be reflected in both pages. When we make a modification to the second page (the green one in the image), the OS will create a new physical page and effectively split them forever. Changes made to either page will only be reflected in that page, not both, since they aren’t sharing the same page.Note that this behavior applies at a page boundary. What happens if I have a buffer, 1GB in size, and I use this technique on it? Let’s assume that we have a buffer that is 1GB in size and I created a copy-on-write mapping on top of it. The amount of physical memory that I would consume is still just 1GB. In fact, I would effectively memcpy()very quickly, since I’m not actually copying anything. And for all intents and purposes, it works. I can change the data through the second buffer, and it would not show up in the first buffer. Of particular note is that when I modify the data on the second buffer, only a single page is changed. Here is what this looks like:So instead of having to copy 1GB all at once, we map the buffer again as copy on write, and we can get a new page whenever we actually modify our “copy” of the data.So far, this is great, and it is heavily used for many optimizations. It is also something that I want to use to implement cheap snapshots of a potentially large data structure. The idea that I have is that I can use this technique to implement it. Here is the kind of code that I want to write:var list = new CopyOnWriteList(); list.Put(1); list.Put(2); var snapshot1 = list.CreateSnapshot(); list.Put(3) var snapshot2 = list.CreateSnapshot(); list.Put(4);And the idea is that I’ll have (at the same time) the following:listsnapshot1snapshot21,2,3,41,21,2,3I want to have effectively unlimited snapshots, and the map may contain a large amount of data. In graphical form, you can see it here:We started with Page 1, created a Copy of Write for Page 2, modified Page 2 (breaking the Copy on Write), and then attempted to create a Copy on Write for Page 2. That turns out to be a problem.Let’s see the code that we need in order to create a copy using copy-on-write mapping on Linux:int shm_fd = shm_open("/third", O_CREAT | O_RDWR, 0666); ftruncate(shm_fd, 4096); char *page1 = mmap(NULL, 4096, PROT_READ | PROT_WRITE, MAP_SHARED, shm_fd, 0); page1[0] = 'A'; page1[1] = 'B'; // pages1 = 'AB' char *page2 = mmap(NULL, 4096, PROT_READ | PROT_WRITE, MAP_PRIVATE, shm_fd, 0); // pages2 = 'AB' page1[0]= 'a'; // pages1 = 'aB' // pages2 = 'aB' (same pagee) page2[2] = 'C'; // force a private copy creation // pages1 = 'aB' // pages2 = 'aBC' page1[1] = 'b'; // pages1 = 'ab' // pages2 = 'aBC' (no change here)The code in Windows is pretty similar and behaves in the same manner:HANDLE hMapFile = CreateFileMapping(INVALID_HANDLE_VALUE, NULL,PAGE_READWRITE,0,4096, TEXT("Local\\MySharedMemory")); char* page1 = MapViewOfFile(hMapFile, FILE_MAP_READ | FILE_MAP_WRITE, 0, 0, 4096); page1[0] = 'A'; page1[1] = 'B'; // pages1 = 'AB' char* page2 = MapViewOfFile(hMapFile, FILE_MAP_COPY, 0, 0, 4096); // pages2 = 'AB' page1[0] = 'a'; // pages1 = 'aB' // pages2 = 'aB' (same pagee) page2[2] = 'C'; // force a copy on write // pages1 = 'aB' // pages2 = 'aBC' page1[1] = 'b'; // pages1 = 'ab' // pages2 = 'aBC' (no change here)Take a look at the API we have for creating a copy-on-write:MapViewOfFile(hMapFile, FILE_MAP_COPY, 0, 0, 4096); // windows mmap(NULL, 4096, PROT_READ | PROT_WRITE, MAP_PRIVATE, shm_fd, 0); // linuxA key aspect of the API is that we need to provide a source for the Copy-on-Write operation. That means that we can only create a Copy-on-Write from a single source. We cannot perform a Copy-on-Write on top of a page that was marked as copy-on-write. This is because we cannot refer to it. Basically, I don’t have a source that I can use for this sort of mapping.I tried being clever and wrote the following code on Linux:int selfmem = open("/proc/self/mem", O_RDWR); char *page2 = mmap(NULL, 4096, PROT_READ | PROT_WRITE, MAP_PRIVATE, selfmem, (off_t)page1);On Linux, you can use the special file /proc/self/mem to refer to your memory using file I/O. That means that I can get a file descriptor for my own memory, which provides a source for my copy-on-write operation. I was really excited when I realized that this was a possibility. I spent a lot of time trying to figure out how I could do the same on Windows. However, when I actually ran the code on Linux, I realized that this doesn’t work. The mmap() call will return ENODEV when I try that. It looks like this isn’t a supported action.Linux has another call that looks almost right, which is mremap(), but that either zeros out or sets up a userfaulfdhandler for the region. So it can’t serve my needs.Looking around, I’m not the first person to try this, but it doesn’t seem like there is an actual solution.This is quite annoying since we are almost there. All the relevant pieces are available, if we had a way to tell the kernel to create the mapping, everything else should just work from there.Anyway, this is my tale of woe, trying (and failing) to create a snapshot-based system using the Memory Manager Unit. Hopefully, you’ll either learn something from my failure or let me know that there is a way to do this…

Activator.CreateInstance(Type) may return null

by Gérald Barré

posted on: July 08, 2024

Activator.CreateInstance(Type) allows you to create an instance of a type without knowing the type at compile time. For example, you might want to create an instance of a type based on a configuration file or a user input. Anytime you cannot use the new keyword you can use Activator.CreateInstance(

Reading unfamiliar codebases quickly: LMDB

by Oren Eini

posted on: July 05, 2024

Reading code is a Skill (with a capital letter, yes) that is really important for developers. You cannot be a good developer without it.Today I want to talk about one aspect of this. The ability to go into an unfamiliar codebase and extract one piece of information out. The idea is that we don’t need to understand the entire system, grok the architecture, etc. I want to understand one thing about it and get away as soon as I can.For example, you know that project Xyz is doing some operation, and you want to figure out how this is done. So you need to look at the code and figure that out, then you can go your merry way.Today, I’m interested in understanding how the LMDB project writes data to the disk on Windows. This is because LMDB is based around a memory-mapped model, and Windows doesn’t keep the data between file I/O and mmap I/O coherent.LMDB is an embedded database engine (similar to Voron, and in fact, Voron is based on some ideas from LMDB) written in C. If you are interested in it, I wrote 11 posts going through every line of code in the project. So I’m familiar with the project, but the last time I read the code was over a decade ago. From what I recall, the code is dense. There are about 11.5K lines of code in a single file, implementing the entire thing.I’m using the code from here.The first thing to do is find the relevant section in the code. I started by searching for the WriteFile() function, the Win32 API to write. The first occurrence of a call to this method is in the mdb_page_flush function.I look at this code, and… there isn’t really anything there. It is fairly obvious and straightforward code (to be clear, that is a compliment). I was expecting to see a trick there. I couldn’t find it.That meant either the code had a gaping hole and potential data corruption (highly unlikely) or I was missing something. That led me to a long trip of trying to distinguish between documented guarantees and actual behavior. The documentation for MapViewOfFile is pretty clear:A mapped view of a file is not guaranteed to be coherent with a file that is being accessed by the ReadFile or WriteFile function.I have my own run-ins with this behavior, which was super confusing. This means that I had experimental evidence to say that this is broken. But it didn’t make sense, there was no code in LMDB to handle it, and this is pretty easy to trigger. It turns out that while the documentation is pretty broad about not guaranteeing the behavior, the actual issue only occurs if you are working with remote files or using unbuffered I/O.If you are working with local files and buffered I/O (which is 99.99% of the cases), then you can rely on this behavior. I found some vaguereferences to this, but that wasn’t enough. There is this post that is really interesting, though.I pinged Howard Chu, the author of LMDB, for clarification, and he was quick enough to assure me that yes, my understanding was (now) correct. On Windows, you can mix memory map operations with file I/O and get the right results.The documentation appears to be a holdover from Windows 9x, with the NT line always being able to ensure coherency for local files. This is a guess about the history of documentation, to be honest. Not something that I can verify.I had the wrong information in my head for over a decade. I did not expect this result when I started this post, I was sure I would be discussing navigating complex codebases. I’m going to stand in the corner and feel upset about this for a while now.

Cloned Dictionary vs. Immutable Dictionary vs. Frozen Dictionary in high traffic systems

by Oren Eini

posted on: July 03, 2024

In my previous post, I explained what we are trying to do. Create a way to carry a dictionary between transactions in RavenDB, allowing one write transaction to modify it while all other read transactions only observe the state of the dictionary as it was at the publication time.I want to show a couple of ways I tried solving this problem using the built-in tools in the Base Class Library. Here is roughly what I’m trying to do:IEnumerable<object> SingleDictionary() { var dic = new Dictionary<long, object>(); var random = new Random(932); var v = new object(); // number of transactions for (var txCount = 0; txCount < 1000; txCount++) { // operations in transaction for (int opCount = 0; opCount < 10_000; opCount++) { dic[random.NextInt64(0, 1024 * 1024 * 1024)] = v; } yield return dic;// publish the dictionary } }As you can see, we are running a thousand transactions, each of which performs 10,000 operations. We “publish” the state of the transaction after each time. This is just to set up a baseline for what I’m trying to do. I’m focusing solely on this one aspect of the table that is published. Note that I cannot actually use this particular code. The issue is that the dictionary is both mutable and shared (across threads), I cannot do that. The easiest way to go about this is to just clone the dictionary. Here is what this would look like:IEnumerable<object> ClonedDictionary() { var dic = new Dictionary<long, object>(); var random = new Random(932); var v = new object(); // number of transactions for (var txCount = 0; txCount < 1000; txCount++) { // operations in transaction for (int opCount = 0; opCount < 10_000; opCount++) { dic[random.NextInt64(0, 1024 * 1024 * 1024)] = v; } // publish the dictionary yield return new Dictionary<long, object>(dic); } }This is basically the same code, but when I publish the dictionary, I’m going to create a new instance (which will be read-only). This is exactly what I want: to have a cloned, read-only copy that the read transactions can use while I get to keep on modifying the write copy.The downside of this approach is twofold. First, there are a lot of allocations because of this, and the more items in the table, the more expensive it is to copy.I can try using the ImmutableDictionary in the Base Class Library, however. Here is what this would look like:IEnumerable<object> ClonedImmutableDictionary() { var dic = ImmutableDictionary.Create<long, object>(); var random = new Random(932); var v = new object(); // number of transactions for (var txCount = 0; txCount < 1000; txCount++) { // operations in transaction for (int opCount = 0; opCount < 10_000; opCount++) { dic = dic.Add(random.NextInt64(0, 1024 * 1024 * 1024), v); } // publish the dictionary yield return dic; } }The benefit here is that the act of publishing is effectively a no-op. Just send the immutable value out to the world. The downside of using immutable dictionaries is that each operation involves an allocation, and the actual underlying implementation is far less efficient as a hash table than the regular dictionary.I can try to optimize this a bit by using the builder pattern, as shown here:IEnumerable<object> BuilderImmutableDictionary() { var builder = ImmutableDictionary.CreateBuilder<long, object>(); var random = new Random(932); var v = new object(); ; // number of transactions for (var txCount = 0; txCount < 1000; txCount++) { // operations in transaction for (int opCount = 0; opCount < 10_000; opCount++) { builder[random.NextInt64(0, 1024 * 1024 * 1024)] = v; } // publish the dictionary yield return builder.ToImmutable(); } }Now we only pay the immutable cost one per transaction, right? However, the underlying implementation is still an AVL tree, not a proper hash table. This means that not only is it more expensive for publishing the state, but we are now slower for reads as well. That is not something that we want.The BCL recently introduced a FrozenDictionary, which is meant to be super efficient for a really common case of dictionaries that are accessed a lot but rarely written to. I delved into its implementation and was impressed by the amount of work invested into ensuring that this will be really fast.Let’s see how that would look like for our scenario, shall we?IEnumerable<object> FrozenDictionary() { var dic = new Dictionary<long, object>(); var random = new Random(932); var v = new object(); // number of transactions for (var txCount = 0; txCount < 1000; txCount++) { // operations in transaction for (int opCount = 0; opCount < 10_000; opCount++) { dic[random.NextInt64(0, 1024 * 1024 * 1024)] = v; } // publish the dictionary yield return dic.ToFrozenDictionary(); } }The good thing is that we are using a standard dictionary on the write side and publishing it once per transaction. The downside is that we need to pay a cost to create the frozen dictionary that is proportional to the number of items in the dictionary. That can get expensive fast.After seeing all of those options, let’s check the numbers. The full code is in this gist.I executed all of those using Benchmark.NET, let’s see the results. MethodMeanRatioSingleDictionaryBench7.768 ms1.00BuilderImmutableDictionaryBench122.508 ms15.82ClonedImmutableDictionaryBench176.041 ms21.95ClonedDictionaryBench1,489.614 ms195.04FrozenDictionaryBench6,279.542 ms807.36ImmutableDictionaryFromDicBench46,906.047 ms6,029.69Note that the difference in speed is absolutely staggering. The SingleDictionaryBench is a bad example. It is just filling a dictionary directly, with no additional cost. The cost for the BuilderImmutableDictionaryBench is more reasonable, given what it has to do. Just looking at the benchmark result isn’t sufficient. I implemented every one of those options in RavenDB and ran them under a profiler. The results are quite interesting.Here is the version I started with, using a frozen dictionary. That is the right data structure for what I want. I have one thread that is mutating data, then publish the frozen results for others to use.However, take a look at the profiler results! Don’t focus on the duration values, look at the percentage of time spent creating the frozen dictionary. That is 60%(!) of the total transaction time. That is… an absolutely insane number.Note that it is clear that the frozen dictionary isn’t suitable for our needs here. The ratio between reading and writing isn’t sufficient to justify the cost. One of the benefits of FrozenDictionary is that it is more expensive to create than normal since it is trying hard to optimize for reading performance.What about the ImmutableDictionary? Well, that is a complete non-starter. It is taking close to 90%(!!) of the total transaction runtime. I know that I called the frozen numbers insane, I should have chosen something else, because now I have no words to describe this. Remember that one problem here is that we cannot just use the regular dictionary or a concurrent dictionary. We need to have a fixed state of the dictionary when we publish it. What if we use a normal dictionary, cloned?This is far better, at about 40%, instead of 60% or 90%.You have to understand, better doesn’t mean good. Spending those numbers on just publishing the state of the transaction is beyond ridiculous. We need to find another way to do this. Remember where we started? The PageTable in RavenDB that currently handles this is really complex. I looked into my records and found this blog post from over a decade ago, discussing this exact problem. It certainly looks like this complexity is at least semi-justified. I still want to be able to fix this… but it won’t be as easy as reaching out to a built-in type in the BCL, it seems.