skip to content
Relatively General .NET

What bug through yonder weirdness my build breaks?

by Oren Eini

posted on: September 13, 2022

We have a lot of tests for RavenDB, and we are running them on plenty of environments. We semi frequently get a build failure when running on the “macOS latest” runner on GitHub. The problem is that the information that I have is self-contradicting. Here is the most relevant piece: This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters Show hidden characters byte* address = allocation.Address; byte* current = _ptrCurrent; Debug.Assert(address != current, $"address != current ({new IntPtr(address)} != {new IntPtr(current)} [{nameof(_ptrCurrent)} = {new IntPtr(_ptrCurrent)}])"); // On MacOS, this sometimes fail with: // Method Debug.Fail failed with 'address != current (140409565380608 != 140409565380608 [_ptrCurrent = 140409565380608]) view raw wierd.cs hosted with ❤ by GitHub Here you can see the failure itself and what is causing it. Note that the debug message is showing that all three variables here have the same numeric value. The address and the current variables are also held on the stack, so there is no option for race conditions, or something like that. I can’t figure out any reason why this would be triggered, in this case. About the only thing that pops to mind is whether there is some weirdness going on with pointer comparisons on MacOS, but I don’t have a lead to follow. We haven’t investigated it properly yet, I thought to throw this to the blog and see if you have any idea what may be going on here.

Reducing complexity with a shift in thinking

by Oren Eini

posted on: September 12, 2022

I love B+Trees, but they can be gnarly beasts, with the number of edge cases that you can run into. Today’s story is about a known difficult place, page splitting in the tree. Consider the following B+Tree, showing a three-level tree with 3 elements on each page. Consider what will happen when we want to insert a new value to the tree, the value: 27. Given the current state of the tree, that should go on the page marked in red: But there is no place for the new value on this page, so we have to split it. The tree will then look like so, we split the page and now we need to add the new page to the parent, but that one also doesn’t have room for it: So we are now in a multi-level split process. Let’s see what this looks like when we go up the tree. This is the final state of the tree when we are done doing all the splits: The reason for all of this is that we need to add 27 to the tree, and we haven’t done that yet. At this stage, we got the tree back in order and we can safely add the new value to the tree, since we made sure we have enough space. However, note that the exact same process would apply if we were adding 27 or 29. The page that we’ll add them to, however, is different. This can be quite complex to keep track of, because of the recursive nature of the process. In code, this looks something like this: This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters Show hidden characters def add(self, key, value): page = self.findPageFor(key) self.addToPage(key, value) def addToPage(self, key, value): if self.cursor[self.pos].add(key, value): return # the current page retains the bottom half of entries # the newPage gets the top half of entries if self.pos == 0: # root page self.insertNewRootPage() # afterward, pos == 1 newPage = self.cursor[self.pos].split() self.popPage() # now adding reference to child page self.addToPage(newPage.entries[0], newPage.pageNum) pos = self.searchPage(key) # find the relevant page for the entry self.pushPage(self.cursor[self.pos].entries[pos]) # go into the child page self.addToPage(key, value) # add in the right location view raw add.py hosted with ❤ by GitHub I am skipping on some details, but that is the gist of it. So we do the split (recursively if needed) and then after we wired the parent page properly, we find the right location for the new value. An important aspect here is the cursor. We use that to mark our current location in the tree, so the cursor will always contain all the parent pages that we are currently searching upon. A lot of the work that we are doing in the tree is related to the cursor. Now, look at the code and consider the behavior of this code when we insert the value 29. It will correctly generate this page: However.. what happens if we’ll insert 27? Well, when we split the page, we went up the tree. And then we had another split, and then we went down another branch. So as written, the result would be adding the 27 to the same page as we would the 29. This would look like this: Look at the red markers. We put entry 27 on the wrong page. Fixing this issue is actually pretty hard, because we need to keep track of the values as we go up and down the tree. For fun, imagine what happens in this exact scenario, but when you have 6 levels in the tree and you end up in a completely different location in the tree. I spent a lot of time struggling with this issue, including getting help from some pretty talented people, and the final conclusion we got was “it’s complicated”. I don’t want complications here, I need it to be as simple as possible, otherwise, we can’t make any sort of sense here. I kept spinning more and more complex systems to resolve this, when I realized that I just looked at the problem in the wrong manner all along. The issue was that I was trying to add the new value to the tree after I sorted out the structure of the tree, but there was actually nothing that forced me to do that. Given that I already split the page at this stage, I know that I have sufficient space to add the key without doing anything else.  I can first add the key to the right page, then write the split page back to the tree. In this case, I don’t need to do any sort of backtracking or state management . Here is what this looks like: This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters Show hidden characters def add(self, key, value): page = self.findPageFor(key) self.addToPage(key, value) def addToPage(self, key, value): if self.cursor[self.pos].add(key, value): return # the current page retains the bottom half of entries # the newPage gets the top half of entries if self.pos == 0: # root page self.insertNewRootPage() # afterward, pos == 1 newPage = self.cursor[self.pos].split() if newPage.entries[0] >= key: # figure out what page this goes into newPage.add(key, value) else: page.add(key, value) self.popPage() # now adding reference to child page self.addToPage(newPage.entries[0], newPage.pageNum) view raw add2.py hosted with ❤ by GitHub And with this change, the entire class of problems related to the tree structure just went away. I’m very happy with this result, even if it is a bit ironic. Like the problem at hand, a lot of the complexity was there because I had to backtrack the implementation decisions and go on a new path to solve this. Also, I just checked, the portion that controls page splits inside Voron has had roughly 1 change a year for the past 5 years. Given our scope and usage, that means that it has been incredibly stable in the face of everything that we could throw at it.

Detecting Double-Writes in MSBuild using the binlog

by Gérald Barré

posted on: September 12, 2022

Double-writes are a common problem when building an application. This occurs when the same file is written at least twice during a build. This could be because 2 different projects are outputting the same files, so you are doing unneeded work. Thus, the build is slower than it needs to be. Also, it

Monitoring I/O inside of RavenDB

by Oren Eini

posted on: September 07, 2022

Earlier this year I talked about the reasoning behind the RavenDB team spending so much time building self monitoring diagnostics. Today I’m happy to announce that RavenDB has another such feature, allowing you to see, in real time, exactly what is going on in your database in terms of I/O. You can see what this looks like in this image, RavenDB is showing you key metrics in terms of I/O utilization. You can get the same metrics from your disk directly, and you have similar dashboards on all cloud infrastructure. For that matter, RavenDB Cloud also gives you access to this information. So why build that directly into RavenDB? The answer is simple, putting that directly into RavenDB means that it is accessible. We don’t assume that a user has access / ability to see such data. For example, you may have access to the RavenDB instance, but not to the cloud account that is running it. Or you may not have sufficient permissions to view metrics data. In many cases, even if they have access, they don’t know (or it wouldn’t occur to them to look at that). By reducing the amount of hassle you have to go through, we can make those metrics more accessible. Remember, you probably aren’t looking for those numbers for fun. You need to troubleshoot some issue, and being able to directly see what is going on is key to quickly resolving a problem. For example, if you are seeing the disk queue length spiking a lot, you know that you are spending all of your I/O budget. Just knowing that will let you direct your investigation in the right direction a lot sooner.

Comparing SQLite WAL mode to Voron’s

by Oren Eini

posted on: September 06, 2022

There is a great article discussing how SQLite is handling transactions at fly.io. Which led me to the great documentation on the WAL mode for SQLite. And that led me to think about the differences between how SQLite does it and how Voron does it. Both SQLite and Voron share asame behavior, they use Copy on Write and make the modifications for the pages in the database on a copy of the data. That means that readers can continue to operate with zero overhead, reading the stable snapshot of their data. However, SQLite works by copying the data to the WAL file directly and modifying it there. Voron doesn’t use this approach. Instead, we have the notion of scratch space where this is done. Look at the figure below, which showcase the difference between the databases: In SQLite, any modifications are written to the WAL file and operated on there. When you want to commit a transaction in SQLite, you’ll compute the checksum of all the pages modified in the transaction and write a commit record to the disk, at which point you’ll need to issue an fsync() call. Voron, on the other hand, will copy the data that is modified in the transaction into scratch space (essentially, just some memory we allocated). On commit, it will not write the data to the WAL. Instead, it will take the following actions: Compute a diff of the current state of the page compared to its initial state, writing only the modifications. Compress the resulting output. Compute a checksum of all the pages that were modified. Write the compressed output and the checksum as a single write call. Voron opens the file with O_DIRECT | O_DSYNC, the idea is that we don’t need to call fsync() at any stage, and we significantly reduce the number of system calls we have to make to commit a transaction. Other transactions, at the same time, will use the data in the scratch space, not the WAL, to read the current state of pages. Voron also supports MVCC, so you may have multiple copies of the data in memory at once (for different transactions). Voron is able to significantly reduce the total amount of I/O we have to use for writes, because we only write the changes in the data between page versions and even that is compressed. We typically can safely trade off the additional CPU work in favor of I/O costs and still come up ahead. Another reason we went with this route is that we use memory mapped files, and on Windows, those aren’t coherent with file I/O calls. That means that mixing reading via mmap() and writing via file I/O (which is what we want to do to avoid fsync() calls) wouldn’t really work. Voron also benefits from not having to deal with multiple processes running at the same time, since it is typically deployed from within a single process. Finally, the fact that we use scratch spaces separately from the WAL means that we put that somewhere else. You can have a fast empheral disk (common on the cloud) for scratch files, very fast (but small) disk for the WAL journal and standard disk for the actual data. This sort of configuration gives you more choices on how to deal with the physical layout of your data.

RavenDB PHP Client beta is out

by Oren Eini

posted on: September 05, 2022

The official RavenDB Client for PHP is now out in beta. You can now make use of a rich client to consume RavenDB with all the usual features you would expect. To start using RavenDB, run: $ composer require ravendb/ravendb-php-client And then you can start using RavenDB in your project. Here are some interesting code samples. Setting up a document store: This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters Show hidden characters use RavenDB\Documents\DocumentStore; $store = new DocumentStore(["http://live-test.ravendb.net" ], "Northwind") $store->initialize(); view raw init.php hosted with ❤ by GitHub Loading a document: This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters Show hidden characters $session = $store->openSession(); $company = $session->load(Company::class, $companyId); echo $company->getName() . "\n"; echo "Located at: " . $company->getAddress()->getCountry() . "\n"; view raw load.php hosted with ❤ by GitHub Querying: This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters Show hidden characters $session = $store->openSession(); $orders = $session->query(Order::class) ->whereEquals("customer", $customerId) ->toList(); echo "Orders for " . $customerId . "\n"; foreach($orders as $order) { echo $order->getId() . " - " . $order->getTotal() . "$\n"; } view raw query.php hosted with ❤ by GitHub Pretty much all other capabilities are also available  (unit of work, change tracking, automatic failover, and more). Please give it a whirl, we’ll love to hear about your experience with RavenDB & PHP.

Getting user consent before executing sensitive code

by Gérald Barré

posted on: September 05, 2022

Before executing sensitive actions, such as purchasing something or showing sensitive data, you need to get user consent. Also, you need to ensure the actual user is the one who is executing the action. Windows provides an API to get user consent before executing sensitive actions.First, you need t

Using RavenDB for Department of Defense projects

by Oren Eini

posted on: September 02, 2022

If you are using RavenDB for defense projects, we have got good news for you. RavenDB is now available on Iron Bank, making it that much easier to make use of RavenDB in defense or high security projects. Iron Bank is the DoD repository of digitally signed, binary container images including both Free and Open-Source software (FOSS) and Commercial off-the-shelf (COTS). All artifacts are hardened according to the Container Hardening Guide. Containers accredited in Iron Bank have DoD-wide reciprocity across classifications. RavenDB has a history of focusing on (usable) security and making sure that your systems are secured by default and in practice. Now it is even easier to make use of RavenDB in projects that are required to meet the DoD standards. Note that you get the exact same codebase and configuration that you’ll get from the usual RavenDB distribution, it has simply been audited and approved by Iron Bank.

Rainbow Colorized Brackets in Visual Studio

by Ardalis

posted on: August 31, 2022

A popular extension and later core feature of VS Code, rainbow bracket colorization is now available as a free extension for Visual Studio…Keep Reading →