skip to content
Relatively General .NET

Always profile! The case of the mysterious performance regression

by Oren Eini

posted on: November 13, 2020

This is part of the same issue as the previous post. I was tracking a performance regression between RavenDB 4.1 and RavenDB 4.2, there was a roughly 8% performance difference between the two (favoring the older version) which was reported to us. The scenario was very large and complex documents (hundreds of objects in a document, > 16KB of JSON each).The two code bases have had a lot of (small) changes between them, so it was hard to figure out exactly what was the root cause for the regression. Eventually I found something utterly bizarre. One of the things that we have to do when you save a document is check if the document has been modified. If so, we need to save it, otherwise, we can skip it. Here is the relevant piece of code in 4.1:So this costs 0.5 ms (for very large documents), seems perfectly reasonable. But when looking at this on 4.2, we have:This cost six times as much, what the hell?! To clarify, Blittable is the name of the document format that RavenDB uses. It is a binary JSON format that is highly efficient. You can think about this as comparing two JSON documents, because this is what it is doing.I mentioned that there are differences between these versions? There have been quite a few  (thousands of commits worth), but this particular bit of code hadn’t changed in years. I just couldn’t figure out what was going on. Then I looked deeper. Here are the cost of these calls. Here is the 4.1 version:And here is the 4.2 version:There are a few interesting things here. First, we can see that we are using Enumerable.Contains and that is where most of the time goes. But much more interesting, in 4.1, we are calling this method a total of 30,000 times. In 4.2, we are calling it 150,000 times!!! Note that CompareBlittable is recursive, so even though we call it on 10,000 documents, we get more calls. But why the difference between these version?I compared the code for these two functions, and they were virtually identical. In 4.2, we mostly change some error message texts, nothing major, but somehow the performance was so different. It took a while to figure out that there was another difference. In 4.1, we checked the changes in the documents in the order of the properties on the document, but on 4.2, we optimized things slightly and just used the index of the property. A feature of the blittable format is that properties are lexically sorted. Here is the document in question, in our test, we are modifying Property6, as you can see here:There are a total of 40 properties in this document. And much nesting. In this case, in 4.2, we are scanning for changes in the document using the lexical sorting, which means:The CompareBlittable() function will exit on the first change it detect, and in 4.1, it will get to the changed Property6 very quickly. On 4.2, it will need to scan most of the (big) document before it find a change. That is a large part of the cost difference between these versions. Now that I know what the issue is, we have to consider whatever behavior is better for us. I decided to use the order of inserted properties, instead of the lexical order. The reasoning is simple. If a user care about that, they can much more easily change the order of properties in the document than the names of the properties. In C#, you can just change the order the properties shows up in the class definition.I have to say, this was much harder to figure out than I expected, because the change happened in a completely different location and was very much none obvious in how it worked.

Always profile! The hidden cost of serializing large object graphs to JSON

by Oren Eini

posted on: November 12, 2020

We were looking at the cost of writing 10,000 documents to RavenDB and found out something truly interesting. The documents in question are complex, this is an object graph that includes over 120 objects belonging to 20 different classes. The serialized JSON is over 16KB in size. It is expected that something like that would take a lot of time. Here is the results under the profiler: Given the size involved, taking (under the profiler), just under 4.2 ms per document isn’t bad, I thought. In general, we consider the cost of JSON serialization as pretty fixed and ignore it. This time, I looked little bit deeper and found this: Note that this is deep in the call stack. But it is a pretty expensive call, I decided to look at all the calls that we had for this method, which gave me: A few things that I would like to note here. The time this single method takes is roughly 50% of the total time it takes to serialize the documents. In fact, I checked what would happen if I could remove this cost: You can insert your favorite profanity here, I usually go with Pasta! Looking at the stack trace, it is clear what is going on. RavenDB has a bunch of customizations for the default way documents are serialized. And JSON.Net will call the CanConvet() method on the converters on each new object that it is about to covert. Given the level of complexity in these documents (over 450 values), that means that we would call this a lot. And the cost of finding out if we need to check convert the value or not dominated this kind of operation completely. I wrote the following converter, instead: This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters Show hidden characters public class CachingJsonConverter : JsonConverter { private readonly JsonConverter[] _converters; private Dictionary<Type, JsonConverter> _cache = new Dictionary<Type, JsonConverter>(); public override bool CanRead { get; } public override bool CanWrite { get; } public CachingJsonConverter(JsonConverter[] converters, bool canRead, bool canWrite) { CanRead = canRead; CanWrite = canWrite; _converters = converters; } public override void WriteJson(JsonWriter writer, object? value, JsonSerializer serializer) { if (value == null) { writer.WriteNull(); return; } Type objectType = value.GetType(); if (_cache.TryGetValue(objectType, out var converter)) { converter.WriteJson(writer, value, serializer); return; } // can happen if the cache was replaced by another copy without the right converter // this can happen if we lost the race, UpdateCache return the right converter, so no issue here UpdateCache(objectType).WriteJson(writer, value, serializer); } public override object? ReadJson(JsonReader reader, Type objectType, object? existingValue, JsonSerializer serializer) { if (_cache.TryGetValue(objectType, out var converter)) { return converter.ReadJson(reader, objectType, existingValue, serializer); } // can happen if the cache was replaced by another copy without the right converter // this can happen if we lost the race, UpdateCache return the right converter, so no issue here return UpdateCache(objectType).ReadJson(reader, objectType, existingValue, serializer); } public override bool CanConvert(Type objectType) { if (_cache.TryGetValue(objectType, out var value)) return value != null; return UpdateCache(objectType) != null; } private JsonConverter UpdateCache(Type objectType) { foreach (var converter in _converters) { if (!converter.CanConvert(objectType)) continue; _cache = new Dictionary<Type, JsonConverter>(_cache) {[objectType] = converter}; return converter; } _cache = new Dictionary<Type, JsonConverter>(_cache) {[objectType] = null}; return null; } } view raw CachingJsonConverter.cs hosted with ❤ by GitHub As you can see, the core of the fix is in the CanConvertInternal() method, there I’m checking the internal converters and caching the results. Note that I’m intentionally not using any thread safe operations here, even though this is meant to be a shared instance. The idea is that on each new item, we’ll create a new copy of the cache, and we can afford to waste the memory / work to compute the cache if we can reduce the common cost of verifying the types on each call. We are actually expecting to lose the race conditions here, which is fine. After a very short while, we are going to be stabilized and not have to pay any additional costs. Here is the result, you can see that we are taking about half of the previous run time. And if we look at just the GetMatchingConverter() call: I actually had to end up with two such calls, because I have some converters that are only for writes. Given the huge performance improvement, we just created two converters to handle the caching. Note that without a profiler, I would have never been able to figure something like this out, and even with the profiler, I had to dig deep and it took a while to narrow down where the costs actually were. I’m very surprised.

Dealing with large documents (100+ MB)

by Oren Eini

posted on: November 11, 2020

RavenDB can handle large documents. There actually is a limit to the size of a document in RavenDB, but given that it is 2 GB in size, that isn’t a practical one. I actually had to go and check if the limit was 2 or 4 GB, because it doesn’t actually matter.That said, having large documents is something that you should be wary of. They work, but they have very high costs.I have run some benchmarks on the topic a while ago, and the results are interesting. Let’s consider a 100MB document. Parsing time for that should be around 4 – 5 seconds. That ignores the fact that there are also memory costs. For example, you can have a JSON documents that is parsed to 50(!) time the size of the raw text. That is 5GB of memory to handle a single 100MB document. That is just the parsing cost. But there are others. Reading a 100MB from most disks will take about a second, assuming that the data is sequential. Assuming you have 1Gbits/S network, all of which is dedicated to this single document, you can push that to the network in 800 ms or so. Dealing with such documents is hard and awkward, if you accidently issue a query on a bunch of those documents and get 25 of them page, you just got a query that is 2.5 GB in size.  With documents of this size, you are also likely to want to modify multiple pieces at the same time, so you’ll need to be very careful about concurrency control as well. In general, at those sizes, you stop threating this as a simple document and move to a streaming approach, because anything else doesn’t make much sense, it is too costly. A better alternative is to split this document up to its component parts. You can then interact with each one of them on an independent basis. It is the difference between driving an 18 wheeler and driving family cars. You can pack a whole lot more on the 18 wheeler truck, but it got a pretty poor mileage and it is very awkward to park. You aren’t going to want to use that for going to the grocery store.

GitHub Actions from CLI

by Ardalis

posted on: November 11, 2020

Tim Heuer recently published an article showing how to create your own dotnet CLI templates for generating GitHub action YAML files. I…Keep Reading →

I’ll be speaking on Build Stuff conference this week

by Oren Eini

posted on: November 09, 2020

I’ll be speaking in the Build Stuff conference about the design of RavenDB Cloud, how we built the backend and control plane for a multi cloud, highly available database as a service platform. The key concept that I’m going to be talking about is that we need simplicity, in the design, in development and in operations.

Rendering raw/unescaped HTML in Blazor

by Gérald Barré

posted on: November 09, 2020

By default, everything you write in a razor file is escaped to prevent any injection. This is very nice to be safe by default. But there are some use cases where you want to generate an HTML string in the code and render it as-is. For instance, you can convert a markdown string to HTML and render i

Spending political capital

by Oren Eini

posted on: November 06, 2020

As I’m writing this, there seem to be a huge amount of furor over the US elections. I don’t want to get into that particular mess, but I had a few idle thoughts about how to fix one of the root issues here. The fact that the system of voting we use is based on pen & paper and was designed when a computer was a job title for a practical mathematician. A large part of the problem is that we have to wait for a particular date, and there are polls, which seems to be used for information and misinformation at the same time. This means that people are basically trying to vote with insufficient information. It would be much more straightforward if the whole system was transparent and iterative. Note: This is a blog post I’m writing to get this idea out of my head. I did very little research and I’m aware that there is probably a wide body of proposals in the area, which I didn’t look at.The typical manner I have seen suggested for solving the election problem is to ensure the following properties:Verifying that my vote was counted properly.Verifying that the total votes were counted fairly.(maybe) Making it hard / impossible to show a 3rd party who I voted to.The last one is obviously pretty hard to do, but is important to prevent issues with pressuring people to vote for particular candidates.I don’t have the cryptographic chops to come up with such a system, not do I think that it would be particularly advisable. I would rather come up with a different approach entirely. In my eyes, we can have a system where we have the following:The government issues each voter with a token (coin) that they can spend. That will probably get you to think about blockchains, but I don’t think this is a good idea. If we are talking about technical details, let’s say that the government issues a certificate to each voter (who generates their own private key, obviously).The voter can then give that voting coin to a 3rd party. For example, but signing a vote for a particular candidate using their certificate. These coins can then be “spent” during election. The winner of the election is the candidate that got more than 50% of the total coins spent. As it currently stands, this is a pretty horrible system. To start with, this means that it is painfully obvious who voted for whom. I think that a transparent voting record might be a good idea in general, but there are multitude of problems with that. So like many great ideas in theory, we shouldn’t allow it. This is usually where [complex cryptography] comes into play, but I think that a much better option would be to introduce the notion of brokers into the mix. What do I mean?While you could spend your coin directly on a particular candidate, you could also spend it at a broker. That broker is then delegated to use your vote. You may use a party affiliated broker or a neutral one. Imagine a 2 party system when you have the Olive party and the Opal party. I’m using obscure colors here to try to reduce any meaning people will read into the color choices. For what it’s worth, red & blue as party colors have opposite meaning in the states and Israel, which is confusing.Let’s take two scenarios into considerations:A voter spend their coin on the Olive Political Action Committee, who is known to support Olive candidates. In this case, you can clearly track who they want to vote for. Note that they aren’t voting directly for a candidate, because they want to delegate their vote to a trusted party to manage that.A voter spend their coin on a Private Broker #435. While they do that, they instruct the broker to pass their vote to the Olive PAC broker, or a particular candidate, etc. The idea is that the private broker is taking votes from enough people that while it is possible to know that you went through a particular broker, you can’t know who you voted for. The broker itself obviously know, but that is similar to tracking the behavior of a voting booth, which also allows you to discover who voted to whom. I think that it is possible to set things up so the broker itself won’t be able to make this association, however. Secured Sum would work, probably. A key point for me is that this is an issue that is internal to a particular broker, not relevant to the system as a whole.So far, I don’t think that I’m doing too much, but the idea here is that I want to take things further. Instead of stopping the system there, we allow to change the vote. In other words, instead of having to vote blindly, we can look at the results and adjust them. In the Apr 2019 Israeli election, over 330 thousands votes were discarded because they didn’t reach the minimum threshold. That was mostly because the polls were wrong, because I don’t think that people would have voted for those parties if they knew that they are on the verge. Being able to look at that and then adjust the votes would allow all those votes to be counted. Taking this further, once we have the system of brokers and electronic votes in place, there is no reason to do an election once every N years. Instead, we can say that the coins are literal political capital. In order to remain in office, the elected officials must keep holding over 50% of the amount of spent coins. It would probably be advisable to actually count these on a weekly / bi-weekly basis, but doing this on a short time intervals means that there is going to be a lot more accountability. Speaking from Israel’s point of view, there have been sadly numerous cases where candidates campaigned on A, then did the exact opposite once elected. There is even a couple of famous sayings about it:We promised, but didn’t promise to fulfil.What you see from here you can’t see from there.Note that this is likely to result in more populist system, since the elected officials are going to pay attention to the electorate on an ongoing basis, rather than just around election time. I can think of a few ways to handle that. For example, once seated, for a period of time, you’ll need > 50% of the coins to get an elected official out of office.A few more details:For places like the states, where you vote for multiple things at the same time (local, state, federal house & senate, president), you’ll get multiple coins to spend, and can freely spend them in different locations. Or, via a broker, designate that they are to be spend on particular directions.A large part of the idea is that a voter can withdraw their coin from a broker or candidate at any time.Losing the key isn’t a big deal. You go to the government, revoke your pervious certificate and get a new one. The final computation will simply ignore any revoked coins.The previous point is also important to dealing with people who died or moved. It is trivial to ensure that the dead don’t vote in this scheme, and you can verify that a single person don’t have coins from multiple different locations.The implications of such a system are interesting, in my eyes. The idea is that you delegate the vote to someone you trust, directly or indirectly. That can be a candidate, but most likely will be a group. The usual term in the states in PAC, I believe. The point is that you then have active oversight by the electorate on the elected officials. Over seven years ago I wrote a post about what I most dislike in the current election system. You are forced to vote on your top issue, but you usually have a far more complex system of values that you have to balance. For example, let’s say that my priorities are:National securityFiscal policyHealthcareAccess to violet flowersI may have to balance between a candidate that want to ban violet flowers but propose to lower taxes or a candidate that wants to raise taxes and want to legalize violet flowers. Which one do I choice? If I can shift my vote as needed, I can make that determination at the right political time. During the budget month, my votes goes to $PreferredFiscalPolicy candidate and then if they don’t follow my preferences on violet flowers, I can shift. This will have the benefit of killing the polls industry, since you can just look at where the political capital is. And it will allow the electorate to have a far greater control over the government. I assume that elected officials will then be paying very careful attention to how much political capital they have to spend and act accordingly. I wonder if I should file this post under science fiction, because that might make a good background for world building. What do you think of this system? And what do you think the effects of it would be?