skip to content
Relatively General .NET

Tracking down RavenDB I/O usage in Linux

by Oren Eini

posted on: August 08, 2022

Today I had to look into the a customer whose RavenDB instance was burning through a lot of I/O. The process is somewhat ingrained in me by this point, but I thought that it would make for a good blog post so I’ll recall that next time.Here is what this looks like from the point of view of the disk:We are seeing a lot of reads in terms of MB/sec and a lot of write operations (but far less in terms of bandwidth). That is the external details, can we figure out more? Of course.We start our investigation by running:sudo iotop -aoThis command gives us the accumulative time for threads that do I/O. One of the important things that RavenDB is to tag its threads with the tasks that they are assigned. Here is a sample output: TID PRIO USER DISK READ DISK WRITE> SWAPIN IO COMMAND 2012 be/4 ravendb 1748.00 K 143.81 M 0.00 % 0.96 % Raven.Server -c /ravendb/config/settings.json [Follower thread] 9533 be/4 ravendb 189.92 M 86.07 M 0.00 % 0.60 % Raven.Server -c /ravendb/config/settings.json [Indexing of Use] 1905 be/4 ravendb 162.73 M 72.23 M 0.00 % 0.39 % Raven.Server -c /ravendb/config/settings.json [Indexing of Use] 1986 be/4 ravendb 154.52 M 71.71 M 0.00 % 0.38 % Raven.Server -c /ravendb/config/settings.json [Indexing of Use] 9687 be/4 ravendb 185.57 M 70.34 M 0.00 % 0.59 % Raven.Server -c /ravendb/config/settings.json [Indexing of Car] 1827 be/4 ravendb 172.60 M 65.25 M 0.00 % 0.69 % Raven.Server -c /ravendb/config/settings.json ['Southsand']In this case, we see the top 6 threads in terms of I/O (for writes). We can see that we have a lot of of indexing and documents writes. That said, thread names in Linux are limited to 14 characters, so we probably need to give better names to them.That is part of the task, let’s look at the cost in terms of reads: TID PRIO USER DISK READ> DISK WRITE SWAPIN IO COMMAND 11191 be/4 ravendb 2.09 G 31.75 M 0.00 % 7.58 % Raven.Server -c /ravendb/config/settings.json [.NET ThreadPool] 11494 be/4 ravendb 1353.39 M 14.54 M 0.00 % 19.62 % Raven.Server -c /ravendb/config/settings.json [.NET ThreadPool] 11496 be/4 ravendb 1265.96 M 4.97 M 0.00 % 16.56 % Raven.Server -c /ravendb/config/settings.json [.NET ThreadPool] 11211 be/4 ravendb 1120.19 M 42.66 M 0.00 % 2.83 % Raven.Server -c /ravendb/config/settings.json [.NET ThreadPool] 11371 be/4 ravendb 1114.50 M 35.25 M 0.00 % 5.00 % Raven.Server -c /ravendb/config/settings.json [.NET ThreadPool] 11001 be/4 ravendb 1102.55 M 43.35 M 0.00 % 3.12 % Raven.Server -c /ravendb/config/settings.json [.NET ThreadPool] 11340 be/4 ravendb 825.43 M 26.77 M 0.00 % 4.85 % Raven.Server -c /ravendb/config/settings.json [.NET ThreadPool]That is a lot more complicated, however. Now we don’t know what task this is running, only that something is reading a lot of data.We have the thread id, so we can make use of that to see what it is doing:sudo strace -p 11191 -cThis command will track the statistics on the systems calls that are issued by the specified thread. I’ll typically let it run for 10 – 30 seconds and then hit Ctrl+C, giving me:% time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 90.90 3.868694 681 5681 82 futex 8.28 0.352247 9 41035 sched_yield 0.79 0.033589 1292 26 pwrite64 0.03 0.001246 52 24 1 recvfrom 0.01 0.000285 285 1 restart_syscall 0.00 0.000000 0 2 madvise 0.00 0.000000 0 2 sendmsg ------ ----------- ----------- --------- --------- ---------------- 100.00 4.256061 46771 83 totalI’m mostly interested in the pwrite64 system call. RavenDB uses mmap() for most of its data access, so that is harder to read, but we can get a lot of information from the output. Now I’m going to run the following command:sudo strace -p 11191 -e trace=pwrite64This will give us a trace of all the pwrite64() system calls from that thread, looking like this:pwrite64(315, "\365\275"..., 4113, 51080761896) = 4113 pwrite64(315, "\344\371"..., 4113, 51080893512) = 4113There is an strace option (-y) that can be used to show the file paths for system calls, but I forgot to use it, no worries, I can do:sudo ls -lh /proc/11191/fd/315Which will give me the details on this file:lrwx------ 1 root root 64 Aug 7 09:21 /proc/11783/fd/315 -> /ravendb/data/Databases/Southsand/PeriodicBackupTemp/2022-08-07-03-30-00.ravendb-encrypted-full-backup.in-progressAnd that tells me everything that I need to know. The reason we have high I/O is that we are generating a backup file. That explains why we are seeing a lot of reads (since we need to read in order to generate the backup).The entire process is mostly about figuring out exactly what is going on, and RavenDB is very careful about leaving as many breadcrumbs as possible to make it easy to follow.

Using a custom build pipeline with Azure Static Web Apps

by Gérald Barré

posted on: August 08, 2022

Azure Static Web Apps allows to host static web sites and their API. They also provide a GitHub action to build and deploy your application. The GitHub action automatically detects the kind of project from the sources and builds it. It currently supports many JavaScript frameworks and static site g

Production postmortem

by Oren Eini

posted on: August 05, 2022

A customer was experiencing large memory spikes in some cases, and we were looking into the allocation patterns of some of the queries that were involved. One of the things that popped up was a query that allocated just under 30GB of managed memory during its processing. Let me repeat that, because it bears repeating. That query allocated 30(!) GB(!) during its execution. Now, that doesn’t mean that it was consuming 30 GB, it was just the allocations involved. Most of that memory was immediately discarded during the operation. But 30 GB of garbage to cleanup puts a lot of pressure on the system. We took a closer look at the offensive query. It looked something like this: from index “Notifications/RoutingAndPriority” where startsWith(Route, $routeKeyPrefix) order by Priority desc That does not seem like a query that should be all that expensive. But details matter, so we dove into this. For this particular query, the routes are hierarchical structures that are unique for each message. Something like: notifications/traffic/new-york-city/67a81019-941b-4d04-a0db-0559ed45343c notifications/emergency/las-vegas/0a8e18fb-563b-4b6a-8e93-e10e08239656 And the queries that were generated were using the city & topic to filter the information that they were interested in. The customer in question had a lot of notifications going on at all times. And each one of their Routes was unique. Internally, RavenDB uses Lucene (currently ) to handle searches, and Lucene is using an inverse index to execute queries. The usual way to think about is like this: We have a list of terms (Brown, Green & Purple) and each of them has a list of the matching documents that contain the particular term. The process of issuing a prefix query then is easy, scan all entries that match the prefix and return their results. This is indeed what Lucene is doing. However… while it is doing that, it will do something like this: This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters Show hidden characters var termsEnum = GetTermsEnum(); foreach(string term in termsEnum.Prefix("notifications/traffic/new-york-city/")) { foreach(int match in docsEnum.ForTerm(term)) { yield return match; } } view raw prefix.cs hosted with ❤ by GitHub Pay close attention to what is actually happening here. There are two enumerators that we work with. One for the terms for the field and one for the documents for a specific term. All of this is perfectly reasonable, but there is an issue. What happens when you have a lot of unique values? Well, then Lucene will have a lot of iterations of the loop. In this case, each term has just a single match, and Lucene is pretty good at optimizing search by specific term. The actual problem is that Lucene allocates a string instance for each term. If we have 30 million notifications for New York’s traffic, that means that we’ll allocate 30 million strings during the processing of the query. We aren’t retaining these strings, mind. They’ll be cleaned up by the GC quickly enough, but that is an additional cost that we don’t actually want. Luckily, in this case, there is a much simple solution. Given that the pattern of the route is known, we can skip the unique portion of the route. That means that in our index, we’ll do something similar to: Route = doc.Route.Substring(0, doc.Route.LastIndexOf('/') + 1) Once that is done, the number of unique matches there would be negligible. There would be no more allocations galore to observe and overall system performance is much improved. We looked into whether there is something that we can do with Lucene to avoid this allocations issue, but it is endemic to the way the API works. The longer term plan is to fix that completely, of course. We are making great strides there already . In short, if you are doing startsWith() queries or similar, pay attention to the number of unique terms that you have to go through. A simple optimization on the index like the one above can bring quite a bit of dividends.

Faster and Safer NuGet restore using Source Mapping and Lock files

by Gérald Barré

posted on: August 01, 2022

Using NuGet packages is very common. Some options may help get better performance and security when restoring packages.#Lock FilesWhen restoring packages, NuGet creates a dependency graph. This graph includes all declared packages and the transitive packages. This dependency graph is then used to d

Webinar Recording

by Oren Eini

posted on: July 26, 2022

In RavenDB 5.4, we’re introducing new ELT features for Kafka and RabbitMQ. Now, instead of your documents just sitting there in your database, you can involve them in your messaging transactions. In this webinar, RavenDB CEO Oren Eini explains how these ETL tasks open up a whole new world of architectural patterns, and how they spare you from a lot of complexity when you want to involve your data in pub/sub or other messaging patterns.

Considering replacing Disqus with Giscus

by Andrew Lock

posted on: July 26, 2022

In this post I'm soliciting opinions-should I pay to keep using Disqus for comments, or should I switch to an open-source alternative that uses GitHub…

Recording

by Oren Eini

posted on: July 25, 2022

I spoke at Cloud Lunch & Learn about the basics of building a database from scratch. We took a storage engine and created a simple database within the span of an hour. Covered in the talk are the details of how you can build the database, using indexes to speed up queries and the manner in which a database interacts with its storage engine. I think it was a great talk, but let me know your feedback:

Running Lighthouse CI when deploying Azure SWA

by Gérald Barré

posted on: July 25, 2022

When you add new features to a web application, it's important to ensure there is no regression in terms of performance. LightHouse CI is a tool that helps you to run a Lighthouse analysis on your web application. If you don't know lighthouse, it's a tool that analyzes web apps and web pages and ge