Microsoft.Extensions.Logging allows providers to implement semantic or structured logging. This means that the logging system can store the parameters of the log message as fields instead of just storing the formatted message. This enables logging providers to index the log parameters and perform a
I had a great time talking with Scott Hanselman about how we achieve great performance for RavenDB with .NET.You can listen to the podcast here, as usual, I would love your feedback.In this episode, we talk to Oren Eini from RavenDB. RavenDB is a NoSQL document database that offers high performance, scalability, and security. Oren shares his insights on why performance is not just a feature, but a service that developers and customers expect and demand. He also explains how RavenDB achieves fast and reliable data access, how it handles complex queries and distributed transactions, and how it leverages the cloud to optimize resource utilization and cost efficiency!
WarningThis post is about micro-optimization. Don't make your code unreadable for the sake of performance unless there is an absolute need and you have a benchmark proving this is worth it. In most cases, you should not worry about this kind of optimization..NET introduced some new methods that all
winget is a package manager for Windows. You can use winget install <package> to install new software. You can also use winget upgrade <package> or winget upgrade --all --silent to upgrade one or all installed software. But what if you want to upgrade only a subset of your installed sof
RavenDB is a .NET application, written in C#. It also has a non trivial amount of unmanaged memory usage. We absolutely need that to get the proper level of performance that we require.
With managing memory manually, there is also the possibility that we’ll mess it up. We run into one such case, when running our full test suite (over 10,000 tests) we would get random crashes due to heap corruption. Those issues are nasty, because there is a big separation between the root cause and the actual problem manifesting.
I recently learned that you can use the gflags tool on .NET executables. We were able to narrow the problem to a single scenario, but we still had no idea where the problem really occurred. So I installed the Debugging Tools for Windows and then executed:
&"C:\Program Files (x86)\Windows Kits\10\Debuggers\x64\gflags.exe" /p /enable C:\Work\ravendb-6.0\test\Tryouts\bin\release\net7.0\Tryouts.exe
What this does is enable a special debug heap at the executable level, which applies to all operations (managed and native memory alike).
With that enabled, I ran the scenario in question:
PS C:\Work\ravendb-6.0\test\Tryouts> C:\Work\ravendb-6.0\test\Tryouts\bin\release\net7.0\Tryouts.exe 42896 Starting to run 0 Max number of concurrent tests is: 16 Ignore request for setting processor affinity. Requested cores: 3. Number of cores on the machine: 32. To attach debugger to test process (x64), use proc-id: 42896. Url http://127.0.0.1:51595 Ignore request for setting processor affinity. Requested cores: 3. Number of cores on the machine: 32. License limits: A: 3/32. Total utilized cores: 3. Max licensed cores: 1024 http://127.0.0.1:51595/studio/index.html#databases/documents?&database=Should_correctly_reduce_after_updating_all_documents_1&withStop=true&disableAnalytics=true Fatal error. System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt. at Sparrow.Server.Compression.Encoder3Gram`1[[System.__Canon, System.Private.CoreLib, Version=7.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].Encode(System.ReadOnlySpan`1<Byte>, System.Span`1<Byte>) at Sparrow.Server.Compression.HopeEncoder`1[[Sparrow.Server.Compression.Encoder3Gram`1[[System.__Canon, System.Private.CoreLib, Version=7.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]], Sparrow.Server, Version=6.0.0.0, Culture=neutral, PublicKeyToken=37f41c7f99471593]].Encode(System.ReadOnlySpan`1<Byte> ByRef, System.Span`1<Byte> ByRef) at Voron.Data.CompactTrees.PersistentDictionary.ReplaceIfBetter[[Raven.Server.Documents.Indexes.Persistence.Corax.CoraxDocumentTrainEnumerator, Raven.Server, Version=6.0.0.0, Culture=neutral, PublicKeyToken=37f41c7f99471593],[Raven.Server.Documents.Indexes.Persistence.Corax.CoraxDocumentTrainEnumerator, Raven.Server, Version=6.0.0.0, Culture=neutral, PublicKeyToken=37f41c7f99471593]](Voron.Impl.LowLevelTransaction, Raven.Server.Documents.Indexes.Persistence.Corax.CoraxDocumentTrainEnumerator, Raven.Server.Documents.Indexes.Persistence.Corax.CoraxDocumentTrainEnumerator, Voron.Data.CompactTrees.PersistentDictionary) at Raven.Server.Documents.Indexes.Persistence.Corax.CoraxIndexPersistence.Initialize(Voron.StorageEnvironment)
That pinpointed things so I was able to know exactly where we are messing up.
I was also able to reproduce the behavior on the debugger:
This saved me hours or days of trying to figure out where the problem actually is.
We got a support call from a client, in the early hours of the morning, they were getting out-of-memory errors from their database and were understandably perturbed by that. They are running on a cloud system, so the first inclination of the admin when seeing the problem was deploying the server on a bigger instance, to at least get things running while they investigate. Doubling and then quadrupling the amount of memory that the system has had no impact. A few minutes after the system booted, it would raise an error about running out of memory.
Except that it wasn’t actually running out of memory. A scenario like that, when we give more memory to the system and still have out-of-memory errors can indicate a leak or unbounded process of some kind. That wasn’t the case here. In all system configurations (including the original one), there was plenty of additional memory in the system. Something else was going on.
When our support engineer looked at the actual details of the problem, it was quite puzzling. It looked something like this:
System.OutOfMemoryException: ENOMEM on Failed to munmap at Sparrow.Server.Platform.Posix.Syscall.munmap(IntPtr start, UIntPtr length);
That error made absolutely no sense, as you can imagine. We are trying to release memory, not allocate it. Common sense says that you can’t really fail when you are freeing memory. After all, how can you run out of memory? I’m trying to give you some, damn it!
It turns out that this model is too simplistic. You can actually run out of memory when trying to release it. The issue is that it isn’t you that is running out of memory, but the kernel. Here we are talking specifically about the Linux kernel, and how it works.
Obviously a very important aspect of the job of the kernel is managing the system memory, and to do that, the kernel itself needs memory. For managing the system memory, the kernel uses something called VMA (virtual memory area). Each VMA has its own permissions and attributes. In general, you never need to be aware of this detail.
However, there are certain pathological cases, where you need to set up different permissions and behaviors on a lot of memory areas. In the case we ran into, RavenDB was using an encrypted database. When running on an encrypted database, RavenDB ensures that all plain text data is written to memory that is locked (cannot be stored on disk / swapped out).
A side effect of that is that this means that for every piece of memory that we lock, the kernel needs to create its own VMA. Since each of them is operated on independently of the others. The kernel is using VMAs to manage its own map of the memory. and eventually, the number of the items in the map exceeds the configured value.
In this case, the munmap call released a portion of the memory back, which means that the kernel needs to split the VMA to separate pieces. But the number of items is limited, this is controlled by the vm.max_map_count value.
The default is typically 65530, but database systems often require a lot more of those. The default value is conservative, mind.
Adjusting the configuration would alleviate this problem, since that will give us sufficient space to operate normally.
We use cookies to analyze our website traffic and provide a better browsing experience. By
continuing to use our site, you agree to our use of cookies.