skip to content
Relatively General .NET

Introducing the new IHostedLifecycleService Interface in .NET 8

by Steve Gordon

posted on: August 09, 2023

As regular readers will be aware, an area of .NET which I follow closely is Microsoft.Extensions.Hosting. I’ve already blogged about a change in .NET 8, where new concurrency options have been introduced to support parallel running of the StartAsync and StopAsync across multiple IHostedServices. In this post, we’ll look at some new lifecycle events introduced […]

A performance profile mystery: The cost of Stopwatch

by Oren Eini

posted on: August 09, 2023

Measuring the length of time that a particular piece of code takes is a surprising challenging task. There are two aspects to this, the first is how do you ensure that the cost of getting the start and end times won’t interfere with the work you are doing. The second is how to actually get the time (potentially many times a second) in as efficient way as possible. To give some context, Andrey Akinshin does a great overview of how the Stopwatch class works in C#. On Linux, that is basically calling to the clock_gettime system call, except that this is not a system call. That is actually a piece of code that the Kernel sticks inside your process that will then integrate with other aspects of the Kernel to optimize this. The idea is that this system call is so frequent that you cannot pay the cost of the Kernel mode transition. There is a good coverage of this here. In short, that is a very well-known problem and quite a lot of brainpower has been dedicated to solving it. And then we reached this situation: What you are seeing here is us testing the indexing process of RavenDB under the profiler. This is indexing roughly 100M documents, and according to the profiler, we are spending 15% of our time gathering metrics? The StatsScope.Start() method simply calls Stopwatch.Start(), so we are basically looking at a profiler output that says that Stopwatch is accounting for 15% of our runtime? Sorry, I don’t believe that. I mean, it is possible, but it seems far-fetched. In order to test this, I wrote a very simple program, which will generate 100K integers and test whether they are prime or not. I’m doing that to test compute-bound work, basically, and testing calling Start() and Stop() either across the whole loop or in each iteration. I run that a few times and I’m getting: Windows: 311 ms with Stopwatch per iteration and 312 ms without Linux: 450 ms with Stopwatch per iteration and 455 ms without On Linux, there is about 5ms overhead if we use a per iteration stopwatch, on Windows, it is either the same cost or slightly cheaper with per iteration stopwatch. Here is the profiler output on Windows: And on Linux: Now, that is what happens when we are doing a significant amount of work, what happens if the amount of work is negligible? I made the IsPrime() method very cheap, and I got: So that is a good indication that this isn’t free, but still… Comparing the costs, it is utterly ridiculous that the profiler says that so much time is spent in those methods. Another aspect here may be the issue of the profiler impact itself. There are differences between using Tracing and Sampling methods, for example. I don’t have an answer, just a lot of very curious questions.

QCon San Francisco Workshop: Building a database from the ground up

by Oren Eini

posted on: August 08, 2023

I’m going to QCon San Francisco and will be teaching a full day workshop where we’ll start from a C compiler and  an empty file and end up with a functional storage engine, indexing and more. Included in the minimum requirements are implementing transactions, MVCC, persistent data structures, and indexes. The workshop is going to be loosely based on the book, but I’m going to condense things so we can cover this topic in a single day. Looking forward to seeing you there.

Struct memory layout optimizations, practical considerations

by Oren Eini

posted on: August 07, 2023

In my previous post I discussed how we could store the exact same information in several ways, leading to space savings of 66%! That leads to interesting questions with regard to actually making use of this technique in the real world.The reason I posted about this topic is that we just gained a very significant reduction in memory (and we obviously care about reducing resource usage). The question is whether this is something that you want to do in general.Let’s look at that in detail. For this technique to be useful, you should be using structs in the first place. That is… not quite true, actually. Let’s take a look at the following declarations: This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters Show hidden characters public class PersonClass { public int Id; public DateTime Birthday; public ushort Kids; } public struct PersonStruct { public int Id; public DateTime Birthday; public ushort Kids; } view raw StructVsClass.cs hosted with ❤ by GitHub We define the same shape twice. Once as a class and once as a structure. How does this look in memory? Typelayoutfor'PersonClass'Size:32bytes.Paddings:2bytes%12ofObjectHeader8bytesMethodTablePtr8bytes1619:Int32Id4bytes2021:UInt16Kids2bytes2223:padding2bytes2432:DateTimeBirthday8bytesemptyspaceTypelayoutfor'PersonStruct'Size:24bytes.Paddings:10bytes%41of03:Int32Id4bytes47:padding4bytes815:DateTimeBirthday8bytes1617:UInt16Kids2bytes1823:padding6bytesemptyspace Here you can find some really interesting differences. The struct is smaller than the class, but the amount of wasted space is much higher in the struct. What is the reason for that?The class needs to carry 16 bytes of metadata. That is the object header and the pointer to the method table. You can read more about the topic here. So the memory overhead for a class is 16 bytes at a minimum. But look at the rest of it.You can see that the layout in memory of the fields is different in the class versus the structure. C# is free to re-order the fields to reduce the padding and get better memory utilization for classes, but I would need [StructLayout(LayoutKind.Auto)] to do the same for structures. The difference between the two options can be quite high, as you can imagine. Note that automatically laying out the fields in this manner means that you’re effectively declaring that the memory layout is an implementation detail. This means that you cannot persist it, send it to native code, etc. Basically, the internal layout may change at any time.  Classes in C# are obviously not meant for you to poke into their internals, and LayoutKind.Auto comes with an explicit warning about its behavior.Interestingly enough, [StructLayout] will work on classes, you can use to force LayoutKind.Sequential on a class. That is by design, because you may need to pass a part of your class to unmanaged code, so you have the ability to control memory explicitly. (Did I mention that I love C#?) Going back to the original question, why would you want to go into this trouble? As we just saw, if you are using classes (which you are likely to default to), you already benefit from the automatic layout of fields in memory. If you are using structs, you can enable LayoutKind.Auto to get the same behavior.This technique is for the 1% of the cases where that is not sufficient, when you can see that your memory usage is high and you can benefit greatly from manually doing something about it.That leads to the follow-up question, if we go about implementing this, what is the overhead over time? If I want to add a new field to an optimized struct, I need to be able to understand how it is laid out in memory, etc. Like any optimization, you need to maintain that. Here is a recent example from RavenDB.In this case, we used to have an optimization that had a meaningful impact. The .NET code changed, and the optimization now no longer makes sense, so we reverted that to get even better perf.At those levels, you don’t get to rest on your laurels. You have to keep checking your assumptions.If you got to the point where you are manually optimizing memory layouts for better performance, there are two options:You are doing that for fun, no meaningful impact on your system over time if this degrades.There is an actual need for this, so you’ll need to invest the effort in regular maintenance.You can make that easier by adding tests to verify those assumptions. For example, verifying the amount of padding in structs match expectation. A simple test that would verify the size of a struct would mean that any changes to that are explicit. You’ll need to modify the test as well, and presumably that is easier to catch / review / figure out than just adding a field and not noticing the impact. In short, this isn’t a generally applicable pattern. This is a technique that is meant to be applied in case of true need, where you’ll happily accept the additional maintenance overhead for better performance and reduced resource usage.

Sharing object between .NET host and WebView2

by Gérald Barré

posted on: August 07, 2023

WebView2 is a new web browser control for Windows desktop applications. It is based on the Chromium open-source project and is powered by the Microsoft Edge browser. It is available in the Microsoft.Web.WebView2 package. Using this control, you can have a web browser in your application and interac

Struct memory layout and memory optimizations

by Oren Eini

posted on: August 03, 2023

Consider a warehouse that needs to keep track of items. For the purpose of discussion, we have quite a few fields that we need to keep track of. Here is how this looks like in code: This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters Show hidden characters public struct WarehouseItem { public Dimensions? ProductDimensions; public long? ExternalSku; public TimeSpan? ShelfLife; public float? AlcoholContent; public DateTime? ProductionDate; public int? RgbColor; public bool? IsHazardous; public float? Weight; public int? Quantity; public DateTime? ArrivalDate; public bool? Fragile; public DateTime? LastStockCheckDate; public struct Dimensions { public float Length; public float Width; public float Height; } } view raw WarehouseItem.cs hosted with ❤ by GitHub And the actual Warehouse class looks like this: This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters Show hidden characters public class Warehouse { private List<WarehouseItem> _items= new (); public int Add(WarehouseItem item); public WarehouseItem Get(int itemId); } view raw Warehouse.cs hosted with ❤ by GitHub The idea is that this is simply a wrapper to the list of items. We use a struct to make sure that we have good locality, etc. The question is, what is the cost of this? Let’s say that we have a million items in the warehouse. That would be over 137MB of memory. In fact, a single struct instance is going to consume a total of 144 bytes.That is… a big struct, I have to admit. Using ObjectLayoutInspector I was able to get the details on what exactly is going on:Type layout for 'WarehouseItem' Size: 144 bytes. Paddings: 62 bytes (%43 of empty space) 07:Int64ticks8bytes07:UInt64dateData8bytes07:UInt64dateData8bytes07:UInt64dateData8bytes015:Nullable`1ProductDimensions16bytes0:BooleanhasValue1byte13:padding3bytes415:Dimensionsvalue12bytes03:SingleLength4bytes47:SingleWidth4bytes811:SingleHeight4bytes1631:Nullable`1ExternalSku16bytes0:BooleanhasValue1byte17:padding7bytes815:Int64value8bytes3247:Nullable`1ShelfLife16bytes0:BooleanhasValue1byte17:padding7bytes815:TimeSpanvalue8bytes4855:Nullable`1AlcoholContent8bytes0:BooleanhasValue1byte13:padding3bytes47:Singlevalue4bytes5671:Nullable`1ProductionDate16bytes0:BooleanhasValue1byte17:padding7bytes815:DateTimevalue8bytes7279:Nullable`1RgbColor8bytes0:BooleanhasValue1byte13:padding3bytes47:Int32value4bytes8081:Nullable`1IsHazardous2bytes0:BooleanhasValue1byte1:Booleanvalue1byte8283:padding2bytes8491:Nullable`1Weight8bytes0:BooleanhasValue1byte13:padding3bytes47:Singlevalue4bytes9299:Nullable`1Quantity8bytes0:BooleanhasValue1byte13:padding3bytes47:Int32value4bytes100103:padding4bytes104119:Nullable`1ArrivalDate16bytes0:BooleanhasValue1byte17:padding7bytes815:DateTimevalue8bytes120121:Nullable`1Fragile2bytes0:BooleanhasValue1byte1:Booleanvalue1byte122127:padding6bytes128143:Nullable`1LastStockCheckDate16bytes0:BooleanhasValue1byte17:padding7bytes815:DateTimevalue8bytesAs you can see, there is a huge amount of wasted space here. Most of which is because of the nullability. That injects an additional byte, and padding and layout issues really explode the size of the struct. Here is an alternative layout, which conveys the same information, much more compactly. The idea is that instead of having a full byte for each nullable field (with the impact on padding, etc), we’ll have a single bitmap for all nullable fields. Here is how this looks like: This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters Show hidden characters public struct WarehouseItem { public Dimensions ProductDimensions; public bool HasProductDimensions => (_nullability & (1 << 0)) != 0; public long ExternalSku; public bool HasExternalSku => (_nullability & (1 << 1)) != 0; public TimeSpan ShelfLife; public bool HasShelfLife => (_nullability & (1 << 2)) != 0; public float AlcoholContent; public bool HasAlcoholContent => (_nullability & (1 << 3)) != 0; public DateTime ProductionDate; public bool HasProductionDate => (_nullability & (1 << 4)) != 0; public int RgbColor; public bool HasRgbColor => (_nullability & (1 << 5)) != 0; public bool IsHazardous; public bool HasIsHazardous => (_nullability & (1 << 6)) != 0; public float Weight; public bool HasWeight => (_nullability & (1 << 7)) != 0; public int Quantity; public bool HasQuantity => (_nullability & (1 << 8)) != 0; public DateTime ArrivalDate; public bool HasArrivalDate => (_nullability & (1 << 9)) != 0; public bool Fragile; public bool HasFragile => (_nullability & (1 << 10)) != 0; public DateTime LastStockCheckDate; public bool HasLastStockCheckDate => (_nullability & (1 << 11)) != 0; private ushort _nullability; public struct Dimensions { public float Length; public float Width; public float Height; } } view raw Smaller.cs hosted with ❤ by GitHub If we look deeper into this, we’ll see that this saved a lot, the struct size is now 96 bytes in size. It’s a massive space-savings, but…Type layout for 'WarehouseItem' Size: 96 bytes. Paddings: 24 bytes (%25 of empty space)We still have a lot of wasted space. This is because we haven’t organized the struct to eliminate padding. Let’s reorganize the structs fields to see what we can achieve. The only change I did was re-arrange the fields, and we have: This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters Show hidden characters public struct WarehouseItem { public Dimensions ProductDimensions; public float AlcoholContent; public long ExternalSku; public TimeSpan ShelfLife; public DateTime ProductionDate; public DateTime ArrivalDate; public DateTime LastStockCheckDate; public float Weight; public int Quantity; public int RgbColor; public bool Fragile; public bool IsHazardous; private ushort _nullability; public bool HasProductDimensions => (_nullability & (1 << 0)) != 0; public bool HasExternalSku => (_nullability & (1 << 1)) != 0; public bool HasShelfLife => (_nullability & (1 << 2)) != 0; public bool HasAlcoholContent => (_nullability & (1 << 3)) != 0; public bool HasProductionDate => (_nullability & (1 << 4)) != 0; public bool HasRgbColor => (_nullability & (1 << 5)) != 0; public bool HasIsHazardous => (_nullability & (1 << 6)) != 0; public bool HasWeight => (_nullability & (1 << 7)) != 0; public bool HasQuantity => (_nullability & (1 << 8)) != 0; public bool HasArrivalDate => (_nullability & (1 << 9)) != 0; public bool HasFragile => (_nullability & (1 << 10)) != 0; public bool HasLastStockCheckDate => (_nullability & (1 << 11)) != 0; public struct Dimensions { public float Length; public float Width; public float Height; } } view raw Smallest.cs hosted with ❤ by GitHub And the struct layout is now: Typelayoutfor'WarehouseItem'Size:72bytes.Paddings:0bytes%0ofemptyspace011:DimensionsProductDimensions12bytes03:SingleLength4bytes47:SingleWidth4bytes811:SingleHeight4bytes1215:SingleAlcoholContent4bytes1623:Int64ExternalSku8bytes2431:TimeSpanShelfLife8bytes3239:DateTimeProductionDate8bytes4047:DateTimeArrivalDate8bytes4855:DateTimeLastStockCheckDate8bytes5659:SingleWeight4bytes6063:Int32Quantity4bytes6467:Int32RgbColor4bytes68:BooleanFragile1byte69:BooleanIsHazardous1byte7071:UInt16nullability2bytesWe have no wasted space, and we are 50% of the previous size. We can actually do better, note that Fragile and IsHazarous are Booleans, and we have some free bits on _nullability that we can repurpose. For that matter, RgbColor only needs 24 bits, not 32. Do we need alcohol content to be a float, or can we use a byte? If that is the case, can we shove both of them together into the same 4 bytes?For dates, can we use DateOnly instead of DateTime? What about ShelfLife, can we measure that in hours and use a short for that (giving us a maximum of 7 years)? After all of that, we end up with the following structure: This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters Show hidden characters public struct WarehouseItem { public Dimensions ProductDimensions; public float Weight; public long ExternalSku; public DateOnly ProductionDate; public DateOnly ArrivalDate; public DateOnly LastStockCheckDate; public int Quantity; private int _rgbColorAndAlcoholContentBacking; private ushort _nullability; public ushort ShelfLifeInHours; public float AlcoholContent => (float)(byte)_rgbColorAndAlcoholContentBacking; public int RgbColor => _rgbColorAndAlcoholContentBacking >> 8; public bool Fragile => (_nullability & (1 << 12)) != 0; public bool IsHazardous => (_nullability & (1 << 13)) != 0; public bool HasProductDimensions => (_nullability & (1 << 0)) != 0; public bool HasExternalSku => (_nullability & (1 << 1)) != 0; public bool HasShelfLife => (_nullability & (1 << 2)) != 0; public bool HasAlcoholContent => (_nullability & (1 << 3)) != 0; public bool HasProductionDate => (_nullability & (1 << 4)) != 0; public bool HasRgbColor => (_nullability & (1 << 5)) != 0; public bool HasIsHazardous => (_nullability & (1 << 6)) != 0; public bool HasWeight => (_nullability & (1 << 7)) != 0; public bool HasQuantity => (_nullability & (1 << 8)) != 0; public bool HasArrivalDate => (_nullability & (1 << 9)) != 0; public bool HasFragile => (_nullability & (1 << 10)) != 0; public bool HasLastStockCheckDate => (_nullability & (1 << 11)) != 0; public struct Dimensions { public float Length; public float Width; public float Height; } } view raw Packed.cs hosted with ❤ by GitHub And with the following layout: 03:Int32dayNumber4bytes03:Int32dayNumber4bytes03:Int32dayNumber4bytesTypelayoutfor'WarehouseItem'Size:48bytes.Paddings:0bytes%0ofemptyspace011:DimensionsProductDimensions12bytes03:SingleLength4bytes47:SingleWidth4bytes811:SingleHeight4bytes1215:SingleWeight4bytes1623:Int64ExternalSku8bytes2427:DateOnlyProductionDate4bytes2831:DateOnlyArrivalDate4bytes3235:DateOnlyLastStockCheckDate4bytes3639:Int32Quantity4bytes4043:Int32rgbColorAndAlcoholContentBacking4bytes4445:UInt16nullability2bytes4647:UInt16ShelfLifeInHours2bytesIn other words, we are now packing everything into  48 bytes, which means that we are one-third of the initial cost. Still representing the same data. Our previous Warehouse class? It used to take 137MB for a million items, it would now take 45.7 MB only.In RavenDB’s case, we had the following:That is the backing store of the dictionary, and as you can see, it isn’t a nice one. Using similar techniques we are able to massively reduce the amount of storage that is required to process indexing.Here is what this same scenario looks like now:But we aren’t done yet , there is still more that we can do.

Polyfills in .NET to ease multi-targeting

by Gérald Barré

posted on: July 31, 2023

When you write a .NET library, you may want to target multiple target framework monikers (TFM). For instance, you may want to target .NET 6 and .NET Standard 2.0. This allows your library to be used by more applications, By targeting the latest supported framework, you can also use Nullable Referen