Page 12 • Relatively General .NET

On the role of design documents

by Oren Eini

posted on: March 11, 2025

When we build a new feature in RavenDB, we either have at least some idea about what we want to build or we are doing something that is pure speculation. In either case, we will usually spend only a short amount of time trying to plan ahead. A good example of that can be found in my RavenDB 7.1 I/O posts, which cover about 6+ months of work for a major overhaul of the system. That was done mostly as a series of discussions between team members, guidance from the profiler, and our experience, seeing where the path would lead us. In that case, it led us to a five-fold performance improvement (and we’ll do better still by the time we are done there). That particular set of changes is one of the more complex and hard-to-execute changes we have made in RavenDB over the past 5 years or so. It touched a lot of code, it changed a lot of stuff, and it was done without any real upfront design. There wasn’t much point in designing, we knew what we wanted to do (get things faster), and the way forward was to remove obstacles until we were fast enough or ran out of time.I re-read the last couple of paragraphs, and it may look like cowboy coding, but that is very much not the case. There is a process there, it is just not something we would find valuable to put down as a formal design document. The key here is that we have both a good understanding of what we are doing and what needs to be done.RavenDB 4.0 design documentThe design document we created for RavenDB 4.0 is probably the most important one in the project’s history. I just went through it again, it is over 20 pages of notes and details that discuss the current state of RavenDB at the time (written in 2015) and ideas about how to move forward. It is interesting because I remember writing this document. And then we set out to actually make it happen, that wasn’t a minor update. It took close to three years to complete the process, to give you some context about the complexity and scale of the task.To give some further context, here is an image from that document:And here is the sharding feature in RavenDB right now:This feature is called prefixed sharding in our documentation. It is the direct descendant of the image from the original 4.0 design document. We shipped that feature sometime last year. So we are talking about 10 years from “design” to implementation.I’m using “design” in quotes here because when I go through this v4.0 design document, I can tell you that pretty much nothing that ended up in that document was implemented as envisioned. In fact, most of the things there were abandoned because we found much better ways to do the same thing, or we narrowed the scope so we could actually ship on time. Comparing the design document to what RavenDB 4.0 ended up being is really interesting, but it is very notable that there isn’t much similarity between the two. And yet that design document was a fundamental part of the process of moving to v4.0.What Are Design Documents?A classic design document details the architecture, workflows, and technical approach for a software project before any code is written. It is the roadmap that guides the development process.For RavenDB, we use them as both a sounding board and a way to lay the foundation for our understanding of the actual task we are trying to accomplish. The idea is not so much to build the design for a particular feature, but to have a good understanding of the problem space and map out various things that could work.Recent design documents in RavenDBI’m writing this post because I found myself writing multiple design documents in the past 6 months. More than I have written in years. Now that RavenDB 7.0 is out, most of those are already implemented and available to you. That gives me the chance to compare the design process and the implementation with recent work.Vector Search & AI Integration for RavenDBThis was written in November 2024. It outlines what we want to achieve at a very high level. Most importantly, it starts by discussing what we won’t be trying to do, rather than what we will. Limiting the scope of the problem can be a huge force multiplier in such cases, especially when dealing with new concepts.Reading throughout that document, it lays out the external-facing aspect of vector search in RavenDB. You have the vector.search() method in RQL, a discussion on how it works in other systems, and some ideas about vector generation and usage.It doesn’t cover implementation details or how it will look from the perspective of RavenDB. This is at the level of the API consumer, what we want to achieve, not how we’ll achieve it.AI Integration with RavenDBGiven that we have vector search, the next step is how to actually get and use it. This design document was a collaborative process, mostly written during and shortly after a big design discussion we had (which lasted for hours).The idea there was to iron out the overall understanding of everyone about what we want to achieve. We considered things like caching and how it plays into the overall system, there are notes there at the level of what should be the field names.That work has already been implemented. You can access it through the new AI button in the Studio. Check out this icon on the sidebar: That was a much smaller task in scope, but you can see how even something that seemed pretty clear changed as we sat down and actually built it. Concepts we didn’t even think to consider were raised, handled, and implemented (without needing another design). Voron HSNW Design NotesThis design document details our initial approach to building the HSNW implementation inside Voron, the basis for RavenDB’s new vector search capabilities.That one is really interesting because it is a pure algorithmic implementation, completely internal to our usage (so no external API is needed), and I wrote it after extensive research. The end result is similar to what I planned, but there are still significant changes. In fact, pretty much all the actual implementation details are different from the design document. That is both expected and a good thing because it means that once we dove in, we were able to do things in a better way.Interestingly, this is often the result of other constraints forcing you to do things differently. And then everything rolls down from there. “If you have a problem, you have a problem. If you have two problems, you have a path for a solution.”In the case of HSNW, a really complex part of the algorithm is handling deletions. In our implementation, there is a vector, and it has an associated posting list attached to it with all the index entries. That means we can implement deletion simply by emptying the associated posting list. An entire section in the design document (and hours spent pondering) is gone, just like that.If the design document doesn’t reflect the end result of the system, are they useful?I would unequivocally state that they are tremendously useful. In fact, they are crucial for us to be able to tackle complex problems. The most important aspect of design documents is that they capture our view of what the problem space is. Beyond their role in planning, design documents serve another critical purpose: they act as a historical record. They capture the team’s thought process, documenting why certain decisions were made and how challenges were addressed. This is especially valuable for a long-lived project like RavenDB, where future developers may need context to understand the system’s evolution.Imagine a design document that explores a feature in detail—outlining options, discussing trade-offs, and addressing edge cases like caching or system integrations. The end result may be different, but the design document, the feature documentation (both public and internal), and the issue & commit logs serve to capture the entire process very well. Sometimes, looking at the road not taken can give you a lot more information than looking at what you did. I consider design documents to be a very important part of the way we design our software. At the same time, I don’t find them binding, we’ll write the software and see where it leads us in the end.What are your expectations and experience with writing design documents? I would love to hear additional feedback.

.NET and .NET Framework March 2025 servicing releases updates

by Tara,Rahul

posted on: March 11, 2025

A recap of the latest servicing updates for .NET and .NET Framework for March 2025.

Running an ASP.NET Core app inside IIS in a Windows container

by Andrew Lock

posted on: March 11, 2025

In this post I describe how to run an ASP.NET Core app inside IIS in a Windows Docker container…

How to iterate on a ConcurrentDictionary: foreach vs Keys/Values

by Gérald Barré

posted on: March 10, 2025

There are multiple ways to iterate on a ConcurrentDictionary<TKey, TValue> in .NET. You can use the foreach loop, or the Keys and Values properties. Both methods provide different results, and you should choose the one that best fits your needs.#Using the foreach loopThe foreach loop lazily i

RavenDB 7.0 Released

by Oren Eini

posted on: March 07, 2025

One of the “minor” changes in RavenDB 7.0 is that we moved from our own in-house logging system to using NLog. I looked at my previous blog posts, and I found the blog post outlining the rationale for the decision to use our own logging infrastructure from 2016. At the time, no other logging framework was able to sustain the kind of performance that we required. The .NET community has come a long way since then, and it has become clear that we need to revisit this decision. Performance has a much higher priority, and the API at all levels supports that (spans, avoiding allocations, etc).The move to NLog gives users a much simpler way to integrate RavenDB logs into their monitoring & observability pipeline. You can read about the new NLog feature in our blog.We also spent time making the logs view inside the RavenDB Studio nicer, taking advantage of the new capabilities we now expose:Hopefully, you won’t need to dig too deeply into the logs, but it is now easier than ever to use them.

.NET AI Template Now Available in Preview

by Jordan Matthiesen

posted on: March 06, 2025

Announcing the first preview of the .NET AI Template, for Visual Studio, Visual Studio Code, and the .NET CLI. Get started building amazing AI apps with .NET.

RavenDB 7.0 Released

by Oren Eini

posted on: March 05, 2025

RavenDB 7.0 adds Snowflake integration to the set of ETL targets it supports. Snowflake is a data warehouse solution, designed for analytics and data at scale. RavenDB is aimed at transactional scenarios and has a really good story around data distribution and wide geographical deployments. You can check out the documentation to read the details about how you can use this integration to push data from RavenDB to your Snowflake database. In this post, I want to introduce one usage scenario for such integration.RavenDB is commonly deployed on the edge, running on site in grocery stores, restaurants’ self-serve kiosks, supermarket checkout counters, etc. Such environments have to be tough and resilient to errors, network problems, mishandling, and much more. We had to field support calls in the style of “there is ketchup all over the database”, for example.In such environments, you must operate completely independently of the cloud. Both because of latency and performance issues and because you must keep yourself up & running even if the Internet is down. RavenDB does very well in such a scenario because of its internal architecture, the ability to run in a multi-master configuration, and its replication capabilities.From a business perspective, that is a critical capability, to push data to the edge and be able to operate independently of any other resource. At the same time, this represents a significant challenge since you lose the ability to have an overall view of what is going on. RavenDB’s Snowflake integration is there to bridge this gap. The idea is that you can define Snowflake ETL processes that would push the data from all the branches you have to a single shared Snowflake account. Your headquarters can then run queries, analyse the data, and in general have near real-time analytics without hobbling the branches with having to manage the data remotely.The Grocery Store ScenarioIn our grocery store, we manage the store using RavenDB as the backend. With documents such as this to record sales:{ "Items": [ { "ProductId": "P123", "ProductName": "Milk", "QuantitySold": 5 }, { "ProductId": "P456", "ProductName": "Bread", "QuantitySold": 2 }, { "ProductId": "P789", "ProductName": "Eggs", "QuantitySold": 10 } ], "Timestamp": "2025-02-28T12:00:00Z", "@metadata": { "@collection": "Orders" } }And this document to record other inventory updates in the store:{ "ProductId": "P123", "ProductName": "Milk", "QuantityDiscarded": 3, "Reason": "Spoilage", "Timestamp": "2025-02-28T14:00:00Z", "@metadata": { "@collection": "DiscardedInventory" } }These documents are repeated many times for each store, recording the movement of inventory, tracking sales, etc. Now we want to share those details with headquarters. There are two ways to do that. One is to use the Snowflake ETL to push the data itself to the HQ’s Snowflake account. You can see an example of that when we push the raw data to Snowflake in this article.The other way is to make use of RavenDB’s map/reduce capabilities to do some of the work in each store and only push the summary data to Snowflake. This can be done in order to reduce the load on Snowflake if you don’t need a granular level of data for analytics.Here is an example of such a map/reduce index:// Index Name: ProductConsumptionSummary // Output Collection: ProductsConsumption map('Orders', function(order) { return { ProductId: order.ProductId, TotalSold: order.QuantitySold, TotalDiscarded: 0, Date: dateOnly(order.Timestamp) }; }); map('DiscardedInventory', function(discard) { return { ProductId: discard.ProductId, TotalSold: 0, TotalDiscarded: discard.QuantityDiscarded, Date: dateOnly(order.Timestamp) }; }); groupBy(x => ({x.ProductId, x.Date) ) .aggregate(g =>{ ProductId: g.key.ProductId, Date: g.key.Date, TotalSold: g.values.reduce((c, val) => g.TotlaSld + c, 0), TotalDiscarded: g.values.reduce((c, val) => g.TotalDiscarded + c, 0) });This index will output its documents to the artificial collection: ProductsConsumption.We can then define a Snowflake ETL task that would push that to Snowflake, like so:loadToProductsConsumption({ PRODUCT_ID: doc.ProductId, STORE_ID: load('config/store').StoreId, TOTAL_SOLD: doc.TotalSold, TOTAL_DISCARDED: doc.TotalDiscarded, DATE: doc.Date });With that in place, each branch would push details about its sales, inventory discarded, etc., to the Snowflake account. And headquarters can run their queries and get a real-time view and understanding about what is going on globally. You can read more about Snowflake ETL here. The full docs, including all the details on how to set up properly, are here.

Creating an analyzer to detect infinite loops caused by ThreadAbortExceptions

by Andrew Lock

posted on: March 04, 2025

In this post I describe a Roslyn Analyzer I created to detect code that can result in infinite loops if a ThreadAbortException is raised…

Unlock new possibilities for AI Evaluations for .NET

by Wendy Breiding (SHE/HER)

posted on: March 03, 2025

Microsoft.Extensions.AI.Evaluations library is now open source, and a new Azure DevOps plug-in is available to make reporting in your CI pipelines easier than ever.

Listen to clipboard changes in a WPF application

by Gérald Barré

posted on: March 03, 2025

To improve the UX, it can be useful to listen to clipboard changes. For instance, you can fill a 2FA code in a text box when the user copies it to the clipboard.This can be done using the AddClipboardFormatListener function. Here is a simple example of how to listen to clipboard changes in a WPF ap