Microsoft Copilot is powered by several open-source tools, such as SignalR, Adaptive Cards, Markdown, and object-basin to solve the unique challenges in building AI-enabled applications at scale. In this article, we share the design considerations and how we integrated various tools with a focus on how we stream messages and responses to the front-end UI while giving some overview of what happens on the server-side.
Join Oren Eini, CEO of RavenDB, as he explores the design and implementation of RavenDB’s indexing engine Corax, its impact on indexing and query performance, and how the engine addresses common challenges such as slow data retrieval, high hosting expenses, and sluggish development processes. You’ll also gain valuable insights into the architecture's performance costs and its ability to unlock efficiency in data handling.You can watch it now.
This happened a few minutes ago, I got a call from an unknown number. That was my wife’s work number, and she called to ask me an urgent question, it seems:
“Can you tell me how to compress a PDF file?” she asked.
For the next part, it might be better if I paint you the whole picture. Imagine bullet time, where everything slows down, and I start to analyze the question and my possible answer. The following thoughts run through my mind during that time.
PDF files are already compressed by default.
Pretty sure that the file format is already using compression.
You could strip unneeded elements from the file, removing fonts is one example, I think.
If there are images, can probably downscale or re-sample them to reduce their size.
What about just running this through Zip?
Where did this question come from?
That took about two seconds in real time. The decision tree for any possible answer here grew exponentially. I had to make a call.
“No, that isn’t easily possible,” I answered.
I got some more details as well.
“This is for uploading a document to the XYZ system, it only accepts up to 4MB files, but this PDF is 5.5MB. I guess I can just scan this document as two separate pages instead of one, right?”
A workaround found, and a detailed dive into lossless vs. lossy compression compared to the file format choice avoided, I agreed that this was probably the best option and finished my coffee, pondering the ethical dilemma of answering the actual question or the intended question.
In C#, there are different ways to check if a collection is empty. Depending on the type of collection, you can check the Length, Count or IsEmpty property. Or you can use the Enumerable.Any() extension method.C#copy// array: Length
int[] array = ...;
var isEmpty = array.Length == 0;
// List: Coun
I’m currently playing with a Secret Project (code-named Hugin right now) and for that purpose, I literally ordered all the available Raspberry Pi in Israel. That last statement sounds like a joke, but we checked six to eight places, and our order quantity exceeds the inventory in the country. They are flying the units to us as you read this.I would love to hear what you think I’m doing, by the way. Please share your thoughts on the matter in the comments.For Hugin, I’m playing with Pi Zero 2 W, which is about the size of a lighter. They are small, and somewhat underpowered, but really cool. They also run RavenDB surprisingly well, but I’ll touch on that in a later post.The drawback of the Zero is that basically it has two ports: a micro USB and a mini-HDMI. There is also a micro USB for power, but for doing stuff with it, just those two ports. If you are like me, you have more micro USB power cables than you know what to do with. However, micro USB on-the-go connectors or mini-HDMI are far rarer these days. I want this to be useful and easy, so I started thinking about how I could make it simpler to work with. Then I realized that the Zero model I’m using (2 W) has built-in wifi, and that meant that I could start getting smart. The idea is that we can turn the Zero into an access point, so all you’ll need is to plug it into power (using a micro USB cable you likely already have), wait half a minute, and connect to the machine. Once I had the idea, I delved deep into figuring out how to make it work. I managed, and the entire process is pretty simple from a user perspective, but it was anything but to make it work. For the rest of this post, I will be working with the Raspberry Pi Zero 2 W, using Raspberry Pi OS Lite (Legacy, 32 bits) (Debian Bullseye). I tested this on a range of Pis (I apparently got lots, from Raspberry Pi 3 B to the Raspberry Pi 400), and it worked on everything I tried.I actually tried quite hard to get it working on the Raspberry Pi OS (the non-legacy, which is Debian Bookworm). However, I couldn’t get it to behave the way I wanted it to. Setting up a wifi hotspot on Bookworm is easy, but getting it to bind DNS and DHCP to a particular device was beyond my capabilities.From my reading, it doesn’t look like I’m the only one running into issues here.The basic idea is that on connecting to a WiFi network, most devices will check connectivity and display the captive portal page if needed. In this case, we simply provide the captive portal page to our application. Hence, the only thing you need to do is to connect to the hotspot, and everything else is handled for you.This blog post was really helpful figuring things out.How this works, however, is a whole other matter. I’m assuming that you are running on a clean slate, booting for the first time on the clean image of Raspberry Pi Lite (Bullseye). The first thing to do is to set up the wifi, DNS, and DHCP, like so:sudo raspi-config nonint do_wifi_country IL
sudo rfkill unblock wifi
sudo apt-get install -y nginx dnsmasq dhcpcdWe first set up the country for wifi, unblock it, and install nginx,dnsmasq and dhcpcd. Our next step it update /etc/wpa_supplicant/wpa_supplicant.conf to create the actual hotspot:ctrl_interface=DIR=/var/run/wpa_supplicant GROUP=netdev
}We define the MyHotSpot network as an open (key_mgmt=NONE) access point (mode=2). We need to plug this into the DHCP configuration in /etc/dhcpcd.conf:hostname
option rapid_commit
option domain_name_servers, domain_name, domain_search, host_name
option classless_static_routes
option interface_mtu
require dhcp_server_identifier
slaac private
env wpa_supplicant_conf=/etc/wpa_supplicant/wpa_supplicant.conf
interface wlan0
static ip_address= last part is the most important bit. We pull the wpa_supplicant configuration that we previously defined, apply it to the WiFi device (wlan0), and register a static IP for that interface. Basically, the WiFi interface will use that IP address as the gateway for clients connecting to it. Those clients need to get their own IP addresses, and that is the role of dnsmasq (no idea why it isn’t a dhcpcd that does it, it’s literally in the name). Here is the relevant configuration file /etc/dnsmasq.conf:listen-address=
# Resolve everything to the portal's IP address.
# Android Internet Conectivity Test Domains
address=/ is a lot going on here. We define the DHCP range from which clients will get their IPs and set the router for this connection. We also define option 114 (and 160, which is a legacy one) to instruct the client that it needs to first visit that URL before it connecting to the wider internet. Finally, we set up DNS in such a way that all DNS entries go to the server, except for a certain set of known domains used by some Android phones to check for an internet connection. We’ll touch on that in a bit.In short, all of this configuration basically tells the Zero to create a WiFi hotspot with IP, assign connected devices IP addresses in the range .., set the DNS server for those devices to, and resolve any DNS query to IP Also, if they care to, there is a specific URL users need to visit to get things started. In short, we are trying to guide the user to take us to the right place. One problem we have, however, is that we didn’t set up anything to respond to HTTP requests. That is why we installed nginx earlier. We configure it using /etc/nginx/sites-available/default:server {
listen 80 default_server;
listen [::]:80 default_server;
server_name _;
location / {
return 302 http://awesome.appliance;
server {
listen *:80;
server_name awesome.appliance;
root /var/appliance/web;
autoindex on;
}The idea here is simple. Everything before basically directs the client to the server, all domains go to it, etc. So when a connection comes, we tell nginx that it should return a 302 response (redirect) to the portal endpoint we have. If the client is requesting the http://awesome.applianceaddress, however, we serve an actual website.All of this together ends up with an open access point that, upon connection, will direct you to a web page. This is a walled garden, of course, since we assume that the Zero is connected only to the power. Now that this is solved, you need to figure out what function you want the appliance to actually have.
Real-Time Channel is Microsoft Office Online's service that powers real time collaboration and coauthoring. This blog post describes the journey to migrate the service from .NET Framework to modern .NET.
When we started working on Corax (10 years ago!), we had a pretty simple mission statement for that: “Lucene, but 10 times faster for our use case”. When we actually started implementing this in code (early 2020), we had a few more rules about the direction we wanted to take.Corax had to be faster than Lucene in all scenarios, and 10 times faster for common indexing and querying scenarios. Corax design is meant for online indexing, not batch-oriented like Lucene. We favor moving work to indexing time and ensuring that our data structures on disk can work with no additional processing time.Lucene was created at a time when data size was much smaller and disks were far more expensive. It shows in the overall design in many ways, but one of the critical aspects is that the file design for Lucene is compressed, meaning that you need to read the data, decode that into the in-memory data structure, and then process it. For RavenDB’s use case, that turned out to be a serious problem. In particular, the issue of cold queries, where you query the database for the first time and have to pay the initialization cost, was particularly difficult. Now, cold queries aren’t really that interesting, from a benchmark perspective, you have to warm things up in every software (caches are everywhere, from your disk to your CPU). I like to say that even memory has caches (yes, plural) because it is so slow (L1, L2, L3 caches). With Lucene’s design, however, whenever it runs an indexing batch, it creates a new file, and to start querying after that means that you have a “cold start” for that file. Usually, those files are small, but every now and then Lucene needs to merge several files together and then we have to pay the cold start price for a large amount of data.The issue is that this sometimes introduces a high latency spike (hitting us in the P999 targets), which is really hard to smooth over. We spent a lot of time and engineering resources ensuring that this doesn’t have a big impact on our users. One of the design goals for Corax was to ensure that this doesn’t happen. That we are able to get consistent performance from the system without periodic maintenance tasks. That led us to a very different internal design. The persistent data structures that we use are meant to be used as is, without initial processing. Everything has a cost, and in this case, it means that the size of Corax on disk is typically somewhat larger than Lucene. The big advantage is that the amount of memory being used by Corax tends to be significantly lower. And in today’s world, disks are far cheaper than memory. Corax’s cold start time is orders of magnitude faster than Lucene’s cold start time. It turns out that there is a huge impact in another scenario as well, completely unexpected. We continuously run performance tests on our system, and we got some ridiculous results when testing query performance using encrypted databases.When you use encryption at rest, RavenDB ensures that the only time that your data is decrypted is when there is an active transaction using the data. In other words, even in-memory buffers are encrypted. That applies to documents as well as indexes. It does not apply to the in-memory data that Lucene holds in its cache, though. For Corax, however, all of its state is encrypted.When we run our benchmark on encrypted database queries, we expect to see either roughly the same performance between Corax and Lucene or see Lucene edging out Corax in this scenario, since it can use its cache without paying decryption costs.Instead, we got really puzzling results. I tried showing them in bar chart format, but I literally couldn’t make the data fit in a reasonable size. The scenario is testing queries on an encrypted database, using an m5.xlarge instance on AWS. We are hitting the server with 500 queries/second, and testing for the 99.99 percentile performance.Indexing Engine99.99% percentile (ms)99.99% percentile (seconds)Lucene40,21040.21Corax1860.18Take a look at those numbers! Somehow Corax is absolutely smoking Lucene’s lunch. And I was quite surprised about that. I mean, I’m happy, I guess, that the indexing engine we spent so much time on is doing this well, but any time that we see a performance number that we cannot explain we need to figure out what is going on.Here is the profiler output for this benchmark, using Lucene.As you can see, the vast majority of the time is spent decrypting pages. And we are decrypting pages belonging to a stream. Those are the Lucene files, stored (encrypted in this case) inside of Voron. The issue is that the access pattern that Lucene is using forces us to touch large parts of the file. It usually reads a very small portion each time, but in various locations. Given that the data is encrypted, we have to decrypt each of those locations. Corax, on the other hand, keeps the persistent data structure in such a way that when we need to access specific pages only. That means that in terms of the number of pages touched by Corax or Lucene for this particular scenario, Lucene is using a lot more. You’ll usually not notice that since Voron (our storage engine) is memory mapped and those accesses are cheap. When using encrypted storage, however, we need to decrypt the data first, so that was very noticeable. It’s interesting to note that this also applies to instances where there is a memory pressure involved. Corax would tend to touch a lot less memory and have a smaller working set, while Lucene will generate more page faults. Really interesting results, and I’m both happy and amused that totally different design decisions have led to such a big impact in this scenario. In short, Corax is fast, really fast, and in many more scenarios than we initially thought.