I was writing a fairly low level piece of code recently, this is deep in the guts of how RavenDB handles query. I’m adding a cache for some processing inside of RavenDB and the performance benefits are amazing (three orders of magnitude better). As you can imagine, this is the sort of things that I would really like to get into the main codebase. So I did all the usual taxes, created a pull request and moved to other things. Part of our process is to review all pull requests by another pair of eyes. And while there was some minor comments about the code, there was one comment asking about a particular line that had be break out in cold sweat.
I created a simpler reproduction for discussion purposes, here is the code:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Show hidden characters
public class FileHasher
{
private MemoryCache _cache;
public FileHasher()
{
_cache = new MemoryCache(new MemoryCacheOptions
{
SizeLimit = 1024
});
}
public byte[] ComputeHash(string file)
{
return (byte[])_cache.Get(file) ?? ComputeHashAndPutInCache(file);
}
private byte[] ComputeHashAndPutInCache(string file)
{
byte[] hash = ArrayPool<byte>.Shared.Rent(32);
HashTheFile(file, hash); // actual work that we are caching
_cache.Set(file, hash, new MemoryCacheEntryOptions
{
Size = 32,
PostEvictionCallbacks =
{
new PostEvictionCallbackRegistration
{
EvictionCallback = EvictionCallback
}
}
});
return hash;
}
private void EvictionCallback(object key, object value, EvictionReason reason, object state)
{
ArrayPool<byte>.Shared.Return((byte[])value);
}
private static void HashTheFile(string file, byte[] hash)
{
// let's pretend to hash the file here
// this may take some time, but will be cached 2nd time around
// concurrent requests may cause duplicated work, don't care bout that
var _ = file;
Random.Shared.NextBytes(hash);
}
}
view raw
FileHasher.cs
hosted with ❤ by GitHub
Take a look, and let’s me know if you see the bug and its consequences.
Starting with Visual Studio 2022, you can navigate between subwords using the Subword Navigation feature. This is very convenient when you want to move the caret to a part of an identifier, for example. So PascalCaseIdentifier has 3 sub-words, Pascal, Case, and Identifier. Using subword navigation,
For most developers, Behavior Driven Development is a synonym for Given-When-Then syntax and Cucumber-like frameworks that are supporting it. In this session, we will step back from DSLs like Gherkin and show that BDD in its essence is about the mental approach to writing software based on a specification that describes business scenarios. Additionally, we will use RavenDB to provide an easy and effortless way to implement integrations tests based on the BDD approach. Finally, we will show how the BDD approach can be introduced in your projects without the need to learn any new frameworks.
Another code review comment, this time this is an error message being raised:The comment I made here is that this error message lacks a very important detail. How do I fix this? In this case, the maximum number of connections is controlled by a configuration setting, so telling that to the user as well as what is the actual configuration entry to use is paramount.If we give them all the information, we save the support call that would follow. If the user don’t read the error message, the support engineer likely would, and be able to close the ticket within moment.
I just went over the following pull request, where I found this nugget:I have an admittedly very firm views on the subject of error handling. The difference between good error handling and the merely mediocre can be ten times more lines of code, but also hundred times less troubleshooting in production.And a flat out rule that I have is that whenever you have a usage of the Exception.Message property, that means that you are discarding valuable information aside. This is a pretty horrible thing to do. You should never use the Message property, always use the ToString(), so you’ll get the full details. Yes, that is relevant even if you are showing to the user. Because if you are showing an exception message to the user, you have let it bubbled up all the way, so you might as well give full details.
The dotnet format tool is now a part of the dotnet CLI with .NET 6, and you can use it to easily adopt the new file scoped namespace feature…Keep Reading →
I’ll start with saying that this is not something that is planned in any capacity, I run into this topic recently and decided to dig a little deeper. This post is mostly about results of my research.If you run a file sharing system, you are going to run into a problem very early on. Quite a lot of the files that people are storing are shared, that is a good thing. Instead of storing the same file multiple times, you can store it once, and just keep a reference counter. That is how RavenDB internally deals with attachments, for example. Two documents that have the same attachment (content, not the file name, obviously) will only have a single entry inside of the RavenDB database.When you run a file sharing system, one of the features you really want to offer is the ability to save space in this manner. Another one is not being able to read the users’ files. For many reasons, that is a highly desirable property. For the users, because they can be assured that you aren’t reading their private files. For the file sharing system, because it simplify operations significantly (if you can’t look at the files’ contents, there is a lot that you don’t need to worry about).In other words, we want to store the users’ file, but we want to do that in an encrypted manner. It may sound surprising that the file sharing system doesn’t want to know what it is storing, but it actually does simplify things. For example, if you can look into the data, you may be asked (or compelled) to do so. I’m not talking about something like a government agency doing that, but even feature requests such as “do virus scan on my files”, for example. If you literally cannot do that, and it is something that you present as an advantage to the user, that is much nicer to have.The problem is that if you encrypt the files, you cannot know if they are duplicated. And then you cannot use the very important storage optimization technique of de-duplication. What can you do then?This is where convergent encryption comes into play. The idea is that we’ll use an encryption system that will give us the same output for the same input even when using different keys. To start with, that sounds like a tall order, no? But it turns out that it is quite easy to do. Consider the following symmetric key operations:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Show hidden characters
public static byte[] Encrypt(byte[] message, byte[] nonce, byte[] key);
public static byte[] Decrypt(byte[] cipherText, byte[] nonce, byte[] key);
view raw
encryption.cs
hosted with ❤ by GitHub
One of the key aspects of modern cryptography is the use of nonce, that ensures that for the same message and key, we’ll always get a different ciphertext. In this case, we need to go the other way around. To ensure that for different keys, the same content will give us the same output. Here is how we can utilize the above primitives for this purpose. Here is what this looks like:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Show hidden characters
public (byte[] CipherText, byte[] UserKey) ConvergentEncryption(byte[] message, byte[] key)
{
var hash = Hash(message);
var emptyNonce = new byte[32]; // initialized to zero
var ciphertext = Encrypt(message, emptyNonce, key: hash);
var nonce = GenerateNonce();
var encryptedKey = Encrypt(message: hash, nonce, key);
return (ciphertext, nonce.Concat(enecryptedKey));
}
public byte[] ConvergentDecryption(byte[] ciphertext, byte[] userKey, byte[] key)
{
var nonce = userKey[0..32];
var encryptedKey = userKey[32..64];
var hash = Decrypt(encryptedKey, nonce, key);
var emptyNonce = new byte[32]; // zeroed
return Decrypt(ciphertext, emptyNonce, hash);
}
view raw
encryption.cs
hosted with ❤ by GitHub
In other words, we have a two step process. First, we compute the cryptographic hash of the message, and use that as the key to encrypt the data. We use a static nonce for this part, more on that later. We then take the hash of the file and encrypt that normally, with the secret key of the user and a random nonce. We can then push both the ciphertext and the nonce + encrypted key to the file sharing service. Two users, using different keys, will always generate the same output for the same input in this model. How is that? Both users will compute the same cryptographic hash for the same file, of course. Using a static nonce completes the cycle and ensures that we’re basically running the same operation. We can then push both the encrypted file and our encrypted key for that to the file sharing system. The file sharing system can then de-duplicate the encrypted blob directly. Since this is the identical operation, we can safely do that. We do need to keep, for each user, the key to open that file, encrypted with the user’s key. But that is a very small value, so likely not an issue.Now, what about this static nonce? The whole point of a nonce is that it is a value that you use once. How can we use a static value here safely? The problem that nonce is meant to solve is that with most ciphers, if you XOR the output of two cipher texts, you’ll get the difference between them. If they were encrypted using the same key and nonce, you’ll get the result of XOR between their plain texts. That can have catastrophic impact on the security of the system. To get anywhere here, you need to encrypt two different messages with the same key and nonce. In this case, however, that cannot happen. Since we use cryptographic hash of the content as the key, we know that any change in the message will ensure that we have a different key. That means that we never reuse the key, so there is no real point in using a nonce at all. Given that the cryptographic operation requires it, we can just pass a zeroed nonce and not think about it further.This is a very attractive proposition, since it can amounts to massive space savings. File sharing isn’t the only scenario where this is attractive. Consider the case of a messaging application, where you want to forward messages from one user to another. Memes and other viral details are a common scenario here. You can avoid having to re-upload the file many times, because even in an end to end encryption model, we can still avoid sharing the file contents with the storage host.However, this lead to one of the serious issues that we have to cover for convergent encryption. For the same input, you’ll get the same output. That means that if an adversary know the source file, it can tell if a user has that file. In the context of a messaging application, it can spell trouble. Consider the following image, which is banned by the Big Meat industry:Even with end to end encryption, if you use convergent encryption for media files, you can tell that a particular piece of content is accessed by a user. If this is a unique file, the server can’t really tell what is inside it. However, if this is a file that has a known source, we can detect that the user is looking at a picture of salads and immediately snitch to Big Meat. This is called configuration of file attack, and it is the most obvious problem for this scenario. The other issue is that you using convergent encryption, you may allow an adversary to guess about values in the face on known structure. Let’s consider the following scenario, I have a system service where users upload their data using convergent encryption, given that many users may share the same file, we allow any user to download a file using:GET /file?id=71e12496e9efe0c7e31332533cb53abf1a3677f2a802ef0c555b08ba6b8a0f9fNow, let’s assume that I know what typical files are stored in the service. For example, something like this:Looking at this form, there are quite a few variables that you can plug here, right? However, we can generate options for all of those quite easily, and encryption is cheap. So we can speculate on the possible values. We can then run the convergent encryption on each possibility, then fetch that from the server. If we have a match, we figured out the remaining information.Consider this another aspect of trying to do password cracking, and using a set of algorithms that are quite explicitly meant to be highly efficient, so they lend themselves to offline work. That is enough on the topic, I believe. I don’t have any plans of doing something with that, but it was interesting to figure out. I actually started this looking at this feature in WhatsApp:It seams that this is a client side enforced policy, rather than something that is handled via the protocol itself. I initially thought that this is done via convergent encryption, but it looks like just a counter in the message, and then you the client side shows the warning as well as applies limits to it.
Searching for information on internal website is often not an easy task. Indeed, you can't use your favorite search engine to find information. So, you need to provide your own solution. Of course, you can't rebuild Google or Bing 😦 But, you can still provide a sufficient search engine. In this se
Recently we had to tackle a seriously strange bug. A customer reported that under a specific set of circumstances, when loading the database with many concurrent requests, they would get an optimistic concurrency violation from RavenDB.That is the sort of errors that we look at and go: “Well, that is what you expect it to do, no?”The customer code looked something like this:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Show hidden characters
def worker(self): # runs on multiple threads
with session as self.store.open_session(use_optimistic_concurrency = True):
lock = session.load(self.lock_id)
if lock is not None:
raise "Lock busy"
session.store(self.lock_id, {})
session.save_changes() # ensures that we get the lock or throws
self.do_work(session)
session.delete(self.lock_id) # release the lock
session.save_changes()
view raw
work.py
hosted with ❤ by GitHub
As you can imagine, this code is expected to deal with concurrency errors, since it is explicitly meant to run from multiple threads (and processes) all at once.However, in the scenario that the user was using, it had two separate types of workers running, one type was accessing the same set of locks, and there you might reasonably expect to get a concurrency error. However, the second type was run in the main thread only, and there should be no contention on that at all. However, the user was getting a concurrency error on that particular lock, which shouldn’t happen. Looking at the logs, it was even more confusing. Leaving aside that we could only reproduce this issue when we had a high contention rate, we saw some really strange details. Somehow, we had a write to the document coming out of nowhere?It took a while to figure out what was going on, but we finally figured out what the root cause was. RavenDB internally does not execute transactions independently. That would be far too costly. Instead, we use a process call transaction merging.Here is what this looks like:The idea is that we need to write to the disk to commit the transaction. That is an expensive operation. By merging multiple concurrent transactions in this manner, we are able to significantly reduce the number of disk writes, increasing the overall speed of the system. That works great, until you have an error. The most common error, of course, is a concurrency error. At this point, the entire merged transaction is rolled back and we will execute each transaction independently. In this manner, we suffer a (small) performance hit when running into such an error, but the correctness of the system is preserved.The problem in this case would only happen when we had enough load to start doing transaction merging. Furthermore, it would only happen if the actual failure would happen on the second transaction in the merged sequence, not the first one.Internally, each transaction does something like this:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Show hidden characters
class TxBatch:
def __init(self, operations):
self.replies = []
self.operations = operations
def execute(self, tx):
for op in self.operations:
result = op.execute(tx)
self.replies.append(result)
view raw
batch.py
hosted with ❤ by GitHub
Remember that aside from merging transactions, a single transaction in RavenDB can also have multiple operations. Do you see the bug here?The problem was that we successfully executed the transaction batch, but the next one in the same transaction failed. RavenDB did the right thing and execute the transaction batch again independently (this is safe to do, since we rolled back the previous transaction and are operating on a clean slate).The problem is that the transaction batch itself, on the other hand, had its own state. In this case, the problem was that when we replied back to the caller, we used the old state, the one that came from the old transaction that was rolled back.The only state we return to the user is the change vector, which is used for concurrency checks. What happened was that we got the right result, for the old change vector. Everything worked, and the actual state of the database was fine. The only way to discover this issue is if you are continue to make modifications to the same document on the same session. That is a rare scenario, since you typically discard the session after you save its changes.In any other scenario, you’ll re-load the document from the server, which will give you the right change vector and make everything work. Like all such bugs, when we look back at it, this is pretty obvious, but to get to this point was quite a long road.