skip to content
Relatively General .NET

Badly implementing encryption

by Oren Eini

posted on: February 17, 2022

In the previous post, I moved to using HMAC for the key stream generation. Together with a random nonce, we ensure that each time that we encrypt a value, we’ll get a different encrypted value. However, I want to stop for a while and talk about what will happen if we reuse a nonce. The short answer, a single nonce reuse results in catastrophic results for our security scheme. Let’s take a look at how that works, shall we? We are going to use: Key: jMnNRO9K7DEGmrhPS6awT3w4AAjCMgaNNqPSiwTL//s= Nonce: 3ilsaRYYOls4SA6XHd70jA== Here is the encryption results: attack at dawn! 414CE53F71D47A36FF099792858F58 defend at dusk! 445DF73B7CDB7A36FF099786818A58 Do you notice anything? At this point, we can see that we have some similarities between the texts, which is interesting. If we XOR those two texts, what will we get? Well, if you think about it, we have: “attack at dawn!” XOR keystream = 414CE53F71D47A36FF099792858F58”defend at dusk!” XOR keystream = 445DF73B7CDB7A36FF099786818A58 The XOR operation is cumulative, so if we XOR those two values, we have: 445DF73B7CDB7A36FF099786818A58 XOR 445DF73B7CDB7A36FF099786818A58 =     ( “attack at dawn!” XOR keystream ) XOR (”defend at dusk!” XOR keystream) =     “attack at dawn!” XOR ”defend at dusk!” =    051112040D0F000000000014040500 Sorry for the math here, but the basic idea should be clear. Note that in this case, we don’t know either of those messages, but what we have been able to do is to get the XOR of the plain texts. At this point, an actual cryptographer will go look at frequency tables for symbols in English and start making guesses about those values. I’m certain that there are better techniques for that, but given the computing power that I have, I decided to just break it using the following code: This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters Show hidden characters var words = std.StringHashMap(void).init(allocator); defer { var it = words.keyIterator(); while (it.next()) |w| { allocator.free(w.*); } words.deinit(); } var f = try std.fs.openFileAbsolute("/usr/share/dict/words", .{}); defer f.close(); var reader = f.reader(); while (try reader.readUntilDelimiterOrEofAlloc(allocator, '\n', 256)) |word| { try words.put(word, {}); } var wit = words.keyIterator(); while (wit.next()) |w| { if (w.*.len > plainTextXor.len or w.*.len < 4) continue; var attempt = try allocator.dupe(u8, plainTextXor[0..w.*.len]); defer allocator.free(attempt); for (w.*) |c, i| { attempt[i] ^= c; } if (words.get(attempt)) |_| { std.log.info("word: {s}, match: {s}", .{ w.*, attempt }); } } view raw poorManCryptoBreaker.zig hosted with ❤ by GitHub When running this code on the XOR of the encrypted texts (which is the equivalent of the XOR of the plain texts), we get the following outputs: info: xor: 051112040D0F000000000014040500 info: word: defends, match: attacks info: word: attacker's, match: defender's info: word: attacker, match: defender info: word: defenders, match: attackers info: word: adapt, match: dusty info: word: attack, match: defend info: word: defending, match: attacking info: word: attacks, match: defends info: word: dusty, match: adapt info: word: defender's, match: attacker's info: word: defender, match: attacker info: word: defend, match: attack info: word: attackers, match: defenders info: word: attacking, match: defending info: word: attacked, match: defended info: word: defended, match: attacked As you can see, we have a bunch of options for the plain text. Removing redundancies, we have the following pairs: attack defend adapt dusty That was easy, I have to say. Now I can move on to the next word, eventually cracking the whole message. For the values that are the same, I’ll get zero bytes, of course. At that point, I can simply try guessing based on context, but the process shouldn’t take long at all. The question is how to solve this? Currently, the commonly accepted practice is to shout at you very loudly in code reviews and to put big notices in the documentation: Thou shall not reuse a nonce. There is something called SIV mode, which aims to help in this, but I want to keep that for a future post.

Collections and Primitive Obsession

by Vladimir Khorikov

posted on: February 16, 2022

Does the primitive obsession anti-pattern apply to collections? In other words, should you introduce a custom class for a collection?

Badly implementing encryption

by Oren Eini

posted on: February 16, 2022

I’m trying to not get too deep into the theory of encryption. I’m happy to say that so far I was able to avoid any math whatsoever and hopefully this is an interesting series. I do have to touch on an important topic. I’m using MD5 here for the purpose of generating a random bitstream to be used as a stream cypher. In the previous post, we looked into a key issue. If we don’t do things properly, we can easily get to the point where a single 16 bytes block that we guess can allow us to decrypt the entire message. I “solved” that in the previous post by adding the key back again into the MD5 computation. That works, but it isn’t ideal. There are all sorts of subtle issues that you have to take into account (length extensions, for example) and probably other stuff that I’m not even aware of. There is the HMAC family of functions. That is a keyed hash function that has far stronger security properties. Wikipedia does a great job explaining it. Note that there is a cost, HMAC is more expensive than the underlying hash function it uses. The best practical explanation, by the way, I found here. Our previous method of adding a key to the mix was to concatenate it in front of the message. The problem is that if md5(msgA) == md5(msgB), then md5(key || msgA) == md5(key || msgB). And that isn’t something that we want. I (personally) can’t think of a way to abuse that property to get something nasty going on with the way we use it in this encryption algorithm, but I’m very much not an expert. The HMAC model, on the other hand, would use: md5( key1 || md5(key2 || msgA)). And there is no way to get that to collide, even if the messages generate collisions for MD5. Let’s see what we need to do to switch to HMAC-MD5 instead of MD5. Here is what the code now looks like: This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters Show hidden characters const HmacMd5 = std.crypto.auth.hmac.HmacMd5; const MyCryptoAlgo = struct { state: [HmacMd5.mac_length]u8 = undefined, nonce: [HmacMd5.mac_length]u8 = undefined, key: [HmacMd5.key_length]u8 = undefined, pos: u8 = 0, count: [1]u64 = undefined, pub fn initWithNonce(key: [HmacMd5.key_length]u8, nonce: [HmacMd5.mac_length]u8) MyCryptoAlgo { var self = MyCryptoAlgo{}; self.count[0] = 0; std.mem.copy(u8, &self.nonce, &nonce); std.mem.copy(u8, &self.key, &key); var md5 = HmacMd5.init(&key); md5.update(&self.nonce); md5.update(std.mem.sliceAsBytes(&self.count)); md5.final(&self.state); return self; } pub fn init(key: [HmacMd5.key_length]u8) MyCryptoAlgo { var nonce: [HmacMd5.mac_length]u8 = undefined; std.crypto.random.bytes(&nonce); return initWithNonce(key, nonce); } pub fn run(self: *MyCryptoAlgo, data: []u8) void { for (data) |c, i| { if (self.pos == self.state.len) { var md5 = HmacMd5.init(&self.key); md5.update(&self.nonce); self.count[0] += 1; md5.update(std.mem.sliceAsBytes(&self.count)); md5.final(&self.state); self.pos = 0; } data[i] = self.state[self.pos] ^ c; self.pos += 1; } } }; view raw badEncryptionWithHMac.zig hosted with ❤ by GitHub There are a few things that are worth noting here. We changed the size of the key (since HMAC-MD5 uses 32 bytes, not 16 bytes). Again, I have no comment on the actual security of such a scheme, mostly because I wouldn’t know where to even begin doing this analysis. We are also no longer using the previous block as the input to the next block. Instead, we use a counter mode, where we hash the nonce and an ever incrementing counter using the provided key. That gives us some additional safety from the previously seen issue where we could figure out what the rest of the key stream would be. Of course, better than before doesn’t mean that this is actually good. There are several other problems that I (as a non cryptographer) still need to fix, and probably more that I’m not seeing. Another aspect of using this sort of construction is the additional cost that is involved. The HMAC computation isn’t that much more expensive. Looking at some benchmarks, this is about 3% slower, which is quite reasonable. Next, we are going to see how we can abuse the malleability of this encryption system for fun and nefarious people’s profit.

GitHub Diagrams with Mermaid

by Ardalis

posted on: February 16, 2022

GitHub recently announced support for diagrams embedded directly in markdown files. The new feature leverages the Mermaid diagramming and…Keep Reading →

Badly implementing encryption

by Oren Eini

posted on: February 15, 2022

In the previous post, I showed how the lack of nonce means that encrypted similar values will expose their content. Now I want to discuss a different matter, let’s assume that I have some control or knowledge about the plain text of an encrypted message. That is easy enough to obtain, I can simply ask you to encrypt a value for me. A great scenario for that may be when you are sending data based on something that I do. Let’s assume that I get you to include the following plain text in your message: “red tanks are over the big hills”. I am then able to intercept your message, which looks like this: FE485CEECED5BA4CCA281D1F586E67233D924652E5BD690357F6E29C1C36DC446001DDDF16536DB427337089D27A9C6FCCED553FA4982E58F8B7B5FDD02A11C0A1C08E93FA2C29582A15DC34CFCFB61AB2975CC0F4D29F9C6715D0F9E2CE661C816E047590389A9064BA5F3E3D8461D59B7C3407A76F248A71 This was encrypted with the nonce: DE296C6916183A5B38480E971DDEF48C (remember, the nonce itself is public and has no intrinsic meaning), but I don’t actually need the nonce in this case! Now, here is what I can do. I know that the message is bigger than 16 bytes, so I can XOR parts of the encrypted message with the known plain text. If I do this properly, I then get the key stream. Since the algorithm in question is using the key stream to compute the data directly, I can now just decrypt everything. To give some context, here is the full code that I need to decrypt this message: This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters Show hidden characters var cryptedString = "/khc7s7VukzKKB0fWG5nIz2SRlLlvWkDV/binBw23ERgAd3fFlNttCczcInSepxvzO1VP6SYLlj4t7X90CoRwKHAjpP6LClYKhXcNM/Pthqyl1zA9NKfnGcV0PnizmYcgW4EdZA4mpBkul8+PYRh1Zt8NAenbySKcQ=="; var cryptedBuf: []u8 = try allocator.alloc(u8, try std.base64.standard.Decoder.calcSizeForSlice(cryptedString)); defer allocator.free(cryptedBuf); try std.base64.standard.Decoder.decode(cryptedBuf, cryptedString); const knownText = "red tanks are over the big hills"; var cIx: usize = 16; while (cIx + 16 < cryptedBuf.len) : (cIx += 16) { var pIx: usize = 0; while (pIx + 16 < knownText.len) : (pIx += 1) { var cur = try allocator.dupe(u8, cryptedBuf[cIx..]); defer allocator.free(cur); for (knownText[pIx..(pIx + 16)]) |s, i| { cur[i] ^= s; } var mca = MyCryptoAlgo{ .state = undefined, .nonce = undefined, .pos = 16 }; std.mem.copy(u8, &mca.state, cur[0..16]); mca.run(cur[16..]); if (std.mem.indexOf(u8, cur[16..], knownText[pIx + 16 .. pIx + 18]) != null) { std.log.info("decrypted: {s}", .{cur[16..]}); break; } } } view raw breakingEncryption.zig hosted with ❤ by GitHub I’m scanning through the encrypted text, at 16 bytes intervals (since that is the block size of our encryption routine) and try to XOR that value with the relevant matches from the known text. That gives me the key stream, which I then use to decrypt the encrypted text from that point (and compute the rest of the key stream for future values). This code will output the following decrypted text: info: decrypted: the big hills  options: cower in fear, storm the castle, play again?  action plan: zulu-9 And the full message that I encrytped was: enemy said: red tanks are over the big hills options: cower in fear, storm the castle, play again? action plan: zulu-9 The problem was that by XORing the known plain text with the encrypted text, we exposed the key stream, which we also use to compute the next part of the keystream. At this point, I’m entirely exposed. In order to fix that, we need to add something secret back to the mix. The secret key is the obvious answer, and here is the code fix for this issue: This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters Show hidden characters const Md5 = std.crypto.hash.Md5; const MyCryptoAlgo = struct { state: [Md5.digest_length]u8 = undefined, nonce: [Md5.digest_length]u8 = undefined, key: [Md5.digest_length]u8 = undefined, pos: u8 = 0, pub fn initWithNonce(key: [Md5.digest_length]u8, nonce: [Md5.digest_length]u8) MyCryptoAlgo { var self = MyCryptoAlgo{}; std.mem.copy(u8, &self.nonce, &nonce); std.mem.copy(u8, &self.key, &key); var md5 = Md5.init(.{}); md5.update(&key); md5.update(&self.nonce); md5.final(&self.state); return self; } pub fn init(key: [Md5.digest_length]u8) MyCryptoAlgo { var nonce: [Md5.digest_length]u8 = undefined; std.crypto.random.bytes(&nonce); return initWithNonce(key, nonce); } pub fn run(self: *MyCryptoAlgo, data: []u8) void { for (data) |c, i| { if (self.pos == Md5.digest_length) { var md5 = Md5.init(.{}); md5.update(&self.key); md5.update(&self.state); md5.final(&self.state); self.pos = 0; } data[i] = self.state[self.pos] ^ c; self.pos += 1; } } }; view raw badEncryptedWithBetterKeyStream.zig hosted with ❤ by GitHub That would fix this problem. Even if we tried this again, we’ll get a part of the key stream, but we won’t be able to compute the next block of the encrypted values, since we need the key for that. And yes, I know about HMAC, I’m planning to discuss that in the next post.

Badly implementing encryption

by Oren Eini

posted on: February 14, 2022

In my previous post, I implemented an “encryption system” using a stream cipher based on top of Md5. The idea is to start with a given key (128 bits in size) and feed it to Md5 recursively, leading to a random looking set of bits that we can generate for both encryption and decryption. I mentioned, but it is worth repeating, this is not a secure scheme, you should not be using this for anything except (maybe) to learn. As a reminder, here is what the result of this algorithm looks like when we run it: Key Plain text Encrypted text +oupDG0cMVQr7hWqctWZEA== attack at dawn! 4771EC89753EEB16A899E2F79DE9D6 +oupDG0cMVQr7hWqctWZEA== attack at dusk! 4771EC89753EEB16A899E2E399ECD6 You might notice something pretty awful about the outputs that we see here. Take a look at the encrypted text, do you see the problem? Here is what happens when I diff the output of those two strings: 4771EC89753EEB16A899E2F79DE9D6 4771EC89753EEB16A899E2E399ECD6 Do you see the problem? In my encryption algorithm, there is a huge hole. If I have the ability to ask you to encrypt a text that I control, I can compare that to other encrypted texts that I have. And in many cases, it is fairly easy to get the other side to encrypt a value that I control. For that matter, using this approach, I can also simply iterate over the values one byte at a time, until it matches what I expect. In this manner, I can figure out the plain text of the messages very quickly. This isn’t the sole attack that we have on the system, however. Even if I don’t have the ability to choose the plain text, I have other options available to me. I can inspect traffic and see that those two messages are very similar. If I correlate them to the times of attack, I can watch out for the next time that I get this message and know what it means, even if I wasn’t able to decrypt it. To resolve this issue, we need to ensure that two messages, encrypted with the same key, will never look the same, even if they have repeated sequences. How do we do that? We introduce a random variable (which is public), that will ensure that we’ll generate a new message each time. Here is the code: This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters Show hidden characters const MyCryptoAlgo = struct { state: [Md5.digest_length]u8, nonce: [Md5.digest_length]u8, pos: u8, pub fn initWithNonce(key: [Md5.digest_length]u8, nonce: [Md5.digest_length]u8) MyCryptoAlgo { var self = MyCryptoAlgo{ .state = undefined, .pos = 0, .nonce = undefined }; std.mem.copy(u8, &self.nonce, &nonce); var md5 = Md5.init(.{}); md5.update(&key); md5.update(&self.nonce); md5.final(&self.state); return self; } pub fn init(key: [Md5.digest_length]u8) MyCryptoAlgo { var nonce: [Md5.digest_length]u8 = undefined; std.crypto.random.bytes(&nonce); return initWithNonce(key, nonce); } pub fn run(self: *MyCryptoAlgo, data: []u8) void { for (data) |c, i| { if (self.pos == Md5.digest_length) { std.crypto.hash.Md5.hash(&self.state, &self.state, .{}); self.pos = 0; } data[i] = self.state[self.pos] ^ c; self.pos += 1; } } }; view raw badlyEncryptWithNonce.zig hosted with ❤ by GitHub In order to encrypt, I’m generating a nonce (number only used once) randomly. That number is 128 bits in length and will make sure that even if we encrypt the same value twice, we’ll always get a different encrypted string. Here are the results: Key Plain text Nonce Encrypted text +oupDG0cMVQr7hWqctWZEA== attack at dusk! 91AA2DB93DE35785D7FC1D5394F524C4 FC8638CA6BDEEDEE7F79D1BF3F72D6 +oupDG0cMVQr7hWqctWZEA== attack at dusk! 16C915439BA0EC862307D091293B0D7E 2080498822D299FDEE6B2C31C1AE87 It is quite apparent that even though we encrypted the same string, the output is entirely different from one another. One thing to point out, I’m using 128 bits nonce here, but I didn’t bother to actually check what level of security adding a nonce of this size provides. It permutes the output of the encrypted text, but to what level, you’ll need an actual cryptographer to tell you. I’m simply using 128 bits as a nice value. The text attack at dawn! on the other hand, is encrypted to BF5946EAC0C7BA5EDF611D440F7223 using the nonce: DE296C6916183A5B38480E971DDEF48C. This is radically different from the previous example, and does not provide us with a place to start analyzing the encrypted text for similarities between messages. I’m happy, since we dealt with one (of the many) weaknesses in the encryption algorithm. In my next post, I’m going to look at a few more.

Debugging a WebView2 component using Playwright in .NET

by Gérald Barré

posted on: February 14, 2022

To write end-to-end tests of an application hosting a WebView2 component, you may want to use Playwright. Playwright allows to control the browser and to interact with the DOM. This is a well-known tool to test web applications.To attach Playwright to a WebView2 component, you need to use the Chrom

Badly implementing encryption

by Oren Eini

posted on: February 11, 2022

One of the important adages in computer programming is: “Thou shall not write your own encryption”. That is because building your own encryption system is fraught will all sorts of really subtle details that are easy to get wrong, which will entirely eliminate any security that you have in place. For that matter, professional cryptographers are getting those details wrong often enough that WEP has been completely broken. Do not make use of any code that I’m showing in this series of posts to do anything except expand your horizons. This is incredibly insecure. If you want to use encryption, go and use something like Sodium, which is a well reviewed and thought-out encryption library. With that said, let’s see what kind of encryption I can implement. My math skills are lousy, so there is going to be absolutely no math whatsoever in this series. That sounds like it makes it tough to implement encryption, but it isn’t so bad. The absolute best encryption method, in terms of secrecy, is  the One Time Pad. There is a proof that it is unbreakable, which I linked but won’t discuss (nor do I claim to understand). The idea is simple, given a random set of bits, we can XOR that with our message, and be sure that there is no way to derive what the original message was. All possible values are equally likely. While One Time Pad is ideal, its use in practice has severe limitations: You can use the One Time Pad once. Using it twice invalidates the entire scheme. Here is some unfortunate history on Russian spies that reused the pads. You need to have a source of true randomness. Your One Time Pad must be at least as big as the message. The key problem with One Time Pad (pun intended) is how you distributed the pad material. Your security is only as valid as the pad transfer mechanism. And if you want to exchange complex messages, you need big pads. The key to modern encryption is that while we can’t mathematically prove that our method is as secured as One Time Pad, we can become confident enough in our methods to make practical use of that. A One Time Pad is a shared source of truly random data. If we had a way to create a random stream of bits for a smaller shared state, that might do it, right? It wouldn’t be true randomness, of course, but if there is no way to predict what the output should be without knowing a shared secret, that should be enough, we hope. How can we generate some random data from a shared secret? Well, I’m going to use the MD5 primitive as the key function here. That is a cryptographic hash function that you can feed it some buffer and it will compute a cryptographic hash of the buffer. The output of the hash function is pseudo random, so that might be a good building block to create a stream of random bits. I’m using MD5 intentionally here. I assume that by now anyone that pays any attention to cryptography will know that MD5 is considered broken. I don’t want anyone to use this method. In general, the concept that I’m going for is something like this: This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters Show hidden characters const Md5 = std.crypto.hash.Md5; pub fn generateStream(key: [Md5.digest_length]u8, size: usize, output: std.fs.File.Writer) !void { var buffer: [Md5.digest_length]u8 = undefined; Md5.hash(&key, &buffer, .{}); while (size > 0) : (size -= Md5.digest_length) { if (size < Md5.digest_length) { try output.write(buffer[0..size]); break; } try output.write(buffer); Md5.hash(&buffer, &buffer, .{}); } } view raw keyStreamGen.zig hosted with ❤ by GitHub In other words, I’m generating a stream of bits by continuously feeding the output of Md5 into itself, starting from the provided key. Again, this is not secured, I’ll touch on exactly why in a future post. A small optimization that I can make is that I don’t actually need to store all the full key bits-stream ahead of time, I can generate that on the fly, as we get data to encrypt. Here is what the encryption algorithm looks like, we generate the key stream and compute the XOR over the provided data: This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters Show hidden characters const Md5 = std.crypto.hash.Md5; const MyCryptoAlgo = struct { state: [Md5.digest_length]u8, pos: u8, pub fn init(key: [Md5.digest_length]u8) MyCryptoAlgo { var self = MyCryptoAlgo{ .state = undefined, .pos = 0 }; Md5.hash(&key, &self.state, .{}); return self; } pub fn run(self: *MyCryptoAlgo, data: []u8) void { for (data) |c, i| { if (self.pos == Md5.digest_length) { std.crypto.hash.Md5.hash(&self.state, &self.state, .{}); self.pos = 0; } data[i] = self.state[self.pos] ^ c; self.pos += 1; } } }; view raw badlyEncrpt.zig hosted with ❤ by GitHub Note that we don’t have a distinction here between encryption and decryption. We generate the key stream and XOR it with the caller’s data. If the data is plain text, it is encrypted, if it is encrypted text, it is decrypted. Here is how this will work: Key (base64) +oupDG0cMVQr7hWqctWZEA== Text attack at dawn! Encrypted (hex) 4771EC89753EEB16A899E2F79DE9D6 And the code to make it happen: This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters Show hidden characters var mca = MyCryptoAlgo.init(key); mca.run(data); view raw usage.zig hosted with ❤ by GitHub We mutate the data in place, I should note. Given that we are not changing the size of the data, that is the simplest way to write the API. This “encryption” method works, and we go from a reasonable looking plain text to something that certainly looks like it is encrypted. This is also a fairly horrible encryption scheme, of course, but I’ll touch on that in my next post.