In the previous post I showed how we can mess around with the encrypted text, resulting in a valid (but not the original) plain text. We can use that for many nefarious reasons, as you can imagine. Luckily, there is a straightforward solution for this issue. We can implement something called MAC (message authentication code) to ensure that the encrypted data wasn’t tampered with. That is pretty easy to do, actually, since we have HMAC already, which is meant for exactly this purpose.
The issue here is an interesting one, what shall we sign? Here are the options:
Sign the plain text of the message, using a hashed key function (HMAC-MD5, in our case). Because we are using the secret key to compute the hash, just looking at the hashed value will not tell us anything about the plain text (for example, if we were using just MD5, we could use rainbow tables to figure out what the plain text was from the hash). Since there is no security issue with making the signature public, we can just append that to the output of the encryption as plain text. At least, I don’t think it does. I’ll remind you again that I’m not a cryptographer by trade.
Sign the plain text of the message (using a hashed key function or a regular cryptographic hash function) and append the hash (encrypted) to the output message.
Sign the encrypted value of the message, and append that hash to the output message.
A visualization might make it easier to understand. If you want to read more, there is a great presentation here.
The first two options are bad. Using those methods will leak your data in various ways. There is something that is apparently called the Cryptographic Doom principle, which is very relevant here. The idea is simple, we don’t trust the encrypted value, it may have been modified by an adversary. The first two options that we have require us to first take an action (decrypting the data) before we authenticate the message. We can then use various tricks to rip apart the whole scheme. That means that the very first thing we do is verify that the encrypted text we were handled was indeed signed by a trusted party (that has the secret key).
If you’ll look closely at the image above, you can see that I’m using two keys here, instead of one: key1 and key2. What is that about?
In cryptography, there is a strong reluctance to reuse the same key in different contexts. The issue is that if we use a single key in multiple scenarios (such as encryption and authentication), a weakness in one of them can be exploited in the other. Remember, cryptography is just math, and the fear is that given two values that were computed with the same key, but using different algorithms, you can do something with that. That has led to practical attacks in the past, so the general practice is to avoid reusing keys. The good thing is that given a single cryptographic key, it is easy to generate a new key using a key derivation function.
I’m still going to limit myself to HMAC-Md5 only (remember, none of this code is meant to actually be used), so I can derive a new key from an existing one using the following mechanism:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Show hidden characters
fn deriveKey(key: *const [32]u8, nonce: []const u8, domain: []const u8, newKey: *[32]u8) void {
var buf: [4]u8 = undefined;
std.mem.writeIntNative(u32, &buf, 1);
var mac: [16]u8 = undefined;
var hmacMd5 = HmacMd5.init(key);
hmacMd5.update(domain);
hmacMd5.update(nonce);
hmacMd5.update(&buf);
hmacMd5.final(&mac);
std.mem.copy(u8, newKey, &mac);
std.mem.writeIntNative(u32, &buf, 2);
hmacMd5 = HmacMd5.init(key);
hmacMd5.update(nonce);
hmacMd5.update(domain);
hmacMd5.update(&buf);
hmacMd5.final(&mac);
std.mem.copy(u8, newKey[16..], &mac);
}
view raw
deriveKey.zig
hosted with ❤ by GitHub
The idea is that we use the HMAC and the static domain string we get to generate the new key. In this case, we actually use it twice, with the nonce being used to inject even more entropy into the mix. Since HMAC-Md5 outputs 16 bytes and I need a 32 bytes key, I’m doing that twice, with a different counter value each time. I also rearrange the order of the (nonce, domain) and (domain, nonce) fields on each hashing attempt to make it more interesting.
A reminder: I didn’t spend any time trying to figure out what kind of security this sort of system brings. It looks very much like what Sodium does for key derivation, but I wouldn’t trust it with anything.
With that in place, here is the new code for encryption:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Show hidden characters
const HmacMd5 = std.crypto.auth.hmac.HmacMd5;
const MyCryptoAlgo = struct {
state: [HmacMd5.mac_length]u8 = undefined,
nonce: [HmacMd5.mac_length]u8 = undefined,
keyEnc: [HmacMd5.key_length]u8 = undefined,
keyMac: [HmacMd5.key_length]u8 = undefined,
macGen: HmacMd5 = undefined,
pos: u8 = 0,
count: [1]u64 = undefined,
pub const NonceLen = HmacMd5.mac_length;
pub const MacLen = HmacMd5.mac_length;
pub const ExtraLen = MacLen + NonceLen;
pub fn initWithNonce(key: [HmacMd5.key_length]u8, nonce: [HmacMd5.mac_length]u8) MyCryptoAlgo {
var self = MyCryptoAlgo{};
self.count[0] = 0;
std.mem.copy(u8, &self.nonce, &nonce);
deriveKey(&key, "encryption", &nonce, &self.keyEnc);
deriveKey(&key, "authentication", &nonce, &self.keyMac);
var md5 = HmacMd5.init(&self.keyEnc);
md5.update(&self.nonce);
md5.update(std.mem.sliceAsBytes(&self.count));
md5.final(&self.state);
self.macGen = HmacMd5.init(&self.keyMac);
self.macGen.update(&self.nonce);
return self;
}
pub fn init(key: [HmacMd5.key_length]u8) MyCryptoAlgo {
var nonce: [HmacMd5.mac_length]u8 = undefined;
std.crypto.random.bytes(&nonce);
return initWithNonce(key, nonce);
}
pub fn encrypt(key: [32]u8, plainText: []const u8, cipherText: []u8) void {
var mca = MyCryptoAlgo.init(key);
mca.encryptBlock(plainText, cipherText);
mca.finalize(cipherText[plainText.len..]);
}
pub fn finalize(self: *MyCryptoAlgo, data: []u8) void {
std.mem.copy(u8, data, &self.nonce);
var mac: [16]u8 = undefined;
self.macGen.final(&mac);
std.mem.copy(u8, data[NonceLen..], &mac);
}
pub fn encryptBlock(self: *MyCryptoAlgo, input: []const u8, output: []u8) void {
self.xorWithKeyStream(input, output);
self.macGen.update(output[0..input.len]); // process the encrypted stream
}
fn xorWithKeyStream(self: *MyCryptoAlgo, input: []const u8, output: []u8) void {
for (input) |c, i| {
if (self.pos == self.state.len) {
self.genKeyStreamBlock();
}
output[i] = self.state[self.pos] ^ c;
self.pos += 1;
}
}
fn genKeyStreamBlock(self: *MyCryptoAlgo) void {
var md5 = HmacMd5.init(&self.keyEnc);
md5.update(&self.nonce);
self.count[0] += 1;
md5.update(std.mem.sliceAsBytes(&self.count));
md5.final(&self.state);
self.pos = 0;
}
};
view raw
badEncryption.zig
hosted with ❤ by GitHub
We have a lot going on here. In the initWithNonce() function, we generate the derived keys for two domains. Then we generate the block of key stream, as we previously did. The last stage in the initWithNonce() function is initializing the MAC computation. Note that in addition to using a derived key for the MAC, I’m also adding the nonce as the first thing that we’ll hash. That should have no effect on security, but it ties the output hash even closer to this specific encryption.
In the xorWithKeyStream() function, you’ll note that I’m now passing both an input and an output buffer, aside from that, this is exactly the same as before (with the actual key stream generation moved to genKeyStreamBlock()). Things get interesting in the encrypytBlock() function. There we XOR the value that we encrypt with the key stream and push that to the output. We also add the encrypted value to the MAC that we generate.
The idea with encryptBlock() is to allow you to build an encrypted message in a piecemeal fashion. Once you are done with the data you want to encrypt, you need to call to finalize(). That would copy the nonce and complete the computation of the MAC of the encrypted portion.
The encrypt() function is provided in order to make it easier to work with the data when you want to encrypt a single buffer. (And yes, I’m not doing any explicit bounds check here, relying on Zig’s to panic if we need to. I mentioned that this isn’t production level code already?)
For encryption, we can pass either a single buffer to encrypt or we can pass it in pieces. For decryption, on the other hand, the situation isn’t as simple. To decrypt the data properly, we first need to verify that it wasn’t modified. That means that to decrypt the data, we need all of it. The API reflects this behavior:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Show hidden characters
pub fn decrypt(key: [32]u8, cipherText: []u8, plainText: []u8) !void {
if (cipherText.len < ExtraLen)
return error.CihpherTextBufferTooSmall;
if (cipherText.len - ExtraLen > plainText.len)
return error.PlainTextBufferTooSmall;
var nonce: [HmacMd5.mac_length]u8 = undefined;
std.mem.copy(u8, &nonce, cipherText[cipherText.len - ExtraLen .. cipherText.len - NonceLen]);
var mac = cipherText[cipherText.len - NonceLen .. cipherText.len];
var dec = MyCryptoAlgo.initWithNonce(key, nonce);
var msgText = cipherText[0 .. cipherText.len - ExtraLen];
var myMac: [16]u8 = undefined;
dec.macGen.update(msgText);
dec.macGen.final(&myMac);
if (std.mem.eql(u8, mac, &myMac) == false) {
return error.InvalidMac;
}
dec.xorWithKeyStream(msgText, plainText);
}
view raw
decrypt.zig
hosted with ❤ by GitHub
The decrypt() function does do some checks. We are dealing here with input that is expected to be malicious. As such, the first thing that we do is to extract the MAC and the nonce from the cipher text buffer. I decided it would be simpler to require that as a single buffer (although, as you can imagine, it would be very simple to change the API to have that as independent values).
Once we have the nonce, we can initialize the struct with the key and nonce (which will also derive the keys and setup the macGen properly). The next step is to compute the hash over the encrypted text and verify that it matches our expectation.
Yes, I’m using eql() here for the comparison. This is a short circuiting operation, and I’m doing that intentionally so I can talk about this in a future post.
If the MAC that I compute is a match to the MAC that was provided, we know that the message hasn’t been tampered with. At that point we can simply XOR the encrypted text with the key stream to get the original value back.
A single bit out of line with this model will ensure that the decryption fails. What is more, note that we don’t do anything with the decryption until we validate the provided MAC and cipher text. To do anything else would invite cryptographic doom, so it is nice that we were able to avoid it.
In the next post, I’m going to cover timings attacks.
When you have multiple Visual Studio instances, the Visual Studio Installer allows you to update them one by one. This is not very convenient as the installer takes time to download and install newer versions. Daniel Cazzulino made a .NET tool to manage Visual Studio instances. This tool allows you
In the previous post, we managed to get to a fairly complete state, the full code is less than 50 lines of code, but has enough functionality to be able to make use of it.
Don’t actually do that. This code is horribly broken, and the adage of “don’t implement your own encryption” holds very strongly here.
Let’s consider a fairly typical encryption setup. You log in with me, I generate an encrypted cookie and hand it back to you. In all future interactions, you give me back the cookie. I can decrypt that and make decisions based on that.
For example, let’s assume that we want to send the user the following:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Show hidden characters
// plain text: {"role":"users","name":"oren"}
// nonce: 8E92B0E4AE4BE6BEFEF2638D02416E61
// encrypted: 6763169603E0BFA8A6BC6B2C768EABAA930E15CB7D11901C9E932ED0DBD8
// structure...
pub const UserCookie = struct {
role: []const u8,
name: []const u8,
pub fn deinit(self: *UserCookie, allocator: std.mem.Allocator) void {
allocator.free(self.role);
allocator.free(self.name);
}
};
fn genCookie(allocator: std.mem.Allocator, key: [32]u8, uc: *UserCookie) ![]u8 {
var json = try std.json.stringifyAlloc(allocator, uc, .{});
defer allocator.free(json);
var encrypted = try allocator.dupe(u8, json);
defer allocator.free(encrypted);
var mca = MyCryptoAlgo.init(key);
mca.run(encrypted);
return try std.fmt.allocPrint(allocator, "{},{}", .{
std.fmt.fmtSliceHexUpper(&mca.nonce),
std.fmt.fmtSliceHexUpper(encrypted),
});
}
view raw
data.zig
hosted with ❤ by GitHub
We compute the user name and role, pass it to the genCookie() function and have an encrypted string to work with. In this case, here is the cookie in question:
8E92B0E4AE4BE6BEFEF2638D02416E61,6763169603E0BFA8A6BC6B2C768EABAA930E15CB7D11901C9E932ED0DBD8
To go the other way, we decrypt the cookie using the nonce and the key, then parse the JSON into the cookie struct, like so:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Show hidden characters
fn parseEncryptedCookie(allocator: std.mem.Allocator, cookie: []const u8, key: [32]u8) !UserCookie {
var comma = std.mem.indexOf(u8, cookie, ",") orelse return error.InvalidCookieFormat;
var nonce: [16]u8 = undefined;
try parseHex(cookie[0..comma], &nonce);
var rest = cookie[comma + 1 ..];
if (rest.len > 1042)
return error.CookieSizeIsTooBig;
var buf: [1024]u8 = undefined;
var raw = buf[0 .. rest.len / 2];
try parseHex(rest, raw);
var mca = MyCryptoAlgo.initWithNonce(key, nonce);
mca.run(raw);
return try std.json.parse(UserCookie, &std.json.TokenStream.init(raw), .{
.allocator = allocator,
});
}
fn parseHex(hexStr: []const u8, bytes: []u8) !void {
if (hexStr.len != bytes.len * 2)
return error.MismatchedBufferSizes;
var idx: usize = 0;
while (idx < hexStr.len) : (idx += 2) {
bytes[idx / 2] = try std.fmt.parseInt(u8, hexStr[idx .. idx + 2], 16);
}
}
view raw
parse.zig
hosted with ❤ by GitHub
With this in place, we can start making use of this foundation. Here is some user level code:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Show hidden characters
var user = try parseEncryptedCookie(allocator, cookieStr, key);
defer user.deinit(allocator);
if(!std.mem.eql(u8, "admin", user.role)){
return error.PermissionDenied;
}
// privileged action
view raw
usage.zig
hosted with ❤ by GitHub
At this point, this is awesome. We know that the cookie we got was encrypted with my key (which the user doesn’t have, obviously). So I handed an encrypted blob to the user, got it back and now I can make decisions based on this.
Or can I? A proper crypto system is defined as one where everything is known, aside from the secret key, and it maintains all its properties. The plain text of the cookie is:
{"role":"users","name":"oren"}
But I don’t have that. However… can I play with this? Remember the last post, we saw that XOR on the encrypted text can give us a lot of insight. What about in this case? Here is what I know:
The role value starts at position 9 and lasts 5 characters.
The value it currently has is “users”, we would like it to be “admin”.
Since we got the encrypted text from the server, we can return something else at a later point. Can we take advantage of that? Well, XOR is cumulative, right? So we can do this:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Show hidden characters
fn hackTheCoookie(allocator: std.mem.Allocator, cookie: []const u8) ![]u8 {
var comma = std.mem.indexOf(u8, cookie, ",") orelse return error.InvalidCookieFormat;
var buf: [1024]u8 = undefined;
var rest = cookie[comma + 1 ..];
var raw = buf[0 .. rest.len / 2];
try parseHex(rest, raw);
var oldRole = "users";
var newRole = "admin";
for (oldRole) |c, i| {
raw[9 + i] ^= c ^ newRole[i];
}
return std.fmt.allocPrint(allocator, "{s},{s}", .{
cookie[0..comma],
std.fmt.fmtSliceHexUpper(raw),
});
}
view raw
hackTheCoookie.zig
hosted with ❤ by GitHub
Let’s see what we’ll get when we run this on this cookie:
8E92B0E4AE4BE6BEFEF2638D02416E61,6763169603E0BFA8A6BC6B2C768EABAA930E15CB7D11901C9E932ED0DBD8
And the output would be:
8E92B0E4AE4BE6BEFEF2638D02416E61,6763169603E0BFA8A6A87C246D93ABAA930E15CB7D11901C9E932ED0DBD8
Now, if I send this to the server, it will properly decrypt this into:
{"role":"admin","name":"oren"}
And we are off to the races with full administrator privileges.
This is not a theoretical issue, it has been exploited in the past, to a devastating effect.
The question is, how do we prevent that? The key issue is that encryption (for stream ciphers) is basically just XORing with a secret key stream. Even for block ciphers, the encrypted data is malleable. We can’t assume that it will not decrypt properly, or that the decryption of the modified encrypted text wouldn’t result in valid plain text.
Luckily, cryptography also has an answer, we can create a signature of the data (with the secret key) and then use that to verify that the data hasn’t been tampered with. In fact, we are already using HMAC, which is meant for this exact purpose. This is a pretty big topic, so I’ll discuss that in the next post.
In the previous post, I moved to using HMAC for the key stream generation. Together with a random nonce, we ensure that each time that we encrypt a value, we’ll get a different encrypted value. However, I want to stop for a while and talk about what will happen if we reuse a nonce.
The short answer, a single nonce reuse results in catastrophic results for our security scheme. Let’s take a look at how that works, shall we?
We are going to use:
Key: jMnNRO9K7DEGmrhPS6awT3w4AAjCMgaNNqPSiwTL//s=
Nonce: 3ilsaRYYOls4SA6XHd70jA==
Here is the encryption results:
attack at dawn!
414CE53F71D47A36FF099792858F58
defend at dusk!
445DF73B7CDB7A36FF099786818A58
Do you notice anything? At this point, we can see that we have some similarities between the texts, which is interesting. If we XOR those two texts, what will we get?
Well, if you think about it, we have:
“attack at dawn!” XOR keystream = 414CE53F71D47A36FF099792858F58”defend at dusk!” XOR keystream = 445DF73B7CDB7A36FF099786818A58
The XOR operation is cumulative, so if we XOR those two values, we have:
445DF73B7CDB7A36FF099786818A58 XOR 445DF73B7CDB7A36FF099786818A58 = ( “attack at dawn!” XOR keystream ) XOR (”defend at dusk!” XOR keystream) = “attack at dawn!” XOR ”defend at dusk!” = 051112040D0F000000000014040500
Sorry for the math here, but the basic idea should be clear. Note that in this case, we don’t know either of those messages, but what we have been able to do is to get the XOR of the plain texts. At this point, an actual cryptographer will go look at frequency tables for symbols in English and start making guesses about those values. I’m certain that there are better techniques for that, but given the computing power that I have, I decided to just break it using the following code:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Show hidden characters
var words = std.StringHashMap(void).init(allocator);
defer {
var it = words.keyIterator();
while (it.next()) |w| {
allocator.free(w.*);
}
words.deinit();
}
var f = try std.fs.openFileAbsolute("/usr/share/dict/words", .{});
defer f.close();
var reader = f.reader();
while (try reader.readUntilDelimiterOrEofAlloc(allocator, '\n', 256)) |word| {
try words.put(word, {});
}
var wit = words.keyIterator();
while (wit.next()) |w| {
if (w.*.len > plainTextXor.len or w.*.len < 4)
continue;
var attempt = try allocator.dupe(u8, plainTextXor[0..w.*.len]);
defer allocator.free(attempt);
for (w.*) |c, i| {
attempt[i] ^= c;
}
if (words.get(attempt)) |_| {
std.log.info("word: {s}, match: {s}", .{ w.*, attempt });
}
}
view raw
poorManCryptoBreaker.zig
hosted with ❤ by GitHub
When running this code on the XOR of the encrypted texts (which is the equivalent of the XOR of the plain texts), we get the following outputs:
info: xor: 051112040D0F000000000014040500 info: word: defends, match: attacks info: word: attacker's, match: defender's info: word: attacker, match: defender info: word: defenders, match: attackers info: word: adapt, match: dusty info: word: attack, match: defend info: word: defending, match: attacking info: word: attacks, match: defends info: word: dusty, match: adapt info: word: defender's, match: attacker's info: word: defender, match: attacker info: word: defend, match: attack info: word: attackers, match: defenders info: word: attacking, match: defending info: word: attacked, match: defended info: word: defended, match: attacked
As you can see, we have a bunch of options for the plain text. Removing redundancies, we have the following pairs:
attack
defend
adapt
dusty
That was easy, I have to say. Now I can move on to the next word, eventually cracking the whole message. For the values that are the same, I’ll get zero bytes, of course. At that point, I can simply try guessing based on context, but the process shouldn’t take long at all.
The question is how to solve this? Currently, the commonly accepted practice is to shout at you very loudly in code reviews and to put big notices in the documentation: Thou shall not reuse a nonce.
There is something called SIV mode, which aims to help in this, but I want to keep that for a future post.
I’m trying to not get too deep into the theory of encryption. I’m happy to say that so far I was able to avoid any math whatsoever and hopefully this is an interesting series. I do have to touch on an important topic.
I’m using MD5 here for the purpose of generating a random bitstream to be used as a stream cypher. In the previous post, we looked into a key issue. If we don’t do things properly, we can easily get to the point where a single 16 bytes block that we guess can allow us to decrypt the entire message.
I “solved” that in the previous post by adding the key back again into the MD5 computation. That works, but it isn’t ideal. There are all sorts of subtle issues that you have to take into account (length extensions, for example) and probably other stuff that I’m not even aware of. There is the HMAC family of functions. That is a keyed hash function that has far stronger security properties. Wikipedia does a great job explaining it. Note that there is a cost, HMAC is more expensive than the underlying hash function it uses.
The best practical explanation, by the way, I found here. Our previous method of adding a key to the mix was to concatenate it in front of the message. The problem is that if md5(msgA) == md5(msgB), then md5(key || msgA) == md5(key || msgB). And that isn’t something that we want. I (personally) can’t think of a way to abuse that property to get something nasty going on with the way we use it in this encryption algorithm, but I’m very much not an expert. The HMAC model, on the other hand, would use: md5( key1 || md5(key2 || msgA)). And there is no way to get that to collide, even if the messages generate collisions for MD5.
Let’s see what we need to do to switch to HMAC-MD5 instead of MD5. Here is what the code now looks like:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Show hidden characters
const HmacMd5 = std.crypto.auth.hmac.HmacMd5;
const MyCryptoAlgo = struct {
state: [HmacMd5.mac_length]u8 = undefined,
nonce: [HmacMd5.mac_length]u8 = undefined,
key: [HmacMd5.key_length]u8 = undefined,
pos: u8 = 0,
count: [1]u64 = undefined,
pub fn initWithNonce(key: [HmacMd5.key_length]u8, nonce: [HmacMd5.mac_length]u8) MyCryptoAlgo {
var self = MyCryptoAlgo{};
self.count[0] = 0;
std.mem.copy(u8, &self.nonce, &nonce);
std.mem.copy(u8, &self.key, &key);
var md5 = HmacMd5.init(&key);
md5.update(&self.nonce);
md5.update(std.mem.sliceAsBytes(&self.count));
md5.final(&self.state);
return self;
}
pub fn init(key: [HmacMd5.key_length]u8) MyCryptoAlgo {
var nonce: [HmacMd5.mac_length]u8 = undefined;
std.crypto.random.bytes(&nonce);
return initWithNonce(key, nonce);
}
pub fn run(self: *MyCryptoAlgo, data: []u8) void {
for (data) |c, i| {
if (self.pos == self.state.len) {
var md5 = HmacMd5.init(&self.key);
md5.update(&self.nonce);
self.count[0] += 1;
md5.update(std.mem.sliceAsBytes(&self.count));
md5.final(&self.state);
self.pos = 0;
}
data[i] = self.state[self.pos] ^ c;
self.pos += 1;
}
}
};
view raw
badEncryptionWithHMac.zig
hosted with ❤ by GitHub
There are a few things that are worth noting here. We changed the size of the key (since HMAC-MD5 uses 32 bytes, not 16 bytes). Again, I have no comment on the actual security of such a scheme, mostly because I wouldn’t know where to even begin doing this analysis.
We are also no longer using the previous block as the input to the next block. Instead, we use a counter mode, where we hash the nonce and an ever incrementing counter using the provided key. That gives us some additional safety from the previously seen issue where we could figure out what the rest of the key stream would be.
Of course, better than before doesn’t mean that this is actually good. There are several other problems that I (as a non cryptographer) still need to fix, and probably more that I’m not seeing.
Another aspect of using this sort of construction is the additional cost that is involved. The HMAC computation isn’t that much more expensive. Looking at some benchmarks, this is about 3% slower, which is quite reasonable.
Next, we are going to see how we can abuse the malleability of this encryption system for fun and nefarious people’s profit.
GitHub recently announced support for diagrams embedded directly in markdown files. The new feature leverages the Mermaid diagramming and…Keep Reading →
In the previous post, I showed how the lack of nonce means that encrypted similar values will expose their content. Now I want to discuss a different matter, let’s assume that I have some control or knowledge about the plain text of an encrypted message. That is easy enough to obtain, I can simply ask you to encrypt a value for me. A great scenario for that may be when you are sending data based on something that I do. Let’s assume that I get you to include the following plain text in your message: “red tanks are over the big hills”.
I am then able to intercept your message, which looks like this:
FE485CEECED5BA4CCA281D1F586E67233D924652E5BD690357F6E29C1C36DC446001DDDF16536DB427337089D27A9C6FCCED553FA4982E58F8B7B5FDD02A11C0A1C08E93FA2C29582A15DC34CFCFB61AB2975CC0F4D29F9C6715D0F9E2CE661C816E047590389A9064BA5F3E3D8461D59B7C3407A76F248A71
This was encrypted with the nonce: DE296C6916183A5B38480E971DDEF48C (remember, the nonce itself is public and has no intrinsic meaning), but I don’t actually need the nonce in this case!
Now, here is what I can do. I know that the message is bigger than 16 bytes, so I can XOR parts of the encrypted message with the known plain text. If I do this properly, I then get the key stream. Since the algorithm in question is using the key stream to compute the data directly, I can now just decrypt everything.
To give some context, here is the full code that I need to decrypt this message:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Show hidden characters
var cryptedString = "/khc7s7VukzKKB0fWG5nIz2SRlLlvWkDV/binBw23ERgAd3fFlNttCczcInSepxvzO1VP6SYLlj4t7X90CoRwKHAjpP6LClYKhXcNM/Pthqyl1zA9NKfnGcV0PnizmYcgW4EdZA4mpBkul8+PYRh1Zt8NAenbySKcQ==";
var cryptedBuf: []u8 = try allocator.alloc(u8, try std.base64.standard.Decoder.calcSizeForSlice(cryptedString));
defer allocator.free(cryptedBuf);
try std.base64.standard.Decoder.decode(cryptedBuf, cryptedString);
const knownText = "red tanks are over the big hills";
var cIx: usize = 16;
while (cIx + 16 < cryptedBuf.len) : (cIx += 16) {
var pIx: usize = 0;
while (pIx + 16 < knownText.len) : (pIx += 1) {
var cur = try allocator.dupe(u8, cryptedBuf[cIx..]);
defer allocator.free(cur);
for (knownText[pIx..(pIx + 16)]) |s, i| {
cur[i] ^= s;
}
var mca = MyCryptoAlgo{ .state = undefined, .nonce = undefined, .pos = 16 };
std.mem.copy(u8, &mca.state, cur[0..16]);
mca.run(cur[16..]);
if (std.mem.indexOf(u8, cur[16..], knownText[pIx + 16 .. pIx + 18]) != null) {
std.log.info("decrypted: {s}", .{cur[16..]});
break;
}
}
}
view raw
breakingEncryption.zig
hosted with ❤ by GitHub
I’m scanning through the encrypted text, at 16 bytes intervals (since that is the block size of our encryption routine) and try to XOR that value with the relevant matches from the known text. That gives me the key stream, which I then use to decrypt the encrypted text from that point (and compute the rest of the key stream for future values).
This code will output the following decrypted text:
info: decrypted: the big hills options: cower in fear, storm the castle, play again? action plan: zulu-9
And the full message that I encrytped was:
enemy said: red tanks are over the big hills options: cower in fear, storm the castle, play again? action plan: zulu-9
The problem was that by XORing the known plain text with the encrypted text, we exposed the key stream, which we also use to compute the next part of the keystream. At this point, I’m entirely exposed.
In order to fix that, we need to add something secret back to the mix. The secret key is the obvious answer, and here is the code fix for this issue:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Show hidden characters
const Md5 = std.crypto.hash.Md5;
const MyCryptoAlgo = struct {
state: [Md5.digest_length]u8 = undefined,
nonce: [Md5.digest_length]u8 = undefined,
key: [Md5.digest_length]u8 = undefined,
pos: u8 = 0,
pub fn initWithNonce(key: [Md5.digest_length]u8, nonce: [Md5.digest_length]u8) MyCryptoAlgo {
var self = MyCryptoAlgo{};
std.mem.copy(u8, &self.nonce, &nonce);
std.mem.copy(u8, &self.key, &key);
var md5 = Md5.init(.{});
md5.update(&key);
md5.update(&self.nonce);
md5.final(&self.state);
return self;
}
pub fn init(key: [Md5.digest_length]u8) MyCryptoAlgo {
var nonce: [Md5.digest_length]u8 = undefined;
std.crypto.random.bytes(&nonce);
return initWithNonce(key, nonce);
}
pub fn run(self: *MyCryptoAlgo, data: []u8) void {
for (data) |c, i| {
if (self.pos == Md5.digest_length) {
var md5 = Md5.init(.{});
md5.update(&self.key);
md5.update(&self.state);
md5.final(&self.state);
self.pos = 0;
}
data[i] = self.state[self.pos] ^ c;
self.pos += 1;
}
}
};
view raw
badEncryptedWithBetterKeyStream.zig
hosted with ❤ by GitHub
That would fix this problem. Even if we tried this again, we’ll get a part of the key stream, but we won’t be able to compute the next block of the encrypted values, since we need the key for that.
And yes, I know about HMAC, I’m planning to discuss that in the next post.
In this post I describe how you can wait for your ASP.NET Core app to be ready to receive requests from inside an IHostedService/BackgroundService in .NET 6…
In my previous post, I implemented an “encryption system” using a stream cipher based on top of Md5. The idea is to start with a given key (128 bits in size) and feed it to Md5 recursively, leading to a random looking set of bits that we can generate for both encryption and decryption.
I mentioned, but it is worth repeating, this is not a secure scheme, you should not be using this for anything except (maybe) to learn.
As a reminder, here is what the result of this algorithm looks like when we run it:
Key
Plain text
Encrypted text
+oupDG0cMVQr7hWqctWZEA==
attack at dawn!
4771EC89753EEB16A899E2F79DE9D6
+oupDG0cMVQr7hWqctWZEA==
attack at dusk!
4771EC89753EEB16A899E2E399ECD6
You might notice something pretty awful about the outputs that we see here. Take a look at the encrypted text, do you see the problem?
Here is what happens when I diff the output of those two strings:
4771EC89753EEB16A899E2F79DE9D6
4771EC89753EEB16A899E2E399ECD6
Do you see the problem?
In my encryption algorithm, there is a huge hole. If I have the ability to ask you to encrypt a text that I control, I can compare that to other encrypted texts that I have. And in many cases, it is fairly easy to get the other side to encrypt a value that I control.
For that matter, using this approach, I can also simply iterate over the values one byte at a time, until it matches what I expect. In this manner, I can figure out the plain text of the messages very quickly.
This isn’t the sole attack that we have on the system, however. Even if I don’t have the ability to choose the plain text, I have other options available to me. I can inspect traffic and see that those two messages are very similar. If I correlate them to the times of attack, I can watch out for the next time that I get this message and know what it means, even if I wasn’t able to decrypt it.
To resolve this issue, we need to ensure that two messages, encrypted with the same key, will never look the same, even if they have repeated sequences. How do we do that? We introduce a random variable (which is public), that will ensure that we’ll generate a new message each time.
Here is the code:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Show hidden characters
const MyCryptoAlgo = struct {
state: [Md5.digest_length]u8,
nonce: [Md5.digest_length]u8,
pos: u8,
pub fn initWithNonce(key: [Md5.digest_length]u8, nonce: [Md5.digest_length]u8) MyCryptoAlgo {
var self = MyCryptoAlgo{ .state = undefined, .pos = 0, .nonce = undefined };
std.mem.copy(u8, &self.nonce, &nonce);
var md5 = Md5.init(.{});
md5.update(&key);
md5.update(&self.nonce);
md5.final(&self.state);
return self;
}
pub fn init(key: [Md5.digest_length]u8) MyCryptoAlgo {
var nonce: [Md5.digest_length]u8 = undefined;
std.crypto.random.bytes(&nonce);
return initWithNonce(key, nonce);
}
pub fn run(self: *MyCryptoAlgo, data: []u8) void {
for (data) |c, i| {
if (self.pos == Md5.digest_length) {
std.crypto.hash.Md5.hash(&self.state, &self.state, .{});
self.pos = 0;
}
data[i] = self.state[self.pos] ^ c;
self.pos += 1;
}
}
};
view raw
badlyEncryptWithNonce.zig
hosted with ❤ by GitHub
In order to encrypt, I’m generating a nonce (number only used once) randomly. That number is 128 bits in length and will make sure that even if we encrypt the same value twice, we’ll always get a different encrypted string. Here are the results:
Key
Plain text
Nonce
Encrypted text
+oupDG0cMVQr7hWqctWZEA==
attack at dusk!
91AA2DB93DE35785D7FC1D5394F524C4
FC8638CA6BDEEDEE7F79D1BF3F72D6
+oupDG0cMVQr7hWqctWZEA==
attack at dusk!
16C915439BA0EC862307D091293B0D7E
2080498822D299FDEE6B2C31C1AE87
It is quite apparent that even though we encrypted the same string, the output is entirely different from one another.
One thing to point out, I’m using 128 bits nonce here, but I didn’t bother to actually check what level of security adding a nonce of this size provides. It permutes the output of the encrypted text, but to what level, you’ll need an actual cryptographer to tell you. I’m simply using 128 bits as a nice value.
The text attack at dawn! on the other hand, is encrypted to BF5946EAC0C7BA5EDF611D440F7223 using the nonce: DE296C6916183A5B38480E971DDEF48C. This is radically different from the previous example, and does not provide us with a place to start analyzing the encrypted text for similarities between messages.
I’m happy, since we dealt with one (of the many) weaknesses in the encryption algorithm. In my next post, I’m going to look at a few more.
We use cookies to analyze our website traffic and provide a better browsing experience. By
continuing to use our site, you agree to our use of cookies.