skip to content
Relatively General .NET

Choose Excitement over Fear

by Ardalis

posted on: January 26, 2021

There's virtually no difference, physiologically, between the sensations and symptoms of fear and excitement. But there is a big difference…Keep Reading →

Building a social media platform without going bankrupt

by Oren Eini

posted on: January 25, 2021

Following the discussion a few days ago, I thought that I would share my high level architecture for building a social media platform in a way that would make sense. In other words, building software that is performant, efficient and not waste multiples of your yearly budget on unnecessary hardware. It turns out that 12 years ago, I wrote a post that discusses how I would re-architect twitter. At the time, the Fail Whale would make repeated appearances several times a week and Twitter couldn’t really handle its load. I think that a lot of the things that I wrote then are still applicable and would probably allow you to scale your system without breaking the bank. That said, I would like to think that I learned a lot since that time, so it is worth re-visiting the topic. Let’s outline the scenario, in terms of features, we are talking about basically cloning the key parts of Twitter. Core features include: Tweets Replies Mentions Tags Such an application does quite a lot a the frontend, which I’m not going to touch. I’m focusing solely on the backend processing here. There are also a lot of other things that we’ll likely need to deal with (metrics, analytics, etc), which are separate and not that interesting. They can be handled via existing analytics platforms and don’t require specialized behavior. One of the best parts of a social media platform is that by its very nature, it is eventually consistent. It doesn’t matter if I post a tweet and you see it now or in 5 seconds. That gives us a huge amount of flexibility in how we can implement this system efficiently. Let’s talk about numbers I can easily find: Twitter has 340 million users. Twitter has 42% of the population active. Twitter has 6,000 / second. – 15.7 billion tweets a month. 80% of the content is generated by 10% of the users. In Twitter, out of 1.3 billion accounts, only 550 million ever sent a tweet. A very small percentage have large followers counts. The latest numbers I can get are from early the previous decade, then Twitter has 0.06% of accounts with more than 20,000 followers. and 2.12% has more than 1,000 followers. Common twitter usage is 2 tweets a month (common) to 138 a month (prolific). There are problem with those stats, however. A lot of them are old, some of them are very old, nearly a decade! Given that I’m writing this blog to myself (and you, my dear reader), I’m going to make some assumptions so we can move forward: 50 million users, but we’ll assume that they are more engaged than the usual group. Out of which 50% (25,000,000) would post on a given month. 80% of the users post < 5 posts a month. That means 20 million users that post very rarely. 20% of the users 5 million or so, post more frequently, with a maximum of around 300 posts a month. 1% of the active users 50,000) posts even more frequently, to the tune of a couple of hundred posts a day. Checking my math, that means that: 50,000 high active users with 150 posts a day for a total of 225 million posts. 5 million active users with 300 posts a month for another 1.5 billion posts. 20 million other users with 5 posts a month, given us another 100 million posts. Total month posts in this case, would be: 1.745 billion posts a month. 2.4 million posts an hour. 670 posts a second. That assume that there is  a constant load on the system, which is probably not correct. For example, the 2016 Super Bowl saw a record of 152,000 tweets per minute with close to 17 million tweets posted during the duration of the game. What this means is that the load is highly variable.  We may have low hundreds of posts per second to thousands. Note that 152,000 posts per minute are “just” 2,533 posts per second, which is a lot less scary, even if it means the same. We’ll start by stating that we want to process 2,500 posts per second as the current maximum acceptable target. One very important factor that we have to understand is what exactly do we mean by “processing” a post. That means recording the act of the post and doing that within an acceptable time frame, we’ll call that 200 ms latency for the 99.99%. I’m going to focus on text only mode, because for binaries (pictures and movies) the solution is to throw the data on a CDN and link to it, nothing more really needs to be done. Most CDNs will already handle things like re-encoding, formatting, etc, so that isn’t something that you need to worry about to start with. Now that I have laid the ground works, we can start thinking about how we can actually process this. That is going to be handled in a few separate pieces. First, how do we accept the post and process it and then how do we distribute it to all the followers. I’ll start dissecting those issues in my next post.

Simplifying paths handling in .NET code with the FullPath type

by Gérald Barré

posted on: January 25, 2021

In my projects, I often deal with paths. However, there are many traps when working with path:A path can be absolute or relative (c:\a\b, .\a\b)A path can contain navigation segments (., ..)A path can end with a directory separator or not (c:\a or c:\a\)A path can contain empty segments (c:\a\\b)A

Hiring again (Poland): PHP Dev & QA Engineer

by Oren Eini

posted on: January 22, 2021

We have two positions open in Poland now:Backend PHP DeveloperQA EngineerYou can see the full job details on the provided links.These are both for working with RavenDB, so it is going to be very interesting work.

Looking at Parler specs and their architecture

by Oren Eini

posted on: January 21, 2021

I run into the following twitter, which list some of Parler’s requirements (using the upper limits specified): Scylla cluster – 40 nodes with 64 cores, 512GB RAM, 14TB NVME drives for each node. For a total of 2,560 cores and 20TB RAM, 560 TB of disks. PostgreSQL cluster – 100 nodes with 96 cores, 768 GB RAM and 4 TB NVME. For a total of 9,600 cores, 75 TB RAM and 400 TB of disks. 400 application instances – 16 cores & 64 GB RAM. Their internal traffic is about 6.6 GB / sec and their external traffic is about 2 GB / sec. There is a lot of interesting discussion on the twitter feed on these numbers, but I thought that it would be interesting to see how much it would cost to build that. The 64 Cores & 512 GB RAM can be handled via Gigabyte R282-Z90, the given specs says that a single one would cost 27,000 USD. That means that the Scylla cluster alone would be about a million dollar, but I haven’t even touched on the drives. I couldn’t find a 14 TB NVMe drive in a cursory search, but a 15.36TB drive (Micron 9300 Pro 15.36TB NVMe) costs 2,500 USD per unit. That makes the cost of the hardware alone for the Scylla cluster at 1.15 million USD. I would expect about twice that much for the PostgreSQL cluster, for what it’s worth. For the application servers, that is a lot less, with about a 4,000 USD cost per instance. That comes to another 1.6 million USD. Total cost is roughly 5 million USD, and we aren’t talking about the other stuff (power, network, racks, etc). I’m not a hardware guy, mind! I’m probably missing a lot of stuff. At that size, you can likely get volume discounts, but I’m missing that the stuff that I’m missing would cost quite a lot as well. Call it a minimum of 7.5 million USD to setup a data center with those numbers. That does not include labor and licensing costs, I want to add. Also, note that that kind of capacity is likely something that you can’t just get from anyone but the big cloud providers with a quick turnaround basis. I’ll estimate that this is a multiple months just to order the parts, to be honest. In other words, they are going to be looking at a major financial commitment and some significant lead time. Then again… Given their location in Henderson, Nevada, the average developer salary is 77,000 USD per year. That means that the personal cost, which is typically significantly higher than any other expense, is actually not that big. As of Non 2020, they had about 30 people working for Parler, assuming all of them are developers paid 100,000 USD a year (significantly higher than the average salary in their location), the employment costs of the entire company would likely be under half of the cost of the hardware required. All of that said…. what we can really see here is a display of incompetency. At the time it was closed, Parler has roughly 15 – 20 million users. A lot of them were recently registrations, of course, but Parler already experience several cases of high number of user registrations in the past. In June of 2020 it saw 500,000 users registering to its services within 3 days, for example. Let’s take the 20 million users as the number of users, and assume that all of them are in the states and have the same hours of activity. We’ll further assume that we have high participation numbers and all of those users are actively viewing. Remember the 1% rule, only a small minority of users are actually generating content on most platforms. The vast majority are silent observers. That would give us roughly 200,000 users that generate content, but even then, not all content is made equal. We have posts and comments, basically, and treating them differently is a basic part of building efficient system. On Twitter, Katy Perry has just under 110 million followers. Let’s assume that the Parler ecosystem was highly interconnected and most of the high profile accounts would be followed by the majority of the users. That means that the top 20,000 users will be followed by all the other 20 millions. The rest of the 180,000 users that active post will likely do so in reaction, not independently, and have comparatively smaller audiences. Now, we need to estimate how much these people will post. I looked at Dave Weigel’s account (591.7K followers), covering politics for Washington Post. I’m writing this on Jan 20, so the Biden inauguration takes place. I’m assuming that this is a busy time for political correspondents.  Looking at his twitter feed, he posted 3,220 tweets this month and Jan 6, which had a lot to report on, had 377 total tweets. Let’s take 500 as the reasonable upper bound for the number of interactions of most of the top users in the system, shall we? That means that we have: 20,000 high profiler users. Each posting to a max of 500 a day. Let’s assume that this all happens in 8 hours, instead of over the entire day. That translates to roughly 1,250,000 posts an hour. If we express this in terms of posts per second, that comes to 348 posts per second. Go and look at the specs above. Using these metrics, you can dedicate a machine for each one of those posts. Given the number of cores requested for application instances (400 x 16 = 6400 cores), this is beyond ridiculous. Just to give you some context, when we run benchmarks of RavenDB, we run it on a Raspberry Pi 3. That is a 25$ machine, with a shady power supply and heating issues. We were able to reach over 1,000 writes / second on a sustained basis. Now, that is for simple writes, sure, but again, that is a Raspberry Pi doing three times as much as we would need to handle Parler’s expected load (which I think I was overestimating). This post is getting a bit long, but I want to point out another social network Stack Exchange (Stack Overflow), with 1.3 billion page views per month (assuming perfect distribution, roughly 485 page views per second, each generating multiple requests). Their web servers handle 450 req/sec at peak across 9 web servers (Max of 4,050 req/sec) with peak CPU usage of 12%. 2 SQL Server clusters with 4 machines in total. Handling an aggregate of 23800 queries / sec with peak CPU usage of 15%. Render time across the board of < 20 ms. The hardware that is used for those servers: 9 Web - 48 cores + 64 GB RAM 4 DB – 32 cores + 768 GB RAM There are a few other type of servers there, and I recommend looking into the links, because there is a lot of interesting details there. The key here is that they are running top 200 site in significantly less hardware, and are able to serve requests and provide great quality of service. To be fair, Stack Overflow is a read heavy site, with under half a million questions and answers in a month. In other words, less than 0.04% of the views generate a write. That said, I’m not certain that the numbers would be meaningfully different in other social media platforms. In my next post, I’m going to explore how you can build a social media platform without going bankrupt.

Seeking feedback on the RavenDB Cluster Dashboard

by Oren Eini

posted on: January 20, 2021

I am really proud with the level of transparency and visibility that RavenDB gives out of the box to its users. The server dashboard gives you all the critical information about what a node is doing and can serve as a great mechanism to tell at a glance what is the health of a node.A repeated customer request is to take that up a notch. Not just showing a single server status, but showing the entire cluster state. This isn’t a replacement for a full monitoring solution, but it is meant to provide you with a quick overview of exactly what is going on in your cluster.I want to present some early prototypes of how we are thinking about showing this information, and I wanted to get your feedback about those, as well as any other information that you think should be relevant for the cluster dashboard.Here is the resource utilization portion of the dashboard. We aren’t settled yet on the graphs, but we’ll likely have CPU usage and memory (including memory breakdowns). Some of this information may be hidden by default, and you can expand it:You can get more details about what the cluster is doing here:And finally, the overall view of task assignment in the cluster:You can also drill down to a particular server status:This is early stages yet, we pretty much have just the mockup, so this is the right time to ask for what you want to see there.

Canceling background tasks when a user navigates away from a Blazor component

by Gérald Barré

posted on: January 18, 2021

In an application, you don't want to waste resources. When an ongoing operation is not needed anymore, you should cancel it. For instance, if your application downloads data to show them in a view, and the user navigates away from this view, you should cancel the download.In a .NET application, the