skip to content
Relatively General .NET

Fundamental knowledge

by Oren Eini

posted on: November 29, 2022

I’ve been calling myself a professional software developer for just over 20 years at this point. In the past few years, I have gotten into teaching university courses in the Computer Science curriculum. I have recently had the experience of supporting a non-techie as they went through a(n intense) coding bootcamp (aiming at full stack / front end roles). I’m also building a distributed database engine and all the associated software. I list all of those details because I want to make an observation about the distinction between fundamental and transient knowledge. My first thought is that there is so much to learn. Comparing the structure of C# today to what it was when I learned it (pre-beta days, IIRC), it is a very different language. I had literally decades to adjust to some of those changes, but someone that is just getting started needs to grasp everything all at once. When I learned JavaScript you still had browsers in the market that didn’t recognize it, so you had to do the “//<!—” trick to get things to work (don’t ask!). This goes far beyond mere syntax and familiarity with language constructs. The overall environment is also critically important. One of the basic tasks that I give in class is something similar to: “Write a network service that would serve as a remote dictionary for key/value operations”.  Most students have a hard time grasping details such as IP vs. host, TCP ports, how to read from the network, error handling, etc. Adding a relatively simple requirement (make it secure from eavesdroppers) will take it entirely out of their capabilities. Even taking a “simple” problem, such as building a CRUD website is fraught with many important details that aren’t really visible. Responsive design, mobile friendly, state management and user experience, to name a few. Add requirements such as accessibility and you are setting the bar too high to reach. I intentionally choose the examples of accessibility and security, because those are “invisible” requirements. It is easy to miss them if you don’t know that they should be there. My first website was a PHP page that I pushed to the server using FTP and updated live in “production”. I was exposed to all the details about DNS and IPs, understood exactly that the server side was just a machine in a closet, and had very low levels of abstractions. (Naturally, the solution had no security or any other –ities). However, that knowledge from those early experiments has served me very well for decades. Same for details such as how TCP works or the basics of operating system design. Good familiarity with the basic data structures (heap, stack, tree, list, set, map, queue) paid itself many times over. The amount of time that I spent learning WinForms… still usable and widely applicable even in other platforms and environments. WPF or jQuery? Not so much. Learning patterns paid many dividends and was applicable on a wide range of applications and topics. I looked into the topics that are being taught (both for bootcamps and universities) and I understand why in many cases, those are being skipped. You can actually be a front end developer without understanding much (if at all) about networks. And the breadth of details you need to know is immense. My own tendency is to look at the low level stuff, and given that I work on a database engine, that is obviously quite useful. What I have found, however, is that whenever I dug deep into a topic, I found ways to utilize that knowledge at a later point in time. Sometimes I was able to solve a problem in a way that would be utterly inconceivable to me previously. I’m not just talking about being able to immediately apply new knowledge to a problem. If that were the case, I would attribute that to wanting to use the new thing I just learned. However, I’m talking about scenarios where months or years later I ran into a problem, and was then able to find the right solution given what was then totally useless knowledge. In short, I understand that chasing the 0.23-alpha-stage-2.3.1-dev updates on the left-pad package is important, but I found that spending time deep in the stack has a great cumulative effect. Joel Spolsky wrote about leaky abstractions, that was 20 years ago. I remember reading that blog post and grokking that. And it is true, being able to dig one or two layers down from where you usually live has a huge amount of leverage on your capabilities.

Managing delays in distributed systems with RavenDB

by Oren Eini

posted on: November 28, 2022

A user came to us with an interesting scenario. They have a RavenDB cluster, which is running in a distributed manner. At some point, we have a user that creates a document inside of RavenDB as well as posts a message using SQS (Amazon queuing system) to be picked up by a separate process. The flow of the system is shown below: The problem they run into is that there is an inherent race condition in the way they work. The backend worker that picks up the messages may use a different node to read the data than the one that it was written to. RavenDB uses asynchronous replication model, which means that if the queue and the backend workers are fast enough, they may try to load the relevant document from the server before it was replicated to it. Amusingly enough, that typically happens on light load (not a mistake, mind). On high load, the message processing time usually is sufficient to delay things for replication to happen. In light load, the message is picked up immediately, exposing the race condition. The question is, how do you deal with this scenario? If this was just a missing document, that was one thing, but we also need to handle another scenario. While the message is waiting to be processed in the queue, it may be updated by the user. So the question now is, how do we handle distributed concurrency in a good manner using RavenDB. The answer to this question is the usage of change vectors.  A change vector is a string that represents the version of a document in a distributed environment. The change vector is used by RavenDB to manage optimistic concurrency. This is typically used to detect changes in a document behind our backs, but we can use that in this scenario as well. The idea is that when we put the message on the queue, we’ll include the change vector that we got from persisting the document. That way, when the backend worker picks up the message and starts using it, the worker can compare the change vectors. If the document doesn’t exist, we can assume that there is a replication delay and try later. If the document exists and the change vector is different, we know the document was modified, and we may need to use different logic to handle the message in question.

Starting a process as normal user from a process running as Administrator

by Gérald Barré

posted on: November 28, 2022

If you start a process from a process running as Administrator, the child process also runs as Administrator. Running a process as Administrator could be a security risk. So, a good practice is to use the minimum privileges required to run a process.If you want to start a process as a normal user f

RavenDB Index Cleanup feature

by Oren Eini

posted on: November 25, 2022

A database indexing strategy is a core part of achieving good performance. About 99.9% of all developers have a story where adding an index to a particular query cut the runtime from seconds or minutes to milliseconds. That percentage is 100% for DBAs, but the query was cut from hours or days to milliseconds. The appropriate indexing strategy is often a fairly complex balancing act between multiple competing needs. More indexing means more I/O and cost on writes, but faster reads. RavenDB has a query optimizer engine that will analyze your queries and generate the appropriate set of indexes on the fly, without you needing to think much about it. That means that RavenDB will continuously respond to your operational environment and changes in it. The end result is an optimal indexing strategy at all times. This automatic behavior applies only to automatic indexes, however. RavenDB also allows you to define your own indexes and many customers run critical business logic in those indexes. RavenDB now has a feature that aims to help you manage/organize your indexes by detecting redundant definitions & unqueried indexes which can be removed or merged. The index cleanup feature is now exposed in the Studio (since build 5.4.5): When you select it, the Studio will show you the following options: You can see that RavenDB detected that two indexes can be merged into a single one, and additionally there are some indexes that haven’t been used in a while or have been completely superseded by other indexes. RavenDB will even go ahead and suggest the merged index for you: The idea is to leverage RavenDB’s smarts so you won’t have to spend too much time thinking about index optimization and can focus on the real value-added portions of your system.

Killing all child processes when the parent exits (Job Object)

by Gérald Barré

posted on: November 21, 2022

A Job Object allows groups of processes to be managed as a unit. This is useful for managing the lifetime of a group of processes, for example, when you want to terminate a group of processes when one of them terminates. It is also useful for managing the resources consumed by a group of processes,

RavenDB PHP Client has been released

by Oren Eini

posted on: November 18, 2022

I recently talked about the beta release of the RavenDB PHP client a few months ago. I’m now really pleased to announce that we have just released the official RavenDB PHP client. Here is what this looks like:   This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters Show hidden characters use RavenDB\Documents\DocumentStore; use YourClass\Company; $store = new DocumentStore(["http://localhost:8080" ], "Northwind") $store->initialize(); { // save a new document $session = $store->openSession(); $entity = new Company("My Company Name"); $session->store($entity); $session->saveChanges(); } { // query on the data $session = $store->openSession(); $companies = $session->query(Company::class) ->whereGreaterThan("numberOfEmployees", 12) ->toList(); } view raw sample.php hosted with ❤ by GitHub

Recording

by Oren Eini

posted on: November 17, 2022

I had a lot of fun in this webinar, showing off some of RavenDB’s capabilities in really complex distributed systems:

Scaling Redis

by Ardalis

posted on: November 14, 2022

Redis is a popular open source cache server. When you have a web application that reaches the point of needing more than one front end…Keep Reading →