RavenDB is an OLTP database, it is meant to be the backend of business applications. There are some features in RavenDB that are meant for reporting purposes, but that is quite explicitly not our main focus. That is part of why RavenDB has such good integration with the rest of the environment, to give you the ability to use the best tool for the job.
With RavenDB 5.3, we are now allowing you to integrate directly with Power BI, so you can pull data from RavenDB to Power BI, write reports and in general utilize the full power of Power BI with your RavenDB data.
The image on the right, for example, is a report generated inside of Power BI on top of the sample data from RavenDB.
As you can imagine, I’m particularly stoked about this feature. Not only does it make reporting integration with RavenDB a lot simpler, the way that we do it is quite interesting. Instead of diving to the technical details, it would probably be more fun to show you how it works, from the perspective of Power BI.
The first thing we need to do is connect Power BI to RavenDB, using your existing Power BI system, you can simple click on Get Data and select:
You’ll then need to provide the connection details:
You’ll then be presented with the following dialog:
As you can see, we are translating the JSON documents inside of RavenDB to a columnar format for ease of processing inside of Power BI.
You can even take this further and issue RQL queries directly from inside of Power BI and transform the data. That means that you can utilize indexes, map/reduce operations, etc. Take a look:
And the result inside of Power BI:
As you can imagine, this is going to be a powerful tool in how you can work with your RavenDB data. You can also take this further and integrate that with Power BI on Azure, of course.
The way this works behind the scene is that we can now understand PostgreSQL wire protocol. That means that we can now be accessed from anywhere that can connect to Postgres. While the PostgreSQL protocol implementation is still marked as experimental, we have put the Power BI integration through its paces and we consider that stable enough for regular use.
Happy reporting .
This feature is part of the RavenDB 5.3 release (expected in mid November) and is available in the Enterprise edition of RavenDB.
File Scoped Namespace is a new feature of C# 10. The idea is to remove one level of indentation from source files when they contain only one namespace in it. The goal is to reduce horizontal and vertical scrolling and make the code more readable.C#copynamespace MyProject
{
class Demo
{
Next week is Black Friday, which has reached a global phenomenon status. It is a fun day for shoppers, and a nervous wreck for IT admins everywhere. It is not uncommon to see traffic doubles or triples and the actual load (processing more heavyweight requests) can go up an order of magnitude. Preparing for Black Friday can be a harrowing issue since you have a narrow window of opportunity and it is hard to know exactly where the stress points are.
This year, I decided to make your life easier, and RavenDB is offering a Black Friday Surge to all our customers. No, we aren’t offering you 50% off and everything must go. What we do instead is try to be of help.
This Black Friday (and Cyber Monday as well), we are offering all our customers double what they paid for. When running RavenDB on premise, if you purchased a RavenDB license for a 12 cores cluster (running on 3 nodes of 4 cores each), we’ll offer you 30 days of double the core count. In other words, you can scale your system to be twice as powerful, and it won’t cost you a cent.
On the cloud, as well, we will provide users with credits to upgrade their clusters to the next level up (doubling their power) for a full week during the next 30 days. Again, there is no extra cost here.
You can register for the Surge here to request the upgrade and you’ll get twice as much power to handle the increased load.
Enjoy the power up!
A really nice feature that we have in RavenDB 5.3 is support for wire protocol compatibility with PostgreSQL. That opens up RavenDB to the entire PostgreSQL ecosystem. You are now able to connect to a RavenDB instance using the tools such as psql, Npgsql, etc. This feature is both surprisingly simple and incredibly complex at the same time.
The actual wire protocol from Postgres is well documented and pretty clean. Doing a clean room implementation of that is a straightforward process. Adding that to RavenDB, on the other hand, led to a number of interesting challenges. To start with, the protocol assumes, at the wire level, that all the rows that you return for a query have the same structure. This is a very reasonable assumption to make for a relation database protocol, but it doesn’t hold true when you are talking about a schema-less database such as RavenDB.
Then there is the fact that clients will generate all sort of pretty scary queries before you even get to running a user’s query. For example, take a look at how Npgsql is detecting the capabilities of the database that it connects to. Just supporting the wire protocol isn’t sufficient, you also need to support quite a bit of additional behavior.
When we implemented this feature, we decided that we’ll support the wire protocol, so you’ll be able to connect, issue queries and get results. However, the query language itself is going to be RQL. We aren’t attempting to pretend that we are a PostgreSQL instance to the outside world, only implement enough to make integration and compatibility work.
Here is an example of running a query through the Postgres integration.
This is an experimental feature, mind you. It is showing a lot of promise, but we want to get some more feedback from our users about which ways we should take it. The feature opens up many doors, but it is also bringing with it a non trivial amount of complexity.
This feature requires that we’ll open up another port to the world, this is something that we require the user to explicitly allow. To enable this feature, you’ll need to set the following options in the settings.json configuration file:
"Integrations.PostgreSQL.Enabled": true"Features.Availability" : "Experimental"
You can also control which port it will use using the Integrations.PostgreSQL.Port configuration option. We default to 5433 if none is specified.
At the current time, we only allow to issue queries and not modify data using the Postgres integration. This is something that we would very much like more feedback on, what kind of scenarios would you like to have where write scenario is supported? What kind of writes do you expect to have at that point?
Finally, a word about security. The PostgreSQL protocol supports using TLS for encryption. When running in insecure mode, RavenDB will reject SSL/TLS connections from Postgres client. When running in a secured mode (the default), the same server certificate that is used inside of RavenDB will also be used for the Postgres connection. However, while we usually require that the other side also authenticates using a client certificate, in the case of Postgres connection, we run into a problem. There are quite a few scenarios where we found out that while the Postgres protocol supports mutual authentication using client certificates, clients aren’t supporting it.
For that reason, we are allowing user & password authentication (on top of TLS connection, obviously) for the Postgres connections at this time. Note that there is no correlation between the Postgres login and the access to any other RavenDB features (where client certificates is the only option).
This is part of the RavenDB 5.3 release (expected in mid November) and will be available in the Professional and Enterprise editions.
RavenDB tries to be a good neighbor in your systems. RavenDB is typically used in polyglot solutions and we are often brought in to existing ecosystems. One of the things that we do to make it easier to use RavenDB is to have a full suite of built-in tools to make pushing data to other destinations.
For example, you can define an ETL process that will push document changes from RavenDB (potentially transforming & filtering them) to a relational database, another RavenDB instance, a data lake / OLAP system and much more.
In RavenDB 5.3 we have added Elasticsearch as an ETL target for RavenDB. If you are familiar with RavenDB ETL processes, the behavior is pretty much the same as you would expect. You select which collections you want to push to Elasticsearch, you provide a script that filters and transform the data and then you are done. From that point on, it is RavenDB’s responsibility to keep the Elasticsearch target up to date with any changes that are happening inside of RavenDB.
I’ll discuss the exact details on how to make it work shortly, but first I want to talk a bit about the usage scenario for this. Elasticsearch, just like RavenDB, it using Lucene behind the scenes to implement indexes. Unlike RavenDB, however, Elasticsearch is all about… well, searching. In that context, there is a pretty big overlap between RavenDB and Elasticsearch. In fact, one of the primary reasons we see people selecting RavenDB is that they now don’t need to maintain multiple environments (one to store the data and an Elasticsearch cluster for searching on that), RavenDB is able to undertake both needs in a single highly integrated and performant package.
The most common scenario for Elasticsearch ETL is when you already have an existing investment in Elasticsearch. RavenDB will naturally integrate into your environment, without needing to make any significant changes. That can enable you to start running queries, Kibana dashboard, etc on your RavenDB documents.
Here is the transformation script:
And the configuration telling RavenDB where to go:
You can push multiple collections to multiple Elasticsearch indexes. It is important to note that you must include the RavenDB document Id as a property in the script and also set it in the destination index configuration. If the Elasticsearch index doesn't already exist, RavenDB will create it for you on the fly.
This is… pretty much it. The actual feature is fully fledged, of course. You get monitoring and tracking, it will run in high availability mode and will be assigned an owner node in the cluster, etc. If there is a failure on Elasticsearch, there is no data loss, RavenDB will wait for the target to come back up and push all the data that was changed in the meantime. The ETL process is an online process, which means that you can expect to see changes in RavenDB reflected in Elasticsearch index within a few milliseconds of the transaction commit.
This feature is available in the Professional and Enterprise editions of RavenDB and will be included in the RavenDB 5.3 released, scheduled for mid November.
About a year and a half ago Kamran wrote an article showing how you can implement rate limits in RavenDB. He used both counters and expiration to handle this scenario. I think that incremental time series make it so much easier to work with, it’s not even funny.Consider the following code:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Show hidden characters
using var session = GetDocumentStore().OpenAsyncSession();
var now = DateTime.UtcNow;
var result = await session.Query<Watch>()
.Where(x => x.Id == "api/stock-exchange")
.Select(w =>
RavenQuery.TimeSeries(w, "INC:Usage", now.AddSeconds(-30), now) // past 30 seconds
.GroupBy(x => x.Days(1)) // only single result
.Select(x => x.Sum())
.ToList()
).SingleOrDefaultAsync();
if (result?.Results[0]?.Sum[0] > 50)
{
throw new RateLimitedExceededException("More than 50 requests in the last 30 seconds, try again later");
}
// do work here....
session.IncrementalTimeSeriesFor("api/stock-exchange","INC:Usage")
.Increment(1);
await session.SaveChangesAsync();
view raw
rate-limit.cs
hosted with ❤ by GitHub
Those are ~20 lines of code and that is all you need to handle a sliding window for rate limits. This will work quite nicely even with high levels of concurrency, you can also establish more interesting scenario, such as maximum 50 requests in the past 30 seconds, but also maximum of 200 requests in the past 5 minutes, etc. If you ever tried to make sense of the way Let’s Encrypt is handling rate limits, for example, you can see how you can get some really crazy complexity.That is now all handled for you. We have the part where we record the number of operations under the limit, and the rest is limited only by your imagination.For fun, you can now decide what you want to do with this data. If you are only interested in that for rate limiting purposes, you can tell RavenDB to just delete the old data once it is no longer of use using time series retention. You can also just roll them up to a 5 minute boundary and keep that information (for example, monitoring, audits and billing purposes come to mind).The whole thing is far more cohesive and easier to work with.
It's typical for API endpoints to call application or domain services. In the case of success, the API can simply return Ok and the result…Keep Reading →
Everyone is on the cloud these days, and one of the things that I keep seeing pushed is the notion of usage based billing. Basically, the idea that you are paying for what you use.
Let’s assume that we are building a software as a service where users can submit an image and you’ll do some computation on that. The actual details aren’t relevant. What matters is that your pricing model is based around how much time processing each image takes and how much memory is used. You are running this on many machines and need to figure out how to do billing at the end of the month. It turns out that this can be quite a challenge. With incremental time series, a lot of the details around that just go away.
Here is how you can implement this:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Show hidden characters
async Task Process(ProcessImageCmd cmd)
{
var sp = Stopwatch.StartNew();
var size = cmd.Image.SizeInBytes * GetMemMultiplierFor(cmd.Image.Format);
await ProcessImage(cmd.Image.GetStream()); // do actual work
sp.Stop();
using var session = store.OpenAsyncSession();
await session.StoreAsync(new ProcessingRecord(cmd.AccountId, cmd.Image.Name,sp.ElapsedMilliseconds));
session.IncrementalTimeSeriesFor(cmd.AccountId, "INC:Processing")
// record the details for billing
.Increment(new[]{ sp.ElapsedMilliseconds, size });
await session.SaveChangesAsync();
}
view raw
process.cs
hosted with ❤ by GitHub
You count the required memory as well as the actual runtime and record that in an incremental time series. We are also storing the details in a separate document for that particular run in the same transaction (if the user cares about that level of detail). The interesting bit about how this can be used is that the data is now immediately available for the user to see how much they are going to be billed.
Typically, a lot of time is spent in figuring out how to record those details efficiently and then how to query and aggregate those. We tested time series in RavenDB to billions of data points, and the internal format lends itself very well to aggregated queries.
Now you can take the code above, run it on 100s of machines, and it will all end up giving you the proper result in the end.
Table Of ContentsWhat is OpenTelemetry?Code instrumentationLoggingTracingMetricsExporting dataStarting the collector and back-endsConfiguring the application to export data to the collectorMonitoring multiple servicesEnriching the request activity created by ASP.NET CoreAdditional resourcesNoteThe
We use cookies to analyze our website traffic and provide a better browsing experience. By
continuing to use our site, you agree to our use of cookies.