Near-Real-Time rule restrictions

Near-Real-Time (NRT) rule is a pretty new addition to Microsoft Sentinel. There are already blog posts out there detailing the functionality of this rule type and explaining in which scenarios it can be useful. There is some information on Microsoft’s site though that left some people confused. The limitations and NRT rules themselves are not working in the way lots of people expect them. So instead of a general introduction of the NRT rules I rather focus on these topics.

If you are not familiar with Near-Real-Time rules yet, I recommend you read the official documentation of this feature or this blog post.

If you think you know enough about these rules and how they work, then let’s take a look at the above-mentioned topics.

1. Unions and joins

According to Microsoft since this type of rule is new these two restrictions are in effect (beside some other ones):

The query defined in an NRT rule can reference only one table. Queries can, however, refer to multiple watchlists and to threat intelligence feeds.
You cannot use unions or joins.

From another Microsoft site:

The query itself can refer to only one table, and cannot contain unions or joins.

It is important to clarify here, that the limitation regarding joins and unions is not correct in itself. It is only true if you want to join two different data tables. The phrasing on MS’s site made some people believe that unions cannot be used in an NRT rule at all. The join and union (and also lookup) command can be used in an NRT rule if the second operand is not a second data table but a different entity.

So, unions and joins can still be used in the following scenarios:

Union/join the table with itself: KQL is a capable query language, so this option is rarely needed. In case of other SIEMs and with less capable languages sometimes you have to join a table with itself. Again, this is rarely the case with Sentinel, but it is good to know you have the option to do this in an NRT rule. So again, the following statement: “union TableA, TableA” can be used in an NRT rule.
Use any watchlist: In case of watchlists, you are free to use these operators as watchlists are not another data table.
Use externaldata/datatable: You can use an externaldata or datatable operator in an NRT rule and correlate the events with a table by using union or join.

The following query can be used as an NRT rule even though it contains joins and unions. Just a random query saved as an NRT rule successfully:

1
2
3
4
5
6
7
8


let Lookup = datatable (UserName:string,Risk: int) 
    [ 'Alice', 70, 'Bob', 100 ]; 
let csv_from_repo = externaldata(network:string,geoname_id:string,continent_code:string)
    [ h@"https://raw.githubusercontent.com/<CSV_FILE>"] with(format="csv"); 
HoneyDoc_Detectionv2_CL  
| join kind=leftouter _GetWatchlist('HoneyToken_HashUser') on $left.CodeHash_s == $right.CodeHash 
| union csv_from_repo 
| union Lookup 

Even the rule creator says I can only use a single table and watchlists (there is no mention of joins or unions here). But again, in my examples, I used ’externaldata’ or ‘datatable’ operators too. Using these commands, you can also bypass some of the table limitations in some scenarios if needed.

Please be aware, I call the tables in Sentinel ’tables’ or ‘data tables’. This second expression is different than the ‘datatable’ operator.

See how union and join is allowed if there is no second table in the query:

Notification on the rule creation page

And another image showing that if I want to union Syslog to my data table (2 tables in 1 rule) then an error shows up (bottom of the picture):

Not allowed NRT rule notification

So, it is true, that you can reference only one table, but in any other cases, you are free to use join and union. These operators are not blocked by default.

2. Ignoring delay and ingestion latency

According to MS: “Since NRT rules use the ingestion time rather than the event generation time (represented by the TimeGenerated field), you can safely ignore the data source delay and the ingestion time latency …”

So, again, it is important to clarify that this statement is true if you use the NRT-rules as they intended to be used. But frequently, people are not aware of what they should and should not cover with these rules. Some of the limitations already provide some guidance, but there are some options allowed in NRT-rules that shouldn’t be done. So, NRT-rules were not designed and can’t properly cover the following scenarios:

Correlate logs from two separate tables: This is something, that is prohibited in NRT-rules by default. You cannot define two tables in an NRT-rule, so you cannot even make the mistake of creating a rule like this. Otherwise, the different ingestion delays in various tables could be a problem.
Create rules based on a group of events (covering just one table): If a detection is based on multiple events, there is a chance your NRT rules won’t work. An NRT rule always checks one minute’s worth of logs based on the ingestion time. If one log arrives before the 1-minute mark and another one after the 1-minute mark, then they won’t be tested by the same execution of an NRT rule. So, any rule, in which events must reach a threshold in order to trigger the rule, or the ordering of the events are important are not a good fit for an NRT rule.

So, it is clear, that NRT rules should be used in order to detect single event behaviors. Like when a user opens a malicious site from a watchlist, that you want to alert on immediately. Or, when you want to detect a honeypot or honeytoken alert. Another typical scenario is break-glass account usage, in which case a quick detection is important. All these events are interesting alone, you do not have to correlate them with another logs.

If you want to create a single event detection, then NRT rules can be the way to go (if you can accept the limitations). In this case, you can indeed safely ignore both the ingestion delay and the ingestion delay variance. While you can miss some ingested logs with normal Scheduled Analytics rules, you will never miss any event with an NRT rule, because as soon as a log is ingested the NRT rule is going to check it (with an approx. 2 minutes delay).

So, again, “you can safely ignore the data source delay and the ingestion time latency” is true in case of single event detections, but you should avoid multi-event detections at all costs.

Multi-event detection is a problem when there are multiple events from the data table. However, you can still use multi-event detection if all events but one are from an ’externaltable’ or watchlist since these logs don’t have ingestion time or ingestion delay, they are static.

3. General limitations

And here are the general limitations of the NRT rules compared to the normal scheduled Analytic rules:

	Scheduled rule	NRT rule
Built-in delay	5 minutes	2 minutes
Scheduling (frequency)	Most frequent: 5 minutes	Fixed 1 minute
Quantity	512 rules	20 rules
Tables	Multi-table	Single table
Workspace	Multi-workspace	Single workspace

One important takeaway from this table is that Scheduled rules can be executed every 5 minutes (most frequent) and they have a built-in 5-minute delay. While NRT rules are executed every minute and they have a built-in 2-minute delay.

An example Scheduled rule that has a 5 min lookback and a 5 min execution frequency. As you can see it only detected the event after 9 mins and 50 seconds due to the built-in delay (the previous execution around 1:19:32 missed this event, due to the delay):

Scheduled rule delay example

While the NRT rule detected the HoneyDoc_Detectionv2_CL event after 2 minutes and 35 seconds (built-in delay is 2 minutes) (ordering is the opposite here):

NRT rule delay example

This means a Scheduled rule during its execution does not check the last 5 minutes (in case the lookup time is configured to be 5 mins), but the timeframe between the last 10 and 5 minutes (5-minute time window). Similarly, an NRT rule that is executed every minute does not check the last 1 minute. With the 2 minutes built-in delay it actually checks the time window between the last 3 and 2 minutes (1-minute time window).

Thus, the detection delay from ingestion for an NRT rule is between 2-3 minutes. Based on the same info, the detection delay from event creation (in case of 0 ingestion time) for a Scheduled rule like the one above is between 5-10 minutes.

But, with a lucky ingestion delay, the detection latency for a Scheduled rule can be even 0 minutes from ingestion, while -again- it is 2-3 minutes for a NRT rule. So time-to-time the processing of an event can be quicker with a Scheduled rule.

Let me explain this through two examples.

1. Example

In the first one, assume that a log is created at 00:20 (minutes:seconds) and the log ingestion takes 4 minutes.

This is what the timeline looks like with NRT rules:

Timestamp	Activity
00:20	The event is created, let’s call it Event A.
04:20	Event A arrives into Sentinel.
05:00	The NRT rule runs, but it does not check the last 1 minute, it checks the time window between minutes 2 and 3. Thus, it won’t see Event A.
06:00	The NRT rule runs again, it checks the logs between minutes 3 and 4.
07:00	The NRT rule runs again and this time it checks the logs between minutes 4 and 5 and it will find the event generated at 00:20 and ingested at 4:20. An alert will be created around minute 7.

The same activity with a Scheduled rule that runs every 5 minutes (most frequent execution) and checks events that were generated less than 5 minutes ago. Also, assume that this rule does not have an ingestion_time() filter in it at all.

Timestamp	Activity
00:20	The event is created, let’s call it Event A.
04:20	Event A arrives into Sentinel.
05:00	The Scheduled rule is executed and due to the built-in delay it checks the logs between –5 and 0 minutes (TimeGenerated field).
10:00	The Scheduled rule runs again and checks the logs between minutes 0 and 5 (TimeGenerated field). Our event was created at 00:20 (in the range) and ingested at 4:20 (so it is already in Sentinel) so an alert will be created at minute 10.

2. Example

Let’s see another example, in which every parameter is the same, but the ingestion delay is 9 minutes instead of 4 minutes.

This is what the timeline looks like with NRT rules:

Timestamp	Activity
00:20	The event is created, let’s call it Event A.
09:20	Event A arrives into Sentinel.
10:00	The NRT rule runs and it checks the time window between minutes 7 and 8. Thus, it won’t see Event A.
11:00	The NRT rule runs again, it checks the logs between minutes 8 and 9.
12:00	The NRT rule runs again, and this time it checks the logs between 9 and 10, and it will find the event generated at 00:20 and ingested at 9:20. An alert will be created around minute 12.

Same with Scheduled query:

Timestamp	Activity
00:20	The event is created, let’s call it Event A.
09:20	Event A arrives into Sentinel.
10:00	The Scheduled rule is executed. Due to the built-in delay, it does not monitor the last 5 minutes but the time window between the 0- and 5-minute mark. However, instead of ingestion_time() the Scheduled rule uses the TimeGenerated field. So, the rule will check the events with a TimeGenerated field greater than 0 minutes and less than the 5-minute mark. It ignores the ingestion_time. Event A was created at 00:20 (TimeGenerated) and ingested at 9:20 (so it is already in Sentinel). So, this execution detects it almost immediately after ingestion (with 40 seconds delay).

I only marked the relevant actions and timings in the tables.

I used these two examples to show that sometimes a Scheduled rule can pick up an event quicker than an NRT rule. So, sometimes NRT abbreviates the Non-Real-Time rule instead of the Near-Real-Time rule when we compare it to this Scheduled query (joke-joke). This is rarely the case though, because Scheduled rules typically use an execution frequency greater than 5 minutes. It is also important to point out that using these rules above the Scheduled rule would have missed every event with more than 10 minutes delay, while an NRT rule could still process them successfully.

So, the benefit of an NRT rule is not exclusively its speed, but the fact the in case of a single-event detection, it won’t miss any event.

This is because the Scheduled rule in this example checks events that arrived in the last 10 minutes (lookback time + built-in delay) but were generated between 10 and 5 minutes ago. An event with more than 10 mins delay is going to show up in Sentinel when the TimeGenerated field is already more than 10 minutes ‘old’.

The end

The key takeaways:

You can use unions and joins in NRT rules.
You can’t always ignore ingestion delay.
Sometimes a Scheduled rule processes a log before an NRT rule could do the same.

Be cautious and always test the new features before you start to use them in a production environment. Near-Real-Time rules are pretty new at this point, so I expect them to have more functionality and less limitations over time.