DCRs are the modern backbone of data ingestion for Azure Sentinel, replacing legacy methods with a scalable, flexible, and consistent approach that uses a common data ingestion pipeline for all data sources. DCRs enable advanced filtering, transformation, and routing of data before it even hits your storage solution. In addition to these core features, DCRs offer a range of lesser-known and less frequently used functionalities that are well worth investigating for organizations looking to maximize their monitoring capabilities.
While Data Collection Rules (DCRs) are best known for gathering and sending logs and events to Microsoft Sentinel, their capabilities extend well beyond simply serving as part of Sentinel’s data ingestion process.
In upcoming posts, we’ll explore some of these less familiar but powerful capabilities of DCRs - including alternative storage options, lesser-used data source connections, and agent configuration through DCRs. In this first article, I willl provide an overview of DCR types and specifically focus on the ‘Direct’ DCR, which enables onboarding of API-based log sources without requiring an external Data Collection Endpoint.
Types of DCRs
Azure Monitor supports several different kinds of DCRs, each tailored for specific usage scenarios. The following table provides a comprehensive overview of all DCR types:
Kind | Description | Use Case / Good For | Additional information |
---|---|---|---|
Direct |
Direct ingestion using Logs ingestion API. Endpoints are created for the DCR automatically. | Custom apps, direct log ingestion via REST API. | Sample ARM code |
Windows |
Collect events and performance data from Windows machines. | Windows VMs, servers, client. devices | - |
Linux |
Collect events and performance data from Linux machines. | Linux VMs, servers, syslog. collection | - |
AgentDirectToStore |
Send collected data directly to Azure Storage and Event Hubs. | Data archival, long-term storage, cheap storage for Azure-based VMs. | Sample ARM code |
AgentSettings |
Configure Azure Monitor agent parameters. | Agent configuration management, centralized agent settings. | - |
PlatformTelemetry |
Export platform metrics. | Azure resource metrics export, platform monitoring. | MS Github |
WorkspaceTransforms |
Workspace transformation DCR for sources that do not support DCR directly. | Data transformation at workspace level for legacy agents and some legacy log sources. | - |
All |
This is not a separate type - it’s simply what you get if you don’t define the ‘kind’ in your code, or if you choose both Linux and Windows in the GUI. | Other than for testing purposes, I generally do not recommend using this DCR type. | - |
Most people are fairly familiar with the general log collection DCRs (Windows, Linux, WorkspaceTransforms, All). In this series of articles, I will highlight some of the less commonly known types and options, and share sample deployment code to assist with your testing.
This is going to be a four-part series:
- Direct DCRs: In this post, I concentrate on the Direct DCR type, which simplifies log collection configuration for API-based sources.
- AgentDirectToStore DCRs: In the second episode, I show how to use DCRs to directly forward events from your Azure VMs to either a Storage Account or an Event Hub.
- Event Hub as a Log Relay: The third installment of this article series explores the advantages and disadvantages of utilizing Event Hub as a log relay to Sentinel - a capability now supported by DCRs.
- Coming soon
‘Direct’ DCRs
1. Role
The ‘Direct’ DCR type is designed to streamline the ingestion and management of API-based log sources, while maintaining backward compatibility with existing methods.
By eliminating the need for a dedicated Data Collection Endpoint (DCE), the ‘Direct’ DCR simplifies API-based log collection. This is accomplished by embedding a logs ingestion endpoint directly within the DCR, allowing you to send logs without setting up a separate DCE.
However, to fully appreciate these advantages and understand how they work, it’s important to first explore the roles and interactions of DCEs and DCRs within Azure.
2. DCEs and DCRs
Direct Data Collection Rules (Direct DCRs) enable applications to send logs directly to a Log Analytics workspace using the Logs Ingestion API without requiring a separate Data Collection Endpoint (DCE). They are optimized for API-based log sources where agentless ingestion is preferred.
Data Collection Endpoints are often not needed in log collection scenarios, as many sources can send data directly to a public endpoint or use the ingestion endpoint specified in the Data Collection Rule (DCR). However, when a DCE is needed, it typically serves two primary functions in the log ingestion process:
Act as a… | Description | Regionality considerations | Data collection rule configuration |
---|---|---|---|
Logs ingestion endpoint | The endpoint that ingests logs into the data ingestion pipeline. | Same region as the destination workspace | Set on the Basics tab when you create a data collection rule using the portal. |
Configuration access endpoint | The endpoint from which the AMA agent retrieves data collection rule configuration. | Same region as the monitored machine | Set on the Resources tab when you create a data collection rule using the portal. |
When using AMA-agent-based log collection, the agent must first download a configuration file and then upload events to the endpoint as specified in that configuration. This process leverages both roles of the Data Collection Endpoint.
Please note that dedicated Data Collection Endpoints (DCEs) are not always necessary for agent-based log collection. Refer to the relevant scenarios for more details.
On the other hand, API-based log sources do not need to fetch any configurations. They independently send data to the DCE/DCR based on their own local settings. As a result, these sources only require a log ingestion endpoint, which a ‘Direct’ DCR provides without the need for an external DCE.
To illustrate this difference, let’s examine the code structure of a traditional non-Direct DCR setup first:
A non-Direct DCR without embedded DCE. The external DCE is attached to the DCR in this case.
An external DCE with both a logsIngestion and the configurationAccess endpoints.
So, using Direct DCRs removes the need for conventional data collection endpoints. When you create a DCR with the “kind”: “Direct” setting, Azure automatically provisions embedded ingestion endpoints within the DCR. As a result, you only need to manage a single Azure resource - the DCR itself - while the DCE functionality is handled internally, accessible via the embedded ingestion endpoint URI.
Built-in log ingestion and metrics ingestion endpoints, and not configurationAccess endpoint.
To send custom data to the Log Ingestion API, you require the appropriate permissions, the immutable ID of the DCR, the data collection endpoint URI, and the stream name (sometimes incorrectly referred to as the table name by certain tools). All of these required details are accessible when using a ‘Direct’ DCR, making this type of DCR fully suitable for custom API-based data ingestion scenarios. Thus, no additional change is needed for any of the existing Data Connectors which already supports the separate DCE setup for API-based log forwarding.
Simplified ingestion without external DCE
3. Why Direct DCRs are not commonly used
- Azure Portal’s table-creation GUI wizard defaults to creating DCRs without a kind attribute (‘All’ on the GUI) and requires a separate DCE. So, if you use the GUI for table creation you will ‘automatically’ have a different DCR type and a DCE created.
- Official samples and documentation rarely use Direct DCRs; most tutorials focus on agent-based collection or DCE-backed DCRs.
4. My recommendations
Consider the following DCR management and implementation best practices to optimize your data collection strategy:
- Create separate DCRs for multiple sources: While it’s technically possible to ingest multiple data streams through a single Direct DCR, I recommend creating individual DCRs when you have more than two sources. This approach simplifies both management and troubleshooting processes.
- Use programmatic table creation: Create DCR-based tables programmatically rather than through the Azure Portal GUI to avoid unnecessary DCR and DCE resource creation.
- Use the latest API version: Deploy using API version 2023-03-11 (the newest version at the time of writing) to access all configuration options and detailed information for Direct DCR types.
5. Getting started
Use my ARM template available on GitLab to experiment with Direct DCR implementation. The template includes a default stream structure that you can customize before or after deployment. The creation of a ‘Direct’ DCR is not available on the GUI in the Azure Portal, so this ARM template is a good starting point.
{
"$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentTemplate.json#",
"contentVersion": "1.0.0.0",
"parameters": {
"dataCollectionRuleName": {
"type": "string",
"metadata": {
"description": "Specifies the name of the Data Collection Rule to create."
}
},
"location": {
"defaultValue": "[resourceGroup().location]",
"type": "string",
"metadata": {
"description": "Specifies the location in which to create the Data Collection Rule."
}
},
"workspaceResourceId": {
"type": "string",
"metadata": {
"description": "Specifies the Azure resource ID of the Log Analytics workspace (Sentinel) to use."
}
}
},
"resources": [
{
"type": "Microsoft.Insights/dataCollectionRules",
"apiVersion": "2023-03-11",
"name": "[parameters('dataCollectionRuleName')]",
"location": "[parameters('location')]",
"kind": "Direct",
"properties": {
"streamDeclarations": {
"Custom-DataTable": {
"columns": [
{
"name": "TimeGenerated",
"type": "datetime"
},
{
"name": "RawData",
"type": "string"
}
]
}
},
"destinations": {
"logAnalytics": [
{
"workspaceResourceId": "[parameters('workspaceResourceId')]",
"name": "DestLogAnalytics"
}
]
},
"dataFlows": [
{
"streams": [
"Custom-DataTable"
],
"destinations": [
"DestLogAnalytics"
],
"transformKql": "source",
"outputStream": "Custom-DataTable_CL"
}
]
}
}
],
"outputs": {
"dataCollectionRuleResourceId": {
"type": "string",
"value": "[resourceId('Microsoft.Insights/dataCollectionRules', parameters('dataCollectionRuleName'))]"
},
"immutableId": {
"type": "string",
"value": "[reference(resourceId('Microsoft.Insights/dataCollectionRules', parameters('dataCollectionRuleName'))).immutableId]"
},
"logsIngestionEndpoint": {
"type": "string",
"value": "[reference(resourceId('Microsoft.Insights/dataCollectionRules', parameters('dataCollectionRuleName'))).endpoints.logsIngestion]"
},
"metricsIngestionEndpoint": {
"type": "string",
"value": "[reference(resourceId('Microsoft.Insights/dataCollectionRules', parameters('dataCollectionRuleName'))).endpoints.metricsIngestion]"
}
}
}
Continuing the series
If you are interested in learning about some other lesser-known aspects of DCRs, check out another article in this four-part series:
- Direct DCRs: In this current post I’m focusing on the Direct DCR kind that can simplify the log collection config for API-based sources.
- AgentDirectToStore DCRs: In the second episode, I show how to use DCRs to directly forward events from your Azure VMs to either a Storage Account or an Event Hub.
- Event Hub as a Log Relay: The third installment of this article series explores the advantages and disadvantages of utilizing Event Hub as a log relay to Sentinel - a capability now enabled by DCRs.
- Coming soon