We built a serverless Lambda pipeline that ships FSx for ONTAP audit logs to Dynatrace via the Log Ingest API v2. The real value: Dynatrace's Davis AI can automatically correlate file access anomalies with application performance degradation β answering "why is the app slow?" with "because 500 users hit the same NFS share simultaneously."
FSx for ONTAP β S3 Access Point β EventBridge Scheduler β Lambda β Dynatrace Log Ingest API v2
β
βΌ
Davis AI
βββββββββββββββββββββ
β Correlates: β
β β’ File access β
β anomalies β
β β’ APM metrics β
β β’ Infrastructure β
β health β
β β
β β Root cause β
β in seconds β
βββββββββββββββββββββ
Verified on Dynatrace SaaS Trial (Tokyo-equivalent region). Logs visible in Logs Viewer within 1-2 minutes.
This is Part 11 of the Serverless Observability for FSx for ONTAP series.
Most observability tools treat storage logs as isolated data. Dynatrace is different β it builds a topology map of your entire stack and uses Davis AI to find causal relationships through time-window correlation and entity connectivity:
| Scenario | Without Dynatrace | With Dynatrace |
|---|---|---|
| App latency spike | "Check the logs" | Davis AI detects temporal correlation: file access to /vol/data/ increased 10x within the same 5-minute window as app response time degradation, connected via topology (app β NFS mount β SVM) |
| Storage I/O anomaly | Manual investigation | Automatic correlation via shared topology entities β Davis identifies which services are affected based on entity relationships |
| User reports slow file access | Grep through audit logs | DQL query + topology view showing the full dependency path from user request to storage operation |
The key differentiator: Davis AI correlates events across entities that share topology connections within overlapping time windows β not just keyword matching or manual dashboard correlation.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Event Sources β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β EventBridge Scheduler β
β rate(5 minutes) βββ Lambda β
β β lists new files via β
β β S3 Access Point β
β β (checkpoint in SSM) β
β βΌ β
β Dynatrace Log Ingest API v2 β
β (Api-Token auth) β
β β β
β EMS Webhook β β
β βββ API GW βββ Lambda ββββββββββββββ€ β
β (ems_handler) β β
β βΌ β
β FPolicy Dynatrace β
β βββ ECS Fargate βββ SQS (Logs Viewer, β
β βββ Bridge Lambda Davis AI, β
β βββ EventBridge DQL, β
β βββ Lambda (fpolicy_handler) Dashboards) β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ€β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
When you ship FSx for ONTAP logs to Dynatrace alongside your APM data, Davis AI can detect patterns like:
This works because Dynatrace maps your FSx for ONTAP SVM as a custom device entity in its topology, connecting it to the applications that access it.
logs.ingest
dt0c01.<TOKEN_ID>.<TOKEN_SECRET>
aws secretsmanager create-secret \
--name "dynatrace/fsxn-api-token" \
--secret-string '{"api_token":"dt0c01.XXXXXXXX.YYYYYYYY"}' \
--region ap-northeast-1
aws cloudformation deploy \
--template-file integrations/dynatrace/template.yaml \
--stack-name fsxn-dynatrace-integration \
--parameter-overrides \
S3AccessPointArn=arn:aws:s3:ap-northeast-1:123456789012:accesspoint/fsxn-audit-ap \
DynatraceApiTokenSecretArn=arn:aws:secretsmanager:ap-northeast-1:123456789012:secret:dynatrace/fsxn-api-token-XXXXXX \
DynatraceEnvUrl=https://abc12345.live.dynatrace.com \
S3BucketName=my-fsxn-audit-bucket \
--capabilities CAPABILITY_NAMED_IAM \
--region ap-northeast-1
Navigate to Logs β View logs β Run query:
fetch logs
| filter log.source == "fsxn-ontap"
Logs should appear within 1-2 minutes.
Each audit log event is shipped with structured attributes for DQL querying:
{
"content": "{\"EventID\":\"4663\",\"UserName\":\"admin@corp.local\",...}",
"log.source": "fsxn-ontap",
"dt.source_entity": "CUSTOM_DEVICE-fsxn-svm-prod-01",
"timestamp": "2026-01-15T12:00:00Z",
"severity": "info",
"fsxn.svm": "svm-prod-01",
"fsxn.operation": "ReadData",
"fsxn.user": "admin@corp.local",
"fsxn.path": "/vol/data/file.txt",
"fsxn.s3_key": "audit/2026/01/15/audit-001.json"
}
The dt.source_entity
field links logs to a custom device in Dynatrace's topology, enabling Davis AI correlation.
Dynatrace Query Language (DQL) provides powerful analytics:
// All failed file access attempts (using structured attributes)
fetch logs
| filter log.source == "fsxn-ontap"
| filter fsxn.result == "Failure"
| summarize count(), by: {fsxn.user, fsxn.path}
// Top operations by volume
fetch logs
| filter log.source == "fsxn-ontap"
| summarize count(), by: {fsxn.operation}
| sort count() desc
// Access timeline for a specific SVM
fetch logs
| filter fsxn.svm == "svm-prod-01"
| makeTimeseries count(), interval: 5m
// File access volume vs app response time (side-by-side)
fetch logs
| filter log.source == "fsxn-ontap"
| makeTimeseries file_ops = count(), interval: 5m
// Correlate with service metrics in a dashboard
// (Place this next to a service response time tile)
// Find users causing the most I/O during a performance incident
fetch logs
| filter log.source == "fsxn-ontap"
| filter timestamp >= now() - 1h
| summarize ops = count(), by: {fsxn.user}
| sort ops desc
| limit 10
// Detect potential ransomware (mass file modifications)
fetch logs
| filter log.source == "fsxn-ontap"
| filter fsxn.operation == "WriteData" OR fsxn.operation == "Delete"
| makeTimeseries write_ops = count(), interval: 1m
| filter write_ops > 100
// After-hours access
fetch logs
| filter log.source == "fsxn-ontap"
| filter hour(timestamp) < 7 OR hour(timestamp) > 19
| summarize count(), by: {fsxn.user, fsxn.path}
| Deployment | URL Format | Data Location |
|---|---|---|
| SaaS | https://<env-id>.live.dynatrace.com |
|
| Dynatrace-managed (region-specific) | ||
| Managed | https://<your-domain>/e/<env-id> |
|
| Your infrastructure | ||
| ActiveGate | https://<host>:9999/e/<env-id> |
|
| Your network (proxy) |
For data sovereignty requirements, Dynatrace Managed or ActiveGate keeps all data within your infrastructure.
Dynatrace pricing is based on Davis Data Units (DDU):
| Monthly Log Volume | DDU/day (est.) | Monthly DDU Cost |
|---|---|---|
| 1 GB | ~1 DDU | Minimal (within base allocation) |
| 10 GB | ~10 DDU | ~$25/month (at $2.50/DDU) |
| 100 GB | ~100 DDU | ~$250/month |
| Component | Monthly Cost (10 GB/month) |
|---|---|
| Lambda (5-min polling) | ~$3 |
| EventBridge Scheduler | ~$1 |
| Secrets Manager | ~$1 |
| Dynatrace DDU | |
| ~$25 | |
| Total | |
| ~$30 |
DDU pricing varies by contract. The 14-day trial includes generous DDU allocation for validation. Check your license terms for production estimates.
| # | Discovery | Impact |
|---|---|---|
| 1 | ||
| API returns HTTP 204 on success (not 200) | ||
| Lambda must treat 204 as success | ||
| 2 | Trial environment has 1-2 minute ingestion lag | Wait before checking Logs Viewer |
| 3 | ||
logs.ingest scope is required β ReadConfig /WriteConfig won't work |
||
| Token creation must select correct scope | ||
| 4 | ||
logs.read scope needed separately for API-based queries |
||
| Create a second token for automation | ||
| 5 | Log entries older than 24 hours may be rejected | Use current timestamps in test data |
| 6 | Max 1MB per request (smallest batch limit in this series) | Lambda splits large batches |
| 7 | Firehose delivery requires ActiveGate (not direct to SaaS) | Use Lambda direct for simplicity |
To get the most from Davis AI correlation, all three prerequisites must be in place:
dt.source_entity
field setdt.source_entity
) β this creates the storage-side topology node. Use the POST /api/v2/entities/custom
) or Settings API to pre-create the device entity before first log ingestion
Prerequisites for correlation: Davis AI correlation only activates when all three components are connected in the topology. Without OneAgent on the application hosts, Davis AI cannot establish the causal link between file access patterns and application performance. The custom device entity must use a consistent naming convention (e.g.,CUSTOM_DEVICE-fsxn-{svm-name}
) across all log entries.
Application (OneAgent) βββ NFS/SMB βββ FSx for ONTAP (SVM)
β β
β APM metrics β Audit logs
βΌ βΌ
Dynatrace Davis AI
(automatic correlation)
This integration follows the project's Production Readiness Levels:
| Level | What You Get | Go/No-Go to Next |
|---|---|---|
| Level 1 (this Quick Start) | Audit poller + DLQ | Logs arrive, checkpoint advances, DLQ empty 24h |
| Level 2 | + DQL dashboards + alerts | SLOs met 7 days, security review done |
| Level 3 | + DynamoDB ledger + Davis AI correlation | SLOs met 30 days, compliance pack |
| Level 4 | + OTel Collector + redaction + OneAgent | Multi-backend, PII redaction, full topology |
Data classification: Dynatrace receivesfsxn.user
andfsxn.path
fields (PII/sensitive). Dynatrace SaaS environments are region-specific β select a region matching your data residency requirements. For Managed/ActiveGate deployments, data stays in your infrastructure. See[Data Classification Guide].
Full criteria: Pipeline SLO Definitions | DLQ Replay Runbook
| Template | Purpose | Key Parameters |
|---|---|---|
template.yaml |
||
| FSx audit log poller | S3AccessPointArn, DynatraceApiTokenSecretArn, DynatraceEnvUrl | |
template-ems.yaml |
||
| EMS webhook handler | DynatraceApiTokenSecretArn, DynatraceEnvUrl | |
template-fpolicy.yaml |
||
| FPolicy EventBridge handler | DynatraceApiTokenSecretArn, DynatraceEnvUrl, EventBusName |
Questions about the Dynatrace integration or Davis AI correlation? Drop a comment below.
GitHub: github.com/Yoshiki0705/fsxn-observability-integrations