Drift Detection
Monitor and remediate policy drift in your clusters
Drift Detection
Learn how to detect and remediate configuration drift between your declared policies and actual cluster state.
What is Drift?
Policy drift occurs when the actual policies deployed in your cluster differ from the desired state defined in ClusterSpecification resources.
Types of Drift
Missing Policies Expected policy not found in cluster
Modified Policies Policy exists but has been changed (enforcement mode, rules, etc.)
Extra Policies Unexpected policies found in cluster (not in specification)
How It Works
kspec continuously monitors your clusters for drift:
- Fetch Desired State: Read ClusterSpecification resources
- Fetch Actual State: Query cluster for deployed Kyverno policies
- Compare: Detect differences
- Report: Generate DriftReport with findings
- Remediate: (Optional) Auto-fix drift
Enabling Drift Detection
Drift detection runs automatically when you create a ClusterSpecification:
apiVersion: kspec.io/v1alpha1
kind: ClusterSpecification
metadata:
name: production-spec
namespace: kspec-system
spec:
targetClusterRef:
name: production-cluster
enforcementMode: enforce
driftDetection:
enabled: true
interval: "5m" # Check every 5 minutes
autoRemediate: false # Don't auto-fix (default)
policies:
- id: "pod-security"
# ... policy definition
Viewing Drift Reports
Get All Drift Reports
kubectl get driftreport -n kspec-system
View Detailed Report
kubectl get driftreport production-spec-drift -n kspec-system -o yaml
Example output:
apiVersion: kspec.io/v1alpha1
kind: DriftReport
metadata:
name: production-spec-drift
namespace: kspec-system
spec:
clusterRef:
name: production-cluster
specRef:
name: production-spec
timestamp: "2025-12-30T15:30:00Z"
status:
driftDetected: true
driftEvents:
- type: "modified"
policyID: "pod-security"
resourceName: "require-run-as-non-root"
message: "Policy validationFailureAction changed from 'enforce' to 'audit'"
detectedAt: "2025-12-30T15:30:00Z"
- type: "missing"
policyID: "network-policies"
resourceName: "require-network-policy"
message: "Expected policy not found in cluster"
detectedAt: "2025-12-30T15:30:00Z"
Manual Remediation
When drift is detected, you can manually remediate:
Option 1: Re-apply ClusterSpecification
kubectl apply -f clusterspec.yaml
The kspec controller will reconcile and fix drift.
Option 2: Force Sync
Trigger immediate reconciliation:
kubectl annotate clusterspecification production-spec \
kspec.io/force-sync="$(date +%s)" \
-n kspec-system
Option 3: Delete Unexpected Resources
For extra policies (not in spec):
# List all policies in cluster
kubectl get clusterpolicy
# Delete unwanted policy
kubectl delete clusterpolicy unwanted-policy
Automatic Remediation
Enable auto-remediation to automatically fix drift:
apiVersion: kspec.io/v1alpha1
kind: ClusterSpecification
metadata:
name: production-spec
spec:
driftDetection:
enabled: true
autoRemediate: true
remediationStrategy:
onMissing: create # Create missing policies
onModified: update # Update modified policies
onExtra: ignore # Don't delete extra policies (safe default)
Remediation Strategies
onMissing
create- Create missing policiesreport- Only report, don't fix
onModified
update- Update policies to match specreport- Only report, don't fix
onExtra
delete- Remove unexpected policies (dangerous!)ignore- Leave them alone (recommended)report- Only report
Monitoring Drift
Prometheus Metrics
kspec exposes drift metrics:
# Total drift events
kspec_drift_events_total
# Drift events by type
kspec_drift_events_total{type="missing"}
kspec_drift_events_total{type="modified"}
kspec_drift_events_total{type="extra"}
# Current clusters with drift
kspec_clusters_with_drift
Alerting
Set up alerts for drift detection:
# Prometheus Alert
groups:
- name: kspec_drift
rules:
- alert: PolicyDriftDetected
expr: kspec_drift_events_total > 0
for: 5m
annotations:
summary: "Policy drift detected in cluster"
description: "{{ $labels.cluster }} has policy drift"
Best Practices
Start with Monitoring
Begin with autoRemediate: false to understand drift patterns:
driftDetection:
enabled: true
autoRemediate: false
Review drift reports for a week before enabling auto-remediation.
Use Safe Remediation
When enabling auto-remediation, use safe defaults:
remediationStrategy:
onMissing: create
onModified: update
onExtra: ignore # Never auto-delete
Schedule Drift Checks
Run drift detection during off-hours:
driftDetection:
enabled: true
schedule: "0 2 * * *" # 2 AM daily (cron format)
timezone: "America/New_York"
Exclude Namespaces
Exclude certain namespaces from drift detection:
driftDetection:
enabled: true
excludeNamespaces:
- kube-system
- kube-public
- development
Common Drift Scenarios
Scenario 1: Manual Policy Changes
Someone manually edited a policy in the cluster.
Detection:
type: "modified"
message: "Policy validationFailureAction changed"
Remediation:
Re-apply ClusterSpecification or enable autoRemediate.
Scenario 2: Policy Deleted
A policy was accidentally deleted.
Detection:
type: "missing"
message: "Expected policy not found"
Remediation: kspec will recreate it on next reconciliation.
Scenario 3: External Tool Added Policy
Another tool (CI/CD, GitOps) created a policy.
Detection:
type: "extra"
message: "Unexpected policy found"
Remediation:
- Add to ClusterSpecification if needed
- Or set
onExtra: ignore
Troubleshooting
Drift Not Detected
Check drift detection is enabled:
kubectl get clusterspecification production-spec \
-n kspec-system \
-o jsonpath='{.spec.driftDetection.enabled}'
Check controller logs:
kubectl logs -n kspec-system \
-l control-plane=controller-manager \
--tail=100 | grep drift
False Positives
kspec may report drift for expected changes. Use exemptions:
driftDetection:
enabled: true
exemptions:
- policyID: "temporary-policy"
reason: "Testing new policy"
expiresAt: "2026-01-01T00:00:00Z"
Performance Impact
Drift detection queries your clusters. Adjust interval for large clusters:
driftDetection:
enabled: true
interval: "15m" # Less frequent checks
Next Steps
- Multi-Cluster Setup - Manage multiple clusters
- Writing Policies - Create effective policies
- API Reference - DriftReport CRD documentation