Troubleshooting the Collector
Decision Tree
Section titled “Decision Tree”flowchart TD
A[Issue observed] --> B{validate passes?}
B -->|No| C[Configuration errors]
B -->|Yes| D{test-connection result}
D -->|AADSTS500011 or 401| E[Microsoft authentication]
D -->|Timeout or DNS error| F[Microsoft Graph connectivity]
D -->|TLS or certificate error| G[TLS and certificates]
D -->|Backend rejected export| H[Data export]
C --> I[Run validate again]
E --> J[Run test-connection again]
F --> J
G --> J
H --> K[Run one collection cycle]
J --> K
K --> L{Service stable?}
L -->|No| M[Service and systemd]
L -->|Yes but data odd| N[State management]
Use this order first: validate -> test-connection -> one real run -> backend-specific troubleshooting.
1. Configuration Errors
Section titled “1. Configuration Errors”validate returns schema or YAML errors
Section titled “validate returns schema or YAML errors”Symptom
ms-teams-agent validate --config ./config.yaml exits with one or more errors.
Root cause Required sections are missing, YAML indentation is invalid, or enabled outputs are misconfigured.
Fix
- Ensure all required sections exist:
microsoft_authentication,license,output,collection_config. - Check YAML indentation and key names.
- Confirm
license.filepathpoints to a readable file. - Confirm at least one output backend is
enabled: true. - Re-run validation after each correction.
Verification command
ms-teams-agent validate --config ./config.yaml2. Microsoft Authentication
Section titled “2. Microsoft Authentication”Error AADSTS500011 (resource principal not found)
Section titled “Error AADSTS500011 (resource principal not found)”Symptom
test-connection fails and logs include AADSTS500011.
Root cause The tenant cannot find the expected app registration or service principal.
Fix
- Confirm
tenant_idpoints to the intended tenant. - Confirm
client_idmatches the app registration in that tenant. - Re-check app registration visibility in Microsoft Entra ID.
- Re-grant admin consent for required Graph application permissions.
- In multi-tenant deployments, confirm a service principal exists in the target tenant.
Verification command
ms-teams-agent test-connection --config ./config.yamlError 401 InvalidAuthenticationToken
Section titled “Error 401 InvalidAuthenticationToken”Symptom
Collector logs show APIError Code: 401, Message: InvalidAuthenticationToken.
Root cause The token is expired, invalid, or generated with inconsistent credentials.
Fix
- Rotate the client secret or re-check the certificate/key pair.
- Confirm credentials belong to the same tenant as
tenant_id. - Confirm
grant_type: "client_credentials". - Sync host time to avoid token validation failures.
- Re-run validation and connection tests.
Verification command
ms-teams-agent validate --config ./config.yamlms-teams-agent test-connection --config ./config.yamlError 401 InvalidCloudInstance
Section titled “Error 401 InvalidCloudInstance”Symptom
Logs show InvalidAuthenticationToken / InvalidCloudInstance.
Root cause
cloud_deployment and token scope target different Microsoft cloud instances.
Fix
- Set
microsoft_authentication.graph.cloud_deploymentto the correct value. - If
scopeis set manually, align it with the same cloud instance. - Keep authority and Graph endpoints in the same cloud family.
- Re-test with debug logs enabled.
Verification command
ms-teams-agent test-connection --config ./config.yamlms-teams-agent run --config ./config.yaml --log-level DEBUG --dry-runToken errors after certificate or secret rotation
Section titled “Token errors after certificate or secret rotation”Symptom Authentication fails immediately after credential updates.
Root cause Old credentials are still used by the running process or the new value is malformed.
Fix
- Check whether the service loads the expected
config.yamlpath. - Validate PEM format for certificate authentication.
- Remove trailing spaces in secret values.
- Restart the service after changes.
Verification command
ms-teams-agent service status --config /absolute/path/config.yamlms-teams-agent test-connection --config /absolute/path/config.yaml3. Microsoft Graph Connectivity
Section titled “3. Microsoft Graph Connectivity”Timeouts, DNS errors, or proxy failures
Section titled “Timeouts, DNS errors, or proxy failures”Symptom
test-connection fails with timeout, connection reset, or DNS resolution errors.
Root cause The collector host cannot reach required Microsoft endpoints.
Fix
- Confirm outbound HTTPS access to
graph.microsoft.comandlogin.microsoftonline.com. - Confirm access to
reportsncu.office.comfor report download redirects. - Validate DNS resolution from the collector host.
- Check proxy configuration for the collector runtime user.
- Re-test from the same host context as the collector service.
Verification command
nslookup graph.microsoft.comcurl -I https://graph.microsoft.comms-teams-agent test-connection --config ./config.yaml4. TLS / Certificates
Section titled “4. TLS / Certificates”TLS verification failed (reportsncu.office.com or backend endpoint)
Section titled “TLS verification failed (reportsncu.office.com or backend endpoint)”Symptom
Logs show TLS certificate verification failed.
Root cause The OS trust store does not contain the issuing CA, often due to enterprise TLS inspection.
Fix
- Update the host CA trust store.
- If required, configure
advanced.ca_bundle_pathwith your PEM CA bundle. - Ensure endpoint hostname matches the certificate SAN.
- Re-test with
test-connection.
Verification command
ms-teams-agent test-connection --config ./config.yaml5. Data Export
Section titled “5. Data Export”Collector runs but no data appears in backend
Section titled “Collector runs but no data appears in backend”Symptom The collector process is healthy, but dashboards and searches remain empty.
Root cause Output is disabled, export credentials are invalid, or cycle timing has not elapsed.
Fix
- Confirm at least one output backend is enabled.
- Run
test-connectionto validate backend reachability, credentials, and OTLP headers (for example Grafana Cloud or Datadog). - Run one cycle without previous state to observe fresh export.
- Wait at least one
interval_collection_minutescycle. - If using
--dry-run, remove it for live export.
Verification command
ms-teams-agent test-connection --config ./config.yamlms-teams-agent run --config ./config.yaml --ignore-state6. Service / systemd
Section titled “6. Service / systemd”Service fails to start or restarts repeatedly
Section titled “Service fails to start or restarts repeatedly”Symptom systemd reports failed state or restart loops.
Root cause The service uses an invalid config path, missing file permissions, or invalid runtime environment.
Fix
- Check service logs with
journalctl. - Ensure
--configpath is absolute in the service unit. - Confirm service user can read config and license files.
- Re-enable service with a validated config path.
Verification command
journalctl -u ms-teams-observability-agent@default.service -fsudo ms-teams-agent service enable-service --config /absolute/path/config.yaml7. State Management
Section titled “7. State Management”Duplicate or skipped records after restart
Section titled “Duplicate or skipped records after restart”Symptom Data appears duplicated or expected records are not exported after restart.
Root cause State cache is stale or inconsistent with the expected collection window.
Fix
- Inspect current state.
- First purge stale pending outbox rows (dry-run, then execute) to avoid unnecessary full resets.
- Reset state only when reprocessing is acceptable.
- Run one controlled cycle and verify export behavior.
- Return to normal service mode once validated.
Verification command
ms-teams-agent state showms-teams-agent state purge-stale --older-than 168 --dry-runms-teams-agent state purge-stale --older-than 168ms-teams-agent state resetms-teams-agent run --config ./config.yaml --ignore-state8. Diagnostic Workflow
Section titled “8. Diagnostic Workflow”flowchart TD
A[Unknown issue] --> B[validate]
B -->|Errors| C[Fix configuration]
C --> B
B -->|OK| D[test-connection]
D -->|Auth errors| E[Fix Entra ID credentials and consent]
D -->|Network or TLS errors| F[Fix connectivity and trust]
E --> D
F --> D
D -->|OK| G[Run one collection cycle]
G -->|Export errors| H[Fix output backend credentials]
G -->|No export errors| I[Check backend UI and queries]
Recommended sequence:
ms-teams-agent validate --config ./config.yamlms-teams-agent test-connection --config ./config.yamlms-teams-agent run --config ./config.yaml --ignore-state- Backend-specific troubleshooting when collector checks are green