HomeGuidesAPI ReferenceChangelog
GuidesAPI ReferenceGitHubAirheads Developer CommunityLog In
Guides

Orchestrator Health Monitoring

Monitoring Orchestrator health involves both Orchestrator KPIs exposed via the API and OS-level resource monitoring of the VM running Orchestrator. This page outlines both approaches and when to use each.

When to Use What

  • Orchestrator APIs (recommended for app health):
    • Use /gms/rest/gmsserver/ping for reachability health and /gms/rest/stats/timeseries/metrics for heap/memory stats and trend analysis.
  • OS/Hypervisor Monitoring (for VM resources):
    • Monitor CPU, memory, swap, and disk from your hypervisor or your standard Linux monitoring stack. Orchestrator also raises disk-usage alarms at 70% (Warning) and >90% (Major).
  • SNMP on the Orchestrator VM (optional):
    • If your NMS standardizes on SNMP for OS metrics, you can enable snmpd in the Orchestrator's underlying Linux to poll CPU/memory/disk. Read the supportability notes below first.

Supportability & Security Notes

  • Aruba docs primarily recommend API-based monitoring for Orchestrator and Linux/hypervisor tools for VM resources; enabling extra OS packages is outside the standard product configuration and may be overwritten by upgrades or conflict with hardening baselines. Validate with your Ops/Security policy before enabling.
  • If you enable SNMP, use SNMPv3 wherever possible. Appliance-side SNMP features (for ECOS gateways) are documented separately and are not the same as enabling snmpd on the Orchestrator VM.

Orchestrator Health & Reachability (API)

  1. Reachability & DB health: GET /gms/rest/gmsserver/ping
    1. 200 → Orchestrator OK; watch latency trends (hundreds of ms typical)
    2. Fields include dbHealth (database link) and uptime.
  2. Reboot history (last 12 months): GET /gms/rest/gms/rebootHistory
  3. Heap/memory trends (hour/minute granularity): GET /gms/rest/stats/timeseries/metrics?startTime=&endTime= → monitor totalHeapMemory and usedHeapMemory.

Orchestrator VM Disk Space & Linux Tools

  • Orchestrator raises disk alarms: 70% Warning and >90% Major.
  • For CPU/mem/swap/disk, standard Linux tools (e.g., top, free, df) or your hypervisor monitoring are recommended.
  • For VM deployments, reserve CPU & memory for the VM.

SNMP on the Orchestrator OS (Optional)

This section describes how to install and configure the Linux snmpd service on self-hosted Orchestrator for OS-level polling (CPU, memory, disk).

This does not configure SNMP for EdgeConnect appliances (that is done in the Orchestrator UI/templates and is documented separately).

Note: The steps below assume Orchestrator version 9.5+ running on Rocky Linux.

SNMPv3 or v2c

SNMPv3 (recommended)SNMPv2c (fallback)
Use whenPlatform supports SNMPv3SNMPv3 not supported
SecurityAuth + encryptionCleartext community string

Required Values

SNMPv3

PlaceholderDescription
<LOCATION>System location (e.g. Datacenter Rack A3)
<CONTACT>Admin contact (e.g. [email protected])
<V3_USER>SNMPv3 username (e.g. snmpmon)
<AUTH_PASS>Auth password, 8+ chars
<PRIV_PASS>Encryption password, 8+ chars
<NMS_IP>Monitoring server IP

SNMPv2c — same as above but replace <V3_USER>, <AUTH_PASS>, <PRIV_PASS> with:

PlaceholderDescription
<COMMUNITY>Community string (e.g. SpeakNet)

Common Steps (Both Paths)

1. Elevate to root

su - root

2. (Optional) Set DNF proxy

echo "proxy=http://<PROXY>:80" >> /etc/dnf/dnf.conf

3. Install packages

dnf install -y net-snmp net-snmp-utils

4. Back up existing config

cp -p /etc/snmp/snmpd.conf /etc/snmp/snmpd.conf.$(date +"%m_%d_%Y")

SNMPv3 Setup

5. Write config

cat <<EOF > /etc/snmp/snmpd.conf
agentAddress udp:161
syslocation "<LOCATION>"
syscontact "<CONTACT>"
view systemview included .1.3.6.1.2.1.1
view systemview included .1.3.6.1.2.1.2
view systemview included .1.3.6.1.2.1.25.1
rouser <V3_USER> priv -V systemview
EOF

6. Create SNMPv3 user

snmpd must be stopped before creating the user or credentials will not save.

systemctl stop snmpd
net-snmp-create-v3-user -ro -A '<AUTH_PASS>' -a SHA -X '<PRIV_PASS>' -x AES <V3_USER>

7. Start and verify daemon

systemctl enable --now snmpd
systemctl status snmpd

8. Restrict firewall to NMS only

firewall-cmd --permanent --remove-service=snmp
firewall-cmd --permanent \
  --add-rich-rule='rule family="ipv4" source address="<NMS_IP>/32" port port="161" protocol="udp" accept'
firewall-cmd --reload

9. Test

snmpwalk -v3 -l authPriv -u <V3_USER> -a SHA -A '<AUTH_PASS>' -x AES -X '<PRIV_PASS>' localhost sysName

Expected: SNMPv2-MIB::sysName.0 = STRING: hostname.example.com
Timeout? Check: systemctl status snmpd · firewall-cmd --list-all · grep <V3_USER> /var/lib/net-snmp/snmpd.conf

SNMPv2c Setup

5. Write config

cat <<EOF > /etc/snmp/snmpd.conf
agentAddress udp:161
syslocation "<LOCATION>"
syscontact "<CONTACT>"
view systemview included .1.3.6.1.2.1.1
view systemview included .1.3.6.1.2.1.2
view systemview included .1.3.6.1.2.1.25.1
rocommunity <COMMUNITY> <NMS_IP> -V systemview
EOF

6. Start and verify daemon

systemctl enable --now snmpd
systemctl status snmpd

7. Restrict firewall to NMS only

firewall-cmd --permanent --remove-service=snmp
firewall-cmd --permanent \
  --add-rich-rule='rule family="ipv4" source address="<NMS_IP>/32" port port="161" protocol="udp" accept'
firewall-cmd --reload

8. Test

snmpwalk -v2c -c <COMMUNITY> localhost sysName

Expected: SNMPv2-MIB::sysName.0 = STRING: hostname.example.com
Timeout? Check: systemctl status snmpd · firewall-cmd --list-all

Notes

  • Bind to management interface only: Replace agentAddress udp:161 with agentAddress udp:<MGMT_IP>:161
  • Additional OIDs: Add branches to the systemview as needed — avoid enabling .1 (full tree)
  • Daemon failed to start: journalctl -u snmpd -n 50 --no-pager

If your tooling supports Orchestrator/ECOS natively via API or Notification Service, prefer those for SD-WAN alarms/telemetry and reserve SNMP here for OS metrics only.


Examples: API Requests

  • Reachability/DB Health

GET /gms/rest/gmsserver/ping

Example response:

{
  "hostname": "orchestrator.localdomain",
  "dbHealth": true,
  "timeStr": "Thu Jan 30 16:51:31 PDT 2020",
  "time": 1580429731303,
  "message": "I am alive!",
  "version": "9.3.1.40717",
  "uptime": "9d 2h 52m 39s"
}
  • Heap/memory trend

GET /gms/rest/stats/timeseries/metrics?startTime=<EPOCH_SEC>&endTime=<EPOCH_SEC>

Monitor totalHeapMemory and usedHeapMemory.

Example response:

[{
  "stats": {
    "buffersMemory": 2704,
    "usedMemory": 124163080,
    "totalMemory": 387740160,
    "applianceCount": 74,
    "usedSwapMemory": 0,
    "freeSwapMemory": 0,
    "totalHeapMemory": 1933049856,
    "usedHeapMemory": 1052471048,
    "totalSwapMemory": 0,
    "cachedSwapMemory": 77150288,
    "freeMemory": 263577080
  },
  "key": null,
  "timestamp": 1580407637
},
…
]