HomeGuidesAPI ReferenceChangelog
GuidesAPI ReferenceGitHubAirheads Developer CommunityLog In
Guides

ECOS REST API for collecting and monitoring statistics

Retrieving SD-WAN Statistics

The HPE Aruba Networking EdgeConnect SD-WAN solution generates volumes of statistics every minute. These statistics help us debug SD-WAN networks. Other network devices generate orders of magnitude less data per device. This disparity is partly due to the meshing capabilities of EdgeConnect and the collection of data on every tunnel. The other factor is the frequency at which data is collected. HPE Aruba Networking EdgeConnect products produce statistics every minute while the industry standard is generally every five minutes.

While the HPE Aruba Networking Orchestrator is the main consumer of this data and uses it extensively in reporting and troubleshooting, there are many situations where customers will need to access these statistics. The use cases for statistics range substantially and can span anything from integration with diagnostic tools to external reporting tools, to the development of customized health dashboards.

Orchestrator offers a REST API with access to these statistics, but to ensure the best Orchestrator UI user experience for SD-WAN administrators, it is not recommended to retrieve statistics directly from the Orchestrator API. Orchestrator is not designed for supporting these use cases and thus, not the right integration point.

The REST API is not suited for high-frequency polling of granular statistics or “data replication” into an external data store for two reasons: 1) the volume of data that is requested scales linearly with the number of EdgeConnect devices and the granularity of the requested statistics, and 2) Orchestrator does not have the resources for the load that frequent REST API calls would impose. Orchestrator serves multiple purposes and has limited resources tuned to do many functions, including:

  • Supporting 5-50 concurrent users accessing the UI
  • Orchestrating policies across the SD-WAN fabric with 2 to 1,000 devices
  • Collecting and reporting statistics from these devices for Orchestrator users
  • Servicing third-party API requests from external systems

Collecting and reporting statistics consumes large amounts of CPU and memory resources. Since Orchestrator is not designed to provide end applications with a copy of its statistics, HPE Aruba Networking provides REST APIs that allow customers to obtain this data directly from the appliances for use by third-party applications. This approach is required when the use case calls for access to fine-grained statistics and/or frequent access to the statistics and avoids overloading Orchestrator by distributing the load across the SD-WAN.

Pre-Requisites

Before collecting statistics using the EdgeConnect REST APIs, the following prerequisite must be met:

  1. Appliance access: To access the EdgeConnect REST APIs directly, customers must have an up-to-date list of appliances and a method to access their REST API endpoints. The list of appliances and their configuration information can be retrieved from the Orchestrator REST API as part of inventory discovery.

  2. Resource name or ID: Depending on which resource the customer is interested in (for example, tunnels, interfaces, flows, and so on), the identity of that resource needs to be known. This can be obtained from the Orchestrator REST API as part of the inventory discovery. The resource's ID or name is used to identify and associate the statistics in the files.

  3. Login access to EdgeConnect gateways: File retrieval from EdgeConnect requires login access to EdgeConnect to ensure authorized access. HPE Aruba Networking recommends that customers create or orchestrate an "api-user" login on the edgeConnect appliances for use by the third-party application that is retrieving the files.

Customers can follow these procedures to obtain detailed statistics, including the popular request for tunnel metrics such as loss, latency, and jitter, as well as interface and flow statistics.

As a best practice, it is essential to retrieve statistics directly from the appliances. Accessing the appliances is important because you will need to log in, request the minute statistics, and then log out to terminate the session. This procedure is necessary for all appliances.

It is recommended to create a single user account with a password and use this account across all appliances. This approach simplifies the process of requesting statistics, especially when using scripts for automation.

Additionally, configuring loopback interfaces on the appliances is advisable. Methods like loopback orchestration can assist in configuring multiple appliances using the Orchestrator, thereby avoiding the need for manual configuration of loopback adapters.

Overview of File Retrieval Methodology

To use EdgeConnect statistics in an external application, one can poll each appliance directly using the ECOS REST APIs. The following steps outline how to do this. We also provide examples in the Best Practices: Capacity Management

  1. Invoke the EdgeConnect REST API to determine which statistics files are available. EdgeConnect stores a fixed number of statistics files. The REST API call provides information about the time rage that is currently available.
  2. Request the stats file for the minute(s) of interest. The stats file are zipped. Information about which stats are contained in each file is described below.
  3. Store the files of interest and parse or ingest them as needed. The file formats for various statistics files are described below.

Customers can use this procedure to obtain detailed statistics, including the popular requests for Tunnel stats (loss, latency, jitter, and so on), interface stats, and flow stats.


📘

Note: The process for retrieving CPU and memory stats for each EdgeConnect follows a different process and is not file-based.

Getting Statistics Time Range

Appliances generate statistics every minute. Each minute, appliances generate statistics of different types in different CSV or text files. These CSV and TXT files are zipped into a single file for convenience. Appliances are also configured to keep only a certain number of these files. Appliances also retain statistics for a specified time period only. The polling systems (Orchestrator or third-party systems) must get their data before these files age out. Most appliances are configured to keep this data for at least a few hours. You can configure the length of time appliances keep data in the Orchestrator.

A poller calls the following REST API to get a range of timestamps for which data is present on the appliance:

GET /rest/json/stats/minuteRange

This will return the following output:

{ "newest": "1649778540", "oldest": "1649691900" }

These two numbers are minute boundaries expressed in standard epoch seconds. This example indicates Tuesday, April 12, 2022, 3:49:00 PM to Monday, April 11, 2022, 3:45:00 PM.

Polling for a Specific Minute

Once you obtain the time range, you can iterate over every minute (or start from the last minute not yet retrieved). For example:

for (int i=1649691900; i<=1649778540; i+60) {
 retrieve zip file
 unzip
 insert stats into your database
}

The API for retrieving a zip file for a specific minute looks like this:

GET /rest/json/stats/minuteStats/st2-1649691900.tgz

All the file names start with st2- and are followed by a minute timestamp in epoch seconds. The file extension is .tgz

Specific to 9.3, the Statistics ZIP file content contains the following:

  1. tunnel.csv, tunnel_peak.csv, tunnel_v2.txt
  2. interface.csv, interface_peak.csv, interface_v2.txt
  3. flow.csv, flow_peak.csv, flow_v2.txt
  4. dscp.csv, dscp_peak.csv, dscp_v2.txt
  5. tclass.csv, tclass_peak.csv, tclass_v2.txt
  6. shaper.csv, shaper_v2.txt
  7. jitter.csv, mos.csv
  8. zone_pair.csv, zone_pair_v2.txt
  9. boost.csv, boost_v2.txt
  10. drops.csv, drops_v2.txt
  11. drc.csv
  12. ftype.csv, ftype_peak.csv, ftype_v2.txt
  13. interface_overlay.csv, interface_overlay_v2.txt
  14. health_v2.txt
  15. tunnel_availability_v2.txt
  16. interface_availability_v2.txt
  17. appliance_reachability_v2.txt
  18. probe.csv, probe_v2.txt
  19. inet_bkout.csv, inet_bkout_v2.txt
  20. appperf.csv, appperf_v2.txt

📘

Note: CSV and TXT differences:

The two formats are .csv and v2.txt files. The CSV files are what the default stats collector uses. The v2.txt file is used when the new Stats Collector is enabled.


What’s Next