Analytics - Sizing Requirements

Analytics Backend and Analytics Datastore

Important! Database systems and database storage have to be fail safe and redundant. This section does not deal with this subject.

Modules	Large Configuration				High End Configuration
Modules	No.	CPU	Memory	Disk	No.	CPU	Memory	Disk
Analytics & Datastore	1 x	32 Cores	256 GB	2 TB	1 x	32 Cores	256+ GB	4 TB
Number of
Concurrent users	< 200				> 200
Agents	< 1 000				> 1 000
Object definitions	< 100 000				> 100 000
Total Executions per day	< 1 500 000				> 1 500 000

WP	2 x 16				4 x 16
DWP*	2 x 45				4 x 45
JWP*	2 x 10				4 x 10
CP	2 x 2				4 x 4

Sizing and Storage Recommendations

Note: For medium-sized and bigger installations, setting up a regular back-up and truncate process for the Analytics Datastore is recommended. To provide a stable chart performance, back-up and truncate and keep only the last 3-12 months in the Datastore.

For further information, see: Analytics Datastore Delete Action

Setup Recommendations

The UI plug-in is always added to one or more hosts where AWI is installed
The Datastore and Backend should be both installed on a dedicated host
The Backend must be accessible using HTTP or HTTPs from the AWI host. The Backend must be able to connect to the Datastore and to all required databases (AE, ARA).

What disk space is required?

One GB for every hundred thousand executions in the Automation Engine.

Must I back up the Datastore?

The Analytics Datastore was created to store large amounts of data.
To save space, remove data older than 1 year from the Analytics Datastore. You can use backup actions in the ANALYTICS ACTION PACK.

General Database Rules

The following information is valid for all database vendors. The log files must be placed on the fastest available disks (example: SSDs).

ORACLE: REDO LOG FILE DESTINATION
SQL SERVER: TRANSACTION LOG and TEMPDB files
LOG and DATA files must always be on separate DISKS /LUNS

Maximizing Efficiency with the Analytics Datastore

We recommend that you install PostgreSQL 11 with large and high end configurations. This version lets you benefit from the parallel query feature.

To enable parallel queries, two parameters must be set before PostgreSQL is started:

max_worker_processes = 8
The default value is set to eight. The value should be set according to the number of cores the database Administrator allocates for the PostgreSQL database.
max_parallel_workers_per_gather = 7
The value should be set to the value of the max_worker_processes minus one.

You can configure the previously mentioned parameters in the Customized Options section of the postgresql.conf file, that is located:

Windows: C:\Program Files\PostgreSQL\10\data\postgresql.conf

Linux: /etc/postgresql/10/main/postgresql.conf

Example:

A host with 32-cores running PostgreSQL, reserve 4-cores for the Backend:

max_worker_processes = 28 #
max_parallel_workers_per_gather = 27 #

Analytics Rule Engine

Important! Message queue systems and database storage have always to be fail safe and redundant. This section does not deal with this subject.

Sizing and Storage Recommendations

IA Agent Nodes
- See existing recommendations for Analytics Backend, previously mentioned
- On a single box: 16 Cores for a small sized configuration and 32 cores for a medium-sized configuration
- + 8-16 GB RAM to existing memory recommendations
Streaming Platform Nodes
- 1x4 Cores
- 16-GB RAM
- Disk: Expected event size * expected events per second * how many seconds kept in the Streaming Platform (retention period) * replication factor / # of brokers
  That is, 80 Bytes * 30000 events per second * 86400 seconds (= 1 day) of retention * 1 (no replication) / 1 (one broker) ~ 210 GB. A single 80-Bytes raw event results in around 3 KB of disk usage in the Streaming Platform
- Disk buffer cache lives in memory. Sufficient RAM is required on each broker. The RAM depends on how often the Streaming Platform flushes, the more flushes, the less throughput.
- A single broker can host only a single replica per partition, hence # brokers > # replicas
Rule Engine Nodes
- 1x8 Cores
- 32-GB RAM
- Disk: 32 GB
- Memory is critical. The Rule Engine would otherwise start spilling to disk, and decreases throughput.
Other Factors
- To increase throughput by a factor of 5-10 (depending on the batch size). Run Rule Engine, the Automation Engine processes, and the Streaming Platform on separate machines.
- Maximum throughput 1000 concurrent users on a single box, after that backpressure occurs.
- Throughput scales with batch size
- 22.9-GB Streaming Platform logs.dir size for ~ 67-m events ~ 3 KB per event
Note: A single event ingestion using a single box installation is limited to, approximately 2500 events per second. The ingestion rate can be improved by distributing services, selecting a higher batch size, or using more than one IA Agent.

Analytics Streaming Platform

Important! Streaming Platform systems and database storage have always to be fail safe and redundant. This section does not deal with this question.

Modules	Big Configuration				High End Configuration
Modules	No.	CPU	Memory	Disk	No.	CPU	Memory	Disk
Streaming Platform	1 x	32 Cores	256 GB	2 TB	1 x	32 Cores	256+ GB	4 TB
Number of
Concurrent users	< 200				> 200
Agents	< 1 000				> 1 000
Object definitions	< 100 000				> 100 000
Total Executions per day	< 1 500 000				> 1 500 000

Analytics - Sizing Requirements

Analytics Backend and Analytics Datastore

Sizing and Storage Recommendations

Setup Recommendations

Maximizing Efficiency with the Analytics Datastore

Analytics Rule Engine

Sizing and Storage Recommendations

Other Factors

Analytics Streaming Platform