Architecture of the Automation Engine Agent for the HP Non Stop Server

This document explains a job run procedure on an HP NonStopServer via an AE NSK agent.

Description of Internal Procedures:

1. Job Start

Job start is initiated by the Automation Engine which sends the information that the job has started to the AE NSK agent. The AE agent then creates an entry in the AE status file.

The AE NSK agent sends a message to the AE Output Collector (via IPC), which contains the following information:

The following information is retrieved:

If required, the agent starts a new TACL process. This process then logs on to the AE Output Collector (because it has been specified as the output device).
The Output Collector generates the report file. Then, it configures the job's TACL (setting user, priority etc). Finally, job file is assigned to the TACL of the job as obey file and commences to process the job.

2. Job Run

While the job is running, all the job-generated outputs are sent to the AE Output Collector and written to the job reports. If an entry is expected and a terminal has been configured for the job, the respective entry is then taken from the terminal.
The connection between the job report and the job is established via the # Location which is used by the jobs to address the Output Collector.
Hence, outputs of Location $UC4OC.#AAL can be written to the report file $DATA.REPORTS.FFXX, and outputs of $UC4OC.#AAM to the report file $DATA.REPORTS.FFXY.
Names of # Locations and the report files are assigned by the agent or the Output Collector.

3. Job End

The Output Collector recognizes that the job has ended when the TACL process of the job requires an entry or closes the Output Collector (if an error occurs). When the job has ended, the Output Collector writes this information to the job's status file and sends the corresponding message to the agent (via IPC). Finally, the agent reports the job to the Server as finished.

Notes:

1. Meaning of the status file

The job's status file facilitates improved job recovery when the agent or the Output Collector fails. The context is then retrieved from the job's status file in order to restart the troubled process. Many jobs can so overcome these troubles and undisturbed processing can be continued. Jobs that have ended during the time of the agent/Output Collector failure are identified and registered. Hence, the Automation Engine always shows a correct image of the system status.

2. Mutual Monitoring

Agent and Output Collector monitor themselves mutually. If one of these two processes ends unexpectedly (unplanned stop, CPU failure, software failure etc), the surviving process automatically restarts the failed process. If the CPU of the failed process is not available (anymore), another available CPU is selected (with preference given to one that differs from the CPU of the surviving process). This ensures that the system tolerates various errors.

See also:

Agent - Combining AE and NSK