Troubleshooting, Root-Cause Analysis and Remediation

The Process Monitoring perspective provides the essential toolset for identifying, investigating, and resolving abnormal system behaviors. By combining detailed execution data and the Automation Assistant, you can quickly perform root-cause analysis and initiate targeted remediation or escalation. This topic outlines the core activities required to maintain operational stability, from initial failure recognition in the task list to AI-assisted diagnostics that accelerate the path to resolution.

Monitoring Processes and Recognizing Abnormal Behaviors

As soon as an object is executed, it is displayed as a task in the Tasks list, where you can check its properties (activation, start and end times, runID) and statuses. It provides key execution and status data.

Manual Monitoring

Use the Filter pane to isolate Aborted or ENDED_NOT_OK tasks.
AI-enhanced method

Use the Automation Assistant (powered by the MCP server) to find failures using natural language, for example, typing Show me all failed Jobs from the last hour.. This automatically updates the filter to display only the relevant tasks.

Investigating the Abnormal Behaviors

Once you have identified the tasks that require attention, the investigation phase allows you to transform raw execution data into actions. For this purpose, a number of tools are available, see Tools for Investigation. The Automation Assistant helps you not only identify that a failure occurred, but also understand exactly why it happened.

This hybrid approach combines the precision of structured logs alongside the speed of AI-driven summarization. The Automation Assistant drastically reduces the time required to navigate complex Workflows and large Job reports by pinpointing anomalies and explaining technical errors in plain language.

Automated Report Analysis

The Automation Assistant reads reports and log files across different Agents and platforms. When you ask it to analyze a failure, it scans these technical resources for specific error patterns, return codes, and system messages.
Cross-Run Comparison

The Automation Assistant can instantly compare the attributes and logs of a failed execution against a previous successful run to identify environmental or configuration changes.
Plain Language Summaries

The primary effect is the translation of complex, nested error strings into actionable summaries.
Example: Instead of seeing Return Code 9009, the Automation Assistant explains something along theses lines: The job failed because the executable 'xyz.sh' was not found in the target directory.

Tools for Investigation

The following tools are available for investigating failed tasks and abnormal behaviors:

Execution lists

Use these lists to compare different runs of a task (for example, compare a failed execution against a successful one) to discover discrepancies that may explain the failure.

Use these lists in combination with the information in the reports and the task details to get a clear picture of what has happened. Execution lists let you drill down into all aspects of a run; in case of compound tasks, such as Workflows or Schedules, you can access their child task executions. Likewise, from a child task execution you can access its parent execution.

For more information about execution data in general in this documentation, see Execution Data. For more information about object and task details, see Showing Object and Task Details.

Tip: Right-click an execution and select Analyze Execution to get assistance from Automic Automation's Gen AI capabilities. For more information, see:
Reports

Reports provide the technical trail for every execution. There are different types of reports, depending on the type of task. For more information about reports in general, see:
Monitors

Many task types have their own monitor view. For some task types, all the information you need is already contained in the list of Tasks, on the reports and in the Executions lists. Other task types require more extensive information for monitoring and they provide a dedicated monitor. For more information, see:

Escalating and Remediating

Once the root cause is identified, the final step is to restore service and prevent future recurrences. From the Tasks list, the Executions lists, and various monitors, you have access to all the functions designed to resolve issues based on your specific rights and permissions.

The Automation Assistant streamlines the transition from diagnosis to action by providing guided remediation. Instead of manually navigating through menus to find the correct recovery command, you can use the assistant to initiate fixes directly through natural language.

Manual Remediaton

Access the context menu of a failed task to perform actions such as Restart, Cancel, or Modify. These actions are context-sensitive and depend on the current status of the task. For more information, see Available Functions Depending on the Task Status.

AI-Assisted Remediation

The Automation Assistant can suggest the most appropriate next step based on its analysis of the failure. For example, if a Job failed due to a temporary network timeout, the assistant may suggest a simple restart. If a script error is detected, it can provide a direct link to open the object for editing.

Direct Execution

You can command the Automation Assistant to perform bulk operations or complex restarts, such as: Restart the parent workflow for this failed job," or "Cancel all blocked tasks in Client 100.

Whether you are performing a manual deep dive or using AI-led analysis, these tools provide the transparency needed to ensure system reliability and informed decision-making.