Root Cause Analysis (Web Interface)

root cause analysis, root cause for NPTF, root cause on Gantt, root cause reason

When monitoring jobstream runs, if you see a run that has an NPTF (not predicted to finish) run status, in one click you can open the Gantt View for that run and see what the root cause of the problem is. The root cause that AAI identifies helps you move quickly towards resolving the issue that prevents the jobstream target job from completing successfully.

This page includes the following:

Understanding Root Cause Analysis

When the AAI prediction engine determines that the target job of a jobstream run is not likely to complete successfully, AAI sets the status of the jobstream run to NPTF. This triggers the root cause analyzer. The AAI root cause analyzer tries to identify the most likely reason that the run cannot complete successfully. You can see the reason that the analyzer identified in the Gantt View for the run.

Using Root Cause Analysis

When a jobstream run is NPTF, you can find the result of the root cause analysis that AAI determined on the Gantt View for that jobstream run. For information, see Accessing the Gantt View.

When the Gantt chart opens, the following settings and configurations are already selected so that you can immediately focus on the problem that the root cause analyzer has identified:

  • On the table, the filter Root Cause of NPTF is already selected, which means that the table lists only the problematic job.

  • Likewise, the Show filtered view in Gantt option is on, so the timeline shows only the problematic jobs.

  • The table contains two additional columns with information that is relevant to NPTF jobstreams:

    • Reason

      The reason that AAI determined is most likely to prevent a successful completion. For a list of possible reasons, see List of Root Causes.

    • Detail

      This is the specific information about the object or flag or missing input that led to the NPTF status. For example, if the root cause is "Parent success/failure conditions not met, " the detail would show the job name and the status of the job runs.

The additional filter for Root Cause of NPTF and the additional columns for Reason and Detail are available on the Gantt View only for jobstream runs that are in NPTF status. They are available regardless of how you open the Gantt View, not just when you click an NPTF link from another page or view.

 

List of Root Causes

The following table lists all possible root causes that can appear for NPTF jobstream runs and provides a description of the meaning of the root cause. Each root cause also has an internal code that you do not see on the screen. The codes are provided here because they can be helpful to developers and for queries for external processes.

 Reason Code  Root Cause Reason and
Meaning

1

Job start condition is not met (internal code = 1)

Meaning: One or more of the job's start conditions have not been met.

2

Exit code condition is not met

Meaning: The job's exit code did not match the condition specified in the job definition.

3

Run window is not met

Meaning: The job's start conditions were not met during the run window defined on the job.

4

N/A (this code is not currently used)

5

Calendar is expired

Meaning: The calendar specified in the job's start conditions has expired.

6

Job is on hold

Meaning: The job cannot start because it has been put into On Hold status

7

Job is on ice

Meaning: The job cannot start because it has been put into On Ice status

8

Parent already completed

Meaning: The job cannot start because its parent job is not running.

9

Parent success/failure conditions not met

Meaning: The job is a parent job and the success/failure conditions defined for it have not been met.

10

Predecessor could not be analyzed

Meaning: The job's start conditions include a reference to a job that does not exist in the scheduler.

11

Target Job didn’t complete with success status

Meaning: The target job ran but did not complete successfully.

12

Machine is offline (supported in AutoSys r11)

Meaning: The job cannot run because the machine on which the job has been defined to run is not active.

13

Exceeded Lookback seconds, stale status

Meaning: The job has a predecessor that has not run to satisfy a start condition within the lookback secs setting in the job definition.

14

Job depends on itself

Meaning: The job cannot start because one of its start conditions is a condition on the job itself.

15

Root cause cannot be determined

Meaning: A cause of the failure to predict the target job could not be identified.

16

Job timed out

Meaning: The target job has timed out.

17

Job has no start conditions

Meaning: The scheduler will not run the job because its definition contains no start conditions.

18

Predecessor job has been trimmed from the jobstream

Meaning: The job has a predecessor that has been trimmed from the jobstream.

19

Dependency on a job in an instance not defined to AAI

Meaning: The job depends on a job in a scheduler instance that is not defined in AAI.

20

Target job not predicted to complete with Success status

Meaning: The target job of the jobstream is predicted to finish but fail to meet the SLA.  

21

Global variable condition not met

Meaning: The job has a global variable condition that was not met.

See also: