Job Details - Average Runs

{"URL":["/aai/web/v2/jobs/*/average"],"heroDescriptionIdentifier":"ice_job_details_average_runs_INTRO","customCards":[{"id":"ice_description_JobDetails_AverageRuns","type":"customize","url":"https://docs.automic.com/documentation/webhelp/english/ALL/components/TERMA_DOCU/*.*/AAI%20Guides/Content/Monitoring/monitoring_job_details_Avg_runs.htm","title":"Average of Runs"},{"type":"link","link":"https://simlabs.aod.broadcom.com/managing-sla-using-aai","languages":["en-us"],"title":"Hands-On Practice: Managing SLAs using AAI"},{"id":"ice_job_details_moving_average_graphic","type":"customize","url":"https://docs.automic.com/documentation/webhelp/english/ALL/components/TERMA_DOCU/*.*/AAI%20Guides/Content/Monitoring/monitoring_job_details_Avg_runs.htm","title":"The Moving Average Graphic"},{"id":"ice_job_details_trend_graph","type":"customize","url":"https://docs.automic.com/documentation/webhelp/english/ALL/components/TERMA_DOCU/*.*/AAI%20Guides/Content/Monitoring/monitoring_job_details_Avg_runs.htm","title":"The With Trend Graphic"},{"id":"ice_job_details_day_of_week_average","type":"customize","url":"https://docs.automic.com/documentation/webhelp/english/ALL/components/TERMA_DOCU/*.*/AAI%20Guides/Content/Monitoring/monitoring_job_details_Avg_runs.htm","title":"Day of Week Averages"},{"id":"ice_researching_jobs","title":"Researching Jobs","type":"customize","url":"https://docs.automic.com/documentation/webhelp/english/ALL/components/TERMA_DOCU/*.*/AAI%20Guides/Content/Monitoring/monitoring_Researching_Jobs.htm"},{"id":"ice_description_search_for_jobs","title":"Searching for jobs","type":"customize","url":"https://docs.automic.com/documentation/webhelp/english/ALL/components/TERMA_DOCU/*.*/AAI%20Guides/Content/_Common/CommonFunctions/CF_search_for_jobs_web_interface.htm"}]}

The Average Runs tab provides a wealth of data derived from AAI's historical database and from its prediction engine. Every job in AAI is continuously predicted. Because AAI constantly calculates and predicts how long a job will run using its run history, this information is always up-to-date. The information and the graphics on this tab help you identify trends on the job run duration overtime and latency issues due to different kinds of run delays.

This page includes the following:

Overview of the Average Runs Tab

The Average Runs tab provides the results of its aggregates and predictions about a job's run history and trends in three areas on the tab page:

  • Avg Run

    Here you find the average run duration and average finish-to-start delays for the job's runs over time.

  • Last Run

    This section contains the details of the last run times as well as all delays that the run encountered.

  • Job Runs at the bottom of the tab page offers you run information aggregated and presented in three kinds of graphical views, as follows:

    • With Moving Average a line graph of runs over time plotted against moving averages

    • With Trend shows a line graph of the runtime trend over time

    • By Day of the Week aggregates the run times by the day of the week in tabular and as a bar graph.

      Here you can also define a predicted duration override to be used in future predictions

Job Run Averages

The Avg Runs box in the top left section of the Average Runs tab provides information about the job's average run times and the average finish-to-start delay time. The header row shows how many runs the averages are based on. Starting from the top to the bottom of this section, you find:

  • Next to the Avg Run title, the number of runs and the date that this was collected from in the format "based on nn of yy runs since MM/TT/YY HH:MM:SS AM/PM time-zone".

    For example, based on 101 of 101 runs since 2024/09/16 05:03:51 PM GMT means that all 101 historical runs that are available for this job since the indicated time stamp were used in the calculation. This information is useful because it tells you how good a metric the average is. Generally, upwards of three weeks worth of data results in a meaningful average.

    Distinct outliers are excluded from the calculations. For example, if this job had 10 outliers, this value would read based on 91 of 101 runs.

  • Under the Duration header you find the average run based on the runs mentioned above.

    Next to it and under Std Deviation () is the standard deviation amount that was evidenced among these runtimes.

  • Under Finish-to-Start Delay you see the average of all finish-to-start delays (if any) for the same runs, also with a Duration and a Std Deviation () amount.

    The delays are good indicators that either the scheduler needs optimization or that the job design is flawed and should be corrected.

Last Run and Delay Statistics for a Job

The box in the top right section of the Average Runs tab provides information about the run performance of the job in its most recent run, as well as any all delay values. These are under two headers: Last Run and Delay and contain the following information:

  • Under the Last Run heading, you find the run statistics for the most recent run of the job. These include the Duration, and the Start and End date and timestamps of the run.

  • Under the Delay heading you find a list of all delay types and time amounts. If the most recent job run experienced any delays, you see them here.

    • Start-to-Running Delay

      Jobs go through various statuses as they execute. Before it actually runs, a job goes into a start status. The scheduler gathers resources and lines up everything that the job needs to be able to execute. When this is done, the job goes into a running status. This delay measures the time that elapses between the start status and the running status of a job.

      This value is a good performance indicator of the scheduler. If this delay is long, this means that the scheduler is working hard and it might indicate an overload.

    • Finish-to-Start Delay

      Time that elapses between the completion of one job and the start of the next job in the jobstream. As happens with the start-to-running delay, the finish-to-start delay is a good performance indicator of the scheduler. You can use the information provided by these delays to suggest optimizations to the team in charge of the scheduler.

      Important!

      System delays that keep growing indicate that it is necessary to reorganize the scheduler and/or redistribute the load. For example, if you always have a peak at certain times of the day, it might be necessary to spread that load over the next few hours or to increase capacity (additional resources, add remote agents, and so forth).

    • Operational Delay

      Delay that measures the amount of time that the operation team takes to fix an issue in a job. For example, a job fails at 01:00 a.m. and manual intervention is necessary. An operator restarts it at 01:30 a.m. The 30 minutes in between are the operational delay.

      This delay is a good indicator of how the operations team is reacting to issues in schedulers. It can serve as a starting point to suggest improvements in the team's procedures, protocols and so forth.

    • Built-in Delay

      Delay that is caused by the design of the job itself, usually due to semantic errors in their definitions. For example:

      • The job has a hard-coded start time.

      • The job has a reference to another job that no longer exists in the scheduler.

      • The job has references to a calendar that no longer exists, and so forth.

      Tip:

      Hand over the information about these delays to the development team of the corresponding scheduler so that they can redesign the jobs accordingly.

Using the Moving Average Graphic for a Job's Runs

The With Moving Average view of the Job Runs section of the Average Runs tab of a job's details, contains a graphic showing job runs per day, with one data point for each run. Alongside these data points are two lines showing the moving averages over a short and a longer time range so that you can compare the job runs against the moving trend. AAI can search as far back as 90 days of runs to calculate and display this data.

Selecting the moving average time Frame

The two dropdown lists with the legend to the right of the graphic let you select two different time frames for the moving average calculations:

  • Short moving average time frames (5, 10 or 15 days), the default is 10

  • Long moving average time frames (30, 60 or 90 days), the default is 30

For example, if you select 10-Day Moving Average, AAI takes the last 10 days, assembles one data point per day and divides it by 10. If there are no runs on a given day, that data point would be 0. If there are 10 runs that day, the data point would be the average of those 10 runs for that particular day.

How the Graphic Shows Changes in Trends

Each data point in the graphic represents a run of the job. The x axis indicates the dates, timestamps and time zone on which the run happened. The y axis indicates the job run duration in hh:mm:ss. The red line represents the short time frame moving average, the green dotted line represents the long time frame moving average. The combination of these lines lets you identify trends overtime. For example, if you start seeing the short-term average going up relative to the long-term one, you want to investigate why it is taking longer for the job to finish. Could it be due scheduler load? Could it be due to dependencies within the job?

Note:

Always the more data you base a trend calculation on, the more reliable it is. Therefore, an average over a longer period of time is more reliable indication of the job trend.

Mouseover a bar in the graphic to display a pop-up tooltip with the details of the particular run, as follows:

  • The start date and time

  • The Job Run duration

  • The shorter 10-Day Moving Average amount.

    The default is 10 days. However, if you selected 5-day or 15-day, that's what would be depicted on the graph and that's the value you see in the pop-up tooltip.

  • The longer 30-Day Moving Average amount

    The default is 30 days. However, if you selected 60-day or 90-day, that's what would be depicted on the graph and that's the value you see in the pop-up tooltip.

Navigating through the Execution Timeline

Use the slider beneath the graphic to navigate back and forth through the executions timeline and zoom in or out of a time range on the graphic:

  1. Mouse over the handle on either side of the slider until it turns to a double arrow.

  2. Click to grab the handle and drag that side of the slider to the left or right to move backwards and forwards in time to change the time frame you are viewing.

    By narrowing the time frame, you effectively zoom in on run data. By widening the time frame, you effectively zoom out on the run data.

Using the Trend Graphic for a Job's Runs

The With Trend view of the Job Runs section of the Average Runs tab of a job's details, contains a graph that shows you the trend of run durations for a job. The horizontal x-axis shows the data and times, and the y-axis measures the run durations. You see two things on the graph:

  • A data point for each job run in the range that you are looking at

  • A trend line that shows a calculated trend based on the historical run durations and their consistency. An ascending line means that the job is trending towards running longer. A descending line means that the job is trending towards running shorter.

You can choose between two options of how many days in the past you want to be included in the trend calculation. You do this in the Trend Length field to the right of the graph. There are two trend length options that you can choose from:

  • 30 Days

  • 90 Days

It is interesting to note that this graph, which shows the run duration trend for one job, is very much like the graph that you see on the Trending Jobs data insight, where you can view run trends of all jobs over your entire system, or some segment of it based on the filters that you set. You can also jump to the Average Runs tab for a specific job by clicking the job's linked name on that data insight. For more information, see The Trending Jobs Data Insight.

Using the Averages by Day of Week Views for a Job's Runs

The By Day of The Week view of the Job Runs section of the Average Runs tab of a job's details, contains the average run durations for each day of the week. You can view these in either a table view that shows the Average Duration per day of the week, or as a bar graph of the same. Use the Table View and Graph View buttons at the top right of the section to switch between the views.

The predicted run duration for a job

Above the table or graph you see a subtitle that shows what the Predicted duration for the job runs is, and what this value is based on. However it is calculated or supplied, the predicted duration becomes the basis of AAI's predictions of future runs durations. The predicted duration can be based on one of the following:

System Calculated duration, which is based on one of the two job run prediction models that AAI uses. They are as follows:

  • Moving average

    This is AAI's default prediction model. AAI calculates the average durations based on the run times of the entire job run history, excluding the outliers. AAI then uses the resulting average duration values for all its subsequent predictions and calculations.

  • Day of the week

    AAI recognizes when jobs have inconsistent average durations for different days of the week and when there are significant peaks or drops on certain days. When this happens, AAI does not use the moving average method. Instead, AAI automatically uses a day-of-the-week prediction model. With this model AAI calculates job run average durations per weekday.

User Supplied duration, which is based on an override value that you or another user enters from this view on the Average Runs tab. When one of these is applied, you can edit it to change the value or delete it to revert back to a system-calculated predicted duration. For information, see Supplying a Run Predicted Duration for a Job.

Supplying a Run Predicted Duration for a Job

The New Predicted Duration button allows you to override the job run duration that AAI has automatically predicted. Click it to open the Add Predicted Duration dialog and enter the new predicted duration. You can also enter a description that explains your reasons for modifying the system-calculated duration.

In Expiration Information you can determine whether this modification should be permanent or not.

Examples

  • A job has failed and an operator must restart it. Let's suppose that this fix takes 30 minutes. You add the extra 30 minutes to the system-calculated average duration.

  • You know that you will have a one-off situation where a particular job will need much longer than usual to complete. If you can anticipate that it will take 30 minutes longer

You can plug in those extra 30 minutes into AAI's prediction engine and thus get accurate completion times for the jobs (and consequently, for the jobstreams).

See also: