Creating Hive Jobs

To run a Hive scripts with the RA Hadoop Agent, you need to create Hive Jobs. Hive Jobs are very much like Pig Jobs.

To create an RA Hadoop Agent Hive Job:

  1. Add a Job object of type Hadoop > Hive and select your RA Hadoop Agent in the Host field on the Attributes page.
  2. Go to the Hadoop page.
  3. Respond to the fields in the General section.
    • Connection
    • An RA Hadoop Agent Connection object. For more information on Hadoop Connection objects, see topic Creating RA Hadoop Agent Connection Objects.

    • Write response to job Log
    • When checked, the RA Hadoop Agent prints the whole response to the Job log.

    • Create request reports
    • Creates and registers a report with the request data.

    • Create response report
    • Creates and registers a report with the response data.

    • Connection Timeout
    • The number of seconds before timing out while attempting to connect to the URL endpoint. When set to 0, the connection never times out.When no timeout is specified, the Job's Connection object setting is used.

    • Read Timeout
    • The number of seconds before timing out when waiting for a method call. When set to 0, the read never times out.When no timeout is specified, the Job's Connection object setting is used.

    • Trace Performance Metrics
    • When checked, statistics on how long the call took are written to the Job report. Additionally, the following object variables are set with the metrics that can be reported upon:

      • COMPILE_REQUEST_DURATION
      • COMPILE_RESPONSE_DURATION
      • ROUNDTRIP_DURATION

    • URL Endpoint
    • Dynamically lists the full URI in a non-editable field.

  4. Respond to the fields in the Definition section.
    • Hive Script
    • The Hive script for the Job. You can use the folder icon to browse to the file. You can use the Preview button to open a read-only pop-up dialog showing the beginning of the Hive script for reference.

    • User Name
    • The Hadoop user name.

    • Status Directory
    • The Hadoop status directory. You can use the replacement value {runid} to resolve to the run Id of the Job. You can use the folder icon to browse to the file.

    • Arguments
    • Allowable WebHCat interface arguments.

    • Enable Log
    • Enables Hadoop logging.

  5. Click Save to save the Job.

Post Processing Variables

The following variables are available in the Job's Post-Process page.

For the: Use the Variable:
Exit value &exitValue#
Status name &statusName#

Workflow Variables

The following variables are available for the next Job, when this Job is included in a Workflow.

For the: Use the Variable:
Standard output &stdout#
Standard error output &stderr#
Status directory &statusdir#