Product SiteDocumentation Site

Chapter 2. The Web Interface

2.1. Job Configuration
2.2. Job Input
2.3. Mapper and Reducer Source Code
2.4. Submitting and Monitoring a Job
2.4.1. Test Jobs
2.4.2. Standard Jobs
2.5. Job Quotas
2.6. Example Job
2.6.1. Log In
2.6.2. Configure & Write Job
2.6.3. Submit as a Test Job
2.6.4. Submit as a Standard Job

2.1. Job Configuration

The web interface exposes these general configuration options:
  • Job Name: A brief name that signals the purpose of the job to the user and to cluster administrators. Should be composed only of alphanumeric characters, dashes, and underscores.
  • Source Code Language: The programming language to use for the mapper and reducer. The available options are system- and configuration-dependent.
  • Number of Map Tasks: A suggested number of discrete "tasks" to divide map phase of the job into. More tasks allow data to be divided more equally, but can also add overhead. There is no guarantee that the suggestion will be followed, and it usually does not need to be set manually.
  • Number of Reduce Tasks: A suggested number of tasks to divide the reduce phase of the job into. Like the number of map tasks, it does not need to be set manually. However, the map-reduce system follows this suggestion more closely than for map tasks. The default is 1. Set it higher for large jobs where the reduce phase becomes a bottleneck.

    Note

    For test jobs, neither the map or reduce phases are split into tasks, so these numbers are ignored.
  • Sort Order: Controls whether keys should be sorted numerically or alphabetically in the final output. In alphabetic sorting, a key of 10 would come before 9 because the first digit (1) has a lower value. With numeric sorting, the opposite would be true, because the entire numerical value is compared. Numeric sorting supports both integer and floating-point values, and sorts alphabetic characters before numbers.