With Amazon Batch, you simply package the code for your batch jobs, specify their dependencies, and submit your batch job using the Amazon Web Services Management Console, CLIs, or SDKs. Amazon Batch allows you to specify execution parameters and job dependencies, and facilitates integration with a broad range of popular batch computing workflow engines and languages (e.g., Pegasus WMS, Luigi, and Amazon Step Functions). Amazon Batch efficiently and dynamically provisions and scales Amazon EC2 and Spot Instances based on the requirements of your jobs. Amazon Batch provides default job queues and compute environment definitions that enable you to get started quickly.
Support for tightly-coupled HPC workloads
Amazon Batch supports multi-node parallel jobs, which enables you to run single jobs that span multiple EC2 instances. This feature lets you use Amazon Batch to easily and efficiently run workloads such as large-scale, tightly-coupled High Performance Computing (HPC) applications or distributed GPU model training. Amazon Batch also supports Elastic Fabric Adapter, a network interface that enables you to run applications that require high levels of inter-node communication at scale on Amazon Web Services.
Granular job definitions
Amazon Batch allows you to specify resource requirements, such as vCPU and memory, Amazon Identity and Access Management (IAM) roles, volume mount points, container properties, and environment variables, to define how jobs are to be run. Amazon Batch executes your jobs as containerized applications running on Amazon ECS.
Simple job dependency modeling
Amazon Batch enables you to define dependencies between different jobs. For example, your batch job can be composed of three different stages of processing with differing resource needs. With dependencies, you can create three jobs with different resource requirements where each successive job depends on the previous job.
Support for popular workflow engines
Amazon Batch can be integrated with commercial and open-source workflow engines and languages such as Pegasus WMS and Luigi, enabling you to use familiar workflow languages to model your batch computing pipelines.
Dynamic compute resource provisioning and scaling
Amazon Batch provides Managed Compute Environments that dynamically provision and scale compute resources based the volume and resource requirements of your submitted jobs. You can configure your Amazon Batch Managed Compute Environments with requirements such as type of EC2 instances, VPC subnet configurations, the min/max/desired vCPUs across all instances, and the amount you are willing to pay for Spot Instances as a % of the On-Demand Instance price.
Alternatively, you can provision and manage your own compute resources within Amazon Batch Unmanaged Compute Environments if you need to use different configurations (e.g., larger EBS volumes or a different operating system) for your EC2 instances than what is provided by Amazon Batch Managed Compute Environments. You just need to provision EC2 instances that include the Amazon ECS agent and run supported versions of Linux and Docker. Amazon Batch will then run batch jobs on the EC2 instances that you provision.
Priority-based job scheduling
Amazon Batch enables you to set up multiple queues with different priority levels. Batch jobs are stored in the queues until compute resources are available to execute the job. The Amazon Batch scheduler evaluates when, where, and how to run jobs that have been submitted to a queue based on the resource requirements of each job. The scheduler evaluates the priority of each queue and runs jobs in priority order on optimal compute resources (e.g., memory vs CPU optimized), as long as those jobs have no outstanding dependencies.
Integrated monitoring and logging
Amazon Batch displays key operational metrics for your batch jobs in the Amazon Web Services Management Console. You can view metrics related to compute capacity, as well as running, pending, and completed jobs. Logs for your jobs (e.g., STDERR and STDOUT) are available in the Amazon Web Services Management Console and are also written to Amazon CloudWatch Logs.
Fine-grained access control
Amazon Batch uses IAM to control and monitor the Amazon Web Services resources that your jobs can access, such as Amazon DynamoDB tables. Through IAM, you can also define policies for different users in your organization. For example, admins can be granted full access permissions to any Amazon Batch API operation, developers can have limited permissions related to configuring compute environments and registering jobs, and end users can be restricted to the permissions needed to submit and delete jobs.
Learn more about Amazon Batch pricing