Cloud & MLOps ☁️
Compute in the Cloud
AWS Lambda & AWS Batch

AWS Lambda & AWS Batch

AWS has a lot of other services that are used for compute. Some of these are serverless. Serverless is a paradigm in which the developers don’t have to manage servers anymore. They just deploy code or functions - they don’t manage or provision the servers themselves. Initially serverless was known as FaaS (Function as a Service). Serverless was pioneered by AWS Lambda but now also includes a range of things, mainly anything that’s managed, such as databases, messaging, storage, etc.

AWS Lambda

AWS Lambda is a service that lets you run code without needing to provision or manage servers. You pay only for the compute time that you consume. lambda is serverless: i.e. you only run it for small amount of time. Generally only run for 1 second or less. It auto-scales and can run for up to 15 minutes.

Virtual Servers in the CloudVirtual functions – no servers to manage!
Limited by RAM and CPULimited by time - short executions
Continuously runningRun on-demand
Scaling means intervention to add / remove serversScaling is automated

Benefits of AWS Lambda

  • Easy Pricing:
    • Pay per request and compute time
    • 1 million requests per month and 400,000 GB-seconds of compute time per month. This free tier does not automatically expire at the end of your 12 month AWS Free Tier term, but is available indefinitely.
  • Integrated with the whole AWS suite of services
  • Event-Driven: functions get invoked by AWS when needed
  • Integrated with many programming languages
  • Easy monitoring through AWS CloudWatch
  • Easy to get more resources per functions (up to 10GB of RAM)
  • Increasing RAM will also improve CPU and network

AWS Lambda language support

  • Node.js (JavaScript)
  • Python
  • Java (Java 8 compatible)
  • C# (.NET Core)
  • Golang
  • C# / Powershell
  • Ruby
  • Custom Runtime API (community supported, example Rust)
  • Lambda Container Image
    • The container image must implement the Lambda Runtime API
    • ECS / Fargate is preferred for running arbitrary Docker images

AWS Lambda Pricing: example

  • You can find overall pricing information here (opens in a new tab)
  • Pay per calls:
    • First 1,000,000 requests are free
    • $0.20 per 1 million requests thereafter ($0.0000002 per request)
  • Pay per duration: (in increment of 1 ms)
    • 400,000 GB-seconds of compute time per month for FREE
    • == 400,000 seconds if function is 1GB RAM
    • == 3,200,000 seconds if function is 128 MB RAM
    • After that $1.00 for 600,000 GB-seconds
  • It is usually very cheap to run AWS Lambda so it’s very popular

AWS Batch

AWS Batch dynamically provisions the optimal quantity and type of compute resources based on the volume and specific resource requirements of the batch jobs submitted. AWS Batch plans, schedules, and executes your batch computing workloads across the full range of AWS compute services and features, such as Amazon EC2 and Spot Instances.

  • Batch jobs are defined as Docker images and run on ECS
  • Batch figures out the optimal quantity and type based on volume of jobs and requirements
  • No need to manage clusters, fully serverless, just like glue
  • You just pay for the underlying EC2 instances that are created by batch itself
  • You can schedule Batch Jobs using CloudWatch Events
  • Orchestrate Batch Jobs using AWS Step Functions
  • Fully managed batch processing at any scale
  • Efficiently run 100,000s of computing batch jobs on AWS
  • A “batch” job is a job with a start and an end (opposed to continuous)
  • You don't provision the instances; there is Dynamic provisioning of the instances (EC2 or Spot Instances)
  • AWS Batch provisions the right amount of compute / memory
  • You submit or schedule batch jobs and AWS Batch does the rest!
  • Helpful for cost optimizations and focusing less on the infrastructure

Batch vs Lambda

No time limitTime limit
Any runtime as long as it’s packaged as a Docker imageLimited runtime
Rely on EBS / instance store for disk spaceLimited temporary disk space
Relies on EC2 (can be managed by AWS)Serverless