Cloud & MLOps ☁️
Higher-Level AI/ML Services

Higher-Level AI/ML Services

For higher level, not for ML experts.

Amazon Comprehend

  • Comprehensive NLP stuff

  • Natural Language Processing and Text Analytics

  • Input social media, emails, web pages, documents, transcripts, medical records (Comprehend Medical for privacy)

  • Extract key phrases, entities, sentiment, language, syntax, topics, and document classifications from your documents automatically

    • Entities - what does it think this thing is?

    • Key Phrase - important for the sentence

    • Syntax - classifies what everything is, proper noun, punctuation, auxiliary verb etc.

      • Could be a precursor to more NLP stuff
  • Can train on your own data

Amazon Translate

  • Uses deep learning for translation

  • Supports custom terminology

    • In CSV or TMX format (standard in translation)

    • Things that aren't in dictionary, like proper nouns, to handle specially

    • Appropriate for proper names, brand names, etc.

Can do from the API level aswell.

Amazon Transcribe

  • Speech to text

    • Input in FLAC, MP3, MP4, or WAV, in a specified language

    • Streaming audio supported (HTTP/2 or WebSocket)

      • French, English, Spanish only

      • Maybe for closed captions

  • Speaker Identification

    • Specify number of speakers

    • Can tell who is speaking

  • Channel Identification

    • i.e., two callers could be transcribed separately

    • Merging based on timing of "utterances"

  • Automatic Language Identification

    • Figures out the dominant language being spoken

    • You don't have to specify a language; it can detect the dominant one spoken.

  • Custom Vocabularies

    • Vocabulary Lists (just a list of special words - names, acronyms)

    • Vocabulary Tables (can include "SoundsLike", "IPA", and "DisplayAs")

    • May want to provide technical terms for making a course like this for example

Amazon Polly

  • Neural Text-To-Speech, many voices & languages

  • Lexicons

    • Customize pronunciation of specific words & phrases

    • Example: "World Wide Web Consortium" instead of "W3C"

      • Good for handling acronymns
  • SSML

    • Speech Synthesis Markup Language

    • Alternative to plain text

    • Gives control over emphasis, pronunciation, breathing, whispering, speech rate, pitch, pauses.

  • Speech Marks

    • Can encode when sentence / word starts and ends in the audio stream

    • Useful for lip-synching animation

      • you can get this ancillary data back that says this is where the sentence begins, this is where it ends, this is where this word starts, this is where this word ends and you can imagine using that for example if you need to lip sync some animation.


  • Computer vision

  • Object and scene detection

    • What is in this image?

    • Can use your own face collection

      • Identify individuals in photos
  • Image moderation

    • Look at images on a social media for example
  • Facial analysis

  • Celebrity recognition

  • Face comparison

  • Text in image

  • Video analysis

    • Objects / people / celebrities marked on timeline

    • When it found a specific person on the timeline in that video

    • People Pathing

      • Show the path people are following over time

Rekognition: The Nitty Gritty

  • Images come from S3, or provide image bytes as part of request

    • S3 will be faster if the image is already there
  • Facial recognition depends on good lighting, angle, visibility of eyes, resolution

  • Video must come from Kinesis Video Streams

    • H.264 encoded

    • 5 - 30 FPS

    • Favor resolution over framerate

  • Can use with AWS Lambda to trigger image analysis upon upload

  • Rekognition (which doesn't have an "edge mode," but does integrate with DeepLens) can't handle the very specific classification task of identifying different street signs and what they mean.

Rekognition Custom Labels

  • Train Rekognition with a small set of labeled images

  • Use your own labels for unique items

  • Example: the NFL (National Football League in the US) uses custom labels to identify team logos, pylons, and foam fingers in images.

Amazon Forecast

  • Time Series Forecasting

  • Fully-managed service to deliver highly accurate forecasts with ML

  • "AutoML" chooses best model for your time series data

    • ARIMA, DeepAR, ETS, NPTS, Prophet
  • Works with any time series

    • Price, promotions, economic performance, etc.

    • Can combine with associated data to find relationships

  • Inventory planning, financial planning, resource planning

  • Based on "dataset groups," "predictors," and "forecasts."

Amazon Lex

  • Billed as the inner workings of Alexa

  • Natural-language chatbot engine

    • Not as comlpicated
  • A Bot is built around Intents

    • Utterances invoke intents ("I want to order a pizza")

    • Lambda functions are invoked to fulfill the intent

    • Slots specify extra information needed by the intent

      • Pizza size, toppings, crust type, when to deliver, etc.
  • Can deploy to AWS Mobile SDK, Facebook Messenger, Slack, and Twilio

Amazon Lex Automated Chatbot Designer

  • Automate process of making a chatbot

  • You provide existing conversation transcripts

  • Lex applies NLP & deep learning, removing overlaps & ambiguity

  • Extracts Intents, user requests, phrases, values for slots automatically

  • Ensures intents are well defined and separated. Gives thought about intents

  • Integrates with Amazon Connect transcripts for customer service

Amazon Personalize

  • Fully-managed recommender engine

    • Same one Amazon uses
  • Primarily use API access

    • Feed in data indicates interests from users (purchases, ratings, impressions, cart adds, catalog, user demographics etc.)

      • via S3 or API integration

      • API is good for real-time

    • You provide an explicit schema in Avro format - catalog data

    • Javascript or SDK

      • Website can communicate directly
    • GetRecommendations

      • Recommended products, content, etc.

      • For a given user

      • Similar items

    • GetPersonalizedRanking

      • Rank a list of items provided

      • Allows editorial control / curation

      • use case where you don't really want to give people their actual recommendations. You have some ulterior motive here you want to push specific products.

      • Rank those items for a specific person

  • Console and CLI too

Amazon Personalize Features

  • Real-time or batch recommendations

  • Recommendations for new users and new items (the cold start problem)

    • Just recommend popular items
  • Contextual recommendations

    • Device type, time of day, etc.
  • Similar items

  • Unstructured text input

    • Can still give the catalog data
  • Intelligent user segmentation

    • For marketing campaigns

Amazon Personalize Terminology

  • Datasets

    • Users, Items, Interactions
  • Recipes


      • Items for this user and their interactions

      • Rank a set number of items for a specific user

      • No user context, what is a similar item
  • Solutions

    • Trains the model

    • Optimizes for relevance as well as your additional objectives

      • Video length, price, etc. - must be numeric

      • Optimize price, give boost to high ticket items.

    • Hyperparameter Optimization (HPO)

  • Campaigns

    • Operation side

    • Deploys your "solution version"

    • Deploys capacity for generating real-time recommendations

Amazon Personalize Hyperparameters

  • User-Personalization, Personalized-Ranking

    • hidden_dimension (HPO)

    • bptt (back-propagation through time - RNN)

      • Time to play a factor

      • Older entities count less

    • recency_mask (weights recent events)

      • Simpler
    • min/max_user_history_length_percentile

      • Crawlers will mess up the algo

      • Big institutions / large buyers will outweigh and not represent what individuals are interested in

      • (filter out robots, crwalers, institutional buyers)

    • exploration_weight 0-1, controls relevance

      • Why less relevance?

      • Show things to fill out space

    • exploration_item_age_cut_off - how far back in time you go

      • Don't pay past 5 years for a person
  • Similar-items

    • item_id_hidden_dimension (HPO)

    • item_metadata_hidden_dimension (HPO with min & max range specified)

Maintaining Relevance

  • Keep your datasets current

    • Incremental data import frequently

    • Feed in most recent data about user behavior for example

  • Use PutEvents operation to feed in real-time user behavior

  • Retrain the model

    • They call this a new solution version

    • Updates every 2 hours by default

    • Should do a full retrain (trainingMode=FULL) weekly

Amazon Personalize Security

  • Data NOT shared across accounts

  • Data may be encrypted with KMS

  • Data may be encrypted at rest in your region (SSE-S3)

  • Data in transit between your account and Amazon's internal systems encrypted with TLS 1.2

  • Access control via IAM

  • Data in S3 must have appropriate bucket policy for Amazon Personalize to process it

  • Monitoring & logging via CloudWatch and CloudTrail

Amazon Personalize Pricing

  • Data ingestion: per-GB

  • Training: per training-hour

  • Inference: per TPS-hour

    • Transactions per second averaged over each hour
  • Batch recommendations: per user or per item

Other ML Services

  1. Amazon Textract

    • OCR with forms, fields, tables support

    • Smart OCR and structure that data for you as it is being scanned in

  2. AWS DeepRacer

    • Reinforcement learning powered 1/18- scale race car
  3. DeepLens

    • Deep learning-enabled video camera

    • Integrated with Rekognition, SageMaker, Polly, Tensorflow, MXNet, Caffe

    • Perform Computer Vison on a camera

Industrial Applications

  1. Amazon Lookout

    • Equipment, metrics, vision

    • Detects abnormalities from sensor data automatically to detect equipment issues

    • Monitors metrics from S3, RDS, Redshift, 3rd party SaaS apps

    • Vision uses computer vision to detect defects in silicon wafers, circuit boards, etc.

  2. Amazon Monitron

    • a self-contained, higher-level Lookout

    • End to end system for monitoring industrial equipment & predictive maintenance

    • How Amazon Monitron works

  3. TorchServe

    • Model serving framework for PyTorch

    • Part of the PyTorch open source project from Facebook (Meta)

    • Operational side of PyTorch

  4. AWS Neuron

    • so the trend lately is for companies to develop their own microchips for deep learning, and Amazon is no exception.

    • SDK for ML inference specifically on AWS Inferentia chips

    • EC2 Inf1 instance type

    • if you provision an EC2 Inf1 instance type you'll get a computer that has this purpose-built Inferentia chip that's designed specifically for machine learning inference

    • Integrated with SageMaker or whatever else you want (deep learning AMI's, containers, Tensorflow, PyTorch, MXNet)

  5. AWS Panorama

    • Computer Vision at the edge

    • Deploy deep learning close to your cameras

    • More general than DeepLens

    • Brings computer vision to your existing IP cameras

    • Or for panorama enabled cameras

    • Results routed to S3, CloudWatch, whatever you want.

  6. AWS DeepComposer

    • AI-powered keyboard

    • Composes a melody into an entire song

    • For educational purposes

  7. Amazon Fraud Detector

    • Upload your own historical fraud data

    • Builds custom models from a template you choose

    • Exposes an API for your online application

    • Assess risk from:

    • New accounts

    • Guest checkout

    • "Try before you buy" abuse

    • Online payments

  8. Amazon CodeGuru

    • Automated code reviews!

    • Finds lines of code that hurt performance

    • Resource leaks, race conditions

    • Offers specific recommendations

    • Powered by ML

    • Supports Java and Python

  9. Contact Lens for Amazon Connect

    • For customer support call centers

    • Ingests audio data from recorded calls

    • Allows search on calls / chats

    • Sentiment analysis

    • Find "utterances" that correlate with

    • What led to more successful calls?

    • successful calls

    • Categorize calls automatically

    • Measure talk speed and interruptions

    • Can give rep feedback

    • Theme detection: discovers emerging issues

  10. Amazon Kendra

    • Enterprise search with natural language for data from your organization

    • For example, "Where is the IT support desk?" "How do I connect to my VPN?"

    • Combines data from file systems, SharePoint, intranet, sharing services (JDBC, S3) into one searchable repository

    • ML-powered (of course) - uses thumbs up / down feedback

    • Relevance tuning - boost strength of document freshness, view counts, etc.

  11. Amazon Augmented AI (A2I)

    • Human review of ML predictions

    • Builds workflows for reviewing low-confidence predictions

    • Access the Mechanical Turk workforce or vendors to get human beings looking at the predictions you don't have faith in

    • Integrated into Amazon Textract and Rekognition

    • Integrates with SageMaker

    • Very similar to Ground TruthVery similar to Ground Truth. it's just a little bit more flexible, a little more general purpose. it's actually focused on building those workflows in an intelligent manner and not just connecting with Mechanical Turk.

Putting Them Together

  • Build your own Alexa!

    • Transcribe -> Lex -> Polly
  • Make a universal translator!

    • Transcribe -> Translate -> Polly
  • Build a Jeff Bezos detector!

    • DeepLens -> Rekognition
  • Are people on the phone happy?

    • Transcribe -> Comprehend