AWS AI Practitioner - Preparation / Last Minute Revision Sheet

Hardik Joshi ☁☸

Senior Technical Program Manager @AWS ☁️ ♦💎 ♦ AWS SA Professional❇ ♦ Azure Solutions Architect Expert ⭐⭐⭐ ♦ Cloud Evangelist

Published Sep 3, 2024

+ Follow

Are you ready to give the AWS AI Practitioner Certification Exam?

Below is the list of all the material and prep notes that helped me pass the exam.

Hope it will be helpful to you.

KEY NOTES: PLEASE READ FIRST

Below are my personal notes taken from the AWS Skill Builder Free course I have taken for this exam. You can also take it by going to AWS Skill Builder.
It is not exhaustive list and may not be specific sequence also (as it my personal note and follows my way to noting and remembering) so just use it as additional help to your planning
Usually my idea was to write everything down while learning and then go through it again and in case I don't recall the given topic then I would go in depth to understand it further.
The below material should be read as per the Indentation levels. If you don't see proper indentation, message me directly and can email that to you.
Hope it helps you in prep, however way it can and most importantly WISH YOU GOOD LUCK..

Domain Level Revision below From the Course on AWS Skill Builder

Domain 1: Fundamentals of AI and ML

Known Data -> Features -> Algorithm -> Output

Adjustments

Inference

ML models can be trained on various types of data.

Structured data on RDS, S3 or Redshift

S3 is primary source of training data

Semi-structures = DynamoDB & DocumentDB

Unstructured data - tokenization

Timeseries - sequential data

Model Training - Algorithm

Inference 2 options

- Real time

Low Latency

High throughput

persistent endpoint

- - Batch Transform

Offline

Large datasets

Infrequent use

ML Types

Supervised Learning

Amazon Sagemaker GroundTruth -> Amazon Mechanical Turk

Unsupervised Learning

Reinforcement Learning

Reward - AWS DeepRacer

Overfitting

Model does well on training data but not outside it

Underfitting

Model cannot determine meaningful results. It gives negative results for training data and new inputs

Bias and fairness

Diversity of training data

Feature importance

Fairness constraints

Deep Learning

Neural Networks

Input Layer -> Hidden Layers -> Output Layer

Machine Learning vs Deep Learning

Consider alternatives when

Costs outweigh the benefits

Models cannot meet the interpretability requirements

Systems must be deterministic rather than probabilistic

ML Models are probabilistic

Supervised learning -

Classification

Binary - Diabetic or not diabetic

MultiClass

Regression

Simple Linear regression

Multiple Linear regression

Logistic regression

Unsupervised Learning

Clustering

Define features

Similarity function

Number of clusters

Anomaly detection

Data points that diverge

Amazon Rekognition

Facial comparison and analysis

Text detection

Object detection and labelling

Content moderation

Can find out explicit text from images and videos

Amazon Textract

Extract text from scanned documents

Amazon Comprehend

Extract key phrases, entities and sentiment.

Main is finding PII data

Amazon Lex

Conversational voice and text

Amazon Transcribe

Converts speech to text

Amazon Polly

Converts Text to speech

Amazon Kendra

Intelligent document search

Amazon Personalize

Personalized product recommendations

Amazon Translate

Translates between 75 languages

Amazon Forecast

Predicts future points in time-series data

Amazon Fraud Detector

Detects fraud and fraudulent activities

Amazon Bedrock

Amazon Sagemaker

ML Pipeline

Identify Business Goal -> Frame ML Problem -> Collect Data -> Pre-process Data -> Engineer Features -> Train, Tune Evaluate -> Deploy -> Monitor

Collect Data

AWS Glue -

Cloud optimized ETL service

Contains its own data catalog

Built in transformations

AWS Glue DataBrew

Point and click data transformation

200+ transformations

AWS SageMaker Ground Truth

Uses ML to label your training data

Can automatically label

AWS SageMaker Canvas

Import, Prepare, Transform, Visualize and analyze

AWS Sagemaker Feature Store

Processes raw data into features by using a processing workflow

Amazon Sagemaker Experiments

visual interface

Amazon Sagemaker automatic model tuning

Deploy

Batch inference

Real-time inference

Self-managed

Hosted

Amazon Sagemaker inference

Batch Transform

Offline inference

Large datasets

Asynchronous

Long processing times

Large payloads

Serverless

Intermittent traffic

Periods of no traffic

Real-time

Live predictions

Sustained traffic

Low latency

Consistent

Monitor the model

Configure alerts to notify and initiate actions if any drift

data drift / concept drift

Amazon Sagemaker Model Monitor

MLOps

Amazon SageMaker Model Building Pipelines

Repository Options

AWS Codecommit

AWS Sagemaker feature store

AWS Sagemaker model registry

3rd party repository

Orchestration options

Amazon Sagemaker pipelines

Amazon managed workflows for apache airflow

AWS Step functions

Accuracy = (True Positives + Ture Negatives) / Total

Precision = True Positives / (True Positivies + False Positives)

Recall = True Positives / (True Positives + False Negatives)

F1 = Precision Recall 2 / (Precision + Recall)

False Positive Rate FPR = False Positives / (True Negatives + False Positives)

True Negative Rate = True Negatives / (True Negatives + False Positives)

Area Under Curve - AUC

Regression Model Errors

Mean Squared Error

Root mean squared error

Mean absolute error

Domain 2: Fundamentals of Generative AI

AI - ML - DL - GAI

Model

In-context learning

Prompts, prompt tuning, prompt engineering

Every NLP has a tokenizer which converts texts into token ID's.

Vector - ordered list of numbers.

Ability to encode related relationships and collect associations

Embeddings

Numerical vectorized representations of type that capture the semantic meaning of the token

Self-attention

LLMs

Deep learning foundation models

Transformers

Unimodal or multimodal

Multimodal use cases

Multimodal tasks

Diffusion Models

Forward Diffusion

Reverse Diffusion

Stable Diffusion

Does not use pixel space of the image, uses a reduced-definition latent space

SageMaker + Amazon Q Developer

Amazon Nimble studio and amazon samarian

Gen AI Architectures

Generative Adversarial Networks GANs

Variational autoencoders VAE

Transformers

AI Project lifecycle

Identify User case

Experiment and select

Adapt, align and augment

Evaluate

Deploy and integrate

Monitor

Interpretability

Intrinsic analysis

Post hoc analysis

ML outputs are deterministic

Gen AI outputs are non-deterministic

Gen AI Performance metrics

Recall - Oriented Understudy for Gisting Evaluation (ROUGE)

Bilingual Evaluation Understudy (BLEU)

Transfer learning

SageMaker JumpStart

Domain 3: Applications of Foundation Models

Considerations

Architecture

Complexity

Availability

Compatibility

Explainability

Interpretability

Inference

It is the process of generating an output from an input that you provided to the model.

Input = Prompt and inference parameters

Randomness and Diversity

Temperature (Lower value = high probability outputs and Higher value = Low probability outputs)

Top K (Lower value = decrease the size of pool)

Top P

Length

Response Length

Penalties

Stop sequences

Prompt

A specific set of inputs to guide LLMs to generate an appropriate output or completion

RAG - Retrieval Augmented Generation (RAG)

Prompt enrichment and appending external data to your prompt

Vector Database

Collection of data stored as mathematical representations

AWS Services for Vector search databases

Amazon OpenSearch Service

Amazon OpenSearch Serverless

Amazon Aurora PostgreSQL

Amazon RDS PostgreSQL

Amazon Aurora

Amazon Neptune

Amazon DocumentDB [with MongoDB compatibility]

Amazon Bedrock AGENTS

Orchestrate prompt completion workflows

Prompt

Zero shot prompting

Few shot prompting

Prompt Template

Chain-of-thought prompting

Prompt tuning

Latent space

The encoded knowledge of language in LLMs or the stored patterns of data that capture relationships and reconstruct the language from the patterns when prompted

Statistical database

Prompt Engineering risks and limitations

Exposure

Prompt Injection

Jailbreaking

Hijacking

Poisoning

Training process for foundation models

Pretraining - Self supervised learning

Fine-tuning - Supervised learning :: Catastrophic forgetting

Continuous pre-training

Fine-tuning techniques

Parameter-efficient fine-tuning (PEFT)

Low-Rank Adaptation (LoRA)

Representation fine-tuning (ReFT)

Multitask fine-tuning

Domain adaption fine-tuning

Reinforcement learning from human feedback (RLHF)

Data preparation fine-tuning

Prepare your training data

Select prompts

Calculate loss

Update weights

Define evaluation steps

Data preparation AWS Services

Amazon SageMaker Canvas

Open-source frameworks

Amazon Sagemaker studio - integration with EMR, can use jupyter labs

Amazon Glue

Amazon SageMaker Feature Store

Amazon SageMaker Clarify -- if you have bias in your data

Amazon SageMaker Ground Truth -- manage data labelling

Model performance

One option to reduce inference latency is to decrease the size of LLMs but might decrease its performance

Gen AI Performance Metrics

Recall Oriented Understudy for Gisting Evaluation (ROUGE)

Automatic summarization tasks

Machine translation software

Bilingual Evaluation Understudy (BLEU)

Used for translation tasks

General Language Understanding Evaluation (GLUE)

Compare against benchmarks set by the experts

Access model generalization across multiple tasks

Holistic Evaluation of Language Models (HELM)

Help improve model transparency

Massive Multitask Language Understanding (MMLU)

Evaluates knowledge and problem solving capabilities of the model

Tested against history, mathematics, laws, computer science and more

Beyond the Imitation Game Benchmark (BIG-bench)

Focuses on tasks that are beyond the capabilities of the current language models

AWS Services for model evaluation

Amazon SageMaker JumpStart

Amazon SageMaker Clarify

Review these materials to learn more about the topics covered in this exam domain:

Domain 4: Guidelines for Responsible AI

Responsible AI

Fairness

Explainability

Robustness

Privacy and security

Governance

Transparency

Effects of bias and variance

Demographic disparities

Inaccuracy

Overfitting

Underfitting

User Trust

Responsible datasets

Inclusivity

Diversity

Balanced datasets

Privacy protection

Consent and transparency

Regular audits

Responsible practices

Environmental considerations

Sustainability

Transparency

Accountability

Stakeholder engagement

AWS service for this

Amazon SageMaker Clarify

Detect bias

Explainability

SageMaker Processing jobs

SageMaker pre-training bias analysis

Class imbalance

Label imbalance

Demographic disparity

Difference in positive proportions

Specificity difference

Recall difference

Accuracy difference

Treatment equality

Gen AI Risks

Hallucinations

Intellectual Property

Bias

Toxicity

Data privacy

Guardrails for Amazon Bedrock

Hate

Insults

Sexual

Violence

+ Denied topics

Model transparency

Interpretability - Deep analysis

Explainability - black box analysis

AI Service Card

Amazon SageMaker Model Cards

Sagemaker provides

Feature attributions - SHAP Values

Partial dependence plots

Amazon Augmented AI (A2I) - send data to human reviewers to review random predictions.

Use your own reviewers or use mechanical turf

Domain 5: Security, Compliance, and Governance for AI Solutions

IAM Identity Center

Workforce users, Workforce identities

Logging with CloudTrail

Captures API calls and related events

Integrated with SageMaker

Amazon SageMaker Role Manager

Preconfigured permissions for 12 activities

Encryption at rest

Amazon SageMaker

Data is encrypted by default on ML storage volumes

Notebook instances, SageMaker jobs, and endpoints

AWS Key Management Service - KMS

Amazon Macie

Identifies and alerts you to sensitive data

Remove PII during ingestion

AI System Vulnerabilities

Training Data

Input Data

Output Data

Models

Inversion

Theft

LLM's

Prompt Injection

Amazon SageMaker Model Monitor

Capture data

Create a baseline

Define data quality monitoring jobs

Evaluate statistics

Amazon SageMaker Model Registry

Amazon SageMaker Model Cards

Amazon SageMaker ML Lineage Tracking

Amazon SageMaker Feature Store

Amazon SageMaker Model Dashboard

Emerging AI compliance standards

ISO 42001 and ISO 23894

EU Artificial Intelligence Act

NIST AI Risk Management Framework (RMF)

AI Risk Management

Probability of occurrence

Severity of occurrence

Algorithmic Accountability Act

Transparency and explainability

Monitor for Bias

AWS Audit Manager

Audits AWS usage to assess compliance

Choose a framework

Gen AI

Customer frameworks

Collect evidence and add to audit report

Guardrails for Amazon Bedrock

Apply guardrails to any foundation model and agents for Amazon Bedrock

Configure harmful content filtering

Define and disallow denied topics

PII data

AWS Config

Continuously monitors and records configurations

AWS Config rules

Conformance packs

Operational best practices for AI and ML

Security best practices for Amazon SageMaker

Amazon Inspector

Works at application level

Performs automated security assessments on your applications

AWS Trusted Advisor

Provides guidance to help you

Reduce cost

Increase performance

Improve security

Data Governance

Curation

Discovery and understanding

Protection

Define roles

Data steward

Data owner

IT Roles

AWS Glue DataBrew for data goverance

Data profiling

Data Lineage

AWS Glue Data Catalog

AWS Glue Data Quality

Curation

Data Quality Management

Data Integration

Data Management

Protection

Data Security

Data Compliance

Data Lifecycle management

Review these materials to learn more about the topics covered in this exam domain:

GENERAL LINKS - For Revision

What are Transformers in Artificial Intelligence? -> aws.amazon.com/what-is/transformers-in-artificial-intelligence/

What are Foundation Models? -> aws.amazon.com/what-is/foundation-models/

What is Artificial Intelligence (AI)? -> aws.amazon.com/what-is/artificial-intelligence/

What is Machine Learning? -> aws.amazon.com/what-is/machine-learning/

What is Deep Learning? -> aws.amazon.com/what-is/deep-learning/

What is Generative AI? -> aws.amazon.com/what-is/generative-ai/

What’s the Difference Between Supervised and Unsupervised Learning? -> aws.amazon.com/compare/the-difference-between-machine-learning-supervised-and-unsupervised/

Machine Learning Concepts -> docs.aws.amazon.com/machine-learning/latest/dg/machine-learning-concepts.html

AWS AI Use Case Explorer -> aws.amazon.com/machine-learning/ai-use-cases/?use-cases

What is Amazon SageMaker? -> docs.aws.amazon.com/sagemaker/latest/dg/whatis.html

AWS Services - Machine Learning (ML) and Artificial Intelligence (AI) -> docs.aws.amazon.com/whitepapers/latest/aws-overview/machine-learning.html

AWS Deploy Serverless ML ->aws.amazon.com/blogs/machine-learning/deploy-a-serverless-ml-inference-endpoint-of-large-language-models-using-fastapi-aws-lambda-and-aws-cdk/

AWS Sagemaker - API Gateway - AWS Lambda -> aws.amazon.com/blogs/machine-learning/call-an-amazon-sagemaker-model-endpoint-using-amazon-api-gateway-and-aws-lambda/

Inference parameters ->docs.aws.amazon.com/bedrock/latest/userguide/inference-parameters.html

Inference parameters -> docs.aws.amazon.com/bedrock/latest/userguide/inference-parameters.html?icmpid=docs_bedrock_help_panel_playgrounds

Amazon Bedrock or Amazon SageMaker? -> docs.aws.amazon.com/decision-guides/latest/bedrock-or-sagemaker/bedrock-or-sagemaker.html

Choosing a generative AI service -> docs.aws.amazon.com/decision-guides/latest/generative-ai-on-aws-how-to-choose/guide.html

AWS Bedrock Agents -> aws.amazon.com/bedrock/agents/

What is RAG? - Retrieval-Augmented Generation AI Explained - AWS (amazon.com )

docs.aws.amazon.com/awscloudtrail/latest/userguide/how-cloudtrail-works.html

docs.aws.amazon.com/bedrock/latest/userguide/usingVPC.html

aws.amazon.com/blogs/machine-learning/use-aws-privatelink-to-set-up-private-access-to-amazon-bedrock/

AWS AI Practitioner - Preparation / Last Minute Revision Sheet

Hardik Joshi ☁☸

Senior Technical Program Manager @AWS ☁️ ♦💎 ♦ AWS SA Professional❇ ♦ Azure Solutions Architect Expert ⭐⭐⭐ ♦ Cloud Evangelist

More articles by this author

Others also viewed

April 2025

New AWS Certified AI Practitioner and Machine Learning Engineer Associate Certifications

Unlock the Power of Machine Learning in Data Science & AI

Navigating the Modern Landscape of Machine Learning Training Infrastructure

Azure AI for LLMOps: Key Features and Tools

Model Training and Distillation with AWS SageMaker and Amazon Bedrock

Machine learning operations: How to enhance AI projects and boost value streams

🤖100 Days of Generative AI - Building Your First RAG Application using AWS Bedrock Knowledge Bases - Day 9 🤖

AWS re:Invent 2020 - Machine Learning Keynote Recap

Selecting a Cloud Provider for AI Services

Explore topics

“AWS Solutions Architect Associate (SAA) / Professional (SAP) – Prep Guide” – Part 4

Dec 1, 2021

“AWS Solutions Architect Associate (SAA) / Professional (SAP) – Prep Guide” – Part 3

Nov 30, 2021

“AWS Solutions Architect Associate (SAA) / Professional (SAP) – Prep Guide” – Part 2

Nov 29, 2021

“AWS Solution Architect Associate (SAA) / Professional (SAP) – Prep Guide” – Part 1

Nov 28, 2021

AWS Step Functions - Order Processing Flow

Sep 13, 2021

My Learning Journey during COVID-19

Jun 7, 2021

5 Steps to your Dream Goals

Mar 18, 2021

Team Culture - Secret Ingredients

Jan 27, 2021

Would you move the rock?

Dec 11, 2020

What do you see?

Oct 6, 2020

Others also viewed

April 2025

New AWS Certified AI Practitioner and Machine Learning Engineer Associate Certifications

Unlock the Power of Machine Learning in Data Science & AI

Navigating the Modern Landscape of Machine Learning Training Infrastructure

Azure AI for LLMOps: Key Features and Tools

Model Training and Distillation with AWS SageMaker and Amazon Bedrock

Machine learning operations: How to enhance AI projects and boost value streams

🤖100 Days of Generative AI - Building Your First RAG Application using AWS Bedrock Knowledge Bases - Day 9 🤖

AWS re:Invent 2020 - Machine Learning Keynote Recap

Selecting a Cloud Provider for AI Services

Explore topics