Amazon Q and Amazon Bedrock, fully managed vs. custom - 2025-06-25

1. Amazon Q and Bedrock, fully managed vs custom Alessandra Bilardi Data & Automation Specialist @ Corley Cloud >>AI CONF 2025

2. AI Conf 2025 Amazon Q and Bedrock, fully managed vs custom

3. AI Conf 2025 Oltre 500 progetti su AWS Corley Cloud è una realtà certiﬁcata con innumerevoli riconoscimenti e un portfolio di centinaia di progetti AWS sviluppati in diversi ambiti: cloud native, migrazione, machine learning & AI, serverless, IoT, sicurezza e cloudOps. Advanced Partner AWS

4. AI Conf 2025 Data & Automation Specialist @ Corley Cloud alessandra.bilardi@corley.it corley.it Alessandra Bilardi

5. AI Conf 2025 Alessandra Bilardi Data & Automation Specialist @ Corley Cloud alessandra.bilardi@corley.it corley.it

6. AI Conf 2025 Alessandra Bilardi Data & Automation Specialist @ Corley Cloud alessandra.bilardi@corley.it corley.it

7. AI Conf 2025 SUMMARY Machine learning steps and actors Generative AI with Amazon Q Generative AI with Amazon Bedrock Chat bot

8. AI Conf 2025 Machine learning steps and actors

9. AI Conf 2025 What are the steps of ML ? ➔ The data may arrive ready for learning, but often some processing is needed ➔ Model training could be delegated to an AI system, except for custom steps ➔ Evaluation is a prediction for which we know the expected values, for which we can calculate metrics ➔ The prediction works on new data processed with point 1 with the best model saved in point 3 Preparation Training & Tuning Testing & Evaluation Prediction (inference)

10. AI Conf 2025 ML system

14. AI Conf 2025 ML

15. AI Conf 2025 ML

16. AI Conf 2025 Are there other steps or actors in ML ? ➔ Embeddings are objects that contain information about text, images, videos, audio or code ➔ The prompt is the text that contains the behavior that the model must have, the instructions to follow to respond to the request posed. ➔ Augmented Generation (AG) techniques allow us to exploit generalist models by providing them with instructions (the prompt), context (an extract of the embeddings) and a request to obtain a speciﬁc response. Question AG Answer Embeddings & Prompt LM

17. AI Conf 2025 Use case - Chat bot - preparation steps

27. AI Conf 2025 Generative AI with Amazon Q

28. AI Conf 2025 Amazon Q

31. AI Conf 2025 Amazon Q Business

32. AI Conf 2025 Amazon Q Business 1. Embedding (as needed)

35. AI Conf 2025 Amazon Q Business 1. Embedding (as needed) 2. Request (question) 3. RAG 4. Response (answer)

36. AI Conf 2025 Amazon Q Business 1. Embedding (as needed) 2. Request (question) 3. RAG 4. Response (answer)

37. AI Conf 2025 Amazon Q Developer 1. Request

38. AI Conf 2025 Amazon Q Developer 1. Request 2. Embedding (as needed)

39. AI Conf 2025 Amazon Q Developer 1. Request 2. Embedding (as needed) 3. Context

40. AI Conf 2025 Generative AI with Amazon Bedrock

41. AI Conf 2025 Amazon Bedrock 1 2 Knowledge base (embedding) 3 Agent Models 4 Prompt

42. AI Conf 2025 Models

49. AI Conf 2025 Knowledge base

54. AI Conf 2025 Agent

62. AI Conf 2025 Prompt

67. AI Conf 2025 Flows

70. AI Conf 2025 Amazon Bedrock 1 2 Model evaluation 3 Playground Data automation 4 Prompt routers / caching

71. AI Conf 2025 Chat bot

74. AI Conf 2025 Use case - Chat bot - production version 1.0

85. AI Conf 2025 Which infrastructure for the ChatBot ?

86. AI Conf 2025 Solutions 1. like ChatGPT, max 30s

87. AI Conf 2025 Solutions 1. like ChatGPT, max 30s 2. extend 30s timeout

88. AI Conf 2025 Solutions 1. like ChatGPT, max 30s 2. extend 30s timeout Goals ● ↓ the response time ● ↓ the inference costs

89. AI Conf 2025 Inference ➔ AWS Lambda ➔ Amazon Fargate ➔ Amazon EC2

90. AI Conf 2025 Inference ➔ AWS Lambda ➔ Amazon Fargate ➔ Amazon EC2 ➔ Amazon SageMaker

91. AI Conf 2025 Inference ➔ AWS Lambda ➔ Amazon Fargate ➔ Amazon EC2 ➔ Amazon SageMaker ➔ Amazon Bedrock / Q

92. AI Conf 2025 Inference Needs ➔ GPU ➔ AWS Lambda ➔ Amazon Fargate ➔ Amazon EC2 ➔ Amazon SageMaker ➔ Amazon Bedrock / Q

93. AI Conf 2025 Inference Needs ➔ GPU ➔ Model loading ✖ ✖ ➔ AWS Lambda ➔ Amazon Fargate ➔ Amazon EC2 ➔ Amazon SageMaker ➔ Amazon Bedrock / Q ✖

94. AI Conf 2025 Inference Needs ➔ GPU ➔ Model loading ✖ ✖ ➔ AWS Lambda ➔ Amazon Fargate ➔ Amazon EC2 ➔ Amazon SageMaker ➔ Amazon Bedrock / Q ✖

95. AI Conf 2025 Comparison of solutions

96. AI Conf 2025 AWS Services Comparison for a Chatbot Services Difficulty Embeddings Training $ Inference $ Amazon Q Business $0.264 / hour / 200MB $20 / user / mo Bedrock fine tuning $2 / 1000 queries $0.0079 / 1k tokens $30 / hour Bedrock on demand $0.00072 for input / 1k tokens and for output / 1k tokens Amazon SageMaker $2 / 1000 queries $0.921 / hour $0.921 / hour

100. AI Conf 2025 Services Embeddings $ Training $ Inference $ Amazon Q Business 190 20 (user / mo) Bedrock fine tuning 2 1.5089 22320 Bedrock on demand 1.0714 (per 1k token) Amazon SageMaker 2 0.0154 3.68 (serverless) 685.22 (provisioned) AWS Services Comparison for a Chatbot Excluded from costs: ML storage, data processing and provisioned concurrency (serverless only) Example: 1 training of 191011 tokens + 1 request of 20s for every hour, every day for a month

104. AI Conf 2025 Services Embeddings $ Training $ Inference $ Amazon Q Business 190 Bedrock fine tuning 2 1.5089 22320 Bedrock on demand 1.0714 (per 1k token) Amazon SageMaker 2 0.0154 3.68 (serverless) 685.22 (provisioned) AWS Services Comparison for a Chatbot Excluded from costs: ML storage, data processing and provisioned concurrency (serverless only) Example: 1 training of 191011 tokens + 1 request of 20s for every hour, every day for a month

105. AI Conf 2025 Thanks for listening!

106. Thank you! >>AI CONF 2025 👉 slides & videos: https://www.improove.tech/videos

Amazon Q and Amazon Bedrock, fully managed vs. custom - 2025-06-25

More Related Content

Similar to Amazon Q and Amazon Bedrock, fully managed vs. custom - 2025-06-25 (20)

More from Alessandra Bilardi (20)

Recently uploaded (20)

Amazon Q and Amazon Bedrock, fully managed vs. custom - 2025-06-25