AWS AI, the practical way

An architecture-first reference for the Amazon AI stack as of June 2026. From Amazon Bedrock and the Nova model family, to Bedrock AgentCore for production agents, to SageMaker for custom models, to the applied-AI services. Trade-offs, pricing shape, and risks. No marketing.

Refreshed June 2026Architecture-firstEnterprise focusVendor-neutral

TL;DR

AWS's AI story in 2026 has three layers. Amazon Bedrock is the managed gateway to dozens of foundation models (Anthropic Claude, Meta Llama, Mistral, DeepSeek, NVIDIA Nemotron, and Amazon's own Nova) behind one API, with Knowledge Bases, Guardrails, and customization. Bedrock AgentCore turned the 2025 agent preview into a managed runtime for production agents - memory, gateway/tools, identity, observability, web search, and (preview) payments. SageMaker - now repositioned as the unified center for data + analytics + AI - is where you train, fine-tune, and host custom models. If you already run on AWS, the data-gravity and IAM integration make this stack the path of least resistance.

The AWS AI mental model

Think of three layers. Most teams start at the top (consume a model via Bedrock) and only drop down when they need custom training.

Figure 1 - The AWS AI stack is layered. Start at Layer 3/2 (Bedrock); drop to Layer 1 only when you need custom training or chips.

What sets AWS apart in 2026

Differentiator	What it means in practice
Widest managed model catalog	Bedrock fronts Anthropic Claude, Meta Llama, Mistral, Cohere, AI21, DeepSeek, NVIDIA Nemotron, Stability, and Amazon Nova behind one API and one bill. Switching models is a parameter change, not a re-architecture.
Anthropic relationship + Trainium	Deep Anthropic partnership (Project Rainier Trainium clusters) means frontier Claude models are first-class on Bedrock, often with strong price/perf on AWS silicon.
AgentCore as managed runtime	Memory, Gateway (tools/MCP), Identity, Observability, Browser, Code Interpreter, Web Search, and Payments (preview) - framework-agnostic (Strands, LangChain, OpenAI Agents SDK, Claude Agent SDK).
Data gravity + IAM	If your data is already in S3/Redshift/Aurora, RAG ground truth and access control are native. No new identity plane.
Custom silicon economics	Trainium/Inferentia give a cost lever for training and high-volume inference that pure-GPU clouds cannot match on price.

Where AWS is weaker (be honest)

Own frontier model

Amazon Nova is competitive on price/latency and improving fast, but it is not the model you reach for when you need the absolute top of the reasoning leaderboard - that is usually Claude (also on Bedrock) or a competitor's flagship. Amazon's bet is breadth, integration, and silicon economics, not owning the #1 model.

Surface area & sprawl

The catalog of overlapping services (Bedrock vs SageMaker vs Q vs applied-AI, three vector stores, two studios) is large. Picking the right primitive is itself an architecture decision - see the Decision Matrix tab.

How to read this portal

Each service tab follows the same shape: what it is, architecture, when to use, and risks. If you only read one tab, read Risks & Gotchas. The other tabs tell you what something does; Risks tells you what bites you in production.

What's New - late 2025 through June 2026

Material changes that affect architecture, cost, or risk. Curated, not a press-release dump.

TL;DR

The dominant 2026 theme is agents going to production: Bedrock AgentCore added managed Knowledge Bases, a managed agent harness, native Web Search, and (preview) autonomous Payments. Model breadth widened (NVIDIA Nemotron 3 on Bedrock, Nova Forge for Nova customization, Reinforcement Fine-Tuning). And SageMaker was repositioned as the unified data+AI center, with SageMaker Unified Studio now GA and Amazon Q Developer embedded throughout.

Date	Release	Why it matters
Dec 2025	Next-gen SageMaker + Unified Studio (re:Invent)	SageMaker repositioned as the single center for data, analytics, and AI - Glue, EMR, Athena, Redshift, Bedrock, and SageMaker AI in one workspace with a lakehouse.
Dec 2025	Trainium3 announced	Next-gen training/inference silicon; continues AWS's price/perf lever vs pure-GPU stacks. Confirm region/instance availability before designing around it.
Jan 2026	SageMaker Unified Studio GA + Amazon Q Developer GA in Studio	Data professionals get GenAI assistance across the lifecycle; Bedrock and SageMaker AI usable from one IDE.
Feb 2026	Reinforcement Fine-Tuning in Bedrock	Tailor models to narrow tasks with reward signals - higher accuracy on domain workflows without full training.
Mar 2026	NVIDIA Nemotron 3 Super on Bedrock; Nova Forge SDK	Open-weight frontier reasoning model available managed; Nova Forge lets enterprises customize Nova on their data and deploy inside Bedrock.
Apr 2026	AgentCore Payments (preview)	Agents can autonomously pay for APIs, MCP servers, web content, and other agents - built with Coinbase and Stripe. New control-plane and audit considerations.
May 2026	Agent Toolkit for AWS; AgentCore managed harness	Declare and run an agent in ~3 API calls, no orchestration code. Lowers time-to-first-agent dramatically.
Jun 2026	AWS Summit NY: Managed Knowledge Bases (Smart Parsing, Agentic Retriever), Web Search on AgentCore (GA), Amazon Quick, S3 Annotations, EC2 G7 (RTX PRO Blackwell)	Fully-managed RAG with multi-format parsing; grounded answers with zero data egress; mutable per-object context in S3; new inference GPU tier.

Practical read

If you piloted Bedrock Agents in 2025, plan a migration review to AgentCore: the managed Memory, Gateway, Identity, and Observability replace a lot of custom glue. If you run SageMaker Studio (classic), plan the move to Unified Studio.

Service Map

The AWS AI services worth knowing, grouped by what you do with them.

COREAmazon Bedrock

Managed multi-model API: catalog, Knowledge Bases, Guardrails, Flows, Evaluations, customization, AgentCore.

MODELSAmazon Nova

Amazon's own FM family: Micro, Lite, Pro, Premier, plus Canvas (image), Reel (video), Sonic (speech). Forge to customize.

AGENTSBedrock AgentCore

Runtime, Memory, Gateway, Identity, Observability, Browser, Code Interpreter, Web Search, Payments (preview).

BUILDSageMaker AI + Unified Studio

Train, fine-tune, host custom models; HyperPod for FM training; one studio over data+analytics+AI.

ASSISTAmazon Q

Q Developer (coding/ops agent), Q Business (enterprise RAG assistant), Q in QuickSight/Connect, Amazon Quick.

APPLIEDApplied AI

Rekognition, Textract, Comprehend, Transcribe, Polly, Translate, Lex, Kendra, Personalize.

DATAVectors & Data

S3 Vectors, OpenSearch vector, Aurora/RDS pgvector, MemoryDB, Kendra GenAI Index, Bedrock Data Automation.

SILICONChips & GPUs

Trainium2/3, Inferentia2, EC2 P5/P6 (Blackwell), G7, UltraClusters, Capacity Blocks.

GOVERNGuardrails

Content filters, denied topics, PII redaction, contextual grounding, Automated Reasoning checks.

Amazon Bedrock

The managed, serverless gateway to foundation models. One API, one IAM model, one bill, many vendors.

Official documentation ↗

Overview

Capabilities

When to use

Risks

Bedrock exposes many foundation models through a unified API. You never manage servers; you call InvokeModel / Converse and pay per token (on-demand) or reserve capacity (Provisioned Throughput). It is the default starting point for almost any GenAI workload on AWS.

ServerlessConverse APIStreamingCross-region inferenceBatchPrompt caching

Capability	What it does
Model catalog & Marketplace	First-party and partner FMs, plus 100+ models via Bedrock Marketplace; import your own custom weights.
Knowledge Bases	Managed RAG: ingest from S3 and connectors, chunk/embed, retrieve. 2026 adds Smart Parsing and an Agentic Retriever.
Guardrails	Independent safety layer: content filters, denied topics, PII redaction, contextual grounding, Automated Reasoning checks.
Flows	Visual orchestration of prompts, models, KBs, and Lambda into a deployable workflow.
Evaluations	Automatic and LLM-as-judge evaluation of model and RAG quality before you ship.
Customization	Fine-tuning, continued pre-training, distillation, and Reinforcement Fine-Tuning.
Prompt caching & cross-region	Cut cost/latency on repeated context; route to capacity in other regions automatically.

You want model optionality without re-architecting - swap Claude / Llama / Nova with a parameter.
You need managed RAG, guardrails, and evaluation without standing up infrastructure.
Your data and identity already live in AWS.

Rule of thumb

Start in Bedrock. Drop to SageMaker only when you need custom training, exotic hosting, or a model not in the catalog.

Region/model availability

Not every model is in every region. Confirm the exact model+region before you design around it; cross-region inference helps but has data-residency implications.

Cost surprises

On-demand token pricing varies widely by model. A flagship model in a chatty agent loop can be 10-30x the cost of a small model. Set budgets, cache prompts, and right-size the model per task.

Foundation Model Catalog

Indicative view of model families on Bedrock in 2026. Exact versions and regions change frequently - confirm in the console.

Official documentation ↗

Provider	Families	Typical use
Anthropic	Claude (Opus / Sonnet / Haiku tiers)	Top-tier reasoning, agents, coding, long context. The frontier default on Bedrock.
Amazon	Nova Micro / Lite / Pro / Premier; Canvas, Reel, Sonic	Cost/latency-optimized text and multimodal; image, video, and speech generation.
Meta	Llama (open weights)	Open-weight workloads, customization, on-prem parity.
Mistral	Mistral / Mixtral	Efficient European open-weight options.
DeepSeek	DeepSeek-R1 and successors	Strong open reasoning at low cost.
NVIDIA	Nemotron 3 (Super)	Open-weight frontier reasoning/agentic, hosted managed.
Cohere / AI21 / Stability	Command / Embed / Rerank, Jamba, Stable Diffusion / Image	Embeddings, reranking, long-context, image generation.

Embeddings + rerank

For RAG, pair an embedding model (Amazon Titan Text Embeddings, Cohere Embed) with a reranker (Cohere Rerank) for a quality lift at low engineering cost.

Amazon Nova

Amazon's own foundation-model family - optimized for price, latency, and AWS integration.

Official documentation ↗

Model	Modality	Best for
Nova Micro	Text	Cheapest, fastest text - classification, routing, simple extraction at scale.
Nova Lite	Multimodal (text+image/video in)	Low-cost multimodal understanding, high-volume workloads.
Nova Pro	Multimodal	Balanced capability/cost for most enterprise tasks and agents.
Nova Premier	Multimodal, most capable	Complex reasoning; also the teacher model for distillation.
Nova Canvas	Image generation	Studio-quality images with content credentials/watermarking.
Nova Reel	Video generation	Short-form video from text/image prompts.
Nova Sonic	Speech-to-speech	Real-time voice interactions with low latency.

Nova Forge (2026)

Forge SDK lets you customize Nova on domain data (fine-tune/distill) and deploy directly within Bedrock - useful when you want Nova's economics with your own task accuracy.

Positioning

Use Nova where cost and latency dominate and the task is well-scoped. For the hardest reasoning, A/B it against Claude on the same Bedrock API before committing.

Amazon Bedrock AgentCore

The managed runtime for production agents. Framework-agnostic - bring Strands, LangChain, OpenAI Agents SDK, or the Claude Agent SDK.

Official documentation ↗

Figure 2 - AgentCore modules. Mix and match; you don't have to adopt all of them.

Module	What it gives you	Status
Runtime / Harness	Managed serverless execution; declare and run an agent in ~3 API calls, no orchestration code.	GA
Memory	Short-term and long-term memory stores so agents retain context across turns and sessions.	GA
Gateway	Turn APIs, Lambda, and MCP servers into governed agent tools with auth and access control.	GA
Identity	Scoped, least-privilege access for agents; policies verified by Automated Reasoning (same tech as IAM/S3).	GA
Observability	Traces of every step, tool call, and where the agent went off track; evaluation against real traffic.	GA
Browser & Code Interpreter	Headless browsing and sandboxed code execution as managed tools.	GA
Web Search	Grounded, cited answers from the live web with zero data egress from your AWS environment.	GA
Payments	Agents autonomously pay for APIs, MCP servers, content, and other agents (Coinbase/Stripe).	Preview

Agentic payments = new risk class

An agent that can spend money needs hard budget caps, human-in-the-loop thresholds, and immutable audit. Treat AgentCore Payments as a controlled pilot, not a default.

Knowledge Bases & RAG

Managed retrieval-augmented generation - the most common enterprise GenAI pattern.

Official documentation ↗

Bedrock Knowledge Bases ingest from S3 and connectors, chunk and embed content, store vectors (OpenSearch Serverless, Aurora pgvector, S3 Vectors, and more), and retrieve relevant passages at query time. The 2026 fully managed version adds Smart Parsing (automatic multi-format prep: PDFs, tables, images) and an Agentic Retriever for multi-step queries.

Use Knowledge Bases when

You want managed RAG with minimal code, your corpus is mostly documents, and you value Smart Parsing and built-in retrieval quality.

Build your own when

You need fine control over chunking, hybrid search, custom rerankers, or a vector store you already operate (e.g. OpenSearch with bespoke pipelines).

Bedrock Data Automation

For multimodal corpora (documents, images, audio, video), BDA extracts structured output you can feed into a Knowledge Base - a cleaner pipeline than rolling your own parsers.

Guardrails & Governance

An independent safety layer you apply to any model - first-party or imported.

Official documentation ↗

Control	What it catches
Content filters	Hate, insults, sexual, violence, misconduct, prompt attacks - tunable thresholds.
Denied topics	Block subjects out of scope for your application.
Sensitive info / PII	Detect and redact or block PII and custom regex patterns.
Contextual grounding	Score answers for grounding against source and relevance to the query - reduce hallucination.
Automated Reasoning checks	Mathematically verify outputs against encoded policies/rules - high-assurance domains.

Apply at the platform layer

Guardrails sit between the app and the model, so the same policy applies regardless of which model the agent picks. Validate prompts and responses here, not only in app code.

SageMaker AI

Where you train, fine-tune, and host models when Bedrock's managed path isn't enough.

Official documentation ↗

Component	Use
JumpStart	One-click deploy/fine-tune of open and partner foundation models.
Training & Inference	Managed training jobs and real-time/serverless/async/batch endpoints with autoscaling.
HyperPod	Resilient, large-scale clusters for foundation-model pre-training and heavy fine-tuning (self-healing across thousands of accelerators).
Pipelines / Model Registry	MLOps: reproducible pipelines, lineage, approval gates, deployment.
Clarify / Model Monitor	Bias/explainability and drift detection in production.

Bedrock vs SageMaker

Bedrock = consume/customize managed models, fast. SageMaker = full control of training, hosting, and MLOps. Many teams use both: Bedrock for the app, SageMaker for the custom model behind it.

SageMaker Unified Studio

The single workspace over data, analytics, and AI - GA in 2026.

Official documentation ↗

Unified Studio brings EMR, Glue, Athena, Redshift, Bedrock, and SageMaker AI into one IDE on a lakehouse foundation, with Amazon Q Developer embedded for code, troubleshooting, and ETL. It replaces the older SageMaker Studio Classic experience and stitches the data and AI lifecycles together so the same governed data powers both analytics and model building.

LakehouseGlue / EMR / AthenaRedshiftBedrockQ DeveloperGovernance / catalog

Migration

If you run Studio Classic, plan the move to Unified Studio - newer Bedrock and governance features land here first.

Model Customization

Four ways to make a model better at your task, from cheapest to most involved.

Official documentation ↗

Technique	When	Cost/effort
Prompt + RAG	Most tasks - ground the model in your data without changing weights.	Low
Fine-tuning	Consistent style/format or narrow task accuracy from labeled examples.	Medium
Reinforcement Fine-Tuning	Optimize toward a reward signal where correctness is checkable (2026).	Medium-High
Distillation	Teach a small, cheap model from a large one - keep quality, cut cost/latency.	Medium
Continued pre-training	Inject large domain corpora; rarely needed for most enterprises.	High

Order of operations

Exhaust prompt engineering and RAG first. Fine-tune only when you have evidence the base model can't hit your accuracy/format bar. Distill once a fine-tuned large model proves out, to cut run-cost.

Amazon Q

AWS's family of GenAI assistants for developers, businesses, and operations.

Official documentation ↗

Product	What it does
Q Developer	Agentic coding and ops assistant - code generation, transformation/modernization, troubleshooting, and AWS console help. Embedded in IDEs and SageMaker Unified Studio.
Q Business	Enterprise RAG assistant over your apps and documents (40+ connectors), with access controls inherited from the source systems.
Amazon Quick	2026 evolution toward autonomous background agents with specialized expertise; an activity feed across email, messaging, calendar, and tasks.
Q in QuickSight / Connect	Natural-language BI and contact-center assistance embedded in those services.

Build vs buy

For internal knowledge assistants, pilot Q Business before building custom RAG - the connectors and permission inheritance save real engineering. Build custom on Bedrock when you need bespoke UX or logic Q can't express.

Applied AI Services

Task-specific managed APIs - no model selection, just call them.

Official documentation ↗

Service	Task
Rekognition	Image/video analysis: labels, faces, moderation, text-in-image.
Textract	Document extraction: text, forms, tables, queries from PDFs/images.
Comprehend	NLP: entities, sentiment, key phrases, PII, custom classification.
Transcribe	Speech-to-text with diarization, custom vocabulary, call analytics.
Polly	Text-to-speech with neural and generative voices.
Translate	Neural machine translation across many languages.
Lex	Conversational bots (the engine behind many IVR/chat flows).
Kendra	Enterprise search; the GenAI Index feeds RAG with permission-aware retrieval.
Personalize	Real-time recommendations from your interaction data.

Trend

Several classic tasks (doc extraction, classification, summarization) are increasingly done with Bedrock + a multimodal model or Bedrock Data Automation. Use the applied service when it is cheaper, lower-latency, or compliance-certified for that exact task; reach for Bedrock when you need flexibility.

Vectors & Data

Where your embeddings and ground-truth live. AWS gives you several stores - pick by scale, latency, and what you already run.

Official documentation ↗

Store	Best for
S3 Vectors	Cost-optimized vector storage/query at massive scale directly in S3 (2026) - cheapest for large, less latency-sensitive corpora.
OpenSearch Serverless (vector)	Low-latency hybrid (keyword + vector) search; the common Knowledge Base default.
Aurora / RDS PostgreSQL (pgvector)	Vectors next to relational data with transactional consistency.
MemoryDB / DocumentDB / Neptune Analytics	In-memory vectors, document-store vectors, and graph+vector analytics respectively.
Kendra GenAI Index	Managed, permission-aware retrieval index purpose-built for RAG.

Default

Most teams start with a Bedrock Knowledge Base on OpenSearch Serverless. Move to S3 Vectors for cost at scale, or pgvector when vectors must sit beside operational rows.

Chips & GPUs

The silicon under the stack. AWS's custom chips are the cost lever; NVIDIA GPUs are the compatibility/flexibility lever.

Official documentation ↗

Silicon	Role
Trainium2 / Trainium3	AWS training (and increasingly inference) accelerators; Trn2 UltraServers and Project Rainier power large Anthropic/enterprise training at strong price/perf.
Inferentia2	Cost-efficient high-volume inference.
EC2 P5 / P6 (NVIDIA Blackwell)	Top-end GPU training/inference; maximum framework compatibility.
EC2 G7 (RTX PRO Blackwell)	2026 graphics/inference tier for cost-effective serving and visual workloads.
UltraClusters / Capacity Blocks / HyperPod	Network-dense GPU/Trainium fabrics; reserve capacity windows; resilient FM-training clusters.

Architect's lever

For high-volume inference, benchmark Inferentia2/Trainium against GPU instances - the price difference can dominate TCO. Keep GPUs where you need a specific CUDA/framework path.

Architecture Patterns

The handful of shapes most AWS GenAI workloads fall into.

1. Managed RAG assistant

Bedrock + Knowledge Base (OpenSearch/S3 Vectors) + Guardrails, fronted by API Gateway/Lambda or Q Business. The default enterprise knowledge assistant.

2. Production agent

AgentCore Runtime + Memory + Gateway (tools/MCP) + Identity + Observability. Add Web Search for grounding. Human-in-the-loop on high-impact actions.

3. Custom model service

SageMaker fine-tune/host (or import to Bedrock) behind a private endpoint; distill to cut cost once quality is proven.

4. Multimodal pipeline

Bedrock Data Automation extracts from docs/images/audio/video into structured output feeding a Knowledge Base or warehouse.

5. Batch enrichment

Bedrock batch inference over large datasets in S3 for classification, summarization, or embedding generation at lowest cost.

6. Embedded BI/ops assistant

Q in QuickSight/Connect, or Q Developer in the SDLC - buy the assistant rather than build it.

Decision Matrix

Fast answers to the questions that come up in every design review.

Question	Default answer
Consume a model or train one?	Consume via Bedrock. Train/fine-tune in SageMaker only with evidence the base model can't meet the bar.
Which model?	Claude for hardest reasoning/agents; Nova for cost/latency; Llama/Mistral/DeepSeek for open-weight/customization. A/B on the same Bedrock API.
Build RAG or use Knowledge Bases?	Knowledge Bases unless you need bespoke chunking/hybrid/rerank control.
Bedrock Agents or AgentCore?	AgentCore for anything heading to production - managed memory, tools, identity, observability.
Which vector store?	OpenSearch Serverless default; S3 Vectors for cost at scale; pgvector for vectors beside relational data.
Buy an assistant or build?	Q Business/Q Developer first; build on Bedrock when you need custom UX/logic.
GPU or AWS silicon?	Trainium/Inferentia for cost at volume; NVIDIA for specific framework/CUDA needs.

Pricing & Cost Control

Shape, not exact numbers - rates change and vary by model/region. Always confirm in the AWS pricing pages.

Lever	How it bills	Control
Bedrock on-demand	Per input/output token, per model.	Right-size model per task; cache prompts; cap output tokens.
Provisioned Throughput	Reserved model units (hourly).	For steady high volume; commit only after you know the load.
Batch inference	Discounted vs on-demand.	Use for non-interactive enrichment jobs.
Knowledge Bases / OpenSearch	Storage + query + embedding tokens.	Tune chunk size; prune stale docs; pick S3 Vectors for cost.
SageMaker	Training + endpoint instance-hours.	Serverless/async endpoints; autoscale to zero where possible.
Agents	Model tokens x steps + tool calls.	Cap loop length; cheap model for routing, strong model only when needed.

The agent cost trap

Agent loops multiply token cost by the number of steps. A 10-step loop on a flagship model is the most common surprise bill. Budget per-conversation, log token usage, and route to small models for routine steps.

Risks & Gotchas

Read this one. What actually bites teams in production.

Model/region drift

Models and versions change and aren't uniform across regions. Pin model IDs, monitor deprecations, and test before auto-upgrading.

Runaway agent cost & actions

Unbounded loops and tool access cause both cost blowouts and unintended actions. Enforce step caps, budgets, least-privilege Identity, and human approval on high-impact tools. For AgentCore Payments, treat spend as a first-class control.

Data egress & residency

Cross-region inference and external tools (web search, third-party MCP) can move data. Confirm residency; prefer zero-egress options where compliance requires.

Service sprawl & lock-in

Mixing Bedrock, SageMaker, Q, and three vector stores creates operational complexity and AWS-specific coupling. Standardize on a few primitives; keep prompts/eval portable.

Hallucination in RAG

Retrieval quality, not the model, is usually the failure. Use contextual grounding guardrails, rerankers, and evaluation before blaming the LLM.

Quotas

Default account quotas (tokens/min, requests/min, concurrent training) throttle real workloads. Request increases early; design for backoff.

AWS vs OCI vs Azure vs GCP

A practitioner's quick read. Every cloud can do the basics; the differences are in defaults, data gravity, and silicon.

Dimension	AWS	OCI	Azure	GCP
Model breadth (managed)	Widest (Bedrock)	Broad (OCI Gen AI)	OpenAI + catalog	Gemini + Model Garden
Frontier own model	Nova (mid-tier); Claude hosted	None (partners)	OpenAI partnership	Gemini
Agents	AgentCore	Enterprise AI Agents	Foundry Agents	Vertex Agent Builder
Custom silicon	Trainium/Inferentia	GPU (NVIDIA)	Maia (emerging)	TPU
Vectors in source DB	pgvector/OpenSearch/S3	In-DB 26ai	pgvector/AI Search	AlloyDB/Vertex
Best when	You already run on AWS; want model choice + silicon economics	You run Oracle DB/EBS; want in-DB vectors + sovereignty	You're Microsoft-centric; want OpenAI + M365	You want Gemini + BigQuery data gravity

Honest take

The cloud you already run is usually the right one for GenAI - data gravity and IAM beat a marginally better model. Pick by where your data and identity live, then choose the model per task.

Sources

Primary AWS material used for this portal (June 2026). Verify specifics against current docs before committing - this space moves weekly.

Accuracy note

Compiled by Brijesh Gogia for expertoracle.com. Independent and not affiliated with Amazon/AWS. Model names, availability, and pricing change frequently - treat this as orientation, confirm in the AWS console/docs before designing.