AI Services Across Cloud Providers
A neutral deep-dive comparison of AI, ML, GenAI, agents, vector search, RAG, document/vision/speech AI, MLOps, governance, and AI infrastructure across OCI, AWS, Azure, Google Cloud, and other AI platforms - a practical reference for enterprise architects and engineers choosing the right AI platform for real workloads.
This portal does not favor any provider and makes no "best AI cloud" claim. Some AI services are close equivalents, some only partially similar, and some have no direct match - we say which, and we mark anything fast-moving as needing verification. AI is the fastest-changing area in cloud: model availability, regions, pricing, token/context limits, fine-tuning, agent features, and data-handling terms shift constantly. Treat concrete claims here as a starting point to verify, not a guarantee.
How to read this portal
The heart is the AI Service Comparison Matrix (section 1): a searchable, filterable table mapping each AI capability across providers with a match rating, maturity, enterprise-readiness, key difference, and risk. Deep-dive sections (2-11) go further by category; the decision sections (12-14) help you choose by workload and cost; and the reference sections (15-17) cover risk, troubleshooting, and learning paths.
Match types
Maturity levels
Enterprise readiness
Providers compared
OCI Generative AI + Agents, AI Vector Search in Oracle Database 23ai, Data Science, OCI AI Services.
Bedrock (multi-model), SageMaker, Kendra/OpenSearch, Textract/Rekognition/Comprehend.
Azure OpenAI / AI Foundry, Azure AI Search, Azure ML, AI Document Intelligence/Vision/Speech.
Vertex AI / Gemini, Vertex AI Search, Vector Search, Document AI, BigQuery ML.
Reading the callouts
1. AI Service Comparison Matrix
A searchable, filterable map of AI capabilities across OCI, AWS, Azure, Google Cloud, and other platforms - with match rating, maturity, enterprise readiness, best-fit use, key difference, and risk. Click any row to expand.
| AI Capability | OCI | AWS | Azure | Google Cloud | IBM / Other | Match |
|---|
Match ratings are conservative and providers are not forced into every row. A "No direct" or "Conceptual" rating means the architecture changes when you move - read the key difference before assuming portability.
2. Generative AI Services Deep Dive
The main managed GenAI platforms compared neutrally - model access, RAG, agents, guardrails, private networking, enterprise controls, and best-fit use.
Every major cloud has a managed GenAI platform: OCI Generative AI + Agents, Amazon Bedrock, Azure OpenAI / AI Foundry, Vertex AI / Gemini, plus IBM watsonx.ai, Databricks Mosaic AI, and Snowflake Cortex. They differ most in model catalog (Bedrock is multi-provider; Azure centers on OpenAI/GPT; Vertex on Gemini + open; OCI is curated) and in how you wire private data, agents, and guardrails. Choose by the models you need, region/quota, enterprise controls, and where your data already lives - not by benchmarks.
Platform comparison
| Provider | GenAI platform | Model access | Agents | RAG | Guardrails | Private networking | Best-fit use | Key gotcha |
|---|---|---|---|---|---|---|---|---|
| OCI | Generative AI + GenAI Agents | Curated (Cohere, Llama, etc.) via API; dedicated clusters | GenAI Agents | Agent knowledge bases + DB Vector Search | Guardrails | Private endpoints | Oracle-data-centric RAG; enterprises on OCI | Verify model catalog + region availability |
| AWS | Amazon Bedrock (+ SageMaker JumpStart) | Many providers (Anthropic, Meta, Cohere, Amazon, etc.) | Bedrock Agents | Bedrock Knowledge Bases | Bedrock Guardrails | PrivateLink / VPC | Model choice + AWS-native governance | Model availability varies by region |
| Azure | Azure OpenAI / AI Foundry | OpenAI GPT family + catalog | AI Agent Service (Foundry) | Azure AI Search + OpenAI | AI Content Safety | Private Endpoints | OpenAI/GPT + Microsoft ecosystem | GPT model/region/quota gating |
| Vertex AI / Gemini | Gemini + Model Garden (open + partner) | Vertex AI Agent Builder | Vertex AI Search / RAG Engine | Safety filters / Model Armor | Private Service Connect | Gemini + data/BigQuery integration | Feature/region availability | |
| IBM/Other | watsonx.ai; Databricks Mosaic; Snowflake Cortex | Granite + open (IBM); open (Databricks); Cortex (Snowflake) | watsonx Orchestrate | watsonx Discovery; Cortex Search; Databricks | watsonx.governance | Varies | Governance focus (IBM); data-platform-native AI | Verify enterprise coverage per platform |
How enterprise data is connected (common to all)
Across every platform, connecting private data safely follows the same shape: ingest → chunk → embed → store vectors → retrieve (entitlement-filtered) → ground the model → audit, all behind a governed serving layer. What differs is the managed convenience (Knowledge Bases, Vertex AI Search, AI Search integration) and where vectors live. See sections 5-6.
3. Foundation Model Comparison
How foundation-model access differs across providers - first-party, third-party, and open models, fine-tuning, and private deployment. Model availability changes constantly, so this is a shape, not a live catalog.
Providers differ in how open their model access is: Bedrock offers the broadest multi-vendor catalog (Anthropic, Meta, Cohere, Amazon, etc.); Azure centers on OpenAI GPT models; Google on Gemini plus Model Garden (open + partner); OCI offers a curated set (Cohere, Llama, etc.); Hugging Face and Databricks/others give access to a large open ecosystem. Abstract the model behind your own serving layer so you can switch as the landscape shifts.
Model access, side by side (verify current state)
| Provider | Model platform | Example families (verify) | First-party | Third-party | Open-source | Fine-tuning | Private deploy | Notes |
|---|---|---|---|---|---|---|---|---|
| OCI | OCI Generative AI | Cohere, Meta Llama (verify) | Curated | Yes (partner) | Some | Some | Dedicated AI clusters | Curated catalog; verify current models |
| AWS | Amazon Bedrock | Anthropic Claude, Meta Llama, Cohere, Amazon Titan/Nova, Mistral (verify) | Titan/Nova | Broad (many vendors) | Yes (incl. via Bedrock/SageMaker) | Yes (varies by model) | VPC/PrivateLink | Broadest multi-vendor catalog |
| Azure | Azure OpenAI + Foundry catalog | OpenAI GPT family; catalog models (verify) | Microsoft/Phi | OpenAI + catalog | Via catalog | Yes (select models) | Private Endpoints | OpenAI-centric; region/quota gated |
| Vertex AI + Model Garden | Gemini; open + partner models (verify) | Gemini/Gemma | Partner + open | Model Garden / open | Yes (select models) | PSC / private | Gemini + broad Garden | |
| IBM / HF | watsonx.ai / Hugging Face | IBM Granite; large open ecosystem (verify) | Granite (IBM) | Some | Extensive (HF) | Yes | Varies | Open-model breadth (HF); governance (IBM) |
4. AI Agents Comparison
Managed agent platforms compared - tool/function calling, knowledge grounding, memory, guardrails, human approval, and the enterprise risks that apply to all of them.
An agent is an LLM that can call tools/functions, retrieve knowledge, keep memory, and take multi-step actions - beyond a chatbot. Every cloud has one (OCI GenAI Agents, Bedrock Agents, Azure AI Agent Service, Vertex AI Agent Builder) plus enterprise-SaaS agents (IBM watsonx Orchestrate, Salesforce Agentforce, ServiceNow). They are evolving fast and share the same risks: over-permissioning, direct database access, and unsafe dynamic SQL. The non-negotiable rule: agents act through governed APIs with least privilege, human approval for consequential actions, and full audit - never raw production databases.
Chatbot vs workflow bot vs autonomous agent
| Type | What it does | Autonomy | Risk |
|---|---|---|---|
| Chatbot | Answers questions (optionally grounded via RAG) | None - responds only | Low (wrong answers) |
| Workflow bot | Follows defined steps, calls known tools | Bounded - fixed flow | Medium (calls real systems) |
| Autonomous agent | Plans, chooses tools, takes multi-step actions | High - decides its own path | High (unpredictable actions) |
Agent platforms, side by side
| Provider | Agent service | Tool calling | Knowledge / RAG | Workflow integration | Human approval | Governance | Best-fit use | Main risk |
|---|---|---|---|---|---|---|---|---|
| OCI | OCI Generative AI Agents | Yes | Knowledge bases + DB Vector Search | OCI services / APIs | Design-dependent | IAM + audit | Grounded assistants over Oracle data | Verify current tool/action coverage |
| AWS | Bedrock Agents | Yes (action groups) | Bedrock Knowledge Bases | Lambda / API | Design-dependent | IAM + Guardrails + CloudTrail | Tool-using assistants on AWS | Over-permissioned action groups |
| Azure | AI Agent Service (Foundry) | Yes | Azure AI Search | Logic Apps / Functions / APIs | Design-dependent | Entra + Content Safety + logging | Microsoft-ecosystem agents | Grounding + identity scoping |
| Vertex AI Agent Builder / Agentspace | Yes | Vertex AI Search | Cloud Functions / APIs | Design-dependent | IAM + safety + audit | Search-grounded agents on GCP | Data-access scoping | |
| Enterprise SaaS | watsonx Orchestrate; Salesforce Agentforce; ServiceNow | Yes | Product knowledge + connectors | Native to the SaaS platform | Often built-in | Platform governance | Agents inside a SaaS (CRM/ITSM/HR) | Scope to the SaaS; verify data flows |
What to evaluate in an agent platform
- Tool / function calling - how tools are defined, scoped, and permissioned (least privilege per tool).
- Knowledge grounding - RAG quality + citations; entitlement-filtered retrieval.
- Memory - session vs long-term; where it is stored and secured.
- Guardrails + human approval - blocking unsafe actions; approval gates for consequential steps.
- Enterprise identity - does the agent act as a scoped identity (not a shared super-user)?
- Audit logging + monitoring - every tool call, retrieval, and action logged and reviewable.
5. RAG and Knowledge Base Comparison
Retrieval-Augmented Generation - the model retrieves relevant enterprise data first, then answers grounded in it. Compared across providers, with architecture diagrams and the gotchas that decide answer quality.
RAG is the same pipeline everywhere: ingest → chunk → embed → store vectors → retrieve (entitlement-filtered) → rerank → ground the model → cite → audit. Managed options: Bedrock Knowledge Bases, Azure AI Search + OpenAI, Vertex AI Search / RAG Engine, OCI Agent knowledge bases + DB Vector Search, plus Databricks and Snowflake Cortex Search. The hard parts - chunking quality, retrieval relevance, security trimming, and index freshness - are the same on every platform and matter more than the model.
RAG options, side by side
| Provider | Managed RAG | Vector store options | Reranking | Citations | Access control |
|---|---|---|---|---|---|
| OCI | GenAI Agents knowledge bases | Oracle DB 23ai AI Vector Search; OpenSearch; Object Storage | Design/model-dependent | Supported | DB/IAM entitlements |
| AWS | Bedrock Knowledge Bases | OpenSearch; Aurora pgvector; (others) | Rerank models | Supported | IAM + source ACLs |
| Azure | Azure AI Search (+ OpenAI) | AI Search vector; Cosmos DB; PostgreSQL pgvector | Semantic ranker | Supported | Security trimming in index |
| Vertex AI Search / RAG Engine | Vertex Vector Search; AlloyDB/Cloud SQL pgvector; BigQuery | Ranking API | Supported | IAM + data-store ACLs | |
| Other | Databricks; Snowflake Cortex Search; watsonx Discovery | Platform-native vector | Varies | Varies | Platform governance |
RAG architectures
Vectors live in the operational DB (Oracle 23ai, AlloyDB, pgvector). Retrieval inherits DB IAM, backups, and row-level security - simplest path to entitlement-filtered retrieval. Best when data already lives in that DB.
Docs in object storage, indexed by a search service (AI Search, Vertex AI Search, OpenSearch/Kendra). Best for unstructured document corpora and hybrid (keyword+vector) search with connectors.
RAG gotchas (identical on every platform)
- Bad chunking creates bad answers - chunk size/overlap and document structure dominate quality.
- Retrieval quality matters more than model hype - a great model on poor retrieval still hallucinates.
- Security trimming is hard - and must happen before retrieval, not by filtering the answer. Enforce entitlements at index/retrieval time.
- Stale indexes create wrong answers - automate re-indexing; track data freshness.
- Vector search cost grows with corpus size and query rate - size indexes and monitor.
- RAG does not eliminate hallucination - it reduces it; still validate outputs and require citations.
- Access control before retrieval, not only after - never rely on the model to withhold data it was given.
6. Vector Search and AI Database Comparison
Where to store and search embeddings - in an operational database, a dedicated vector service, or a specialist vector DB - compared neutrally, with a decision guide by data location and scale.
Two broad choices: vectors in an operational database (Oracle DB 23ai, pgvector on Aurora/AlloyDB/Cloud SQL/Azure PostgreSQL, Cosmos DB) - which inherit existing IAM, backups, and row-level security - or a dedicated vector/ANN service (Azure AI Search, Vertex Vector Search, OpenSearch, or specialist DBs like Pinecone/Weaviate/Milvus/Qdrant) for large-scale, low-latency semantic search. Keep vectors near the governed data when you can; go dedicated when scale/latency demands it.
Vector search options, side by side
| Option | Where vectors live | Managed? | Hybrid search | Metadata filter | Security model | Best-fit |
|---|---|---|---|---|---|---|
| Oracle DB 23ai AI Vector Search | In Oracle Database | Yes (managed DB) | Yes (+ SQL) | Yes (SQL WHERE) | DB IAM + row/label security | Vectors next to Oracle relational data |
| Aurora/RDS pgvector | In PostgreSQL | Yes (managed DB) | Via SQL/extensions | Yes (SQL) | DB IAM | Existing Postgres on AWS |
| OpenSearch / Kendra | Dedicated index | Yes | Yes | Yes | Fine-grained + source ACLs | AWS-native large-scale search |
| Azure AI Search | Dedicated index | Yes | Yes (+ semantic) | Yes | Security trimming | Azure OpenAI RAG default |
| Cosmos DB / PG pgvector | In the database | Yes | Partial | Yes | DB RBAC | Vectors with operational data |
| Vertex Vector Search | Dedicated ANN | Yes | Filter-based | Yes | IAM | Very large-scale, low-latency |
| AlloyDB / Cloud SQL pgvector; BigQuery | In DB / warehouse | Yes | Via SQL | Yes | DB/warehouse security | Vectors with GCP data |
| Databricks / Snowflake Cortex Search | In the data platform | Yes | Yes | Yes | Platform governance | Vectors next to lakehouse/warehouse data |
| Pinecone / Weaviate / Milvus / Qdrant | Dedicated vector DB | Managed or self | Varies | Yes | Own model | Cloud-neutral / specialist scale |
Decision guide (neutral)
| Situation | Neutral guidance |
|---|---|
| Data is in Oracle Database | Oracle DB 23ai AI Vector Search - vectors + relational data + governance in one place. |
| AWS-native RAG | OpenSearch or Aurora pgvector with Bedrock Knowledge Bases. |
| Azure OpenAI pattern | Azure AI Search (hybrid + semantic) is the common default. |
| GCP data + AI | Vertex Vector Search for scale; AlloyDB/BigQuery vectors to stay near the data. |
| Large-scale semantic search | Dedicated ANN (Vertex Vector Search, OpenSearch, or specialist DBs). |
| Existing PostgreSQL users | pgvector (any cloud) - lowest friction; verify performance at scale. |
| Data-warehouse-integrated search | BigQuery vector search, Snowflake Cortex Search, or Databricks Vector Search. |
| Cloud-neutral / specialist | Pinecone, Weaviate, Milvus, or Qdrant - portable across clouds. |
7. Machine Learning Platform Comparison
Full ML/MLOps platforms compared - training, deployment, pipelines, registry, monitoring, and governance - plus neutral guidance on when a managed platform is worth it.
The end-to-end ML platforms - SageMaker, Vertex AI, Azure ML, OCI Data Science, plus Databricks Mosaic AI and IBM watsonx.ai - cover notebooks, training, deployment, pipelines, registry, and monitoring. SageMaker and Vertex are generally the broadest; Databricks is a common cross-cloud choice. Use a managed platform when you need reproducible pipelines, governance, and scale; a plain notebook or in-database ML may be enough for smaller work.
ML platforms, side by side
| Provider | Platform | Best strength | Training | Deployment | MLOps | Governance | Best-fit users | Main limitation |
|---|---|---|---|---|---|---|---|---|
| OCI | Data Science | Oracle-data integration; AI Quick Actions | Jobs | Model Deployment | Pipelines | IAM + audit | Oracle-centric teams | Smaller ecosystem than AWS/GCP |
| AWS | SageMaker | Breadth + ecosystem | Managed/distributed | Endpoints (real-time/batch/serverless) | Pipelines + Registry + Monitor | Clarify + IAM | Broad ML teams | Complexity/choice overload |
| Azure | Azure Machine Learning | Microsoft + Responsible AI tooling | Managed/distributed | Managed endpoints | Pipelines + Registry | Responsible AI dashboard | Microsoft-centric teams | Learning curve; naming |
| Vertex AI | Data + AI integration; TPUs | Managed/distributed (TPU) | Endpoints | Pipelines + Registry + Monitoring | Explainable AI | Data/AI-led teams | Enterprise familiarity varies | |
| Databricks / IBM | Mosaic AI / watsonx.ai | Lakehouse-native (Databricks); governance (IBM) | Yes | Model Serving | MLflow / Workflows | Unity Catalog / watsonx.governance | Lakehouse or governance-led teams | Verify multi-cloud coverage |
When to use what (neutral)
- Use a managed ML platform when you need reproducible pipelines, a model registry with approvals, managed endpoints, monitoring, and governance at team/enterprise scale.
- A simple notebook is enough for exploration, one-off analysis, or a single small model with light serving needs.
- Kubernetes-based ML (Kubeflow, KServe) fits teams who want portability and already run Kubernetes - at the cost of more ops.
- Data-warehouse ML (BigQuery ML, Redshift ML, Oracle in-DB, Snowflake) fits when the data lives in the warehouse and SQL-based ML is sufficient - minimal data movement.
- Do not build ML infrastructure at all when a prebuilt AI service (Document AI, Language, Vision) or a foundation model already solves the task - most "ML projects" are now API calls.
8. AI Infrastructure and Accelerators
GPUs, custom accelerators (TPU, Trainium/Inferentia), Kubernetes for AI, and the infrastructure gotchas - quota, region availability, and idle-GPU cost - that decide whether an AI project ships.
NVIDIA GPUs are available on every cloud (portable); the differentiators are custom silicon - TPU (Google) and Trainium/Inferentia (AWS) - and bare-metal GPU breadth (OCI). All offer managed training/inference and Kubernetes for AI. The infrastructure realities that actually block projects are the same everywhere: GPU quota, region availability, data-pipeline bottlenecks, and idle-GPU cost. Verify accelerator availability in your region early.
AI infrastructure, side by side
| Provider | Accelerator options | Managed training | Managed inference | Kubernetes for AI | Bare metal GPU | Best-fit workload | Main constraint |
|---|---|---|---|---|---|---|---|
| OCI | NVIDIA GPU shapes + bare metal; cluster networking (RDMA) | Data Science | Model Deployment | OKE | Broad | Large training on bare-metal GPU clusters | Verify GPU SKU by region |
| AWS | NVIDIA (P/G) + Trainium + Inferentia | SageMaker | SageMaker Endpoints | EKS | .metal (narrower) | Cost/perf at scale with custom silicon | Quota + custom-silicon lock-in |
| Azure | NVIDIA N-series (+ Maia) | Azure ML | Azure ML endpoints | AKS | Specialized | Microsoft-ecosystem AI + big GPU | Region/SKU availability |
| NVIDIA GPU + TPU | Vertex Training | Vertex Endpoints | GKE | (Bare Metal Solution) | Large-scale training (TPU) + inference | TPU ties the serving stack | |
| Other | NVIDIA (CoreWeave, Lambda, etc.) | Databricks | Model Serving | Kubernetes | Varies | GPU-focused / neutral | Verify integration + support |
AI infrastructure gotchas (universal)
- GPU quota can block projects - default quotas are low; request increases early on every cloud.
- Region availability matters - the accelerator you want may not exist in your region; verify before designing.
- The data pipeline can bottleneck GPUs - slow data loading starves expensive accelerators; size storage/network for the training I/O.
- Network and storage can dominate performance - for distributed training, interconnect (RDMA/InfiniBand) and storage throughput often matter more than raw GPU count.
- Idle GPUs are expensive - the most common AI-infra waste; auto-stop, schedule, or use spot for interruptible work.
- Inference cost can exceed training cost over time - an always-on endpoint runs 24x7; batch or scale-to-zero where latency allows.
9. Document, Vision, Speech, and Language AI
Prebuilt AI services for documents, images, audio, and text - close equivalents across clouds, but with differences in custom-model support, human review, language coverage, and privacy constraints.
These prebuilt AI services are largely close equivalents across OCI/AWS/Azure/GCP - the same tasks with different names, differing in custom-model support, human-in-the-loop review, language coverage, and privacy handling. Increasingly, general-purpose LLMs overlap with some of these (extraction, classification, summarization) - choose the prebuilt service for accuracy/cost on well-defined tasks, and an LLM when flexibility matters. Verify language and feature coverage for your specific use.
Document AI (OCR, forms, tables)
| OCI | AWS | Azure | ||
|---|---|---|---|---|
| Service | Document Understanding | Textract | AI Document Intelligence | Document AI |
| OCR / tables / forms | Yes | Yes | Yes | Yes |
| Custom models | Yes | Yes (custom queries/adapters) | Yes (custom extraction) | Yes (custom processors) |
| Human review | Design-dependent | A2I | Design-dependent | Human-in-the-loop |
| Best-fit | Oracle-integrated doc pipelines | AWS doc pipelines, invoices | Microsoft-ecosystem forms | High-volume document extraction |
Vision AI
| OCI | AWS | Azure | ||
|---|---|---|---|---|
| Service | OCI Vision | Rekognition | Azure AI Vision | Vision AI |
| Classification / detection | Yes | Yes | Yes | Yes |
| Custom vision | Yes | Custom Labels | Custom Vision | AutoML Vision |
| Face / moderation | Limited | Yes | Yes (gated) | Yes (gated) |
Speech AI
| OCI | AWS | Azure | ||
|---|---|---|---|---|
| Speech-to-text | OCI Speech | Transcribe | Azure AI Speech | Speech-to-Text |
| Text-to-speech | (Speech) | Polly | Neural TTS | Text-to-Speech |
| Real-time / batch | Both | Both | Both | Both |
| Diarization | Yes | Yes | Yes | Yes |
Language AI
| OCI | AWS | Azure | ||
|---|---|---|---|---|
| Service | OCI Language | Comprehend | Azure AI Language | Natural Language AI |
| Sentiment / entities / PII | Yes | Yes | Yes | Yes |
| Classification | Yes (custom) | Yes (custom) | Yes (custom) | Yes (AutoML) |
| Translation | Yes | Translate (separate) | Translator (separate) | Cloud Translation (separate) |
10. AI and Enterprise Search
AI-powered enterprise search - keyword, semantic, and hybrid - with connectors, security trimming, and RAG integration, compared across providers.
Enterprise search now blends keyword + vector (semantic) + hybrid ranking, with connectors to enterprise systems and (critically) security trimming so users only see what they are entitled to. Turnkey options: Kendra/OpenSearch, Azure AI Search, Vertex AI Search / Agentspace, plus OCI Search / AI Vector Search and Elastic/Glean. These are also the retrieval layer for RAG. The hard part - source-level access control preserved in the index - is the same everywhere.
AI search, side by side
| Provider | Search service | Keyword | Vector | Hybrid | Connectors | RAG integration | Access control | Best use | Gotcha |
|---|---|---|---|---|---|---|---|---|---|
| OCI | OCI Search (OpenSearch) / AI Vector Search | Yes | Yes | Yes | Custom | Via GenAI Agents | DB/IAM | Oracle-data + open-source search | Assemble connectors |
| AWS | Amazon Kendra / OpenSearch | Yes | Yes | Yes | Many (Kendra) | Bedrock KB | Token-based ACLs | Turnkey enterprise search (Kendra) | Kendra cost at scale |
| Azure | Azure AI Search | Yes | Yes | Yes (+ semantic ranker) | Indexers | Azure OpenAI | Security trimming | Azure OpenAI RAG default | Design security trimming carefully |
| Vertex AI Search / Agentspace | Yes | Yes | Yes | Connectors | Native RAG | IAM + ACLs | Turnkey grounded search | Verify connector coverage | |
| Other | Elastic; Glean; OpenSearch managed | Yes | Yes | Yes | Broad | Varies | Own model | Cloud-neutral / SaaS-wide search | Verify governance model |
11. AI Governance, Security, and Responsible AI
The enterprise controls that make AI safe to run on real data - privacy, private networking, access control, logging, guardrails, responsible AI, and a production governance checklist that applies across providers.
AI governance is mostly your configuration, not a product. The controls are the same across clouds - private endpoints, native IAM, prompt/output logging, guardrails/content safety, CMK encryption, data-retention terms, human approval, and responsible-AI checks - implemented with each cloud's tools (Bedrock Guardrails, Azure AI Content Safety + Entra + Private Link, Google Model Armor + VPC-SC, OCI IAM/Vault, IBM watsonx.governance). Treat AI as a new attack surface (prompt injection, data exfiltration) and govern it deliberately from day one.
Governance controls, side by side
| Control | OCI | AWS | Azure | |
|---|---|---|---|---|
| Identity / access | IAM | IAM | Entra ID + RBAC | Cloud IAM |
| Private networking | Private endpoints | PrivateLink / VPC | Private Link / Private Endpoints | Private Service Connect |
| Guardrails / content safety | Guardrails | Bedrock Guardrails | AI Content Safety | Model Armor / safety filters |
| Key management | Vault (CMK) | KMS | Key Vault | Cloud KMS |
| Secrets | Vault | Secrets Manager | Key Vault | Secret Manager |
| Prompt / output logging | Audit + Logging | CloudTrail + Bedrock logs | Activity Log + Foundry logs | Audit Logs + Vertex logs |
| Data-exfil perimeter | (network + IAM) | (SCP + endpoints) | (Private Link + Policy) | VPC Service Controls |
| Sensitive-data discovery | Data Safe | Macie | Purview | Sensitive Data Protection |
| Responsible AI tooling | (guidance) | SageMaker Clarify | Responsible AI dashboard | Explainable AI |
| Model + use-case governance | (policy + inventory) | (SageMaker + Config) | (Purview + RAI) | (Registry + policy) |
Production AI governance checklist (portable)
- Approved use case - documented, with a business owner and a risk assessment.
- Data classification - what data the AI touches, and its sensitivity level.
- Model selection - approved model(s), with data-handling/retention terms verified.
- Prompt logging policy - prompts + retrieved context IDs logged (per privacy rules).
- Output review policy - outputs logged; human review for consequential answers.
- Access control - least-privilege scoped identity; security-trimmed retrieval before generation.
- Private networking - private endpoints for model, retrieval, and data services.
- Data retention terms - confirmed the provider does not retain/train on your data (or terms accepted).
- Human approval requirement - for any write/action or high-impact output.
- Abuse / injection monitoring - content safety on input+output; prompt-injection detection.
- Cost monitoring - token/GPU/vector spend tracked with budgets/alerts.
- Incident response - a plan for a leaked prompt, bad answer, or compromised agent.
- Legal / compliance review - completed for the use case and data.
- Vendor documentation verified - model, region, features, and terms confirmed current.
12. Enterprise AI Architecture Patterns
Common enterprise AI patterns, each mapped to OCI/AWS/Azure/GCP - the pattern shape is portable; the services differ. Every pattern shares the same governed-serving-layer backbone.
Pattern catalog
| Pattern | OCI | AWS | Azure | Key risk | |
|---|---|---|---|---|---|
| Chat with documents | GenAI Agents + Vector Search | Bedrock KB + OpenSearch | Azure OpenAI + AI Search | Gemini + Vertex AI Search | Chunking/freshness; security trimming |
| Chat with database | Select AI / DB 23ai | Bedrock + curated views | OpenAI + curated views | Gemini + curated views | Never raw prod OLTP; use serving layer |
| Natural language to SQL | Select AI (Autonomous DB) | Bedrock + QuickSight Q | Fabric/Copilot + OpenAI | BigQuery + Gemini | Validate/parameterize; read-only |
| AI assistant for IT ops | Ops Insights + GenAI | DevOps Guru + Bedrock | Azure Monitor + Copilot | Active Assist + Gemini | Human approval before actions |
| AI assistant for business users | GenAI Agents + curated data | Bedrock + governed data | Copilot + governed data | Agentspace + governed data | Answer only from curated data |
| Customer support chatbot | Digital Assistant + GenAI | Lex + Bedrock | Bot Service + OpenAI | Dialogflow + Gemini | Grounding + human escalation |
| Contact center transcription/summary | Speech + GenAI | Connect + Contact Lens + Bedrock | Speech + OpenAI | CCAI + Gemini | Recording compliance; real-time |
| Invoice / document processing | Document Understanding | Textract + Bedrock | Document Intelligence + OpenAI | Document AI + Gemini | Human review of low-confidence |
| Enterprise knowledge search | OCI Search / Vector Search | Kendra / OpenSearch | Azure AI Search | Vertex AI Search / Agentspace | Source-level security trimming |
| RAG over object storage | Object Storage + Vector Search | S3 + Bedrock KB | Blob + AI Search | Cloud Storage + Vertex Search | Index freshness + ACLs |
| RAG over database | DB 23ai Vector Search | Aurora pgvector | Cosmos / PG pgvector | AlloyDB pgvector | Row-level entitlements |
| RAG over data warehouse | ADW + Select AI | Redshift ML + Bedrock | Fabric + OpenAI | BigQuery vector + Gemini | Query cost + column security |
| AI over Oracle EBS / ERP data | Read-only reporting layer + GenAI | Extract to lake + Bedrock | Extract + OpenAI | Extract + Gemini | Never live ERP; performance + governance |
| AI over CRM data | Governed API + GenAI | Bedrock + governed API | OpenAI + Dataverse/API | Gemini + governed API | PII handling; entitlements |
| AI code assistant | (GenAI + code models) | Amazon Q Developer | GitHub Copilot / Foundry | Gemini Code Assist | IP / secret leakage in prompts |
| MLOps train/deploy pipeline | Data Science pipelines | SageMaker Pipelines | Azure ML pipelines | Vertex Pipelines | Reproducibility; drift monitoring |
| Real-time recommendations | ML + serving | SageMaker + feature store | Azure ML + serving | Vertex + feature store | Latency; feature skew |
| Forecasting / anomaly detection | Anomaly Detection / ML | SageMaker / Lookout | Azure ML | BigQuery ML / Vertex | Verify current managed service |
| AI governance & audit | IAM + Audit + inventory | Guardrails + CloudTrail | Content Safety + Purview | Model Armor + audit | Shadow AI; missing audit trail |
Featured pattern: governed enterprise RAG
| Business use | Employees ask questions and get grounded, cited answers from internal documents they are entitled to see. |
|---|---|
| Data flow | Ingest docs → chunk + embed → store vectors with ACL metadata → at query time: authN/authZ → entitlement-filtered retrieval + rerank → grounded generation → cited, audited answer. |
| Identity | User authenticates to the serving layer (native IdP); the app carries the user's entitlements into retrieval. |
| Security | Private endpoints for model + retrieval + storage; security-trimmed retrieval; content safety on input/output; secrets in the vault; CMK. |
| Monitoring | Log prompts, retrieved context IDs, and outputs; track answer quality/groundedness and token cost. |
| Cost drivers | Tokens (context size), embeddings, vector storage/queries, and model choice. |
| Best-fit provider conditions | Follow the data: Oracle data → OCI; AWS lake → AWS; Microsoft/M365 → Azure; BigQuery/GCP data → Google. Verify models/region. |
| Common mistakes | Retrieval not entitlement-filtered; stale index; sending whole documents (cost); no citations/audit; connecting the model to raw production data. |
- Connecting the model/agent to raw production OLTP instead of a governed serving layer.
- Retrieval not security-trimmed - data leaks across users.
- Sending entire documents to the model (cost + context dilution) instead of retrieved chunks.
- No citations, no audit trail - answers can't be explained or defended.
- Stale indexes; poor chunking - wrong answers regardless of model.
- Designing around one model that later changes availability/price.
13. AI Workload Decision Matrix
By AI workload, which providers are strong candidates and why - balanced wording. A "strong candidate" reflects natural advantages under common conditions, not a verdict; any provider can be valid depending on your data, ecosystem, and skills.
| Workload | Strong candidate(s) | Why they fit | Services to evaluate | Main trade-off / risk |
|---|---|---|---|---|
| Enterprise RAG | All (follow the data) | Every cloud has managed RAG; fit follows where docs/data live | Bedrock KB / AI Search / Vertex Search / OCI Agents | Retrieval quality + security trimming, not model |
| Chatbot over internal docs | All | Standard RAG pattern everywhere | Managed RAG + vector store | Chunking/freshness |
| Chat with relational database | Follow the DB | In-DB AI where data lives (Oracle 23ai, BigQuery, AlloyDB) | Select AI / pgvector / BigQuery ML | Never raw prod OLTP; serving layer |
| Natural language to SQL | OCI (Select AI), GCP (BigQuery+Gemini) | Native NL-to-SQL where the DB/warehouse is | Select AI, BigQuery+Gemini, Databricks Genie | Uncontrolled dynamic SQL |
| AI over Oracle Database | OCI (others via extract) | Oracle DB 23ai AI Vector Search + Select AI in the DB | OCI GenAI + DB 23ai | Others need data extraction |
| AI over Microsoft ecosystem | Azure | M365/Entra/Copilot + Azure OpenAI integration | Azure OpenAI, AI Search, Copilot | Ecosystem lock-in |
| AI over Google data ecosystem | GCP | BigQuery ML + Vertex + Gemini integration | BigQuery ML, Vertex AI | Verify enterprise familiarity |
| AI over AWS data lake | AWS | S3 + Bedrock + SageMaker + native governance | Bedrock, SageMaker, OpenSearch | Complexity |
| Document extraction / OCR at scale | All (close equivalents) | Mature document AI on all four | Textract / Doc Intelligence / Document AI / OCI DU | Human review of low-confidence |
| Contact center AI | AWS (Connect), GCP (CCAI) | Turnkey contact-center platforms | Connect+Contact Lens, CCAI | Recording compliance |
| Code assistant | All (verify) | Amazon Q, GitHub Copilot, Gemini Code Assist | Q Developer, Copilot, Gemini | IP/secret leakage in prompts |
| ML model training / deployment | AWS, GCP (all valid) | SageMaker + Vertex breadth; Databricks cross-cloud | SageMaker, Vertex, Azure ML, Databricks | GPU quota; lock-in |
| Time-series forecasting / anomaly | All (verify managed service) | ML platforms + BigQuery ML; some standalone services deprecated | BigQuery ML, SageMaker, OCI Anomaly | Verify current managed path |
| Image / video analysis | All (Video: AWS/GCP stronger) | Vision on all; video coverage differs | Rekognition, Vision AI, AI Vision, OCI Vision | Privacy for face/biometric |
| Speech transcription | All (close equivalents) | Mature speech on all four | Transcribe, Speech, Speech-to-Text, OCI Speech | Accent/domain accuracy |
| Semantic / enterprise search | All | Kendra/Vertex Search turnkey; AI Search; OCI | Kendra, Vertex Search, AI Search | Security trimming across sources |
| Real-time AI inference | All | Managed endpoints everywhere; custom silicon differs | Managed endpoints; GPU/TPU/Inferentia | Idle-endpoint + latency cost |
| Regulated AI workload | All (verify) | Private endpoints + governance on all; compliance varies | Private networking + guardrails + audit | Verify certifications/data terms |
| On-prem / hybrid AI | Varies (verify) | watsonx (portable), Azure Arc, some on-prem model options | watsonx, Arc, OSS models on-prem | Verify on-prem model support |
| Multicloud AI platform | Neutral platforms | Databricks, Snowflake Cortex, Hugging Face, OSS run across clouds | Databricks, Snowflake, HF, OTel | Trade deep native features for portability |
14. Cost Comparison for AI Services
The cost drivers that dominate AI spend, how they map across providers, and neutral optimization guidance. No provider is cheapest overall - it depends on the workload.
AI cost is driven by tokens (input + output), embeddings, fine-tuning, model hosting / endpoint uptime, GPU hours, vector storage + queries, and applied-AI usage (OCR pages, speech minutes) - plus the usual data transfer, logging, and private-networking costs. The biggest surprises are context size (tokens), idle endpoints/GPUs, and vector-search at scale. There is no cheapest AI cloud; the answer depends on volume, model choice, and architecture. Optimize the architecture (retrieval quality, context size, caching) before shopping list prices.
AI cost drivers, side by side
| AI workload | Main cost driver | Cost consideration (all clouds) | Cost control | Gotcha |
|---|---|---|---|---|
| LLM chat / RAG | Input + output tokens | Context size dominates; output tokens often priced higher | Retrieve minimal context; smaller models; cache | Sending whole documents blows up token cost |
| Embeddings | Tokens embedded | One-time (ingest) + per-query | Batch + cache; re-embed only changed content | Re-embedding everything on each run |
| Fine-tuning | Training tokens/hours | Upfront cost; may not beat good RAG | Try RAG/prompting first | Fine-tuning when RAG would suffice |
| Model hosting / endpoints | Endpoint uptime | 24x7 endpoints cost even when idle | Scale-to-zero / batch / serverless | Idle endpoints are silent spend |
| Provisioned throughput | Reserved capacity | Predictable cost + throughput vs on-demand | Match to sustained volume | Over-provisioning for spiky load |
| GPU / training | Accelerator hours | Dominant for training; custom silicon may cut cost | Spot + reservations; right-size | Idle GPUs; data-pipeline starvation |
| Vector search | Storage + queries | Grows with corpus + query rate | Right-size indexes; filter early | Unbounded index growth |
| Document AI / OCR | Pages processed | Per-page pricing | Pre-filter; process only needed pages | Reprocessing whole archives |
| Speech | Audio minutes | Per-minute; real-time may cost more | Batch where latency allows | Transcribing everything |
| Logging / monitoring | Ingestion volume | Prompt/output logging adds up | Sample; set retention | Logging full payloads unbounded |
AI cost optimization (portable)
- Use smaller models where they suffice - many tasks don't need the largest model.
- Cache common responses and embeddings; don't recompute.
- Reduce context size - retrieve the smallest sufficient chunks; don't send whole documents.
- Improve retrieval quality - better retrieval means fewer tokens and better answers.
- Prefer prebuilt AI services (Document AI, NLP) over an LLM for well-defined tasks - cheaper and more predictable.
- Batch where real-time isn't required; shut down idle endpoints/GPUs/notebooks.
- Right-size vector indexes and control logging volume.
- Track cost per user, per document, per workflow, and per business process - not just per service - so you can see which use cases are worth it.
15. AI Risk and Architecture Warnings
The specific ways enterprise AI goes wrong - what can happen, which patterns are affected, how to reduce it, what to monitor, and whether it is production-ready. These risks apply across all providers.
Enterprise AI risk is dominated by a handful of failure modes: hallucination, prompt injection, data leakage, over-permissioned agents, direct production-database access, uncontrolled dynamic SQL, poor/stale retrieval, and missing auditability. None are provider-specific - they are architecture and governance problems. The mitigations are consistent: governed serving layer, least privilege, security-trimmed retrieval, human approval for actions, content safety, and full logging. Build these in before going to production.
| Risk | What can go wrong | Affected patterns | How to reduce | Monitor | Production-ready? |
|---|---|---|---|---|---|
| Hallucination | Confident but wrong answers | All GenAI | RAG grounding + citations; validate output; human review for decisions | Groundedness; user feedback | Yes, with validation + citations |
| Prompt injection | Untrusted content hijacks instructions | RAG, agents, doc processing | Content safety / prompt shields; isolate + sanitize retrieved/user content; least privilege | Injection attempts; anomalies | Requires strong governance |
| Data leakage | Model reveals context it shouldn't | RAG, agents | Security-trimmed retrieval before generation; output filtering; DLP | Access anomalies; output scans | Requires strong governance |
| Over-permissioned agents | Agent does more than intended | Agents | Least-privilege scoped identity; per-tool permissions; approval gates | Tool calls; actions | Requires strong governance |
| Direct production DB access | Agent/LLM queries live OLTP | Chat-with-DB, NL-to-SQL, agents | Never direct; use governed API / curated views / read replica | DB access source | Not recommended without review |
| Uncontrolled dynamic SQL | Free-form SQL against production | NL-to-SQL | Validated, parameterized, read-only SQL on curated schema only | Queries executed | Not recommended without review |
| Poor retrieval quality | Irrelevant context → wrong answers | RAG, search | Better chunking; hybrid + rerank; evaluate retrieval | Retrieval relevance metrics | Yes, with evaluation |
| Stale data | Answers from outdated index | RAG | Automate re-indexing; track freshness | Index age | Yes, with refresh |
| Lack of auditability | Can't explain/defend an answer | All | Log prompts + context IDs + outputs | Log completeness | Requires strong governance |
| No human approval | AI acts without oversight | Agents, ops AI | Approval gates for writes/actions | Actions taken | Requires strong governance |
| Inconsistent answers | Same question, different answers | All GenAI | Lower temperature; deterministic checks; caching | Answer variance | Good for experimentation |
| Model version changes | Behavior shifts on model update | All | Pin/version models; re-evaluate on change; abstraction layer | Model version; eval scores | Yes, with eval on change |
| Region availability changes | Model/service unavailable in region | All | Verify + have fallback; abstraction layer | Availability | Yes, with fallback plan |
| Vendor lock-in | Hard to switch platform/model | All | Abstraction layer; open formats/standards | Coupling | Manageable with design |
| Hidden cost growth | Token/GPU/vector spend creeps up | All | Budgets/alerts; per-workflow cost tracking; optimization | Cost per user/workflow | Yes, with monitoring |
| Compliance gaps | Data/PII handled improperly | All | Data classification; retention terms; legal review | Data flows | Requires review |
| Shadow AI | Ungoverned AI touching real data | Org-wide | AI use-case inventory + approval; guardrails by policy | New AI usage | Requires strong governance |
| Weak monitoring | Problems unseen until users complain | All | Quality + safety + cost monitoring from day one | Quality/safety/cost | Yes, with observability |
16. Troubleshooting AI Workloads
Runbooks for the failures AI workloads actually hit - symptoms, likely causes, cloud-specific checks, fixes, and prevention. The method is portable; the tools differ by provider.
Causes: model not enabled/available in the region; quota/throughput exceeded; endpoint cold-start or under-provisioned; oversized context; network/private-endpoint issue. Checks: model availability in your region; quota/throughput limits; endpoint metrics/logs (Bedrock/Azure OpenAI/Vertex/OCI); token count of the request. Fix: enable the model / request region access; raise quota or use provisioned throughput; reduce context; scale/warm the endpoint. Prevention: verify region + quota early; add retries with backoff; cap context size.
Causes: context (retrieved chunks + history + prompt) exceeds the model's context window. Fix: retrieve fewer/smaller chunks; truncate history; summarize; use a larger-context model if justified. Prevention: budget tokens; measure context size; rerank to fewer, higher-quality chunks.
Causes: bad chunking; wrong embedding model; no reranking; pure-vector missing keyword matches; stale index; retrieving too few/many chunks. Checks: inspect retrieved chunks for the query; evaluate retrieval relevance separately from generation; index freshness. Fix: improve chunking; add hybrid (keyword+vector) + reranking; refresh the index; tune top-k. Prevention: evaluate retrieval as its own metric; automate re-indexing. (Retrieval quality usually matters more than the model.)
Causes: unsupported file type/size; permission to read source; embedding model quota; malformed content; timeout. Checks: ingestion/pipeline logs; source access (IAM); quota. Fix: convert/split files; grant read access; batch and retry; raise quota. Prevention: validate inputs; batch large corpora; monitor the pipeline.
Causes: ambiguous tool descriptions; over-broad tool permissions; no human approval; direct DB access. Checks: agent trace (which tool, which input); the tool's identity/permissions. Fix: sharpen tool descriptions; scope each tool with least privilege; add approval gates; route DB access through a governed read-only API. Prevention: never give agents raw DB access; require approval for writes; test tool selection.
Causes: over-broad content-safety rule (false positive); or a genuine injection attempt in retrieved/user content. Checks: guardrail/content-safety logs; the blocked content. Fix: tune the rule / add an exception (false positive); or confirm and block the injection (isolate/sanitize retrieved content). Prevention: run content safety in report mode first; sanitize untrusted content; monitor injection patterns.
Private endpoint: DNS not resolving to the private endpoint; missing route/firewall; verify per cloud (PrivateLink / Private Endpoint+Private DNS / PSC / OCI private endpoint). IAM denied: the workload/agent identity lacks the model/data role; check native IAM (OCI IAM / AWS IAM / Entra RBAC / Google IAM). Quota exceeded: request an increase; use provisioned throughput. Region: the model/service isn't in your region - request access or choose a supported region. Cost spike: check token/GPU/endpoint/vector usage in cost tools; find idle endpoints or oversized context.
Logs missing: prompt/output logging not enabled (often off by default for privacy/cost); wrong log destination; retention expired - enable Bedrock/Azure OpenAI/Vertex/OCI logging and route to a central store. Hallucination reported: check whether the answer was grounded (retrieved context) and cited; improve retrieval; add citations + a "not found" path; require human review for high-stakes answers; log the incident for evaluation.
17. Learning Paths for AI Across Clouds
Learn AI services fastest by building on what you know. For each persona: what transfers, what does not, where to start, hands-on labs, common mistakes, and the outcome.
Cloud architect learning AI across clouds
(OCI, AWS, Azure, or GCP architect learning the others' AI stacks.)
- Already know: the cloud foundation, IAM, networking, and where data lives.
- Transfers: RAG/agent architecture, private-endpoint patterns, governance principles - identical shapes.
- Doesn't transfer: the specific GenAI platform, model catalog + region availability, managed RAG/agent tooling, and vector store.
- Start with: the Matrix (1) → GenAI (2) → RAG (5) → Vector (6) → Governance (11). Use the matrix as your translation table.
Build the same governed RAG app (serving layer + vector store + model + audit) on a second cloud. You will hit exactly the differences: model access, managed-RAG tooling, and vector store.
Mistakes to avoid: assuming direct model equivalence; designing around one model's limits; skipping security-trimmed retrieval. Outcome: you can design a governed AI architecture on any of the four and know what to verify.
DBA / Data engineer learning GenAI and AI platforms
Databases, SQL, access control, backup/DR - which directly enable in-database vector search and NL-to-SQL governance.
Pipelines, SQL, Spark, data governance - which map to embeddings pipelines, retrieval, and warehouse-integrated AI.
Vector search as a DB feature (Oracle 23ai, pgvector, BigQuery); entitlement-filtered retrieval as row/label security; in-DB/in-warehouse ML.
Chunking/embedding quality's effect on answers; the governed-serving-layer requirement; agent/NL-to-SQL safety; model catalogs.
Hands-on labs: build a RAG app with pgvector or Oracle 23ai over data you control, with entitlement-filtered retrieval; build a governed NL-to-SQL over a curated read-only schema. Outcome: you can add AI to a database safely.
Security engineer / DevOps engineer learning AI
Least privilege, private networking, key/secret management, audit logging, data-exfiltration control - all apply directly to AI.
Prompt injection, data leakage via models, agent permissioning, content safety, and AI-specific audit (prompts/outputs).
CI/CD, containers, IaC, and observability (OpenTelemetry) map to MLOps pipelines and model deployment.
Model registry + versioning, drift/quality monitoring, prompt/eval management, and GPU/endpoint cost control.
Hands-on labs: Security - configure guardrails + private endpoints + prompt logging for a GenAI app; test prompt injection. DevOps - build an MLOps pipeline with a model registry and drift monitoring. Outcome: you can secure and operate AI in production.
AI engineer, enterprise architect, or business analyst
Know: models, prompting, embeddings. Learn: enterprise governance, security-trimmed retrieval, cost control, and each cloud's managed tooling. Start: RAG (5), Agents (4), Governance (11).
Learn: the workload decision matrix (13), governance (11), cost drivers (14), and risk (15) - to choose platforms by workload, not hype. Start: Matrix (1) + Workloads (13).
Learn: what each pattern (12) can realistically do, the risks (15), and where value is real vs hype. Start: Home, Patterns (12), Risk (15).
Follow the data, insist on governance and auditability, validate outputs, and verify model/region/cost before committing.
Outcome: you can evaluate, choose, govern, and cost an enterprise AI use case across providers - and tell real value from hype.