Latest publications

AWS showed semantic video search on Amazon Bedrock with Nova Multimodal Embeddings
AWS published a reference architecture for video search on Amazon Bedrock: the service splits videos into scenes, separately indexes image, audio, and speech, and combines everything with metadata.

Amazon Bedrock gets detailed inference cost attribution across users and applications
AWS has added detailed cost attribution to Amazon Bedrock: companies can now see exactly who is spending the inference budget across users, roles, services, and tenants.

AWS cut marketing page production from hours to minutes with agentic AI
AWS and Gradial deployed agentic AI on Amazon Bedrock: building and checking marketing web pages now takes about 10 minutes instead of four hours.

AWS makes multimodal BioFMs available for drug development and clinical medicine
Amazon Web Services described how multimodal biological AI models accelerate the search for new drugs, patient stratification in trials, and treatment personalization.

Amazon Quick for marketing: a personal knowledge graph from disparate data
Amazon Quick connects to a marketer’s tools and data, builds a personal knowledge graph, and turns chaos across different systems into strategic decisions.

AWS showed how to fine-tune NVIDIA Nemotron Speech for accurate ASR in niche scenarios
AWS outlined an end-to-end approach to fine-tuning the NVIDIA Nemotron speech model on Amazon EC2: with synthetic audio data, ASR can be adapted more precisely to medicine, support, and other narrow domains.

Amazon demonstrated natural-language search across large video archives with Nova
AWS described the architecture of scalable multimodal video search: Nova generates audio and image embeddings, OpenSearch indexes them, and queries run in milliseconds.

Amazon Bedrock AgentCore gets Policy for AI agent access control
Amazon explained how the new Policy layer in Bedrock AgentCore checks every agent request to tools and data against Cedar rules, without relying on the model's own logic.

AWS explained how to fine-tune Amazon Nova through an LLM judge for complex enterprise tasks
AWS showed a reinforcement fine-tuning setup for Amazon Nova in which a separate LLM evaluates the model's responses, and in a contract review use case Nova 2 Lite outperformed larger solutions.

AWS and vLLM integrated P-EAGLE to speed up large LLM inference by up to 1.69x
AWS showed how P-EAGLE in vLLM removes the bottleneck in speculative decoding, generates multiple tokens in a single pass, and delivers up to a 1.69x speedup.

AWS showed how to build an AI engine for A/B tests on Amazon Bedrock and DynamoDB
AWS published an architecture for an AI system for A/B tests: Bedrock analyzes user context and helps assign variants not randomly, but considering user behavior and experiment conditions.

AWS Shows How to Fine-Tune Amazon Nova via Nova Forge SDK and SageMaker Jobs
AWS released a detailed guide on Nova Forge SDK: from baseline evaluation of Amazon Nova to SFT, RFT, and deployment in SageMaker, raising exact match from 13% to 78.8% and quasi-EM to 80.6%.

AWS launches Nova Forge SDK for fine-tuning Nova models in enterprise AI
AWS has introduced Nova Forge SDK, a toolkit that simplifies customization of Nova models for enterprise teams and removes part of the infrastructure routine.

AWS showed how Amazon Bedrock AgentCore Gateway connects to private APIs and services
AWS showed how Bedrock AgentCore Gateway, via Resource Gateway, gives AI agents access to private APIs and services inside a VPC in managed and self-managed modes.

Amazon unveiled an agentic analytics architecture based on SageMaker, Athena, and Quick
AWS described an architecture in which Amazon Quick, with agentic AI on top of SageMaker, Athena, and S3, lets business users ask questions of a lakehouse in natural language.

How Sun Finance and AWS accelerated document verification and reduced fraud risk
Sun Finance built an AI pipeline on AWS for identity verification: data extraction accuracy rose to 90.8%, verification time fell to under 5 seconds, and processing costs dropped by 91%.

AWS introduced a system for migrating and upgrading LLMs in production with prompt optimization
AWS introduced the Generative AI Model Agility Solution — an approach for migrating and upgrading LLMs in production with prompt conversion, optimization, and quality control.

AWS and Artificial Genius demonstrated a way to reduce LLM hallucinations in finance and medicine
AWS and Artificial Genius described a hybrid approach where Amazon Nova understands the request, and a deterministic layer restricts answers only to what can be verified against input data.

AWS explained the launch of reinforcement fine-tuning in Amazon Bedrock via OpenAI-compatible APIs
AWS released a step-by-step breakdown of reinforcement fine-tuning in Amazon Bedrock: with OpenAI-compatible API configuration, Lambda grader, model training, and inference without separate hosting.

AWS Explains How to Accelerate Fine-Tuning Llama 3.2 Vision on S3 Data
AWS demonstrated a practical scenario where SageMaker Unified Studio, Catalog, and S3 enable faster fine-tuning of Llama 3.2 11B Vision Instruct on unstructured data for VQA.

AWS Launches Amazon Bedrock in New Zealand with Claude and Cross-Region Inference
Amazon Bedrock is now available in the Asia Pacific (New Zealand) region: companies can now invoke Claude and Nova from Auckland, with load distributed between New Zealand and Australia.

AWS showed how to search for solar flares in SageMaker AI using ESA STIX instrument data
AWS published a breakdown showing how to train and deploy an LSTM model in SageMaker AI for detecting solar flares using ESA STIX instrument data.

AWS explained how to scale AI agent memory with namespace patterns in AgentCore Memory
AWS published a guide on namespace hierarchies, retrieval patterns, and IAM access control for AgentCore Memory — a long-term memory service for AI agents.

Amazon Bedrock AgentCore Runtime now supports serverless MCP-proxies
AWS demonstrated how to deploy serverless MCP-proxies in Amazon Bedrock AgentCore Runtime: a programmable layer with security policies, audit, and observability for AI agents.

Vanguard built a Virtual Analyst on AWS following eight AI-ready data principles
Vanguard — one of the world's largest asset managers — published a case study on how it built a Virtual Analyst on AWS following eight AI-ready data principles and achieved measurable business results.

PwC and AWS Demonstrate AI System for Contract Analysis with 90% Verification Reduction
PwC unveiled the AIDA system on AWS, which uses Amazon Bedrock to extract terms from contracts, answer questions about them, and, according to the company, reduces manual verification by up to 90%.

NVIDIA Releases Nemotron 3 Nano Omni on Amazon SageMaker JumpStart on Release Day
NVIDIA added Nemotron 3 Nano Omni to Amazon SageMaker JumpStart on release day to enable companies to quickly deploy multimodal AI scenarios for text, images, audio, and video.

AWS explained how to convert a text-based AI agent into a voice assistant on Nova 2 Sonic
AWS broke down the transition from a text agent to a voice assistant on Amazon Nova 2 Sonic: what changes in architecture, prompts, tools, and user experience.

AWS Shows How Amazon Nova Act Automates Competitor Price Monitoring
AWS described a system based on Amazon Nova Act that concurrently visits competitor websites, collects prices and promotions in structured form, and helps make pricing decisions faster.

Rocket Close Accelerated Mortgage Document Processing by 15x with AWS
Rocket Close, together with AWS, accelerated mortgage document processing by 15x, combining Amazon Textract for OCR and Amazon Bedrock for segmentation, classification, and field extraction with approximately 90% accurac