Publisher · verified by editors

Hugging Face Blog

AI news source. Articles are auto-selected and adapted by Hamidun News editors.

65 articles in Hamidun·Latest: July 23· Active·huggingface.co ↗

Latest publications

IBM released Granite Embedding R2 — a multilingual model for semantic search

IBM introduced Granite Embedding R2, an open multilingual model for semantic search with 32K context support and best-in-class performance among sub-100M models.

May 16, 2026·3 min

LLMHugging Face Blog

H Company released Holotron-12B — a model for agents with a 2x speed increase

H Company published Holotron-12B on Hugging Face: the multimodal model for AI agents delivers more than a 2x throughput gain in interface-use tasks on a single H100.

May 2, 2026·3 min

LLMHugging Face Blog

NVIDIA introduced SPEED-Bench — a unified benchmark for speculative decoding

NVIDIA published SPEED-Bench, a dataset and measurement framework that compares speculative decoding across real-world workloads, long contexts, and different inference engines.

May 2, 2026·3 min

LLMHugging Face Blog

IBM released Mellea 0.4.0 and Granite Libraries for verifiable AI pipelines

IBM Research updated the open-source Mellea framework to version 0.4.0 and released three Granite Libraries for structured, verifiable, and safe AI workflows.

May 2, 2026·3 min

LLMHugging Face Blog

NVIDIA showed how to fine-tune an embedding model for a specific domain in a day

NVIDIA and Hugging Face published a step-by-step recipe that turns a base embedding model into specialized search over internal documents in a few hours.

May 2, 2026·3 min

LLMHugging Face Blog

ServiceNow introduced EVA — a new framework for evaluating voice AI agents

ServiceNow released EVA — a system that evaluates voice AI agents not only by task success, but also by dialogue quality, from response brevity to turn timing.

May 2, 2026·3 min

LLMHugging Face Blog

IBM releases Granite 4.0 3B Vision for extracting data from documents and charts

IBM introduced Granite 4.0 3B Vision, a compact multimodal model for extracting tables, charts, and key fields from documents that can be integrated into enterprise pipelines with Docling.

May 2, 2026·2 min

LLMHugging Face Blog

H Company introduces Holo3 — an AI agent for computer use with a record score on OSWorld-Verified

H Company has released Holo3, a model for computer use that scored 78.85% on OSWorld-Verified and was trained on synthetic enterprise scenarios.

May 2, 2026·3 min

LLMHugging Face Blog

Google released Gemma 4 on Hugging Face: multimodal models for local inference

Google DeepMind has opened the Gemma 4 family on Hugging Face: four multimodal models under the Apache 2.0 license, with up to 256K context and deployment ranging from phones to workstations.

May 2, 2026·3 min

LLMHugging Face Blog

Hugging Face added gradio.Server: custom frontends can now connect to a Gradio backend

Hugging Face’s new gradio.Server turns Gradio into a backend layer for React, Svelte, and plain HTML/JS, while preserving request queues, ZeroGPU, and compatibility with Spaces.

May 2, 2026·3 min

LLMHugging Face Blog

Hugging Face transfers Safetensors to the PyTorch Foundation for neutral governance of the format

Hugging Face announced that Safetensors has become a PyTorch Foundation project: there are no breaking changes for users, while the format's development moves to a neutral governance model.

May 2, 2026·3 min

LLMHugging Face Blog

Overworld released Waypoint-1.5: 720p interactive worlds for consumer GPUs

Overworld released Waypoint-1.5, a world model for running locally on consumer GPUs: up to 720p and 60 FPS, plus a lighter 360p version for a wider range of PCs and laptops.

May 2, 2026·3 min

LLMHugging Face Blog

Hugging Face released a Skill for quickly porting Transformers models to MLX

Hugging Face introduced a Skill and a separate test harness to port new models from Transformers to mlx-lm on MLX almost immediately, without a stream of raw AI-generated PRs.

May 2, 2026·3 min

LLMHugging Face Blog

IBM Research analyzed where AI agents break down on APIs, documents, and rules in VAKRA

IBM Research's analysis of VAKRA shows that even strong models lose reliability when they have to combine APIs, documents, multi-step reasoning, and tool constraints.

May 2, 2026·3 min

LLMHugging Face Blog

Hugging Face published Ecom-RLVE, a training environment for e-commerce AI agents

Hugging Face published Ecom-RLVE, an open-source environment where AI agents learn to handle purchase conversations, use tools, and earn a verifiable reward for the cart they actually assemble.

May 2, 2026·3 min

LLMHugging Face Blog

QIMMA Arabic LLM Leaderboard: Qwen3.5 Leads, Karnak Second

TII launched QIMMA, an Arabic LLM leaderboard that first checks the benchmarks themselves and only then compares models on 52,000 examples from seven domains.

May 2, 2026·3 min

LLMHugging Face Blog

NVIDIA introduced Nemotron OCR v2: multilingual OCR trained on 12.2 million synthetic documents

NVIDIA showed how it built Nemotron OCR v2: the model was trained on 12.2 million synthetic documents to recognize multiple languages with a single engine and process up to 34.7 pages per second.

May 2, 2026·3 min

LLMHugging Face Blog

NVIDIA showed how Gemma 4 with voice and a webcam runs on Jetson Orin Nano Super

NVIDIA published a demo in which Gemma 4 decides on its own when to activate the webcam and responds by voice — all locally on Jetson Orin Nano Super with 8 GB of memory.

May 1, 2026·3 min

LLMHugging Face Blog

NVIDIA introduces NeMo Retriever — agentic search for complex enterprise data

NVIDIA showcased an agentic pipeline in NeMo Retriever: the system goes beyond semantic search, planning steps, refining queries, and has already taken first place in ViDoRe v3.

Apr 30, 2026·3 min

LLMHugging Face Blog

Nvidia unveiled the first open dataset and foundation AI models for medical robots

Nvidia and partners on Hugging Face released the first large open dataset for medical robots and two foundation models for surgery, simulation, and future autonomy.

Apr 30, 2026·3 min

LLMHugging Face Blog

NVIDIA released Nemotron 3 Nano 4B — a compact hybrid model for on-device deployment

NVIDIA made the 4B Nemotron 3 Nano model with a hybrid Mamba-Transformer architecture available: the lowest VRAM usage in its class, 18 tokens/s on Jetson Orin Nano, and open weights.

Apr 30, 2026·2 min

LLMHugging Face Blog

Hugging Face: Chinese open-source models overtake the US in AI ecosystem downloads

Hugging Face showed that open-source AI nearly doubled in scale over the past year, while Chinese models already account for 41% of downloads and set the pace in releases, adaptation, and local deployment.

Apr 30, 2026·3 min

LLMHugging Face Blog

AI Model Evaluation Now Costs More Than Training — A New Barrier for Researchers

EvalEval Coalition analyzed the cost of AI-benchmarks: a single agentic test costs $40,000 or more, and academic groups can no longer afford independent evaluation.

Apr 30, 2026·2 min

LLMHugging Face Blog

IBM reveals how it built Granite 4.1: 15 trillion tokens, 512K context window, and focus on quality

IBM detailed its approach to training Granite 4.1: five pretraining stages, 15 trillion tokens, context window up to 512K, and separate SFT and RL pipelines for quality improvement.

Apr 30, 2026·3 min

LLMHugging Face Blog

Hugging Face adds DeepInfra to Inference Providers for unified model API

Hugging Face connected DeepInfra to Inference Providers: DeepSeek, Kimi, and GLM models can now be run from Hub pages, via SDK, and through the unified router without separate integration.

Apr 30, 2026·3 min

LLMHugging Face Blog

NVIDIA Introduced Nemotron 3 Nano Omni for Long Documents, Audio, Video, and AI Agents

NVIDIA introduced Nemotron 3 Nano Omni — an open multimodal model for long documents, audio, video, and GUI scenarios with emphasis on speed and context.

Apr 28, 2026·3 min

LLMHugging Face Blog

Hugging Face Explains Fine-tuning of Multimodal Embeddings and Reranker Models

Hugging Face released a practical guide on training multimodal embedding and reranker models in Sentence Transformers and demonstrated how domain-specific fine-tuning improves document retrieval.

Apr 28, 2026·3 min

LLMHugging Face Blog

How Hugging Face Builds Scalable Web Apps with OpenAI Privacy Filter

Hugging Face demonstrated three scenarios for OpenAI Privacy Filter: document reading with PII highlighting, image anonymization, and secure pastebin with public and private versions.

Apr 27, 2026·3 min

LLMHugging Face Blog

Hugging Face: open-source AI gives defenders the same capabilities as attackers

Hugging Face explains why open models and tools are a structural advantage in cybersecurity, not a threat.

Apr 22, 2026·2 min

LLMHugging Face Blog

Hugging Face trained an image generation model in 24 hours

The third part of Hugging Face's PRX project shows that a full-fledged text-to-image model can be trained in just 24 hours. This changes perceptions of the accessibility of generative AI.

Mar 3, 2026·2 min