Publisher · verified by editors

Hugging Face Blog

AI news source. Articles are auto-selected and adapted by Hamidun News editors.

53 articles in Hamidun·Latest: June 15· Active·huggingface.co ↗

Latest publications

AllenAI Releases olmo-eval — A Platform for Evaluating LLMs During Training
LLMHugging Face Blog

AllenAI Releases olmo-eval — A Platform for Evaluating LLMs During Training

AllenAI released olmo-eval, an open toolkit for continuous evaluation of language models throughout the training cycle — checkpoint by checkpoint.

Jun 15, 2026·2 min
Cohere Presents North Mini Code — Model for Developers and AI Agents
LLMHugging Face Blog

Cohere Presents North Mini Code — Model for Developers and AI Agents

Cohere released North Mini Code — a 30-billion-parameter model specifically trained on programming and AI agent work. The model is free and available to everyone.

Jun 11, 2026·3 min
Voice agents not ready for bilingual customers. ServiceNow-AI research
LLMHugging Face Blog

Voice agents not ready for bilingual customers. ServiceNow-AI research

Voice agents perform poorly with bilingual clients. This was shown by research from the ServiceNow-AI team, which tested seven popular speech recognition systems on examples of code-switching — when…

Jun 11, 2026·3 min
How to Speed Up PyTorch Models: A Practical Guide to torch.profiler
LLMHugging Face Blog

How to Speed Up PyTorch Models: A Practical Guide to torch.profiler

Hugging Face explained torch.profiler, a built-in PyTorch tool for analyzing performance. It helps identify bottlenecks in model training and inference.

May 29, 2026·3 min
Hugging Face Enables TRL to Deliver Trillion Parameters Through Delta Weights
LLMHugging Face Blog

Hugging Face Enables TRL to Deliver Trillion Parameters Through Delta Weights

Hugging Face added Delta Weight Sync to TRL — a technique that sends only weight changes instead of full files, reducing data volume by hundreds of times when training giant models.

May 29, 2026·2 min
Reachy Mini Learns to Speak Locally Without the Cloud
LLMHugging Face Blog

Reachy Mini Learns to Speak Locally Without the Cloud

The Reachy Mini humanoid robot can now run a full speech recognition stack locally without cloud or API, thanks to open models from Hugging Face.

May 29, 2026·3 min
IBM and Artificial Analysis create benchmark: AI agents fail at IT tasks
LLMHugging Face Blog

IBM and Artificial Analysis create benchmark: AI agents fail at IT tasks

Large language models scored less than 50% on the new ITBench-AA benchmark for assessing AI agents' ability to solve corporate IT tasks. This shows that full automation of IT work remains a distant future.

May 29, 2026·3 min
NVIDIA Nemotron: Diffusion Models Generate Text 6x Faster
LLMHugging Face Blog

NVIDIA Nemotron: Diffusion Models Generate Text 6x Faster

NVIDIA introduced Nemotron-Labs Diffusion — the first language models that generate text in parallel instead of sequentially. In speculative mode, they run 6× faster than conventional models thanks to the diffusion appro

May 25, 2026·3 min
How a Small Model Beat GPT-5 and Claude Opus at Portuguese OCR
LLMHugging Face Blog

How a Small Model Beat GPT-5 and Claude Opus at Portuguese OCR

Dharma AI trained a 3-billion-parameter specialized model that outperformed all commercial frontier models in text recognition tests and beat them in price by 52 times.

May 25, 2026·3 min
Hugging Face launched Open Agent Leaderboard to evaluate AI agents
LLMHugging Face Blog

Hugging Face launched Open Agent Leaderboard to evaluate AI agents

Hugging Face introduced an open benchmark for comparing full AI agent systems. It found that agent architecture matters more than the model chosen.

May 21, 2026·3 min
PaddleOCR 3.5 Gains Support for Hugging Face Transformers
LLMHugging Face Blog

PaddleOCR 3.5 Gains Support for Hugging Face Transformers

PaddleOCR has been updated with full support for Hugging Face Transformers as an inference backend. Text recognition and document parsing now work in a PyTorch environment.

May 21, 2026·2 min
NVIDIA Shows Efficient Method to Train Cosmos for Robot Video Generation Using LoRA
LLMHugging Face Blog

NVIDIA Shows Efficient Method to Train Cosmos for Robot Video Generation Using LoRA

NVIDIA released a guide for fine-tuning Cosmos Predict 2.5 using LoRA/DoRA—a parameter-efficient adaptation method that enables robot video generation training in 17 hours on a single GPU.

May 21, 2026·2 min
Ettin Reranker from Hugging Face: 6 Models for Precise Search Reranking
LLMHugging Face Blog

Ettin Reranker from Hugging Face: 6 Models for Precise Search Reranking

Hugging Face released 6 Ettin rerankers based on ModernBERT with state-of-the-art accuracy and speed thanks to Flash Attention 2 and sequence optimization.

May 21, 2026·3 min
OlmoEarth v1.1: Allen AI Releases Satellite Models 3 Times Cheaper
LLMHugging Face Blog

OlmoEarth v1.1: Allen AI Releases Satellite Models 3 Times Cheaper

Allen AI presented a more efficient version of models for analyzing satellite imagery, reducing computational costs by 3 times while maintaining quality.

May 21, 2026·2 min
How Allen AI's model learned to discover expert specialization on its own
LLMHugging Face Blog

How Allen AI's model learned to discover expert specialization on its own

Allen AI introduced EMO, a mixture-of-experts model that naturally develops domain specialization (health, politics, film) without explicit training on those categories.

May 17, 2026·3 min
CyberSecQwen-4B: how a small model became a vulnerability expert
LLMHugging Face Blog

CyberSecQwen-4B: how a small model became a vulnerability expert

The specialized 4-billion-parameter cybersecurity model outperforms general-purpose competitors in vulnerability analysis and runs locally on personal hardware without cloud services.

May 17, 2026·3 min
OncoAgent: AI system for early cancer detection based on private patient data
LLMHugging Face Blog

OncoAgent: AI system for early cancer detection based on private patient data

How a machine learning algorithm helps doctors make decisions on cancer diagnosis without compromising patient confidentiality

May 17, 2026·3 min
Hugging Face sped up LLM inference by 22% with asynchronous batching
LLMHugging Face Blog

Hugging Face sped up LLM inference by 22% with asynchronous batching

Parallel CPU and GPU processing instead of sequential processing cut GPU idle time by 24% and sped up token generation by nearly a quarter without changing the model.

May 17, 2026·2 min
IBM released Granite Embedding R2 — a multilingual model for semantic search
LLMHugging Face Blog

IBM released Granite Embedding R2 — a multilingual model for semantic search

IBM introduced Granite Embedding R2, an open multilingual model for semantic search with 32K context support and best-in-class performance among sub-100M models.

May 16, 2026·3 min
H Company released Holotron-12B — a model for agents with a 2x speed increase
LLMHugging Face Blog

H Company released Holotron-12B — a model for agents with a 2x speed increase

H Company published Holotron-12B on Hugging Face: the multimodal model for AI agents delivers more than a 2x throughput gain in interface-use tasks on a single H100.

May 2, 2026·3 min
NVIDIA introduced SPEED-Bench — a unified benchmark for speculative decoding
LLMHugging Face Blog

NVIDIA introduced SPEED-Bench — a unified benchmark for speculative decoding

NVIDIA published SPEED-Bench, a dataset and measurement framework that compares speculative decoding across real-world workloads, long contexts, and different inference engines.

May 2, 2026·3 min
IBM released Mellea 0.4.0 and Granite Libraries for verifiable AI pipelines
LLMHugging Face Blog

IBM released Mellea 0.4.0 and Granite Libraries for verifiable AI pipelines

IBM Research updated the open-source Mellea framework to version 0.4.0 and released three Granite Libraries for structured, verifiable, and safe AI workflows.

May 2, 2026·3 min
NVIDIA showed how to fine-tune an embedding model for a specific domain in a day
LLMHugging Face Blog

NVIDIA showed how to fine-tune an embedding model for a specific domain in a day

NVIDIA and Hugging Face published a step-by-step recipe that turns a base embedding model into specialized search over internal documents in a few hours.

May 2, 2026·3 min
ServiceNow introduced EVA — a new framework for evaluating voice AI agents
LLMHugging Face Blog

ServiceNow introduced EVA — a new framework for evaluating voice AI agents

ServiceNow released EVA — a system that evaluates voice AI agents not only by task success, but also by dialogue quality, from response brevity to turn timing.

May 2, 2026·3 min
IBM releases Granite 4.0 3B Vision for extracting data from documents and charts
LLMHugging Face Blog

IBM releases Granite 4.0 3B Vision for extracting data from documents and charts

IBM introduced Granite 4.0 3B Vision, a compact multimodal model for extracting tables, charts, and key fields from documents that can be integrated into enterprise pipelines with Docling.

May 2, 2026·2 min
H Company introduces Holo3 — an AI agent for computer use with a record score on OSWorld-Verified
LLMHugging Face Blog

H Company introduces Holo3 — an AI agent for computer use with a record score on OSWorld-Verified

H Company has released Holo3, a model for computer use that scored 78.85% on OSWorld-Verified and was trained on synthetic enterprise scenarios.

May 2, 2026·3 min
Google released Gemma 4 on Hugging Face: multimodal models for local inference
LLMHugging Face Blog

Google released Gemma 4 on Hugging Face: multimodal models for local inference

Google DeepMind has opened the Gemma 4 family on Hugging Face: four multimodal models under the Apache 2.0 license, with up to 256K context and deployment ranging from phones to workstations.

May 2, 2026·3 min
Hugging Face added gradio.Server: custom frontends can now connect to a Gradio backend
LLMHugging Face Blog

Hugging Face added gradio.Server: custom frontends can now connect to a Gradio backend

Hugging Face’s new gradio.Server turns Gradio into a backend layer for React, Svelte, and plain HTML/JS, while preserving request queues, ZeroGPU, and compatibility with Spaces.

May 2, 2026·3 min
Hugging Face transfers Safetensors to the PyTorch Foundation for neutral governance of the format
LLMHugging Face Blog

Hugging Face transfers Safetensors to the PyTorch Foundation for neutral governance of the format

Hugging Face announced that Safetensors has become a PyTorch Foundation project: there are no breaking changes for users, while the format's development moves to a neutral governance model.

May 2, 2026·3 min
Overworld released Waypoint-1.5: 720p interactive worlds for consumer GPUs
LLMHugging Face Blog

Overworld released Waypoint-1.5: 720p interactive worlds for consumer GPUs

Overworld released Waypoint-1.5, a world model for running locally on consumer GPUs: up to 720p and 60 FPS, plus a lighter 360p version for a wider range of PCs and laptops.

May 2, 2026·3 min