Publisher · verified by editors

Hugging Face Blog

AI news source. Articles are auto-selected and adapted by Hamidun News editors.

65 articles in Hamidun·Latest: July 23· Active·huggingface.co ↗

Latest publications

NVIDIA opens Nemotron datasets: 10 trillion tokens for training AI agents

NVIDIA released the Nemotron datasets with 10+ trillion tokens and 2.4 billion synthetic personas to train AI agents on real-world scenarios and tool failures.

Jul 9, 2026·3 min

LLMHugging Face Blog

Hugging Face: transformers backend in vLLM now runs at native speed

Hugging Face announced performance parity: the `--model-impl transformers` flag delivers native vLLM speed across 450+ architectures without rewriting code.

Jul 8, 2026·2 min

LLMHugging Face Blog

SkyPilot and Hugging Face launched storage with no egress fees

The SkyPilot and Hugging Face Storage integration makes it possible to train models on any cloud while storing weights and datasets at $12-18/TB/month with no egress fees.

Jul 8, 2026·2 min

LLMHugging Face Blog

Microsoft launched Foundry Managed Compute: thousands of Hugging Face models with one click

Microsoft combined Azure Foundry with the Hugging Face catalog: thousands of open models now deploy in the cloud with one click, with enterprise security and unified billing.

Jul 8, 2026·2 min

LLMHugging Face Blog

LeRobot v0.6.0 from Hugging Face: robots learn to predict the future and evaluate themselves

Hugging Face released LeRobot v0.6.0 with world-model policies, reward models for task self-assessment, and six new benchmarks — completing the full robot learning cycle.

Jul 6, 2026·3 min

LLMHugging Face Blog

Hugging Face Updates Kernels: Trusted Publishers, Code Signing, and Agentic Development

On July 6, 2026, Hugging Face released a major Kernels update: kernels became a separate repository type on Hub, trusted publishers and code signing via Sigstore were added.

Jul 6, 2026·3 min

LLMHugging Face Blog

Hugging Face and Cerebras launch Gemma 4 for real-time voice AI

Hugging Face and Cerebras unveiled an open speech-to-speech pipeline on Gemma 4 with predictable latencies — the system is already used in more than 9,000 Reachy Mini robots.

Jul 4, 2026·2 min

LLMHugging Face Blog

Hugging Face compares all LoRA alternatives: who wins at fine-tuning LLMs

The Hugging Face team tested five PEFT methods for fine-tuning LLMs — from DoRA to GaLore — and found out when LoRA can be beaten, and exactly what it costs.

Jun 29, 2026·2 min

LLMHugging Face Blog

ServiceNow: AI agent leaks corporate secrets through a chain of search queries

ServiceNow researchers have shown that a deep-research agent unintentionally exposes corporate secrets — each query is harmless on its own, but the whole chain forms a mosaic.

Jun 29, 2026·3 min

LLMHugging Face Blog

PaddleOCR releases PP-OCRv6: text recognition in 50 languages from 1.5 to 34.5 million parameters

PP-OCRv6 from Baidu PaddlePaddle: a universal OCR system for 50 languages in three configurations ranging from 1.5 to 34.5 million parameters, with an accuracy gain of +4.6-5.1 percentage points over the previous server

Jun 29, 2026·2 min

LLMHugging Face Blog

Hybrid models predict content words better than transformers — Allen AI study

Allen AI compared the OLMo 3 and OLMo Hybrid architectures: hybrid models predict nouns, verbs, and adjectives more accurately, but lag behind transformers on repetitive text spans.

Jun 28, 2026·2 min

LLMHugging Face Blog

Hugging Face: launch a vLLM server on HF Jobs with a single command

Hugging Face has added vLLM support to the HF Jobs platform: a production-ready inference server for any model from the Hub can now be deployed with a single CLI command.

Jun 28, 2026·2 min

LLMHugging Face Blog

AllenAI Releases olmo-eval — A Platform for Evaluating LLMs During Training

AllenAI released olmo-eval, an open toolkit for continuous evaluation of language models throughout the training cycle — checkpoint by checkpoint.

Jun 15, 2026·2 min

LLMHugging Face Blog

Cohere Presents North Mini Code — Model for Developers and AI Agents

Cohere released North Mini Code — a 30-billion-parameter model specifically trained on programming and AI agent work. The model is free and available to everyone.

Jun 11, 2026·3 min

LLMHugging Face Blog

Voice agents not ready for bilingual customers. ServiceNow-AI research

Voice agents perform poorly with bilingual clients. This was shown by research from the ServiceNow-AI team, which tested seven popular speech recognition systems on examples of code-switching — when…

Jun 11, 2026·3 min

LLMHugging Face Blog

How to Speed Up PyTorch Models: A Practical Guide to torch.profiler

Hugging Face explained torch.profiler, a built-in PyTorch tool for analyzing performance. It helps identify bottlenecks in model training and inference.

May 29, 2026·3 min

LLMHugging Face Blog

Hugging Face Enables TRL to Deliver Trillion Parameters Through Delta Weights

Hugging Face added Delta Weight Sync to TRL — a technique that sends only weight changes instead of full files, reducing data volume by hundreds of times when training giant models.

May 29, 2026·2 min

LLMHugging Face Blog

Reachy Mini Learns to Speak Locally Without the Cloud

The Reachy Mini humanoid robot can now run a full speech recognition stack locally without cloud or API, thanks to open models from Hugging Face.

May 29, 2026·3 min

LLMHugging Face Blog

IBM and Artificial Analysis create benchmark: AI agents fail at IT tasks

Large language models scored less than 50% on the new ITBench-AA benchmark for assessing AI agents' ability to solve corporate IT tasks. This shows that full automation of IT work remains a distant future.

May 29, 2026·3 min

LLMHugging Face Blog

NVIDIA Nemotron: Diffusion Models Generate Text 6x Faster

NVIDIA introduced Nemotron-Labs Diffusion — the first language models that generate text in parallel instead of sequentially. In speculative mode, they run 6× faster than conventional models thanks to the diffusion appro

May 25, 2026·3 min

LLMHugging Face Blog

How a Small Model Beat GPT-5 and Claude Opus at Portuguese OCR

Dharma AI trained a 3-billion-parameter specialized model that outperformed all commercial frontier models in text recognition tests and beat them in price by 52 times.

May 25, 2026·3 min

LLMHugging Face Blog

Hugging Face launched Open Agent Leaderboard to evaluate AI agents

Hugging Face introduced an open benchmark for comparing full AI agent systems. It found that agent architecture matters more than the model chosen.

May 21, 2026·3 min

LLMHugging Face Blog

PaddleOCR 3.5 Gains Support for Hugging Face Transformers

PaddleOCR has been updated with full support for Hugging Face Transformers as an inference backend. Text recognition and document parsing now work in a PyTorch environment.

May 21, 2026·2 min

LLMHugging Face Blog

NVIDIA Shows Efficient Method to Train Cosmos for Robot Video Generation Using LoRA

NVIDIA released a guide for fine-tuning Cosmos Predict 2.5 using LoRA/DoRA—a parameter-efficient adaptation method that enables robot video generation training in 17 hours on a single GPU.

May 21, 2026·2 min

LLMHugging Face Blog

Ettin Reranker from Hugging Face: 6 Models for Precise Search Reranking

Hugging Face released 6 Ettin rerankers based on ModernBERT with state-of-the-art accuracy and speed thanks to Flash Attention 2 and sequence optimization.

May 21, 2026·3 min

LLMHugging Face Blog

OlmoEarth v1.1: Allen AI Releases Satellite Models 3 Times Cheaper

Allen AI presented a more efficient version of models for analyzing satellite imagery, reducing computational costs by 3 times while maintaining quality.

May 21, 2026·2 min

LLMHugging Face Blog

How Allen AI's model learned to discover expert specialization on its own

Allen AI introduced EMO, a mixture-of-experts model that naturally develops domain specialization (health, politics, film) without explicit training on those categories.

May 17, 2026·3 min

LLMHugging Face Blog

CyberSecQwen-4B: how a small model became a vulnerability expert

The specialized 4-billion-parameter cybersecurity model outperforms general-purpose competitors in vulnerability analysis and runs locally on personal hardware without cloud services.

May 17, 2026·3 min

LLMHugging Face Blog

OncoAgent: AI system for early cancer detection based on private patient data

How a machine learning algorithm helps doctors make decisions on cancer diagnosis without compromising patient confidentiality

May 17, 2026·3 min

LLMHugging Face Blog

Hugging Face sped up LLM inference by 22% with asynchronous batching

Parallel CPU and GPU processing instead of sequential processing cut GPU idle time by 24% and sped up token generation by nearly a quarter without changing the model.

May 17, 2026·2 min