Hugging Face Blog→ original

PaddleOCR 3.5 Gains Support for Hugging Face Transformers

PaddleOCR has released version 3.5 with full support for Transformers as an inference backend. Teams can now run OCR and document parsing in a PyTorch…

AI-processed from Hugging Face Blog; edited by Hamidun News
PaddleOCR 3.5 Gains Support for Hugging Face Transformers
Source: Hugging Face Blog. Collage: Hamidun News.
◐ Listen to article

PaddleOCR has been updated to version 3.5 and now runs on Hugging Face Transformers. Instead of using the proprietary Paddle inference engine, teams can use the familiar PyTorch environment for text recognition and document parsing.

Transformers Instead of Paddle

Before version 3.5, PaddleOCR was "locked" to the Paddle inference engine—Baidu's proprietary runtime. This meant that to use the library, teams had to install the entire Paddle stack, even if they were already using PyTorch. Version 3.5 solves this problem by introducing a flexible backend selection interface via the `engine` parameter. Now, if Transformers is installed, simply set `engine="transformers"` and the OCR models will run on PyTorch. This is particularly convenient for teams already using PyTorch and Transformers in other parts of their project. There's no need to maintain two separate runtimes, and no need to switch tools when moving from exploration to production.

What's Supported

The Transformers backend works with two model families:

  • PP-OCRv5—text recognition on images and documents, including multilingual OCR
  • PaddleOCR-VL 1.5—document parsing with visual understanding of page layout and structure
  • Flexible configuration via engine_config: choice of data type (float32, bfloat16), device placement, attention type (sdpa for optimization)

Usage was simple before—`paddleocr ocr -i image.png`. Now it's the same, but on the Transformers backend: `paddleocr ocr -i image.png --engine transformers`. The Python API allows more detailed configuration by specifying the data type and attention implementation via `engine_config`.

Who Benefits from This

The Transformers backend is ideal for several scenarios. First, if you're already working with a PyTorch stack—no need to learn a new tool. Second, if you're building RAG applications (retrieval-augmented generation) where you need to parse PDFs and extract structured data for indexing. Third, for Document AI projects—when you need to automate the processing of large document volumes. The standard Paddle backend remains useful if processing speed is critical and maximum throughput is needed. It's somewhat faster thanks to optimizations specific to the Paddle runtime.

What This Means

PaddleOCR is gradually ceasing to be an isolated tool and becoming one option within the broader Transformers ecosystem. For RAG and Document AI applications, this significantly simplifies the pipeline: now a single PyTorch stack can be used for embeddings, language models, and document parsing. This reduces the complexity of production deployment and simplifies maintaining a single version of dependencies.

ZK
Hamidun News
AI news without noise. Daily editorial selection from 400+ sources. A product by Zhemal Khamidun, Head of AI at Alpina Digital.

Want to stop reading about AI and start using it?

AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.

What do you think?
Loading comments…