PaddlePaddle/PaddleOCR

Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.

75.2k 10.2k +73/wk

GitHub HuggingFace PyPI 3-source

GitHub 📦 PyPI 🤗 HuggingFace

ai4science chineseocr document-parsing document-translation kie ocr paddleocr-vl pdf-extractor-rag pdf-parser pdf2markdown pp-ocr pp-structure

Trend 17

Star & Fork Trend (43 data points)

Stars

Forks

Multi-Source Signals

weekly Downloads 357.1k

Growth Velocity

PaddlePaddle/PaddleOCR has +73 stars this period , with cross-source activity across 3 platforms (github, huggingface, pypi). 7-day velocity: 0.3%.

Deep analysis is being generated for this repository.

Signal-backed technical analysis will be available soon.

Metric	PaddleOCR	gpt4all	llm-course	gpt_academic
Stars	75.2k	77.3k	78.0k	70.4k
Forks	10.2k	8.3k	9.1k	8.4k
Weekly Growth	+73	-3	+41	+8
Language	Python	C++	N/A	Python
Sources	3	3	2	1
License	Apache-2.0	MIT	Apache-2.0	GPL-3.0

Capability Radar vs gpt4all

PaddleOCR

gpt4all

Maintenance Activity 100

Last code push 2 days ago.

Community Engagement 68

Fork-to-star ratio: 13.6%. Active community forking and contributing.

Issue Burden 70

Issue data not yet available.

Growth Momentum 46

+73 stars this period — 0.10% growth rate.

License Clarity 95

Licensed under Apache-2.0. Permissive — safe for commercial use.

Risk scores are computed from real-time repository data. Higher scores indicate healthier metrics.