CatchTheTornado/text-extract-api

Document (PDF, Word, PPTX ...) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured JSON or Markdown

3.1k 268 -1/wk

GitHub

anonymization api extract json llm ocr ocr-python pdf pii

Trend 0

Star & Fork Trend (17 data points)

Stars

Forks

Multi-Source Signals

GitHub

stars 3.1k

forks 268

Growth Velocity

CatchTheTornado/text-extract-api has -1 stars this period . 7-day velocity: -0.0%.

Deep analysis is being generated for this repository.

Signal-backed technical analysis will be available soon.

Metric	text-extract-api	LlamaIndexTS	claude-code-ultimate-guide	openvino_notebooks
Stars	3.1k	3.1k	3.1k	3.1k
Forks	268	513	431	1.0k
Weekly Growth	-1	+1	+38	+3
Language	Python	TypeScript	TypeScript	Jupyter Notebook
Sources	1	1	1	1
License	MIT	MIT	CC-BY-SA-4.0	Apache-2.0

Capability Radar vs LlamaIndexTS

text-extract-api

LlamaIndexTS

Maintenance Activity 34

Last code push 121 days ago.

Community Engagement 43

Fork-to-star ratio: 8.7%. Lower fork ratio may indicate passive usage.

Issue Burden 70

Issue data not yet available.

Growth Momentum 30

No measurable growth in the current period (first-day cold start expected).

License Clarity 95

Licensed under MIT. Permissive — safe for commercial use.

Risk scores are computed from real-time repository data. Higher scores indicate healthier metrics.