Our long-term goal is to build efficient and reliable 2.5B diffusion-based decoding for document OCR. MinerU-Diffusion reframes document OCR as an inverse rendering problem and replaces slow, ...
Python OCR pipeline for scanned company-list PDFs or page-level images. It processes JPG/JPEG/PNG files directly as one page each, renders PDF pages when PDFs are supplied, splits company entries by ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results