Curious how it does on multi-page scanned PDFs vs. single screenshots? The ORT vision/decoder split is the part that usually makes or breaks CPU VLM OCR...
krunck 6 hours ago [-]
I had to extract the image from a PDF for it to work. Then run it on each page image extracted.
monosma 4 hours ago [-]
What was the reason for adopting PaddleOCR?
Can other OCR models be used as well?
mrkn1 4 hours ago [-]
No reason other than their Q4 model working reasonably well and fast on my CPU laptop. Should work with any ONNX VLM model
kouru225 4 hours ago [-]
Roman alphabet only or does this work with other alphabets?
mrkn1 4 hours ago [-]
109 languages, including other alphabets.
garrett2558 8 hours ago [-]
Very cool, I'm building my own local-first product as well