Open Source
ScanTailor is an open-source software, which means it's free to use and allows users to contribute to its development, ensuring transparency and continuous improvement.
Image Processing
It offers various image processing steps such as cleaning, dewarping, and cropping, which enhance the quality of scanned documents.
Batch Processing
ScanTailor supports batch processing of images, making it efficient for users dealing with large volumes of scanned documents.
Customizable Output
Users can define the output parameters and settings to suit specific needs, giving greater control over the final document appearance.
Cross-platform Support
The software is available for multiple operating systems including Windows, macOS, and Linux, increasing its accessibility to a wide range of users.
There's also https://scantailor.org/ (and a maintained fork at https://github.com/4lex4/scantailor-advanced ) which semi-automates unwarping and other corrective tasks in scanned books. - Source: Hacker News / about 2 years ago
Scantailor (https://scantailor.org) is the tool for self-scanned books that exist in images (png, jpg, etc). However, I usually use Irfanview with PDF plugin (https://irfanview.com - download both Irfanview and the Plugins from this home page) I have elsewhere in r/PDF shown how you can do batch splitting of two-page scans, clean up muddy pages (yellowed or browned) . In the Reddit search box, search for... Source: about 2 years ago
Scantailor https://scantailor.org/ might be useful. Source: about 2 years ago
Scantailor is a good open source option that has a lot of features centered towards this process. Source: over 2 years ago
I use OCRmyPDF on a regular basis to OCR journal articles my library sends me. I've found it works great on English but (with appropriate language packs installed) works poorly on Greek and Hebrew. It also makes no effort to understand the layout of pages (e.g., tables). The project is fantastic, though. I've often considered building a web frontend that cleans up PDFs and then OCRs them using OCRmyPDF. For... - Source: Hacker News / almost 3 years ago
- Load the images into a program called scantailor, its an old program but very solid, free and open source. It loads all the TIFs for post processing, it is able to detect and separate pages, rotate and deskew them, detect the content of the page, and cleans up the scan very nicely. It even detects what part of the content is text and what are images, meaning your images will still be shown in RGB, whereas text... Source: about 3 years ago
You can try using a program like ScanTailor. You will have to import all the images to the program and let run the program on default settings. The only gripe with this option is, the output images are huge. I have not used it in a while. Maybe there are improvements to the program. Source: over 3 years ago
You typically need to pre-process the images. I'd recommend https://scantailor.org/ for this (OSS, but. - Source: Hacker News / almost 4 years ago
After this, make a new project in Scan Tailor. This lets you fix up images and allows OCR apps to read the text much better. Source: almost 4 years ago
Do you know an article comparing ScanTailor to other products?
Suggest a link to a post with product alternatives.
This is an informative page about ScanTailor. You can review and discuss the product here. The primary details have not been verified within the last quarter, and they might be outdated. If you think we are missing something, please use the means on this page to comment or suggest changes. All reviews and comments are highly encouranged and appreciated as they help everyone in the community to make an informed choice. Please always be kind and objective when evaluating a product and sharing your opinion.