Go to file
2024-11-04 18:23:06 +01:00
service/linux/systemd Initial commit 2024-11-04 18:23:06 +01:00
go.mod Initial commit 2024-11-04 18:23:06 +01:00
init.go Initial commit 2024-11-04 18:23:06 +01:00
main.go Initial commit 2024-11-04 18:23:06 +01:00
README.md Initial commit 2024-11-04 18:23:06 +01:00
runner.go Initial commit 2024-11-04 18:23:06 +01:00

OCRmyPDF runner

A very simple tool that listens for files in a directory, and runs OCRmyPDF on them.

This is needed as paperless(-ngx) will always create a copy of the document with its built in clean up and OCR feature. Even external pre-consumption scripts will be run on all new documents, not just files in from consumption directory. So the solution is to have this watchdog/runner that only pre-processes scanned documents, and leaves everything else untouched.

The idea is to let it watch a directory any scanner will scan into, and then this runner will write the final pre-processed document into a directory paperless watches.

Usage

  1. Install the project somewhere.
  2. Edit main.go to use the correct paths to your scanner and paperless consumption directories.
  3. Copy the ocrmypdf-runner.service into your paperless systemd services directory (%HOME/.config/systemd/user/ocrmypdf-runner.service).
  4. systemctl --user daemon-reload
  5. systemctl --user enable ocrmypdf-runner.service
  6. systemctl --user start ocrmypdf-runner.service