Lightning Fast OCR for Ruby on Rails
Zero dependencies. Built with Rust. No external services required.
Extract text from images and PDFs entirely on your own server.
Why activestorage-ocr?
Two OCR Engines
Choose between ocrs (pure Rust) or Tesseract. Compare results side-by-side to find what works best for your documents.
Zero Dependencies
No system installations required. No cloud API keys needed. Everything is statically linked into a single binary.
Native Rails Integration
Works seamlessly with Active Storage. Simple API, familiar patterns. Automatic blob analysis and metadata storage.
Simple to Use
Extract text from any attachment
# Get OCR results from an Active Storage attachment
result = ActiveStorage::Ocr.extract_text(document.file)
result.text # => "Extracted text from image..."
result.confidence # => 0.95
result.processing_time_ms # => 142
Or use automatic blob metadata
# OCR runs automatically during blob analysis
document.file.analyze
# Results stored in blob metadata
document.file.metadata["ocr_text"]
document.file.metadata["ocr_confidence"]
Try It Now
Upload an image or PDF and compare both OCR engines side-by-side. Supports PNG, JPEG, GIF, BMP, WebP, TIFF, and PDF.
Getting Started
Add to your Gemfile
gem "activestorage-ocr"
Install the OCR server binary
bundle install
bin/rails activestorage_ocr:install
Add OCR server to Procfile.dev
web: bin/rails server
ocr: activestorage-ocr-server --host 127.0.0.1 --port 9292
Now bin/dev starts both Rails and the OCR server together.
That's it! OCR runs automatically
# OCR metadata is added automatically to Active Storage blobs
document.file.metadata["ocr_text"] # => "Extracted text..."
document.file.metadata["ocr_confidence"] # => 0.85