Lightning Fast OCR for Ruby on Rails

Zero dependencies. Built with Rust. No external services required.
Extract text from images and PDFs entirely on your own server.

2 engine options
0 dependencies
7+ formats supported

Why activestorage-ocr?

Two OCR Engines

Choose between ocrs (pure Rust) or Tesseract. Compare results side-by-side to find what works best for your documents.

Zero Dependencies

No system installations required. No cloud API keys needed. Everything is statically linked into a single binary.

Native Rails Integration

Works seamlessly with Active Storage. Simple API, familiar patterns. Automatic blob analysis and metadata storage.

Simple to Use

Extract text from any attachment

ruby
# Get OCR results from an Active Storage attachment
result = ActiveStorage::Ocr.extract_text(document.file)

result.text              # => "Extracted text from image..."
result.confidence        # => 0.95
result.processing_time_ms  # => 142

Or use automatic blob metadata

ruby
# OCR runs automatically during blob analysis
document.file.analyze

# Results stored in blob metadata
document.file.metadata["ocr_text"]
document.file.metadata["ocr_confidence"]

Try It Now

Upload an image or PDF and compare both OCR engines side-by-side. Supports PNG, JPEG, GIF, BMP, WebP, TIFF, and PDF.

Drop your file here
or click to browse

PNG, JPEG, GIF, BMP, WebP, TIFF, PDF (max 5MB)

Getting Started

1

Add to your Gemfile

ruby
gem "activestorage-ocr"
2

Install the OCR server binary

bash
bundle install
bin/rails activestorage_ocr:install
3

Add OCR server to Procfile.dev

procfile
web: bin/rails server
ocr: activestorage-ocr-server --host 127.0.0.1 --port 9292

Now bin/dev starts both Rails and the OCR server together.

4

That's it! OCR runs automatically

ruby
# OCR metadata is added automatically to Active Storage blobs
document.file.metadata["ocr_text"]       # => "Extracted text..."
document.file.metadata["ocr_confidence"] # => 0.85

Supported Formats

PNG
JPEG
GIF
BMP
WebP
TIFF
PDF