OCR Analysis Project
Project that analyzes text extracted from images for sensitive data.
This is a FastAPI project that analyzes text extracted from images for sensitive data and caches the results using Redis. The project includes functionality to analyze text from both URLs and uploaded image files. It uses the Tesseract OCR engine for text extraction and Redis for caching.
Features
- Analyze sensitive data in text extracted from images.
- Supports both image URLs and file uploads.
- Utilizes Redis caching to improve performance.
- Uses ngrok for tunneling to expose your local server to the internet.
Prerequisites
- Docker
- ngrok (for tunneling to your local FastAPI server)
Installation
Clone this repository to your local machine:
git clone https://gitlab.com/c4pt-mqs/opcharist.git cd opcharist
Install the project dependencies using Poetry:
poetry install
Run the API server
uvicorn main:app --reload
Docker
Build and start the Docker image:
docker-compose up --build -d
This will start the FastAPI app and a Redis container.
Your FastAPI app should now be running at http://localhost:8000.
Endpoints
GET /:
Server status. Returns a simple message indicating that the server is running.
POST /analyze/:
Analyze text extracted from an image URL.
POST /upload/:
Upload an image file and analyze the extracted text.
Using ngrok
To expose your local FastAPI server to the internet, you can use ngrok. After running your FastAPI app with Docker Compose, open a new terminal window and run ngrok:
ngrok http 8000
Ngrok will provide you with a temporary public URL that you can use to access your FastAPI app from anywhere.
License
This project is licensed under the GNU Affero General Public License (AGPL).