docs: update README and requirements for optional OCR dependencies
All checks were successful
Python tests / tests (push) Successful in 1m27s
All checks were successful
Python tests / tests (push) Successful in 1m27s
This commit is contained in:
@@ -15,12 +15,14 @@ Monitor the Berlin Perso/Passport portal, crack the audio CAPTCHA with Whisper,
|
||||
- Firefox + `geckodriver` in `$PATH` for Selenium
|
||||
- `ffmpeg` (needed by `openai-whisper`)
|
||||
- Optional: Tesseract OCR if you experiment with the image-based approach in `ocr/`
|
||||
- Optional: Python packages from `requirements-ocr.txt` when working on the OCR experiments
|
||||
|
||||
## Setup
|
||||
1. Clone the repo and create a virtual environment: `python -m venv .venv && source .venv/bin/activate`
|
||||
2. Install runtime dependencies: `pip install -r requirements.txt`
|
||||
3. (Optional) Add tooling such as pytest: `pip install -r dev-requirements.txt`
|
||||
4. Provide credentials:
|
||||
3. (Optional) Install OCR extras: `pip install -r requirements-ocr.txt`
|
||||
4. (Optional) Add tooling such as pytest: `pip install -r dev-requirements.txt`
|
||||
5. Provide credentials:
|
||||
- Copy `settings.example.py` to `settings.py`
|
||||
- Set `DOCUMENT_ID` (the identifier embedded in the Berlin status URL)
|
||||
- Set `WEBHOOK_URL` pointing to the service that should receive status payloads
|
||||
|
||||
14
requirements-ocr.txt
Normal file
14
requirements-ocr.txt
Normal file
@@ -0,0 +1,14 @@
|
||||
# Optional OCR dependencies
|
||||
|
||||
# traditional OCR
|
||||
pillow
|
||||
pytesseract
|
||||
opencv-python
|
||||
tqdm
|
||||
streamlit
|
||||
python-Levenshtein
|
||||
|
||||
# ocr with keras/tensorflow
|
||||
tensorflow
|
||||
keras
|
||||
matplotlib
|
||||
@@ -5,15 +5,5 @@ beautifulsoup4
|
||||
# audio processing
|
||||
openai-whisper
|
||||
|
||||
# traditional OCR
|
||||
pillow
|
||||
pytesseract
|
||||
opencv-python
|
||||
tqdm
|
||||
streamlit
|
||||
python-Levenshtein
|
||||
|
||||
# ocr with keras/tensorflow
|
||||
tensorflow
|
||||
keras
|
||||
matplotlib
|
||||
# for sending results to webhook
|
||||
requests
|
||||
|
||||
Reference in New Issue
Block a user