# data acquisition and web scraping selenium beautifulsoup4 # audio processing openai-whisper # traditional OCR pillow pytesseract opencv-python tqdm streamlit python-Levenshtein # ocr with keras/tensorflow tensorflow keras matplotlib