Skip to content

Conversation

@HexChap
Copy link

@HexChap HexChap commented Aug 31, 2025

Lets user choose theirs preferred language for the current operation. This improves the ocr results and probably is faster.

@coderabbitai
Copy link

coderabbitai bot commented Aug 31, 2025

Caution

Review failed

Failed to post review comments.

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 9e816db and c48f3d6.

📒 Files selected for processing (1)
  • OCR4Linux.py (8 hunks)
📝 Walkthrough

Summary by CodeRabbit

  • New Features
    • Select OCR languages via a new CLI argument, including an “All” option.
    • Interactive multi-selection of languages in the shell workflow.
  • Improvements
    • Validates chosen languages against available Tesseract options with clear errors.
    • Enhanced error handling with non-zero exit codes on failures.
    • Timestamped logging with optional log persistence.
    • Preflight checks ensure required files and writable directories before running.
  • Documentation
    • Updated usage, descriptions, and examples to include the languages argument.

Walkthrough

Adds CLI-selectable OCR languages: Python now accepts a langs argument and validates it against pytesseract; shell script adds interactive language selection (rofi), logging, preflight file/dir checks, and passes chosen languages to the Python OCR step. Minor formatting and error-handling updates included.

Changes

Cohort / File(s) Summary of changes
Python: OCR language handling & CLI
OCR4Linux.py
Added langs parameter to TesseractConfig.__init__; new static process_langs(preferred_langs: str) -> str that queries pytesseract.get_languages(), supports all, intersects preferred with available, and exits on no match; extract_text_with_lines uses resolved self.langs; added try/except around processing to print errors and return exit code; updated CLI arg count/help to 4.
Shell: interactive selection, logging & checks
OCR4Linux.sh
Added langs array and choose_lang() to list tesseract languages and multi-select via rofi (includes ALL); integrated selected langs into Python invocation (joined with +); added log_message() for timestamped logs and optional persistence; added check_if_files_exist() to ensure screenshot dir writable and presence of Python script; startup flow now invokes choose_lang() and uses languages in OCR flow.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant U as User
  participant SH as OCR4Linux.sh
  participant TS as tesseract
  participant PY as OCR4Linux.py
  participant PT as pytesseract

  U->>SH: run script
  SH->>TS: tesseract --list-langs
  TS-->>SH: available langs
  SH->>U: rofi multi-select (includes ALL)
  U-->>SH: selected langs
  SH->>SH: check_if_files_exist()
  SH->>PY: python OCR4Linux.py <img> <out> <langs>
  PY->>PT: get_languages()
  PT-->>PY: recognized langs
  PY->>PY: process_langs(preferred)
  alt resolved langs empty/error
    PY-->>SH: exit 1 (error)
  else resolved langs valid
    PY->>TS: tesseract -l <resolved langs> ...
    TS-->>PY: OCR result
    PY-->>SH: output produced
    SH->>U: clipboard/output updated
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

A rabbit taps keys and winks with glee,
“Now Tesseract listens to every tongue and tree.
Rofi gathers languages, one, two, all—hop!
Python checks, validates, then won’t stop.
Logs and carrots gleam; OCR hops on top.” 🥕

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 Ruff (0.12.2)
OCR4Linux.py

�[1;31mruff failed�[0m
�[1mCause:�[0m Failed to load configuration /ruff.toml
�[1mCause:�[0m Failed to parse /ruff.toml
�[1mCause:�[0m TOML parse error at line 26, column 3
|
26 | "RSE100", # Use of assert detected
| ^^^^^^^^
Unknown rule selector: RSE100

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore or @coderabbit ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds language selection functionality to the OCR4Linux tool, allowing users to choose their preferred languages for OCR operations through a rofi multiselect dialog. This enhancement aims to improve OCR accuracy and performance by using targeted language models.

  • Adds a choose_lang() function that presents available Tesseract languages via rofi multiselect
  • Modifies the Python OCR script to accept and process language parameters
  • Updates argument handling to support the new language parameter

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
OCR4Linux.sh Adds language selection UI and passes selected languages to Python script
OCR4Linux.py Updates to accept language parameter and validates selected languages against available Tesseract languages

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@moheladwy
Copy link
Owner

I tried the code in this pr and it didn't work at all.
I even tried to see what the problem and how to fix, but everything seems to get missed up even more.
So I wrote the feature myself with the help of some of the code you wrote.
Thank you for your time and effort that you put in this feature ❤️

@moheladwy moheladwy closed this Sep 5, 2025
@moheladwy
Copy link
Owner

you are more than welcome to implement some of the issues in the repo btw

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request new feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants