Skip to content

Conversation

@moheladwy
Copy link
Owner

@moheladwy moheladwy commented Sep 5, 2025

This pull request introduces interactive and flexible language selection for OCR processing in OCR4Linux, both in the Bash (OCR4Linux.sh) and Python (OCR4Linux.py) scripts thanks to @HexChap for his inspiration. Users can now specify languages directly via command-line options or use an interactive menu powered by rofi. The documentation (README.md) is updated to reflect these new features and usage patterns.

Language Selection Enhancements

  • Added --lang option to OCR4Linux.sh for specifying OCR languages directly (e.g., --lang eng+fra or --lang all), bypassing the interactive menu. If not provided, a rofi-based menu allows users to interactively choose one or more languages. ([[1]](https://github.com/moheladwy/OCR4Linux/pull/19/files#diff-01d7fb14176bf75976a4ed0a47fed53e49c2ffa1dd5c2844afaee5ba98a25389R43-R46), [[2]](https://github.com/moheladwy/OCR4Linux/pull/19/files#diff-01d7fb14176bf75976a4ed0a47fed53e49c2ffa1dd5c2844afaee5ba98a25389R55-R94), [[3]](https://github.com/moheladwy/OCR4Linux/pull/19/files#diff-01d7fb14176bf75976a4ed0a47fed53e49c2ffa1dd5c2844afaee5ba98a25389R146-R189), [[4]](https://github.com/moheladwy/OCR4Linux/pull/19/files#diff-01d7fb14176bf75976a4ed0a47fed53e49c2ffa1dd5c2844afaee5ba98a25389R266-R273))
  • Implemented validation and processing for specified languages, including checks for language availability and support for multi-selection in the menu. ([OCR4Linux.shR146-R189](https://github.com/moheladwy/OCR4Linux/pull/19/files#diff-01d7fb14176bf75976a4ed0a47fed53e49c2ffa1dd5c2844afaee5ba98a25389R146-R189))
  • Updated the workflow so that language selection occurs before screenshot and OCR processing. ([OCR4Linux.shR266-R273](https://github.com/moheladwy/OCR4Linux/pull/19/files#diff-01d7fb14176bf75976a4ed0a47fed53e49c2ffa1dd5c2844afaee5ba98a25389R266-R273))

Python Script Improvements

  • Updated OCR4Linux.py to accept an optional --langs argument, supporting both --langs=eng+fra and --langs eng+fra forms, and defaulting to all available languages if not specified. ([[1]](https://github.com/moheladwy/OCR4Linux/pull/19/files#diff-8d235df76c33e3dcb73754335872fc339873557f11c8c557aa2669c242c6ebcfL63-R93), [[2]](https://github.com/moheladwy/OCR4Linux/pull/19/files#diff-8d235df76c33e3dcb73754335872fc339873557f11c8c557aa2669c242c6ebcfL202-R230), [[3]](https://github.com/moheladwy/OCR4Linux/pull/19/files#diff-8d235df76c33e3dcb73754335872fc339873557f11c8c557aa2669c242c6ebcfR287-R295))
  • Fixed typos in the TesseractConfig class, renaming ouput_encoding to output_encoding throughout. ([[1]](https://github.com/moheladwy/OCR4Linux/pull/19/files#diff-8d235df76c33e3dcb73754335872fc339873557f11c8c557aa2669c242c6ebcfL49-R49), [[2]](https://github.com/moheladwy/OCR4Linux/pull/19/files#diff-8d235df76c33e3dcb73754335872fc339873557f11c8c557aa2669c242c6ebcfL63-R93), [[3]](https://github.com/moheladwy/OCR4Linux/pull/19/files#diff-8d235df76c33e3dcb73754335872fc339873557f11c8c557aa2669c242c6ebcfL118-R127))
  • Updated help messages, argument parsing, and usage examples to reflect the new language selection options. ([[1]](https://github.com/moheladwy/OCR4Linux/pull/19/files#diff-8d235df76c33e3dcb73754335872fc339873557f11c8c557aa2669c242c6ebcfL152-R175), [[2]](https://github.com/moheladwy/OCR4Linux/pull/19/files#diff-8d235df76c33e3dcb73754335872fc339873557f11c8c557aa2669c242c6ebcfL255-R271))

Documentation Updates

  • Expanded the README.md to document the new interactive language selection menu, the --lang option, and updated usage examples for both scripts. Emphasized the requirement for rofi and detailed the workflow for language selection and OCR processing. ([[1]](https://github.com/moheladwy/OCR4Linux/pull/19/files#diff-b335630551682c19a781afebcf4d07bf978fb1f8ac04c6bf87428ed5106870f5L21-R23), [[2]](https://github.com/moheladwy/OCR4Linux/pull/19/files#diff-b335630551682c19a781afebcf4d07bf978fb1f8ac04c6bf87428ed5106870f5R33), [[3]](https://github.com/moheladwy/OCR4Linux/pull/19/files#diff-b335630551682c19a781afebcf4d07bf978fb1f8ac04c6bf87428ed5106870f5R71-R72), [[4]](https://github.com/moheladwy/OCR4Linux/pull/19/files#diff-b335630551682c19a781afebcf4d07bf978fb1f8ac04c6bf87428ed5106870f5L94-R137), [[5]](https://github.com/moheladwy/OCR4Linux/pull/19/files#diff-b335630551682c19a781afebcf4d07bf978fb1f8ac04c6bf87428ed5106870f5L103-R211))

Dependency and Usability Improvements

  • Added rofi as a required dependency for the interactive menu and included checks to ensure it is installed before running the script. ([[1]](https://github.com/moheladwy/OCR4Linux/pull/19/files#diff-01d7fb14176bf75976a4ed0a47fed53e49c2ffa1dd5c2844afaee5ba98a25389R26), [[2]](https://github.com/moheladwy/OCR4Linux/pull/19/files#diff-01d7fb14176bf75976a4ed0a47fed53e49c2ffa1dd5c2844afaee5ba98a25389R117-R122))

These changes significantly improve user experience by making multi-language OCR selection intuitive and flexible, while keeping the workflow transparent and well-documented.

…inux

- Implemented interactive language selection using rofi in the shell script.
- Added support for specifying languages via command-line arguments.
- Updated README.md to reflect new features and usage instructions.
@moheladwy moheladwy requested a review from Copilot September 5, 2025 22:26
@moheladwy moheladwy self-assigned this Sep 5, 2025
@moheladwy moheladwy added enhancement New feature or request new feature labels Sep 5, 2025
@coderabbitai
Copy link

coderabbitai bot commented Sep 5, 2025

Caution

Review failed

Failed to post review comments.

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 021b230 and f24a245.

📒 Files selected for processing (3)
  • OCR4Linux.py (7 hunks)
  • OCR4Linux.sh (6 hunks)
  • README.md (6 hunks)
🧰 Additional context used
🪛 LanguageTool
README.md

[grammar] ~22-~22: There might be a mistake here.
Context: ...upport with custom language combinations - Automatic language detection fallback ...

(QB_NEW_EN)


[grammar] ~23-~23: There might be a mistake here.
Context: ... Automatic language detection fallback - Image preprocessing for better accuracy ...

(QB_NEW_EN)


[grammar] ~33-~33: There might be a mistake here.
Context: ... - Interactive language selection menu - Optional screenshot retention - Co...

(QB_NEW_EN)


[grammar] ~71-~71: Use correct spacing
Context: ... interactive language selection feature. ## Installation 1. Clone the repository: ...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~98-~98: There might be a problem here.
Context: ...CR4Linux.sh ``` 2. The script will: - With --lang option: Use specified langua...

(QB_NEW_EN_MERGED_MATCH)


[grammar] ~99-~99: There might be a mistake here.
Context: ...on**: Use specified languages directly (bypasses rofi menu) - **Without --lang opt...

(QB_NEW_EN)


[grammar] ~99-~99: There might be a problem here.
Context: ... languages directly (bypasses rofi menu) - Without --lang option: Display an interacti...

(QB_NEW_EN_MERGED_MATCH)


[grammar] ~100-~100: Insert the missing word
Context: ...interactive language selection menu via rofi - Allow you to select one or multiple lang...

(QB_NEW_EN_OTHER_ERROR_IDS_32)


[grammar] ~104-~104: There might be a mistake here.
Context: ...es - Copy the extracted text to the clipboard### Language Selection You have two opt...

(QB_NEW_EN_OTHER)


[grammar] ~104-~104: Use correct spacing
Context: ...t to the clipboard### Language Selection You have two options for language select...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~106-~106: Use correct spacing
Context: ...have two options for language selection: #### Option 1: Command Line (Direct) Specify...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~108-~108: Use correct spacing
Context: ...n: #### Option 1: Command Line (Direct) Specify languages directly using the `--...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~110-~110: Use correct spacing
Context: ...ages directly using the --lang option: - --lang all - Use all available languages - `--la...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~112-~112: Use hyphens correctly
Context: ...-lang all- Use all available languages - --lang eng- Use English only - --lang eng+ara+f...

(QB_NEW_EN_OTHER_ERROR_IDS_29)


[grammar] ~114-~114: There might be a mistake here.
Context: ...a+fra` - Use multiple specific languages #### Option 2: Interactive Menu (Rofi) When ...

(QB_NEW_EN_OTHER)


[grammar] ~116-~116: Use correct spacing
Context: ... #### Option 2: Interactive Menu (Rofi) When you run the script without --lang...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~118-~118: Use correct spacing
Context: ...--lang, a rofi` menu will appear with: - ALL: Select all available languages - ...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~120-~120: There might be a mistake here.
Context: ... ALL: Select all available languages - Individual languages: Choose specific ...

(QB_NEW_EN_OTHER)


[grammar] ~122-~122: There might be a mistake here.
Context: ...` and click to select multiple languages The selected languages will be used by T...

(QB_NEW_EN_OTHER)


[grammar] ~124-~124: Use correct spacing
Context: ...recognition in multi-language documents. ## Workflow The complete OCR4Linux workflo...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~126-~126: Use correct spacing
Context: ...n multi-language documents. ## Workflow The complete OCR4Linux workflow: 1. **L...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~128-~128: Use correct spacing
Context: ...rkflow The complete OCR4Linux workflow: 1. Language Selection: - Command-line...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~130-~130: There might be a mistake here.
Context: ...ux workflow: 1. Language Selection: - Command-line specified languages (with `...

(QB_NEW_EN)


[grammar] ~131-~131: There might be a mistake here.
Context: ...e specified languages (with --lang) OR - Interactive rofi menu displays available...

(QB_NEW_EN)


[grammar] ~132-~132: There might be a mistake here.
Context: ...s available languages (without --lang) 2. Language Processing: Selected language...

(QB_NEW_EN)


[grammar] ~133-~133: There might be a mistake here.
Context: ...ed languages are validated and formatted 3. Screenshot Capture: Area selection and...

(QB_NEW_EN_OTHER)


[grammar] ~134-~134: There might be a mistake here.
Context: ...ture**: Area selection and image capture 4. OCR Processing: Text extraction using ...

(QB_NEW_EN)


[grammar] ~135-~135: There might be a mistake here.
Context: ...Text extraction using selected languages 5. Clipboard Integration: Extracted text ...

(QB_NEW_EN)


[grammar] ~136-~136: There might be a mistake here.
Context: ...xtracted text copied to system clipboard 6. Cleanup: Optional screenshot removal a...

(QB_NEW_EN)


[grammar] ~137-~137: Use correct spacing
Context: ... Optional screenshot removal and logging ### Command Line Arguments --- #### OCR4Li...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~139-~139: Use correct spacing
Context: ... and logging ### Command Line Arguments --- #### OCR4Linux.sh | Option | Des...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~143-~143: Use correct spacing
Context: ...d Line Arguments --- #### OCR4Linux.sh | Option | Description ...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~145-~145: There might be a mistake here.
Context: ... | Default | | ------------------ | -----------------...

(QB_NEW_EN)


[grammar] ~146-~146: There might be a mistake here.
Context: ...------- | ---------------------------- | | -r | Remove screenshot...

(QB_NEW_EN)


[grammar] ~147-~147: There might be a mistake here.
Context: ...sing | false | | -d DIR | Set screenshot di...

(QB_NEW_EN)


[grammar] ~148-~148: There might be a mistake here.
Context: ... | $HOME/Pictures/screenshots | | -l | Keep logs ...

(QB_NEW_EN)


[grammar] ~149-~149: There might be a mistake here.
Context: ... | false | | --lang LANGUAGES | Specify OCR langu...

(QB_NEW_EN)


[grammar] ~150-~150: There might be a mistake here.
Context: ...s rofi) | Interactive selection | | -h | Show help message...

(QB_NEW_EN)


[grammar] ~151-~151: Use correct spacing
Context: ... | - | Language Format for --lang: - Use...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~153-~153: Use correct spacing
Context: ... | Language Format for --lang: - Use all for all available languages - ...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~155-~155: There might be a mistake here.
Context: ... Use all for all available languages - Use + to separate multiple languages (...

(QB_NEW_EN_OTHER)


[grammar] ~157-~157: Use correct spacing
Context: ...gle languages: eng, ara, fra, etc. #### OCR4Linux.py | Option | ...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~159-~159: Use correct spacing
Context: ..., ara, fra`, etc. #### OCR4Linux.py | Option | Description ...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~161-~161: There might be a mistake here.
Context: ...escription | Required | | --------------------- | --------------...

(QB_NEW_EN)


[grammar] ~162-~162: There might be a mistake here.
Context: ...--------------------------- | -------- | | image_path | Path to input ...

(QB_NEW_EN)


[grammar] ~163-~163: There might be a mistake here.
Context: ...ath to input image | Yes | | output_path | Path to save e...

(QB_NEW_EN)


[grammar] ~164-~164: There might be a mistake here.
Context: ...ath to save extracted text | Yes | | --langs <languages> | Specify langua...

(QB_NEW_EN)


[grammar] ~165-~165: There might be a mistake here.
Context: ...pecify languages for OCR | No | | -l, --list-langs | List available...

(QB_NEW_EN)


[grammar] ~166-~166: There might be a mistake here.
Context: ...ist available OCR languages | No | | -h, --help | Show help mess...

(QB_NEW_EN)


[grammar] ~167-~167: Use correct spacing
Context: ...how help message | No | Language Format: Use + to separate m...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~169-~169: There might be a mistake here.
Context: ...multiple languages (e.g., eng+ara+fra) ### Examples --- #### Using OCR4Linux.sh ...

(QB_NEW_EN_OTHER)


[grammar] ~200-~200: Use correct spacing
Context: ...Linux.sh -h #### Using OCR4Linux.py sh # Basic usage (uses all available languages) python OCR4Linux.py input.png output.txt # Specify single language python OCR4Linux.py input.png output.txt --langs eng # Specify multiple languages python OCR4Linux.py input.png output.txt --langs eng+ara+fra # List available languages python OCR4Linux.py --list-langs # Show help python OCR4Linux.py --help ``` ## Tips - Language Selection Options...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~221-~221: Use correct spacing
Context: ...ips - Language Selection Options: - Command Line: Use --lang for automat...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~223-~223: There might be a problem here.
Context: ...se --lang for automated/scripted usage - --lang all for maximum compatibility - `-...

(QB_NEW_EN_MERGED_MATCH)


[grammar] ~225-~225: There might be a mistake here.
Context: ... --lang all for maximum compatibility - --lang eng for English-only documents - `...

(QB_NEW_EN_OTHER)


[grammar] ~227-~227: Use correct spacing
Context: ...--lang eng+ara for bilingual documents - Interactive Menu: Run without --lang...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~229-~229: There might be a mistake here.
Context: ...un without --lang for manual selection - Select "ALL" to use all available langua...

(QB_NEW_EN_OTHER)


[grammar] ~233-~233: There might be a mistake here.
Context: ... Press Escape to cancel the operation - Performance Optimization: - Use...

(QB_NEW_EN_OTHER)


[grammar] ~235-~235: Use correct spacing
Context: ...ation - Performance Optimization: - Use fewer specific languages for faster ...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~237-~237: There might be a mistake here.
Context: ...specific languages for faster processing - Use --lang all only when document lang...

(QB_NEW_EN_OTHER)


[grammar] ~238-~238: There might be a mistake here.
Context: ...rocessing - Use --lang all only when document language is unknown - Co...

(QB_NEW_EN)


[grammar] ~238-~238: There might be a mistake here.
Context: ...` only when document language is unknown - Command-line specification is faster tha...

(QB_NEW_EN_OTHER)


[grammar] ~239-~239: There might be a mistake here.
Context: ...ion is faster than interactive selection - Keyboard Shortcuts: You can create a k...

(QB_NEW_EN_OTHER)


[grammar] ~241-~241: Use correct spacing
Context: ...rtcut to run the script for easy access. ### Example for Hyprland users: - p...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~243-~243: Use correct spacing
Context: .... ### Example for Hyprland users: - put the following lines in your `hyprlan...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~245-~245: Use correct spacing
Context: ...wing lines in your hyprland.conf file: conf $OCR4Linux = ~/.config/OCR4Linux/OCR4Linux.sh $OCR4Linux_ENG = ~/.config/OCR4Linux/OCR4Linux.sh --lang eng bind = $mainMod SHIFT, E, exec, $OCR4Linux # OCR4Linux with interactive selection bind = $mainMod SHIFT, T, exec, $OCR4Linux_ENG # OCR4Linux with English only ### Example for dwm users: - put th...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~255-~255: Use correct spacing
Context: ... ``` ### Example for dwm users: - put the following lines in your `config....

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~257-~257: Use correct spacing
Context: ...following lines in your config.h file: c static const char *ocr4linux[] = { "sh", "-c", "~/.config/OCR4Linux/OCR4Linux.sh", NULL }; static const char *ocr4linux_eng[] = { "sh", "-c", "~/.config/OCR4Linux/OCR4Linux.sh --lang eng", NULL }; { MODKEY | ShiftMask, XK_e, spawn, {.v = ocr4linux } }, // OCR4Linux interactive { MODKEY | ShiftMask, XK_t, spawn, {.v = ocr4linux_eng } }, // OCR4Linux English only - Language Optimization: For best result...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~267-~267: There might be a mistake here.
Context: ...nguage Optimization**: For best results: - Select only the languages present in you...

(QB_NEW_EN)


[grammar] ~268-~268: There might be a mistake here.
Context: ...y the languages present in your document - Use fewer languages for better performan...

(QB_NEW_EN_OTHER)


[grammar] ~270-~270: There might be a mistake here.
Context: ...ional Tesseract language packs as needed ## Files - [OCR4Linux.py](https://github...

(QB_NEW_EN_OTHER)


[grammar] ~272-~272: Use correct spacing
Context: ...eract language packs as needed ## Files - [OCR4Linux.py](https://github.com/mohelad...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~275-~275: There might be a mistake here.
Context: ...ed text, and copies it to the clipboard. - [setup.sh](https://github.com/moheladwy/O...

(QB_NEW_EN)

🪛 markdownlint-cli2 (0.17.2)
README.md

21-21: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


22-22: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


23-23: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


33-33: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


108-108: Heading levels should only increment by one level at a time
Expected: h3; Actual: h4

(MD001, heading-increment)


223-223: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


225-225: Unordered list indentation
Expected: 4; Actual: 8

(MD007, ul-indent)


226-226: Unordered list indentation
Expected: 4; Actual: 8

(MD007, ul-indent)


227-227: Unordered list indentation
Expected: 4; Actual: 8

(MD007, ul-indent)


229-229: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


230-230: Unordered list indentation
Expected: 4; Actual: 8

(MD007, ul-indent)


231-231: Unordered list indentation
Expected: 4; Actual: 8

(MD007, ul-indent)


232-232: Unordered list indentation
Expected: 4; Actual: 8

(MD007, ul-indent)


233-233: Unordered list indentation
Expected: 4; Actual: 8

(MD007, ul-indent)


237-237: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


238-238: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


239-239: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


243-243: Trailing punctuation in heading
Punctuation: ':'

(MD026, no-trailing-punctuation)


245-245: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


255-255: Trailing punctuation in heading
Punctuation: ':'

(MD026, no-trailing-punctuation)


257-257: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


268-268: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


269-269: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


270-270: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)

📝 Walkthrough

Summary by CodeRabbit

  • New Features

    • Added multi-language OCR selection via new CLI flags: --lang (shell) and --langs (Python).
    • Introduced an interactive rofi-based menu for choosing one or more languages when no flag is provided.
    • Supports selecting all languages or a + separated list (e.g., eng+ara).
    • Improved argument parsing, validation, and user-facing error messages; default language selection applied when none provided.
  • Documentation

    • Updated README with dual-mode language selection (CLI and interactive), usage examples, language format rules, and revised help/usage text.

Walkthrough

Adds configurable OCR language selection across the Python and shell scripts: new CLI flags (--langs/--lang), optional interactive selection via rofi, updated argument parsing and help, propagation of language settings to Tesseract, a small encoding attribute rename, and README updates documenting both direct and interactive flows.

Changes

Cohort / File(s) Summary of Changes
Language selection capability
OCR4Linux.py, OCR4Linux.sh
Introduces language selection: Python accepts optional langs in TesseractConfig and CLI (--langs); shell adds --lang, interactive rofi menu, parsing/validation, and passes languages to Python.
CLI parsing, flow, and diagnostics
OCR4Linux.sh
Reworks arg parsing (while/case), adds logging, dependency checks, language processing functions, and branches flow based on specified vs. interactive selection.
Refactors and naming
OCR4Linux.py
Renames attribute ouput_encoding to output_encoding and updates usages; adds stderr diagnostics for chosen languages.
Documentation updates
README.md
Documents dual-mode language selection (CLI and rofi), new flags, format rules (+-delimited), updated workflows, examples, requirements, and tips.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor U as User
  participant SH as OCR4Linux.sh
  participant R as rofi (interactive)
  participant PY as OCR4Linux.py
  participant TES as Tesseract
  participant CB as Clipboard

  U->>SH: Run script [-r/-d/-l/--lang LANGS]
  alt --lang provided
    SH->>SH: process_specified_langs()
  else interactive
    SH->>R: choose_lang() prompt
    R-->>SH: Selected languages
  end
  SH->>SH: Take screenshot to file
  SH->>PY: Invoke with --langs (if set) and paths
  PY->>PY: Configure TesseractConfig(langs)
  PY->>TES: OCR image with selected langs
  TES-->>PY: Extracted text
  PY-->>SH: Output text (UTF-8)
  SH->>CB: Copy to clipboard
  SH-->>U: Done (logs emitted)
  opt Error paths
    SH->>U: Log error (missing rofi, invalid langs, etc.)
  end
Loading
sequenceDiagram
  autonumber
  participant CLI as CLI (args)
  participant PY as OCR4Linux.py
  participant CFG as TesseractConfig
  participant TES as Tesseract

  CLI->>PY: Parse args [image, output, --langs?]
  PY->>CFG: __init__(image_path, output_path, langs)
  note right of CFG: If langs empty: use all available or 'eng'
  CFG->>TES: Run OCR with lang string
  TES-->>PY: Text
  PY-->>CLI: Write output (output_encoding UTF-8)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

A hop, a skip, I pick a tongue,
rofi sings a menu song.
Plus signs link the langs along,
Tesseract hums steady-strong.
Clipboard brims with letters sprung—
carrots up, the OCR’s done! 🥕✨

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 Ruff (0.12.2)
OCR4Linux.py

�[1;31mruff failed�[0m
�[1mCause:�[0m Failed to load configuration /ruff.toml
�[1mCause:�[0m Failed to parse /ruff.toml
�[1mCause:�[0m TOML parse error at line 26, column 3
|
26 | "RSE100", # Use of assert detected
| ^^^^^^^^
Unknown rule selector: RSE100

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch add-multi-lang-selection

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request adds interactive language selection capabilities to OCR4Linux through a rofi-based menu system and direct command-line language specification. Users can now either specify languages directly via the --lang option or select them interactively from a menu, improving flexibility for multi-language OCR processing.

  • Interactive language selection menu using rofi with multi-select support
  • Direct command-line language specification via --lang option in the shell script and --langs in the Python script
  • Fixed typos in the Python TesseractConfig class and updated documentation

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
README.md Comprehensive documentation updates covering new language selection features, usage examples, and workflow explanations
OCR4Linux.sh Added interactive rofi-based language selection, command-line language specification, and updated argument parsing
OCR4Linux.py Added language parameter support, fixed typos in TesseractConfig class, and updated argument handling

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@moheladwy moheladwy merged commit dc93857 into main Sep 5, 2025
5 of 6 checks passed
@moheladwy moheladwy deleted the add-multi-lang-selection branch January 14, 2026 14:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request new feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants