A modern web application that converts handwritten and printed text from images into digital text using Tesseract.js OCR engine.
- 📝 Convert handwritten and printed text from images to digital text
- 🖼️ Support for multiple image formats (PNG, JPEG, GIF)
- ⚡ Real-time processing with live preview
- 🔧 Advanced image preprocessing options
- 🌍 Multiple language support
- 📱 Responsive design for all devices
- 🎯 Drag and drop file upload
- React 18
- TypeScript
- Tesseract.js
- Tailwind CSS
- Vite
- Lucide React Icons
- Node.js 16.x or higher
- npm or yarn
- Clone the repository:
git clone https://github.com/codegallery-me/Handwritten-Text-Recognition.git
cd handwriting-ocr-app- Install dependencies:
npm install- Start the development server:
npm run dev- Build for production:
npm run build- Open the application in your browser
- Upload an image by either:
- Dragging and dropping an image file
- Clicking "Browse Files" to select an image
- Wait for the OCR processing to complete
- View the extracted text in the results panel
eng: Standard English recognitioneng_best: High-accuracy English recognition (slower)osd: Auto-detect orientation and script
none: No preprocessingbw: Black & White mode (best for handwriting)sharpen: Sharpened mode (best for printed text)
interface Settings {
language: 'eng' | 'eng_best' | 'osd';
preprocessing: 'none' | 'bw' | 'sharpen';
}
interface AppProps {}
interface AppState {
image: string | null;
text: string;
loading: boolean;
error: string | null;
settings: Settings;
}Processes the uploaded file and initiates OCR.
Parameters:
file: The image file to process
Applies preprocessing filters to the image before OCR.
Parameters:
imageData: Base64 encoded image string Returns:- Promise resolving to processed image data URL
Performs OCR on the image using Tesseract.js.
Parameters:
imageData: Base64 encoded image string
{
tessedit_char_whitelist: 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789.,!?-_\'"\n ',
tessedit_pageseg_mode: '6',
tessjs_create_pdf: '0',
tessjs_create_hocr: '0',
tessjs_create_tsv: '0'
}-
Image Quality
- Use clear, well-lit images
- Ensure good contrast between text and background
- Avoid blurry or distorted images
-
Preprocessing Selection
- For handwritten text: Use "Black & White" mode
- For printed text: Use "Sharpen" mode
- For unclear results: Try different preprocessing options
-
Language Selection
- Use "English (Best)" for highest accuracy
- Use "Auto Detect" for unknown text orientation
- Standard "English" for faster processing
- Image size: Larger images take longer to process
- Language mode: "English (Best)" is more accurate but slower
- Browser resources: Processing occurs client-side
- Maximum file size: 10MB recommended
The application handles various error cases:
- Invalid file types
- Processing failures
- Network issues
- Browser compatibility
- Chrome (latest)
- Firefox (latest)
- Safari (latest)
- Edge (latest)
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Tesseract.js for OCR functionality
- Tailwind CSS for styling
- Lucide React for icons