Skip to content

RitamPal26/ScribeAI

 
 

Repository files navigation

ScribeAI - AI-Powered Meeting Transcription

Real-time audio transcription and AI-powered meeting summaries using Next.js, Socket.io, and Google Gemini

Demo Video


Overview

ScribeAI transforms live audio into searchable, summarized transcripts in real-time. Capture meeting audio from your microphone or browser tabs (Google Meet, Zoom, YouTube) and receive instant AI-powered transcriptions with automatic summaries.

Built for: AttackCapital Technical Assignment
Timeline: 4 days
Status: ✅ Production-ready prototype


Features

Core Functionality

  • Real-time Transcription - Live streaming audio chunks (5s intervals) to Gemini API
  • Dual Input Sources
  • Microphone recording
  • Browser tab audio capture (Google Meet, Zoom, Spotify, YouTube)
  • Pause/Resume - Control recording flow with state preservation
  • AI-Powered Summaries - Automatic generation of key points, action items, and decisions
  • Session Management - Complete history with search, filter, and export capabilities
  • Live UI Updates - Real-time transcript display via Socket.io

Tech Stack

Frontend

  • Next.js 16.0.1
  • Tailwind CSS + shadcn/ui
  • Zustand (State Management)
  • Socket.io Client

Backend

  • Node.js 20+ with TypeScript
  • Socket.io Server
  • Express
  • Prisma ORM

AI & APIs

  • Google Gemini 2.5 Flash (Transcription)
  • Gemini via Vercel AI SDK (Structured Summaries)

Database

  • PostgreSQL

Authentication

  • Better Auth

DevOps

  • pnpm (Package Manager)
  • Turbopack (Next.js Dev Server)

Installation

Prerequisites

  • Node.js 20+
  • PostgreSQL 15+ (local or cloud)
  • pnpm installed (npm install -g pnpm)
  • Google Gemini API Key (Get one free)

Setup Instructions

  1. Clone Repository
git clone https://github.com/RitamPal26/ScribeAI.git
cd ScribeAI
  1. Install Dependencies
pnpm install
  1. Setup Database

    a. Create Supabase Project:

    • Go to Supabase Dashboard
    • Click "New Project"
    • Choose a name and set a secure database password
    • Wait for project initialization (~2 minutes)

    b. Get Database URL:

    • Go to Project SettingsDatabase
    • Scroll to Connection String section
    • Copy the Connection pooling URI (recommended for production)
    • Format: postgresql://postgres.[project-ref]:[password]@aws-0-[region].pooler.supabase.com:6543/postgres
  2. Configure Environment Variables

cp .env.example .env

And put in all the required varibles

  1. Start Development Server
pnpm dev

Project Structure

ScribeAI/
├── src/
│   ├── app/                      # Next.js App Router
│   │   ├── (dashboard)/
│   │   │   └── dashboard/
│   │   │       ├── page.tsx      # Dashboard home
│   │   │       ├── record/       # Recording interface
│   │   │       └── sessions/     # Session history
│   │   ├── api/                  # API routes
│   │   ├── login/                # Auth pages
│   │   └── signup/
│   │
│   ├── components/
│   │   ├── auth/                 # Auth forms
│   │   ├── dashboard/            # Dashboard components
│   │   ├── recording/            # Recording UI
│   │   ├── sessions/             # Session cards
│   │   └── ui/                   # shadcn/ui components
│   │
│   ├── hooks/
│   │   ├── useRecording.ts       # Core recording logic
│   │   └── useSocket.ts          # Socket.io client hook
│   │
│   ├── lib/
│   │   ├── auth.ts               # Better Auth config
│   │   ├── prisma.ts             # Database client
│   │   └── socket-client.ts      # Socket.io setup
│   │
│   └── stores/
│       └── recordingStore.ts     # Zustand state
│
├── server/
│   ├── services/
│   │   ├── audioProcessor.ts     # Gemini integration
│   │   └── sessionService.ts     # Database operations
│   │
│   ├── socket/
│   │   └── recording.ts          # Socket.io handlers
│   │
│   └── index.ts                  # Server entry point
│
├── prisma/
│   └── schema.prisma             # Database schema
│
└── server.js                     # Socket.io server

Screenshots

Session Details

Dashboard Home

View complete transcript and download options

AI Summary

Recording Interface

Automatic summary with key points and action items

Session History

Session History

Browse and manage all recorded sessions

Live Transcription

Live Transcription

Real-time text appears as you speak

Dashboard Overview

AI Summary

Main dashboard showing session statistics and recent recordings

Landing Page

Session Details

Simple landing page


Architecture & Design

For complete system architecture, data flow diagrams, and technical decisions:

👉 View Architecture Documentation


Demo Video

📹 Watch Full Walkthrough

Demonstration includes:

  • ✅ Microphone recording with live transcription
  • ✅ Tab audio capture from YouTube video
  • ✅ Pause/Resume functionality
  • ✅ Stop recording & AI summary generation
  • ✅ Session management & transcript export

Testing & Quality

Tested Scenarios

  • ✅ 5-minute continuous microphone recording
  • ✅ Tab audio from Google Meet call
  • ✅ Pause/Resume mid-recording
  • ✅ Network interruption recovery
  • ✅ 1-hour marathon session

Code Quality

  • TypeScript strict mode enabled
  • ESLint + Prettier formatting
  • JSDoc comments on key functions
  • Prisma type safety throughout

License

MIT License - See LICENSE file


Author

Ritam Pal

Built as part of AttackCapital technical assignment (November 2025)


Acknowledgments

  • Google Gemini API for powerful audio transcription
  • Vercel AI SDK for structured output generation
  • Better Auth for authentication
  • shadcn/ui for beautiful components
  • AttackCapital for giving this idea

About

AI-Powered Audio Scribing and Meeting Transcription App

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • TypeScript 98.1%
  • CSS 1.6%
  • JavaScript 0.3%