-
Notifications
You must be signed in to change notification settings - Fork 0
Description
ScribeAR NodeServer currently only supports a single stream of audio and single continuous transcription stream. To support our goal of deploying ScribeAR into many classrooms and making efficient use of compute resources, ScribeAR NodeServer needs to support multi tenancy. This is where a single server instance is able to support many independent transcription streams concurrently. Furthermore, transcriptions should be divided into sessions that can be scheduled and started automatically so that courses can have independent transcription streams.
Below is a formal list of features we are targeting. Future enhancements should be kept in mind when designing the system.
In-Scope Features
-
Core Functionality
- Provide accurate and low-latency live transcription streams from one or more microphone streams
-
Multi Tenancy
- Transcription streams divided into independent sessions
- Transcriptions from one session cannot be viewed by a different session
- Support multiple concurrent sessions
- Sessions can be started by a schedule
-
Authentication
- Authentication required to view a transcription session
- Users can scan a rotating QR code or enter a join code to authenticate to a session
-
User Experience
- Users can view live transcriptions on a kiosk device
- Users can view live transcriptions on their personal devices
- Users can to join sessions via a publicly accessible landing page
- System should require no or minimal user interaction to begin/end sessions
- System should provide actionable error prompts when things go wrong
-
Scalability & Performance
- Transcription services should scale for many concurrent sessions
- Services should make efficient use of GPU resources
-
Reliability
- System should be tolerant of temporary network interruptions or service failures and automatically resume transcription stream when possible
-
Monitoring & Analytics
- System should log health and performance metrics for troubleshooting and debugging
Out-Of Scope (Future Enhancements)
-
Transcriptions
- Speaker diarization support
- Multi-language support
- Custom vocabulary support
- Accurate transcription timestamps
-
Transcription History
- Transcripts should be saved and made viewable and downloadable for authorized users
-
Authentication
- Users can authenticate using an external identity provider (e.g. NetID)
-
Authorization
- Fine grained access controls for features (downloads, session viewing, session admin, session scheduling, etc.)
-
Admin Dashboard
- Web dashboard for authorized users to schedule and manage sessions
- Admins can modify session authentication requirements
- Admins can kick actively connected users of a session
- Admins can edit session transcription configuration (diarization, language, vocabulary, etc.)
-
Scalability & Performance
- Support for automated horizontal scaling and load balancing