Rapid Integration and Visualization for Enhanced Research Platform

Rapid Integration and Visualization for Enhanced Research (RIVER) is an integrated ecosystem for data and computing, designed with a monolithic architecture using a Python backend ("backsheep") and a JavaScript frontend ("vite"). While the current structure is monolithic, it is architected for potential refactoring into microservices. For scientific applications, River aims to be lightweight and serve as a system controller—connecting data, software, and users. For quick recorded video demo, follow here: https://www.youtube.com/watch?v=boabEFNIkNA

Overview

River consists of the following components:

Backend: Asynchronous web server built with BlackSheep, a high-performance Python framework. Utilizes PostgreSQL for the database, Redis for caching, and Celery for job task monitoring.
Frontend: React.js application powered by Vite and Material UI (MUI) for the user interface.
Traefik: Modern reverse proxy used for routing and load balancing.

Usage

The River Platform for my research group is deployed at: https://platform.riverxdata.com You can try your own credentials to test the platform. However, your credential should be easily revoked after testing.

Currently, RiverXData platform supports 3 main components:

Version control: Designed to have a base class that can be used to extended to support for more version control type. Currently, support Github via token
Storage: Designed to have a base class that can be used to extended to support for more storage type. Currently, support compatible S3 storage via IAM keys and optional ARN
Computing: Designed to have a base class that can be used to extended to support for more computing type. Currently, support for a Linux server or SLURM scheduler via SSH.

How to add a tool on RIVER platform

Tutorial for how to develop/add tools to platform with ease: https://www.youtube.com/watch?v=boabEFNIkNA

There are 2 types of tools that can be added to the platform: non-ui tools and web-based tools. You add your github credential, it will retrieve pipeline parameters on the repository, allow the platform UI to allow you add/modify the parameters. The tools requires to have the river folder, with main.sh file as the script executor. Beside that, we support all of the nf-core pipelines. Parameters are configured via the nextflow_schema.json and profile in the the conf folder.

Non-UI tool template:

template.: The template with "Hello world" to print out the inputs of all parameters.
sarek: Variant calling pipeline for germline and somatic variants from whole genome, exome, or targeted sequencing data. Supports tumor/normal analyses.
rnaseq: RNA-seq processing pipeline that includes quality control, alignment or pseudo-alignment, quantification, and generation of gene expression matrices.
ampliseq: Amplicon sequencing pipeline for microbial community profiling, such as 16S rRNA gene sequencing.
quantms: Quantitative proteomics pipeline for label-free and isobaric labeling analyses using both DDA and DIA data.
taxprofiler: Taxonomic profiling pipeline for shotgun metagenomics, supporting multiple tools and producing standardized outputs.
methylseq: DNA methylation analysis pipeline using bisulfite-treated sequencing data. Supports multiple aligners and provides comprehensive QC.
circrna: Pipeline for detecting and quantifying circular RNAs (circRNAs) from RNA-seq data, including miRNA target prediction.
mag: Metagenomic pipeline for assembling, binning, and annotating metagenome-assembled genomes (MAGs) from short or long reads.
atacseq: ATAC-seq pipeline to identify open chromatin regions, perform peak calling, and assess data quality with various QC metrics.
rnafusion: Fusion detection pipeline using RNA-seq data, combining results from multiple fusion detection tools into reports and visualizations.

Web-based tool template:

template.: The streamlit app that simulate the gene expression data of BRCA1, BRCA2 between 2 groups: cancer vs normal
tf-finder.: The wrapper of TFinder, which is a Python easy-to-use web tool for identifying Transcription Factor Binding Sites (TFBS) and Individual Motif (IM).
CARTAR: The wrapper of CARTAR webserver, designed to assist scientist in the in silico identification and validation of immunotherapetic targets present in the cell surface to attack tumoral cells

Deployment

You should deploy your own platform. Using the below tutorial

A .env file is required for deployment. To quick setup for validation, the committed .env can be used. The Google Client ID can be used safelly. It can be deployed using docker compose on cloud. Beside, the Google Client ID, follow here. For user to test after deployment, follow ##Developer to simulate the approriate services for testing purpose. Bedefault, the .env is used for staging only which supports the localhost setup. For binding the domain to a VPS, please adjust the domain name for VITE_BACKEND_URL, FRONTEND_URL and URL.

Please adjust your setup on the .env file. For the detail explaination of the variables, see below:

Variable Name	Description	Example Value
`LETSENCRYPT_EMAIL`	Email address for Let's Encrypt SSL certificate registration.	`nttg8100@gmail.com`
`VITE_BACKEND_URL`	Backend API URL for the frontend to connect to.	`http://localhost`
`FRONTEND_URL`	URL where the frontend is served.	`http://localhost`
`VITE_APP_GOOGLE_CLIENT_ID`	Google OAuth client ID for authentication.	`212676895890-3ad1thuq1kmenn32noc0kut7rl9lelk9.apps.googleusercontent.com`
`CACHE_DB_HOST`	Hostname for the Redis cache used by Celery.	`river-redis`
`BASE_API_HOST`	Hostname for the backend API server.	`river-backend`
`POSTGRES_DATABASE`	Name of the PostgreSQL database.	`river`
`POSTGRES_USER`	PostgreSQL database username.	`river`
`POSTGRES_PORT`	PostgreSQL database port.	`5432`
`POSTGRES_PASSWORD`	PostgreSQL database password.	`password`
`POSTGRES_HOST`	Hostname for the PostgreSQL database server.	`river-db`
`APP_ENV`	Application environment (e.g., `prod`, `dev`).	`prod`

Developer

NOTE: FOR NETWORK COMMUNICATION WITH "dev" ENVIRONMENT, THE /etc/hosts should add this line 127.0.0.1 river-localstack to /etc/hosts to access the S3 storage everywhere

Credential

Only backend has tests using pytest. For credential, object the github token at here.

Setting Up the Development Environment

Use the provided Makefile to automate environment setup and service management.

1. Install Dependencies

Frontend (Node.js 20.17.0):
```
make dev-frontend
```
Backend (Python 3.12.11):
```
make dev-backend
```
Traefik (v3.5.0):
```
make dev-traefik
```
SLURM (builds local SLURM Docker image):
```
make dev-slurm
```

To set up all at once:

make dev

2. Start Development Infrastructure

Start SLURM and Redis:
```
make start-dev-infra
```
Start Local PostgreSQL DB and initialize/migrate:
```
make start-dev-db
```

3. Start Services

Backend (dev mode):
```
make start-backend
```
Frontend:
```
make start-frontend
```
Traefik:
```
make start-traefik
```
Celery Worker:
```
make start-celery
```

4. Testing

Start test infrastructure (Localstack, Redis, Test DB):
```
make start-test-infra
```
Run backend tests:
- Auth: make test-auth
- Organization: make test-org
- Credential: make test-cred
- Project: make test-pro
- Storage: make test-storage
- Public Analysis: make test-public-analysis
- Job: make test-job
- All: make test-all

5. SLURM and S3-Compatible Services

Start SLURM:
```
make start-slurm
```
Start Localstack (S3 simulation):
```
make start-localstack
```

6. Clean Up

Remove development DB volume:
```
make clean-dev-db
```

7. Production Deployment

Deploy production stack:
```
make production
```

Refer to the Makefile for additional targets and details.

License

This project is licensed under the GNU General Public License v3.0 - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 92 Commits
.github		.github
backend		backend
docs		docs
frontend		frontend
grafana		grafana
prometheus		prometheus
slurm		slurm
traefik		traefik
.env		.env
.env.production.template		.env.production.template
.flake8		.flake8
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
docker-compose.yaml		docker-compose.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Rapid Integration and Visualization for Enhanced Research Platform

Overview

Usage

How to add a tool on RIVER platform

Non-UI tool template:

Web-based tool template:

Deployment

Developer

Credential

Setting Up the Development Environment

1. Install Dependencies

2. Start Development Infrastructure

3. Start Services

4. Testing

5. SLURM and S3-Compatible Services

6. Clean Up

7. Production Deployment

License

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

riverxdata/river

Folders and files

Latest commit

History

Repository files navigation

Rapid Integration and Visualization for Enhanced Research Platform

Overview

Usage

How to add a tool on RIVER platform

Non-UI tool template:

Web-based tool template:

Deployment

Developer

Credential

Setting Up the Development Environment

1. Install Dependencies

2. Start Development Infrastructure

3. Start Services

4. Testing

5. SLURM and S3-Compatible Services

6. Clean Up

7. Production Deployment

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages