Skip to content

tzuwei93/nbe-user-code

Repository files navigation

Nearby Beverage Explorer

A data pipeline for exploring and analyzing nearby beverage places using dbt, Dagster, and Spark.

Prerequisites

Before getting started, you'll need:

  1. Local Spark Thrift Server

    • This project requires a local Spark Thrift Server to run the dbt models
    • The Spark SQL endpoint should be available at jdbc:hive2://localhost:10000
    • Make sure your Spark Thrift Server is properly configured with Hudi support
  2. Environment Variables

    • GOOGLE_PLACES_API_KEY: Your Google Places API key
    • Other environment variables as specified in the project

Project Structure

.
├── dbt_main/           # dbt project files
│   └── models/         # dbt models
├── dag/                # Dagster pipeline definitions
└── docker-compose.yml  # Docker configuration

dagster ui preview

Demo

Quick Start

  1. Start your local Spark Thrift Server
  2. Set up environment variables in a .env file
  3. Run the Dagster pipeline:
    dagster dev
  4. Access the Dagster UI at http://localhost:3000

Running dbt Models

To run dbt models directly:

cd dbt_main
dbt build

Scheduling

The pipeline is scheduled to run every Friday at 6 AM by default. You can modify this in dag/definitions.py.

Notes

  • This project uses Apache Hudi for incremental data processing
  • Check the Dagster UI for pipeline execution details and logs

About

evolved version using dagster + dbt for https://github.com/tzuwei93/nearby_beverage_explorer

Resources

Stars

Watchers

Forks

Releases

No releases published