-
Notifications
You must be signed in to change notification settings - Fork 3
feat: add --sample-database flag for initializing embedded emulator with official samples #472
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ith official samples - Add --sample-database flag to load Google's official sample databases - Add --list-samples flag to show available sample databases - Support 5 official samples: banking, finance, finance-graph, finance-pg, gaming - Implement URI-based resource loading (gs://, file://, http://, https://) - Integrate with spanemuboost for DDL/DML initialization - Use memefish for proper SQL statement splitting - Add cloud.google.com/go/storage dependency for GCS support Fixes #470 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary of Changes
Hello @apstndb, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request significantly enhances the spanner-mycli tool by enabling the embedded Cloud Spanner emulator to be initialized with pre-populated official sample databases. This feature streamlines the setup process for development and testing environments, allowing users to quickly get a functional database instance with relevant data without manual configuration. The implementation supports loading data from diverse sources and automatically handles dialect differences.
Highlights
- New Feature: Sample Database Initialization: Introduces a new
--sample-databaseflag to initialize the embedded Cloud Spanner emulator with official sample databases (banking, finance, finance-graph, finance-pg, gaming). - New Feature: List Samples: Adds a
--list-samplesflag to display all available sample databases and their descriptions. - Flexible URI-based Resource Loading: Enables loading schema and data from various sources including Google Cloud Storage (
gs://), local file system (file://), and HTTP/HTTPS URLs (http://,https://). - Enhanced Emulator Integration: Leverages
spanemuboostfor DDL/DML initialization andmemefishfor robust SQL statement splitting, including automatic dialect detection for PostgreSQL samples.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.
| Feature | Command | Description |
|---|---|---|
| Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
| Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
| Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
| Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
Code Metrics Report📊 View detailed coverage report (available for 7 days)
Details | | main (889ab33) | #472 (6209b57) | +/- |
|---------------------|----------------|----------------|-------|
- | Coverage | 70.2% | 69.9% | -0.3% |
| Files | 51 | 53 | +2 |
| Lines | 5534 | 5688 | +154 |
+ | Covered | 3885 | 3977 | +92 |
- | Code to Test Ratio | 1:1.2 | 1:1.2 | -0.1 |
| Code | 12765 | 13067 | +302 |
+ | Test | 15975 | 16312 | +337 |
- | Test Execution Time | 33s | 35s | +2s |Code coverage of files in pull request scope (63.2% → 62.6%)
Reported by octocov |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a valuable feature for initializing the embedded emulator with official sample databases. The implementation is well-structured, with the core logic for handling sample databases and URI-based resource loading cleanly separated into a new sample_databases.go file. The test coverage for the new functionality is also comprehensive.
My review includes a few high-severity suggestions focused on improving performance, robustness, and resource safety, particularly around network and file operations, in alignment with the project's style guide.
- Add --sample-database and --list-samples flags to usage section - Add "Sample Databases" subsection under Embedded Cloud Spanner Emulator - Include examples showing how to list and use sample databases - Fix column name in COUNT example (functions need AS alias for column name) 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
…ations - Implement loadMultipleFromURIs() for efficient batch loading of mixed sources - Add parallel processing using goroutines with sync.WaitGroup.Go() (Go 1.25 feature) - Batch all GCS operations to use a single client for efficiency - Process file:// and HTTP(S) URIs in parallel for better performance - Add HTTP timeout using context.WithTimeout (30 seconds) - Simplify main.go to use the unified batch loading approach This optimization significantly improves loading performance when initializing sample databases with multiple resources from different sources.
- Extract common goroutine pattern into processURIs helper function - Reduce code duplication between file:// and HTTP(S) processing - Maintain same functionality with cleaner, more maintainable code - Use WaitGroup.Go() consistently through helper function
- Remove unnecessary pre-grouping into fileJobs, httpJobs, etc. - Process non-GCS URIs directly in the main loop using WaitGroup.Go() - Only collect GCS URIs separately for batch processing efficiency - Remove unnecessary mutex (each goroutine writes to unique index) - Reduce code by ~20 lines while maintaining same functionality This makes the code simpler and more direct - we process URIs as we encounter them instead of categorizing first then processing.
- Replace loadBatchFromGCS with loadFromGCSWithClient that uses a passed client - Create GCS client once and share it across all GCS operations - Treat all URI types uniformly in the same processing loop - Further reduce code by ~30 lines while maintaining functionality This completes the simplification - all URI types are now processed identically with just the loader function being different.
…DML processing - Change loadMultipleFromURIs to return map[string][]byte for cleaner access - Eliminate error-prone manual index tracking with resultIdx - Extract common DDL/DML processing into addStatements helper function - Add mutex protection for concurrent map writes (required for maps) - Simplify error messages to be more consistent This refactoring reduces code duplication and makes the implementation more robust by eliminating manual index management.
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a valuable feature for initializing the embedded emulator with sample databases. The implementation is generally well-structured, utilizing URI-based loading for flexibility. However, I've identified a critical compilation error in the concurrency logic, a missing security check that violates the project's style guide, and an incorrect test case that will fail. Addressing these points will significantly improve the robustness and correctness of the new functionality.
- Add maxFileSize constant set to 10MB for sample databases - Check file size with os.Stat before reading file:// URIs - Check object size with Attrs() before reading GCS objects - Check Content-Length header and use io.LimitReader for HTTP responses - Prevents accidental OOM from loading unexpectedly large files Addresses review feedback about file size safety limits.
- Remove unnecessary loop variable copy (Go 1.25 has fixed semantics) - Check reader.Close() error with defer func pattern - Fix formatting with gofumpt - Use separate variable name for os.Stat error to avoid shadowing
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a valuable feature for initializing the embedded emulator with sample databases. The implementation is well-structured, with logic for handling various URI schemes encapsulated in a new sample_databases.go file. I've found one critical issue that will prevent the code from compiling, which I've detailed in a specific comment.
- Add validation to prevent reading from special files (devices, pipes, sockets) - Document WaitGroup.Go() as a Go 1.25 feature in styleguide - Improves safety when loading file:// URIs Addresses review feedback about file safety validation.
Document the observed behavior that memefish.SplitRawStatements: - Includes leading line comments with their statements - Does not include block comments between statements
Reference the upstream PR that will fix block comment handling
- Created file_safety.go with ValidateFileSafety and SafeReadFile functions - Unified file safety validation logic between cli.go and sample_databases.go - Added comprehensive tests for file safety functions - Added platform-specific FIFO tests (Unix-only with runtime.GOOS check) - Updated CLAUDE.md to specify copyright header convention for new files - Fixed cli_test.go to match new error messages from SafeReadFile This improves security by centralizing file validation logic and makes it easier to maintain consistent safety checks across the codebase.
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a valuable feature for initializing the embedded emulator with sample databases. The implementation is well-structured, with good separation of concerns into new files for sample database logic and file safety checks. The URI-based loading is a flexible approach. My review focuses on a critical compilation error in the concurrent download logic, improving error reporting for multiple failures, and ensuring consistency with the project's newly defined copyright guidelines for new files.
- Fixed copyright headers for new files (sample_databases.go, sample_databases_test.go) to use 'Copyright 2025 apstndb' - Improved error handling in loadMultipleFromURIs to report all errors when multiple failures occur - Updated CLAUDE.md to clarify copyright policy for files created after fork vs existing files - Errors are now sorted for deterministic output
|
/gemini review |
Fixed trailing whitespace on line 166 of sample_databases.go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a valuable feature for initializing the embedded emulator with sample databases, along with a convenient --list-samples flag. The implementation is well-structured, with URI-based resource loading and a new centralized file safety module. The test coverage for the new functionality is also comprehensive. I've found one critical issue related to concurrency that will prevent the code from compiling, which I've detailed in a specific comment.
Development Insights from ImplementationThis PR implements the Architecture Decisions
Key Learnings
Testing Coverage
Future ConsiderationsThe URI loading infrastructure created here could be extended to other features like the |
|
/gemini summary |
Summary of ChangesThis pull request significantly enhances the Highlights
Changelog
Activity
|
Summary
This PR implements the
--sample-databaseflag for initializing the embedded emulator with Google's official Spanner sample databases, as specified in #470.Changes
New Features
--sample-database=<name>flag to initialize emulator with official samples--list-samplesflag to list available sample databasesbanking,finance,finance-graph,finance-pg,gamingImplementation Details
gs://- Google Cloud Storage (for official samples)file://- Local file systemhttp:///https://- HTTP(S) downloadsfinance-pg)Code Organization
sample_databases.go- Core implementation with registry and URI loadingsample_databases_test.go- Comprehensive test coverageTest Plan
--list-samplesmake checkpasses (all tests and lint)Usage Examples
Future Work
\.meta commandFixes #470
🤖 Generated with Claude Code