Skip to content

Conversation

@sou-cheng-choi
Copy link
Collaborator

  • Corrects a broken pip install in env.yml by updating the qmctoolscl pip pin to version 1.1.5and qmcpy to 2.0.
  • Fix broken the demo notebook.

@sou-cheng-choi sou-cheng-choi changed the title Update environment pin Update environment pins and fix demo notebook Nov 20, 2025
Copy link
Collaborator

@alegresor alegresor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please do not put line breaks in README. There should be an option in you editor to toggle word wrap in order to view long lines. Line breaks are hard to maintain when text is edited.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR updates environment dependencies and adds infrastructure for uploading the LDData repository to Hugging Face Datasets Hub. The main changes support transitioning from a standalone GitHub repository to a publicly accessible dataset on Hugging Face, making low-discrepancy point set parameters more discoverable and easier to use in QMC research.

Key changes:

  • Updates qmcpy to version 2.0 and qmctoolscl to version 1.1.5 to fix broken pip installations
  • Adds comprehensive upload tooling (upload.py, git_lfs_upload.sh) and GitHub Actions workflow for automated synchronization to Hugging Face
  • Reorganizes documentation: transforms README.md into a Hugging Face dataset card and moves technical specifications to LD_DATA.md

Reviewed Changes

Copilot reviewed 8 out of 9 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
env.yml Updates qmcpy and qmctoolscl versions, adds huggingface_hub dependency for upload functionality
upload.py New Python script for uploading repository to Hugging Face Datasets Hub with retry logic and fallback mechanisms
scripts/git_lfs_upload.sh New bash script for git-based uploads using git-lfs for large files
README.md Transformed into a Hugging Face dataset card with usage examples, citations, and dataset structure documentation
LD_DATA.md New file containing the original technical specification for low-discrepancy data formats (moved from old README)
LICENSE.txt Adds Apache 2.0 license file
.gitignore Adds patterns for Python cache files, VS Code settings, and script directories
.github/workflows/sync-to-huggingface.yml New GitHub Actions workflow for automated synchronization to Hugging Face on push

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

sou-cheng-choi and others added 7 commits November 20, 2025 10:49
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@sou-cheng-choi
Copy link
Collaborator Author

@alegresor @copilot @ zitterbewegung

Pass CI tests now.

Copy link

Copilot AI commented Nov 20, 2025

@sou-cheng-choi I've opened a new pull request, #8, to work on those changes. Once the pull request is ready, I'll request review from you.

@sou-cheng-choi sou-cheng-choi changed the title Update environment pins and fix demo notebook Update environment pins; fix demo notebook; and sync to Hugging Face Nov 20, 2025
@zitterbewegung
Copy link

We should not create a wrapper on top of the huggingface_hub if we are only going to update or create a dataset. See https://huggingface.co/docs/datasets/en/upload_dataset

@zitterbewegung
Copy link

Copy link
Collaborator

@alegresor alegresor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for working on this @sou-cheng-choi, it looks like you have done a lot!

May I suggest you break this into much smaller PRs that would be easier to review and faster to get merged? I would suggest the following

  1. a PR which adds the LICENSE
  2. a PR which removes env.yml in favor of pyproject.toml
  3. A PR which adds your enhancements to the README.md and adds the LD_Data.md
  4. One which adds the HuggingFace action

For adding the HF action, I must admit I do not understand what many of your files are doing. The reporuslanmv/How-to-Sync-Hugging-Face-Spaces-with-a-GitHub-Repository gives a MWE of how to sync data from a repo into HF. Based on their MWE, I would expect the suggested PR 4. would only add a single file to .github/workflows/ which automatically uploads the dataset to HF whenever something is pushed to a branch.

@sou-cheng-choi
Copy link
Collaborator Author

I will close this PR and break it into multiple PRs following your suggestions.

@sou-cheng-choi
Copy link
Collaborator Author

Reopen so that I won't forget opening sub-PRs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants