-
Notifications
You must be signed in to change notification settings - Fork 5
Feature/SOF-7781 -clean Feat: add NB to create and run Band Gap Job #258
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
📝 WalkthroughPre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: defaults Review profile: CHILL Plan: Pro 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (1)
🧰 Additional context used🪛 Ruff (0.14.10)utils/auth.py11-11: Possible hardcoded password assigned to: "REFRESH_TOKEN_ENV_VAR" (S105) 102-102: Create your own exception (TRY002) 102-102: Avoid specifying long messages outside the exception class (TRY003) 🔇 Additional comments (6)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
utils/generic.py
Outdated
| from types import SimpleNamespace | ||
| from typing import List, Union | ||
|
|
||
| from IPython.display import HTML, display |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to update the imports to use mat3ra.api_client everywhere - maybe a seprate PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (3)
utils/generic.py (1)
230-237: Consider adding type hints for consistency.The function works correctly, but adding type hints would improve maintainability and align with the typing patterns used elsewhere in this file.
🔎 Proposed enhancement
-def dict_to_namespace(obj): +def dict_to_namespace(obj: Any) -> Any: + """ + Recursively convert dictionaries to SimpleNamespace objects for dot notation access. + + Args: + obj: Input object (dict, list, or primitive value) + + Returns: + SimpleNamespace for dicts, list of converted items for lists, or the original value + """ if isinstance(obj, dict): return SimpleNamespace(**{k: dict_to_namespace(v) for k, v in obj.items()}) elif isinstance(obj, list): return [dict_to_namespace(item) for item in obj] else: return objutils/auth.py (2)
67-82: Improve error handling in token polling loop.The polling loop only checks for status 200 and sleeps on all other responses. Per the device flow spec, specific error codes have different meanings:
authorization_pending- continue pollingslow_down- increase intervalexpired_token,access_denied- stop polling with errorAlso, consider using a custom exception class instead of generic
Exception.🔎 Proposed improvement
+class AuthenticationTimeoutError(Exception): + """Raised when device flow authentication times out.""" + pass + +class AuthenticationError(Exception): + """Raised when device flow authentication fails.""" + pass + async def _poll_for_token_data( ... ) -> dict: deadline_seconds = time.time() + expires_in_seconds while time.time() < deadline_seconds: token_response = requests.post( ... ) if token_response.status_code == 200: return token_response.json() + + # Handle specific OAuth2 device flow errors + if token_response.status_code == 400: + error_data = token_response.json() + error_code = error_data.get("error", "") + if error_code == "authorization_pending": + pass # Continue polling + elif error_code == "slow_down": + polling_interval_seconds += 5 + elif error_code in ("expired_token", "access_denied"): + raise AuthenticationError(f"Authentication failed: {error_code}") + await asyncio.sleep(polling_interval_seconds) - raise Exception("Timeout waiting for authorization.") + raise AuthenticationTimeoutError("Timeout waiting for authorization.")
105-115: Document the magicdata_from_hostglobal variable.The use of
data_from_hostas an undefined global (with# noqa: F821) relies on JupyterLite injecting this variable at runtime. Consider adding a docstring explaining this pattern for maintainability.🔎 Proposed documentation
async def authenticate_jupyterlite(): + """ + Authenticate using config provided by JupyterLite host environment. + + Note: This function expects `data_from_host` to be injected into globals + by the JupyterLite data_bridge extension before this function is called. + """ apiConfig = data_from_host.get("apiConfig") # type: ignore # noqa: F821
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
README.mdconfig.ymlother/materials_designer/workflows/band_gap.ipynbpyproject.tomlutils/auth.pyutils/generic.pyutils/jupyterlite.pyutils/visualize.py
🧰 Additional context used
🧬 Code graph analysis (1)
utils/visualize.py (1)
utils/generic.py (1)
display_JSON(189-227)
🪛 GitHub Actions: Continuous Testing and Docs Publication
utils/generic.py
[error] 1-1: isort hook failed. Files were modified by this hook (utils/generic.py). Run 'pre-commit run --all-files' or commit again to apply the updated imports.
🪛 Ruff (0.14.10)
other/materials_designer/workflows/band_gap.ipynb
35-35: await statement outside of a function
(F704)
36-36: await statement outside of a function
(F704)
39-39: await statement outside of a function
(F704)
45-45: await statement outside of a function
(F704)
86-86: Found useless expression. Either assign it to a variable or remove it.
(B018)
utils/auth.py
12-12: Possible hardcoded password assigned to: "CLIENT_SECRET"
(S105)
16-16: Possible hardcoded password assigned to: "ACCESS_TOKEN_ENV_VAR"
(S105)
17-17: Possible hardcoded password assigned to: "REFRESH_TOKEN_ENV_VAR"
(S105)
82-82: Create your own exception
(TRY002)
82-82: Avoid specifying long messages outside the exception class
(TRY003)
utils/jupyterlite.py
250-250: Do not catch blind exception: Exception
(BLE001)
🔇 Additional comments (11)
config.yml (1)
65-69: LGTM!The addition of
requeststo theapi_examplespackages is appropriate to support the new OIDC authentication flow inutils/auth.py.README.md (1)
107-120: LGTM!Clear and helpful documentation for local development setup. The example configuration snippet provides the essential variables needed for local API development.
pyproject.toml (1)
17-17: Verify the git-based dependency approach.Using a git URL with a commit hash ensures reproducibility, but git-based dependencies have limitations:
- They won't work when installing from PyPI
- They require git to be available during installation
- Some dependency resolution tools may have issues
Is
mat3ra-wodeexpected to be published to PyPI soon, or is this intended as a permanent installation method?utils/jupyterlite.py (1)
254-266: LGTM!The fallback logic to search by material name when filename match fails is a sensible approach. The case-insensitive matching improves usability.
utils/visualize.py (1)
262-274: LGTM!Clean implementation that leverages the existing
display_JSONutility. The docstring clearly documents the expected interface (to_dict()method) for the workflow parameter.utils/auth.py (2)
10-13: Default configuration looks appropriate for development.These are clearly placeholder defaults for local development. The static analysis warnings about "hardcoded passwords" are false positives - these are configuration defaults, not actual secrets.
118-123: Verify globals() scope in async context.The check
"data_from_host" in globals()is evaluated in the module's global scope, not the caller's. This works ifdata_from_hostis injected into this module's globals, but verify this matches the JupyterLite injection behavior.other/materials_designer/workflows/band_gap.ipynb (4)
535-539: Verifywait_for_jobs_to_finishsignature compatibility.The function is called with
client.jobsas the endpoint argument. The existing implementation inutils/generic.pyexpects aJobEndpointstype fromexabyte_api_client. Verify thatclient.jobsprovides a compatible interface with the requiredlist()method.
37-64: Clear parameter organization.Well-structured configuration section with logical grouping of material, workflow, model, k-grid, and job parameters. The use of a timestamp in the job name helps with identification.
72-83: Pyodide environment setup is correct.The static analysis warnings about
awaitoutside function are false positives - Jupyter notebooks support top-level await natively.
555-565: LGTM - Results retrieval logic.The bandgap retrieval correctly uses the NSCF unit's flowchart ID from the reconstructed workflow to fetch both direct and indirect bandgap values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
utils/auth.py (1)
106-116: Add defensive checks for JupyterLite configuration.The function assumes
data_from_hostis well-formed and contains the expected structure. Consider adding validation to handle missing or malformed configuration gracefully.🔎 Suggested defensive implementation
async def authenticate_jupyterlite(): + if "data_from_host" not in globals(): + raise RuntimeError("authenticate_jupyterlite called outside JupyterLite context") + apiConfig = data_from_host.get("apiConfig") # type: ignore # noqa: F821 + if not apiConfig: + raise ValueError("Missing apiConfig in data_from_host") + os.environ.update(data_from_host.get("environ", {})) # noqa: F821 os.environ.update( dict( - ACCOUNT_ID=apiConfig.get("accountId"), - AUTH_TOKEN=apiConfig.get("authToken"), + ACCOUNT_ID=apiConfig.get("accountId", ""), + AUTH_TOKEN=apiConfig.get("authToken", ""), ORGANIZATION_ID=apiConfig.get("organizationId", ""), CLUSTERS=json.dumps(apiConfig.get("clusters", [])), ) )
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
pyproject.tomlutils/auth.pyutils/generic.py
🚧 Files skipped from review as they are similar to previous changes (2)
- pyproject.toml
- utils/generic.py
🧰 Additional context used
🪛 Ruff (0.14.10)
utils/auth.py
11-11: Possible hardcoded password assigned to: "REFRESH_TOKEN_ENV_VAR"
(S105)
81-81: Create your own exception
(TRY002)
81-81: Avoid specifying long messages outside the exception class
(TRY003)
🔇 Additional comments (8)
utils/auth.py (8)
1-9: LGTM!Imports are appropriate for the OIDC device flow implementation.
11-11: LGTM!The static analysis warning about a hardcoded password is a false positive—this is simply an environment variable name constant, not a password value.
14-16: LGTM!Clean helper that correctly delegates to the API client's environment configuration.
19-34: LGTM!Correctly implements the OAuth 2.0 Device Authorization Grant initiation step with proper error handling, timeouts, and sensible defaults.
37-48: LGTM!Provides clear user feedback in Jupyter environments with styled HTML and automatic browser window opening.
51-54: LGTM!Correctly stores required access token and optional refresh token in environment variables.
84-103: LGTM!Correctly orchestrates the OAuth 2.0 device flow sequence with proper parameter handling and token storage.
119-124: LGTM!Clean routing logic that correctly handles both JupyterLite and standard environments, with a
forceparameter for re-authentication.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
♻️ Duplicate comments (1)
utils/auth.py (1)
79-82: Implement RFC 8628 error handling and use a custom exception.The polling loop doesn't handle OAuth 2.0 Device Authorization Grant error responses per RFC 8628. Non-200 responses with
authorization_pending,slow_down,access_denied, orexpired_tokenare treated identically—the function just continues polling. Additionally, line 82 raises a genericException.Per RFC 8628:
slow_down: MUST increase polling interval by 5 secondsaccess_denied/expired_token: MUST stop polling immediately
🧹 Nitpick comments (1)
utils/auth.py (1)
37-48: Consider HTML-escapinguser_codefor defense in depth.The
user_codeis embedded directly into HTML without escaping. While it originates from your OIDC server (typically alphanumeric), escaping prevents issues if the server is compromised or returns unexpected characters.🔎 Suggested improvement
def show_device_flow_popup(verification_uri_complete: str, user_code: str) -> None: - from IPython.display import HTML + from html import escape + from IPython.display import HTML + + safe_user_code = escape(user_code) display( HTML( f"<div style='padding: 15px; background: #e3f2fd; border-left: 4px solid #2196f3; margin: 10px 0;'>" f"<strong>Authentication Required</strong><br/>" - f"Enter this code: <strong style='font-size: 1.2em; color: #1976d2;'>{user_code}</strong>" + f"Enter this code: <strong style='font-size: 1.2em; color: #1976d2;'>{safe_user_code}</strong>" f"</div>" ) )
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
utils/auth.py
🧰 Additional context used
🪛 Ruff (0.14.10)
utils/auth.py
11-11: Possible hardcoded password assigned to: "REFRESH_TOKEN_ENV_VAR"
(S105)
82-82: Create your own exception
(TRY002)
82-82: Avoid specifying long messages outside the exception class
(TRY003)
🔇 Additional comments (5)
utils/auth.py (5)
1-12: LGTM!Imports and constants are appropriate. The static analysis warning S105 on line 11 is a false positive—
REFRESH_TOKEN_ENV_VARis an environment variable name, not an actual secret value.
14-16: LGTM!Clean delegation to the library for environment-based URL construction.
19-34: LGTM!Good HTTP error handling with
raise_for_status(), sensible timeout, and appropriate defaults for polling interval and expiration.
51-54: LGTM!Simple and correct token storage with appropriate conditional handling for optional refresh token.
85-102: LGTM!Clean orchestration of the complete device flow: request state → show popup → poll → store tokens.
| async def _poll_for_token_data( | ||
| oidc_base_url: str, | ||
| client_id: str, | ||
| device_code: str, | ||
| polling_interval_seconds: int, | ||
| expires_in_seconds: int, | ||
| ) -> dict: | ||
| deadline_seconds = time.time() + expires_in_seconds | ||
| while time.time() < deadline_seconds: | ||
| token_response = requests.post( | ||
| f"{oidc_base_url}/token", | ||
| data={ | ||
| "grant_type": "urn:ietf:params:oauth:grant-type:device_code", | ||
| "device_code": device_code, | ||
| "client_id": client_id, | ||
| "redirect_uris": [], | ||
| "response_types": [], | ||
| "token_endpoint_auth_method": "none", | ||
| }, | ||
| headers={"Content-Type": "application/x-www-form-urlencoded"}, | ||
| timeout=10, | ||
| ) | ||
| if token_response.status_code == 200: | ||
| return token_response.json() | ||
| await asyncio.sleep(polling_interval_seconds) | ||
| raise Exception("Timeout waiting for authorization.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Blocking I/O in async function.
Using synchronous requests.post() inside an async function blocks the event loop during HTTP calls. This defeats the purpose of async and can cause issues in environments with other concurrent async tasks.
🔎 Suggested fix using run_in_executor
+import functools
+
async def _poll_for_token_data(
oidc_base_url: str,
client_id: str,
device_code: str,
polling_interval_seconds: int,
expires_in_seconds: int,
) -> dict:
deadline_seconds = time.time() + expires_in_seconds
+ loop = asyncio.get_event_loop()
while time.time() < deadline_seconds:
- token_response = requests.post(
+ token_response = await loop.run_in_executor(
+ None,
+ functools.partial(
+ requests.post,
- f"{oidc_base_url}/token",
+ f"{oidc_base_url}/token",
- data={
+ data={
"grant_type": "urn:ietf:params:oauth:grant-type:device_code",
"device_code": device_code,
"client_id": client_id,
"redirect_uris": [],
"response_types": [],
"token_endpoint_auth_method": "none",
},
- headers={"Content-Type": "application/x-www-form-urlencoded"},
- timeout=10,
+ headers={"Content-Type": "application/x-www-form-urlencoded"},
+ timeout=10,
+ ),
)
if token_response.status_code == 200:
return token_response.json()
await asyncio.sleep(polling_interval_seconds)
raise Exception("Timeout waiting for authorization.")Alternatively, consider using aiohttp for native async HTTP.
🧰 Tools
🪛 Ruff (0.14.10)
82-82: Create your own exception
(TRY002)
82-82: Avoid specifying long messages outside the exception class
(TRY003)
🤖 Prompt for AI Agents
In utils/auth.py around lines 57 to 82, the async function _poll_for_token_data
is making a blocking synchronous requests.post call which blocks the event loop;
replace the blocking call with an async HTTP call (preferably using
aiohttp.ClientSession.post with appropriate timeout and headers) or run the
requests.post inside asyncio.get_running_loop().run_in_executor to offload it to
a thread; ensure you properly await the async response, check response.status,
parse JSON asynchronously (or from the executor result), handle
timeouts/exceptions, and keep the existing polling/sleep and return/exception
behavior unchanged.
| async def authenticate_jupyterlite(): | ||
| apiConfig = data_from_host.get("apiConfig") # type: ignore # noqa: F821 | ||
| os.environ.update(data_from_host.get("environ", {})) # noqa: F821 | ||
| os.environ.update( | ||
| dict( | ||
| ACCOUNT_ID=apiConfig.get("accountId"), | ||
| AUTH_TOKEN=apiConfig.get("authToken"), | ||
| ORGANIZATION_ID=apiConfig.get("organizationId", ""), | ||
| CLUSTERS=json.dumps(apiConfig.get("clusters", [])), | ||
| ) | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add null check for apiConfig.
If data_from_host.get("apiConfig") returns None, line 110 will raise AttributeError: 'NoneType' object has no attribute 'get'.
🔎 Suggested fix
async def authenticate_jupyterlite():
apiConfig = data_from_host.get("apiConfig") # type: ignore # noqa: F821
+ if apiConfig is None:
+ raise ValueError("apiConfig not found in data_from_host")
os.environ.update(data_from_host.get("environ", {})) # noqa: F821
os.environ.update(
dict(
ACCOUNT_ID=apiConfig.get("accountId"),
AUTH_TOKEN=apiConfig.get("authToken"),
ORGANIZATION_ID=apiConfig.get("organizationId", ""),
CLUSTERS=json.dumps(apiConfig.get("clusters", [])),
)
)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| async def authenticate_jupyterlite(): | |
| apiConfig = data_from_host.get("apiConfig") # type: ignore # noqa: F821 | |
| os.environ.update(data_from_host.get("environ", {})) # noqa: F821 | |
| os.environ.update( | |
| dict( | |
| ACCOUNT_ID=apiConfig.get("accountId"), | |
| AUTH_TOKEN=apiConfig.get("authToken"), | |
| ORGANIZATION_ID=apiConfig.get("organizationId", ""), | |
| CLUSTERS=json.dumps(apiConfig.get("clusters", [])), | |
| ) | |
| ) | |
| async def authenticate_jupyterlite(): | |
| apiConfig = data_from_host.get("apiConfig") # type: ignore # noqa: F821 | |
| if apiConfig is None: | |
| raise ValueError("apiConfig not found in data_from_host") | |
| os.environ.update(data_from_host.get("environ", {})) # noqa: F821 | |
| os.environ.update( | |
| dict( | |
| ACCOUNT_ID=apiConfig.get("accountId"), | |
| AUTH_TOKEN=apiConfig.get("authToken"), | |
| ORGANIZATION_ID=apiConfig.get("organizationId", ""), | |
| CLUSTERS=json.dumps(apiConfig.get("clusters", [])), | |
| ) | |
| ) |
🤖 Prompt for AI Agents
In utils/auth.py around lines 105 to 115, add a null check for apiConfig (result
of data_from_host.get("apiConfig")) before accessing its .get methods; if
apiConfig is None, either raise a clear exception (e.g., ValueError with
context) or set apiConfig to an empty dict/defaults so subsequent
os.environ.update calls don't raise AttributeError, then proceed to populate
ACCOUNT_ID, AUTH_TOKEN, ORGANIZATION_ID, and CLUSTERS using safe .get calls or
defaults.
| async def authenticate(force=False): | ||
| if "data_from_host" in globals(): | ||
| await authenticate_jupyterlite() | ||
| else: | ||
| if ACCESS_TOKEN_ENV_VAR not in os.environ or force: | ||
| await authenticate_oidc() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing return statements cause inconsistent behavior.
authenticate_oidc() returns token data, but authenticate() doesn't propagate it. This means callers get None regardless of which path executes.
🔎 Suggested fix
-async def authenticate(force=False):
+async def authenticate(force: bool = False) -> Optional[dict]:
if "data_from_host" in globals():
- await authenticate_jupyterlite()
+ await authenticate_jupyterlite()
+ return None
else:
if ACCESS_TOKEN_ENV_VAR not in os.environ or force:
- await authenticate_oidc()
+ return await authenticate_oidc()
+ return None🤖 Prompt for AI Agents
In utils/auth.py around lines 118 to 123, authenticate() currently calls
authenticate_jupyterlite() or authenticate_oidc() but does not return their
results, causing callers to always receive None; modify the function so it
returns the value from the called authentication function (e.g., return await
authenticate_jupyterlite() and return await authenticate_oidc()) and ensure any
code path that should produce token data returns it (and explicitly return None
only when appropriate).
| return build_oidc_base_url(env.host, env.port, env.secure) | ||
|
|
||
|
|
||
| def request_device_flow_state(oidc_base_url: str, client_id: str, scope: str) -> dict: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs a docstring
Summary by CodeRabbit
New Features
Documentation
Chores
✏️ Tip: You can customize this high-level summary in your review settings.