Skip to content

Conversation

@geetu040
Copy link
Contributor

Towards #1575

This PR sets up the core folder and file structure along with base scaffolding for the API v1 → v2 migration.

It includes:

  • Skeleton for the HTTP client, backend, and API context
  • Abstract resource interfaces and versioned stubs (*V1, *V2)
  • Minimal wiring to allow future version switching and fallback support

No functional endpoints are migrated yet. This PR establishes a stable foundation for subsequent migration and refactor work.

@geetu040 geetu040 mentioned this pull request Dec 30, 2025
25 tasks
@codecov-commenter
Copy link

codecov-commenter commented Dec 31, 2025

Codecov Report

❌ Patch coverage is 84.18605% with 34 lines in your changes missing coverage. Please review.
✅ Project coverage is 53.42%. Comparing base (c5f68bf) to head (74ab366).

Files with missing lines Patch % Lines
openml/_api/http/client.py 82.60% 12 Missing ⚠️
openml/_api/resources/tasks.py 87.23% 6 Missing ⚠️
openml/_api/runtime/fallback.py 0.00% 6 Missing ⚠️
openml/_api/runtime/core.py 81.48% 5 Missing ⚠️
openml/_api/resources/datasets.py 77.77% 2 Missing ⚠️
openml/_api/__init__.py 75.00% 1 Missing ⚠️
openml/_api/config.py 96.87% 1 Missing ⚠️
openml/tasks/functions.py 87.50% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1576      +/-   ##
==========================================
+ Coverage   53.02%   53.42%   +0.39%     
==========================================
  Files          36       46      +10     
  Lines        4326     4537     +211     
==========================================
+ Hits         2294     2424     +130     
- Misses       2032     2113      +81     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

cache: CacheConfig


settings = Settings(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would move the settings to the individual classes. I think this design introduces too high coupling of the classes to this file. You cannot move the classes around, or add a new API version without making non-extensible changes to this file here - because APISettings will require a constructor change and new classes it accepts.

Instead, a better design is to apply the strategy pattern cleanly to the different API definitions - v1 and v2 - and move the config either to their __init__, or a set_config (or similar) method.

Copy link
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall really great, I have a design suggestion related to the configs.

The config.py file and the coupling on it breaks an otherwise nice strategy pattern.

I recommend to follow the strategy pattern cleanly instead, and move the configs into the class instances, see above.

This will make the backend API much more extensible and cohesive.

key="...",
),
v2=APIConfig(
server="http://127.0.0.1:8001/",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be hardcoded? I guess this is just for your local development

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is hard-coded, they are the default values though the local endpoints will be replaced by remote server when deployed hopefully before merging this in main


if strict:
return v2

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a previous commit the 'FallbackProxy' was used here. Do we still need this class?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed this because of the ruff errors. I'll put them back and fix the pre-commit when the class is implemented.

if use_cache:
try:
return self._get_cache_response(cache_dir)
# TODO: handle ttl expired error
Copy link
Collaborator

@SimonBlanke SimonBlanke Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR is out of draft, but this caching is not implemented. I guess this is out of scope for this PR.

Copy link
Contributor Author

@geetu040 geetu040 Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes the PR is currently a draft, should I mark it with draft as well? There are a bunch of work items that I'll separate if they can worked without affecting derived PRs, otherwise implement myself. For caching specifically I plan to implement it myself otherwise stacking is going to be challenging.


return task

def _create_task_from_xml(self, xml: str) -> OpenMLTask:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method already exists here: https://github.com/openml/openml-python/pull/1576/files/74ab3662b6be04d001b1e8dade3f695ca80bcfad#diff-fdaf60448460bf4c7af496380c2f8967b0cabe577a9153256b8397f9f80e0eccR460

Is it really needed at both locations or can we remove one of them? That would be good to avoid duplicate code

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes you are right. this resource implementation was just to give out an example, it will be removed anyways and this duplication will be taken care of in the derived PR specifically for tasks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SimonBlanke the thread on discord discussing this, in case you want to weigh in.

from openml._api.resources.base import DatasetsAPI

if TYPE_CHECKING:
from responses import Response
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In production this would be requests, right? You used responses for the mocking here during development.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes this should should be requests, I'll fix it.

@geetu040 geetu040 changed the title [ENH] Migration: set up core/base structure [ENH] V1 → V2 API Migration - core structure Jan 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants