Skip to content

Conversation

@hutch3232
Copy link

I noticed there is a 1.x.x release (June, 2025) of this dependency but this upper bound prevents it's usage. However, it is being tested against already (the latest.txt requirements are uncapped).

@kyleknap
Copy link
Collaborator

kyleknap commented Jan 5, 2026

Hi @hutch3232. Thanks for the PR! I'm hesitant to remove the upper bound cap completely as it protects adlfs against regressions if azure-datalake-store were to introduce backwards incompatible changes in a major version bump (e.g., a 2.x.x) that negatively impacted adlfs. If we were to update the range ceiling, I'd prefer we match the pattern with azure-core and set the ceiling to <2.

That being said Azure Data Lake Storage Gen 1 has been retired since February 2024 and adlfs does not have tests related to the adls gen 1 filesystem. So, we'd like to better understand what changed when azure-datalake-store went 1.0 before increasing the ceiling.

@hutch3232 could you also elaborate more on why you are trying to use azure-datalake-store 1.x? Or is this more to resolve dependency conflicts when using adlfs with other libraries?

@hutch3232
Copy link
Author

Thanks, @kyleknap. To be honest, I don't have an immediate need to use the new version of the dependency. I'm new to Azure and was poking around to try to understand how all these packages fit together and I was surprised when my resolver wasn't getting me the latest version I had seen on GitHub. Found that adlfs was causing the limitation.

@potiuk
Copy link

potiuk commented Jan 6, 2026

We'd love this to be merged and relased @kyleknap -> it holds us back in Airflow from upgrading the azure-datalake-store. I think for any library, the general approach is that if the library depends on some 3rd-party library, it's a bad idea to upper-bind it blindly without knowing if future version will fix it or not. Then the users of your library have much better flexibility:

  • if it works, they will just upgrade
  • if it does not work - they might open an issue for you to fix - but they can upper-bind the 3rd-party library themselves
  • if it dies not work - but essentially they do not use it in the context of the library - they can still upgrade to higher version if other libraries or their own code use it.

This is far better than the current "they are limited by you not to upgrade it even if they need it for something else" - this is precisely the issue we have in Airflow - even if our users do not use azure with fsspec - but airflow itself (and particularly microsoft-azure provider uses it for other things. the sheer fact that we use fsspec limits us from upgrading the azure library)

So my recommendation is - to accept the "no upper-binding" approach for those libraries, and also likely adding missing tests for adlfs. It's not your user's fault that there are no tests, and well, as maintainers of the library you chose to depend on it (as a required dependency) and expose azure functionality through adlfs.

And it's not a secret what's changed:

https://github.com/Azure/azure-data-lake-store-python/releases

image

But how it impacts your code, it's likely an assessment maintainers of ffspec/adlfs should be best to tell.

This is the code comparision: Azure/azure-data-lake-store-python@v0.0.53...v1.0.1 -> it does not seem a lot, so I guess knowing the integration points, there - it should be easy to asses

@kyleknap
Copy link
Collaborator

kyleknap commented Jan 6, 2026

@hutch3232 @potiuk Thanks for the feedback here. For the short term, I'd still prefer for now just increasing the ceiling to the major version 2; it still gives the flexibility to upgrade within 1.x and unblocks these current dependency issues. However, we will still need to do some validation to make sure 1.x actually works with adlfs, especially since there are no tests unfortunately.

Long term, I'd actually prefer azure-datalake-store removed as a required dependency (either outright or move to an optional dependency if the Azure Data Lake Gen 1 file system is still being used). The azure-datalake-store dependency only supports Azure Data Lake Storage Gen 1, which is retired. This change will take more time to think through and coordinate though.

@potiuk
Copy link

potiuk commented Jan 6, 2026

Long term, I'd actually prefer azure-datalake-store removed as a required dependency (either outright or move to an optional dependency if the Azure Data Lake Gen 1 file system is still being used). The azure-datalake-store dependency only supports Azure Data Lake Storage Gen 1, which is retired. This change will take more time to think through and coordinate though.

Sounds good. I was about to propose that as an option as well, but I did not know how strong tie it has with adlfs

@hutch3232
Copy link
Author

As discussed, I've added back the cap but bumped it to 2.0.0. Agreed it'd be great to remove this dependency entirely once adls gen 1 support is dropped.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants