Skip to content

Conversation

@winglian
Copy link
Contributor

When saving qlora checkpoints with latest bnb, I get the error

stderr: [rank0]:   File "/home/wing/.venvs/axolotl/lib/python3.11/site-packages/transformers/trainer.py", line 3325, in _save_checkpoint
stderr: [rank0]:     self.save_model(output_dir, _internal_call=True)
stderr: [rank0]:   File "/home/wing/.venvs/axolotl/lib/python3.11/site-packages/transformers/trainer.py", line 4207, in save_model
stderr: [rank0]:     state_dict = self.accelerator.get_state_dict(self.model)
stderr: [rank0]:                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
stderr: [rank0]:   File "/home/wing/.venvs/axolotl/lib/python3.11/site-packages/accelerate/accelerator.py", line 4057, in get_state_dict
stderr: [rank0]:     state_dict = model.state_dict()                                                                          
stderr: [rank0]:                  ^^^^^^^^^^^^^^^^^^
stderr: [rank0]:   File "/home/wing/.venvs/axolotl/lib/python3.11/site-packages/torch/nn/modules/module.py", line 2260, in state_dict
stderr: [rank0]:     module.state_dict(  
stderr: [rank0]:   File "/home/wing/.venvs/axolotl/lib/python3.11/site-packages/torch/nn/modules/module.py", line 2260, in state_dict
stderr: [rank0]:     module.state_dict(
stderr: [rank0]:   File "/home/wing/.venvs/axolotl/lib/python3.11/site-packages/torch/nn/modules/module.py", line 2260, in state_dict
stderr: [rank0]:     module.state_dict(
stderr: [rank0]:   [Previous line repeated 7 more times]                                                                      
stderr: [rank0]:   File "/home/wing/.venvs/axolotl/lib/python3.11/site-packages/torch/nn/modules/module.py", line 2257, in state_dict
stderr: [rank0]:     self._save_to_state_dict(destination, prefix, keep_vars)
stderr: [rank0]:   File "/home/wing/.venvs/axolotl/lib/python3.11/site-packages/bitsandbytes/nn/modules.py", line 518, in _save_to_state_dict                                                                                                                
stderr: [rank0]:     if getattr(self.weight.quant_state, "packing_format_for_cpu", False):                                                                                                                                                                   
stderr: [rank0]:                ^^^^^^^^^^^^^^^^^^^^^^^                                                                                                                                                                                                      
stderr: [rank0]: AttributeError: 'Parameter' object has no attribute 'quant_state'

This was a regression from #1804 where it does not check quant_state is a valid attribute. If you look just a few lines below, we are checking for that attribute there too.
https://github.com/bitsandbytes-foundation/bitsandbytes/blob/main/bitsandbytes/nn/modules.py#L523

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant