Fix _tied_weights_keys mapping for Transformers v5

#10

This PR fixes the _tied_weights_keys compatibility issue with Transformers v5.0.0+.

Problem

  • _tied_weights_keys was a list, but Transformers v5+ expects a dict-like mapping
  • This caused AttributeError: 'list' object has no attribute 'keys'

Solution

  • Changed _tied_weights_keys from list to dict format
  • Maps lm_head.weight to transformer.wte.weight
    (트랜슀포머 5.0.0 + 이상 버전을 μƒμš©ν•˜λŠ” 동쒅λͺ¨λΈ)

Related

  • Discussion #9

If backward compatibility with v4 is needed, I can add a version check.

LG AI Research org

Thank you for your contribution! Could you show a simple demonstration of this change?

Here's the demonstration:

The change:

  • Before: _tied_weights_keys = ["lm_head.weight"]
  • After: _tied_weights_keys = {"lm_head.weight": "transformer.wte.weight"}

Test result:
Model loading works with Transformers v5.0.0 βœ… (it worked)

image

To clarify: this PR fixes the initial Transformers v5+ lTo clarify: this PR fixes the initial Transformers v5+ loading crash caused by _tied_weights_keys being a list (expects a dict-like mapping in v5, i.e., .keys()).

With this change, loading proceeds past the tied-weights stage / weight materialization on Transformers v5+.

The DynamicCache.from_legacy_cache error shown in the log is a separate Transformers v5 API change and is not related to _tied_weights_keys. I can open a follow-up PR for the DynamicCache update if desired.

https://github.com/LG-AI-EXAONE/EXAONE-3.5/pull/7

Update: Also fixed the DynamicCache compatibility issue.

All changes in this PR:

  1. _tied_weights_keys: list β†’ dict
  2. DynamicCache.from_legacy_cache() β†’ DynamicCache()
  3. Removed to_legacy_cache() call

Updated : https://github.com/LG-AI-EXAONE/EXAONE-3.5/pull/7

Tested with Transformers 5.0.0 - Model loads and runs inference successfully βœ…
image

image

LG AI Research org

It seems the generation output is broken. This might be another issue caused by an incompatibility.

I think it would be better to use the example from our quickstart.

Also, you’ll need to update this PR to apply the changes, rather than opening a new PR on GitHub.
We don’t manage the modeling code or related scripts in our official GitHub repository.

This documentation may be helpful:
https://huggingface.co/docs/hub/repositories-pull-requests-discussions#pull-requests-advanced-usage

LG AI Research org

We've started integrating the modeling code for Transformers v5 by using modular transformer framework.
The overall model structure will remain the same, even if some class names change.

Given this, no further work is needed for now, so it would be best to merge this PR and open a new one for the integration.
Thank you for your effort and contribution to EXAONE 3.5 πŸ˜€

nuxlear changed pull request status to merged

Thank you for merging! πŸŽ‰

If you need any help with the Transformers v5 integration in the future, feel free to let me know.

Sign up or log in to comment