-
Notifications
You must be signed in to change notification settings - Fork 43
add chat template #42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
5ae9f2f to
60c7c38
Compare
60c7c38 to
a360e42
Compare
tokenizer.go
Outdated
| // defer C.free(unsafe.Pointer(cErr)) | ||
|
|
||
| cChatTemplate := C.new_chat_template(cTemplate, cBosToken, cEosToken) | ||
| // defer C.free_chat_template(cChatTemplate) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uncomment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a Close() method so users can free the resource when needed. We shouldn’t free cChatTemplate here as it’s returned and may be used afterwards.
release/Dockerfile
Outdated
| # syntax=docker/dockerfile:1.4 | ||
|
|
||
| FROM rust:1.87 as builder-rust | ||
| FROM swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/rust:1.87 as builder-rust |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
revert
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, this should not be pushed.
| [dependencies] | ||
| libc = "0.2.162" | ||
| tokenizers = {version = "0.20.2" } | ||
| tokenizers = { version = "0.20.0", features = ["http"] } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a reason for the downgrade in version? is something broken in 20.2?
| // Construct the model URL | ||
| modelURL := fmt.Sprintf("%s/%s/resolve/main", baseURL, modelID) | ||
| modelURL := "" | ||
| hfEndpoint, exist := os.LookupEnv("HF_ENDPOINT") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we gonna make this public config, lets spell it out HUGGINGFACE_ENDPOINT
| if err != nil { | ||
| t.Fatalf("Failed to apply chat template: %v", err) | ||
| } | ||
| fmt.Println(result) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets assert expected output, instead of printing it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same in the other test
| // TODO: replace with better solution | ||
| // hack to adjust gemma3 template for debug | ||
| // replace 'messages[0]['content'][0]['text']' with 'messages[0]['content']' | ||
| let mutated_template = template.replace( | ||
| "messages[0]['content'][0]['text']", | ||
| "messages[0]['content']", | ||
| ); | ||
| // Hack to fix Qwen3 templating. | ||
| // It uses python notation to reverse lists, which do not exist in minijinja | ||
| // so we're using the reverse filter instead. | ||
| let mutated_template = mutated_template.replace("[::-1]", "|reverse"); | ||
| // TODO: replace with a better solution | ||
| // Hack to remove the {% generation %} and {% endgeneration %} statements from | ||
| // the Jinja2 chat templates if there, since those are only using for assistant | ||
| // masking during training, and should be ignored during inference |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should these hacks live in higher layer, like in go, so the library doesnt have to inherit them? Or maybe some kind of config preprocessing script to correct them?
No description provided.