Skip to content

Conversation

@Enivex
Copy link
Contributor

@Enivex Enivex commented Jun 15, 2025

Closes #18

Hopefully I did it correctly

@Enivex
Copy link
Contributor Author

Enivex commented Jun 21, 2025

For clarification:

The patterns are taken straight from https://github.com/hyphenation/tex-hyphen/tree/master/hyph-utf8/tex/generic/hyph-utf8/patterns/tex

I then used cargo test --tests generate to update the binary files.

@Andrew15-5
Copy link

Andrew15-5 commented Jun 21, 2025

Isn't the https://github.com/typst/hypher#languages also should be updated?

@Enivex
Copy link
Contributor Author

Enivex commented Jun 21, 2025

Isn't the https://github.com/typst/hypher#languages also should be updated?

Depends whether that is supposed to be for the current release or not I guess.

Edit: I updated them.

@Andrew15-5
Copy link

Apparently a lot of them are incorrect or imprecise, so the whole list needs updating. I'm surprised there is no script for this in the repo. Even better if it auto updates it into PR.

@Enivex
Copy link
Contributor Author

Enivex commented Jun 22, 2025

Apparently a lot of them are incorrect or imprecise, so the whole list needs updating. I'm surprised there is no script for this in the repo. Even better if it auto updates it into PR.

That sounds out of scope for this pr.

In any case, is the list even needed? It's just the file sizes of the .bin files

@Andrew15-5
Copy link

I was wondering the same thing.

@laurmaedje
Copy link
Member

laurmaedje commented Jun 23, 2025

The list is there because it was a goal to make it smaller than the binary overhead of the hyphenation crate while also having zero startup cost and the list demonstrates that. I'd like to keep it.

@laurmaedje
Copy link
Member

The files differ slightly from the upstream ones. E.g. in https://github.com/hyphenation/tex-hyphen/blob/f44dcc0196939896f40c80e4dc888be0af8822e7/hyph-utf8/tex/generic/hyph-utf8/patterns/tex/hyph-de-1996.tex the patterns are not indented in the \pattern{..} macro but in this PR they are. Did you do any kind of post-processing?

@Enivex
Copy link
Contributor Author

Enivex commented Jun 24, 2025

The files differ slightly from the upstream ones. E.g. in https://github.com/hyphenation/tex-hyphen/blob/f44dcc0196939896f40c80e4dc888be0af8822e7/hyph-utf8/tex/generic/hyph-utf8/patterns/tex/hyph-de-1996.tex the patterns are not indented in the \pattern{..} macro but in this PR they are. Did you do any kind of post-processing?

Not intentionally, but it may have been automatically indented by texfmt. I'll fix it.

@Enivex
Copy link
Contributor Author

Enivex commented Jun 25, 2025

I've updated them, and made sure the formatter didn't run.

Sorry for the confusion.

@laurmaedje laurmaedje merged commit 5864b75 into typst:main Jun 26, 2025
1 check passed
@laurmaedje
Copy link
Member

Thanks!

@Enivex Enivex deleted the update-patterns branch June 26, 2025 12:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Update hyphenation patterns for German, Portuguese, and Albanian

3 participants