document, validate encoding names

It's hard to find out which encoding names will be accepted. And their parsing seems a bit too loose.

The list is at
https://hackage.haskell.org/package/encoding-0.9/docs/src/Data.Encoding.html#encodingFromStringExplicit .

I think each commented group is alternate names for a single encoding, with the first entry being a preferred/canonical form.

Case is ignored.
Wherever an underscore appears, any sequence of non-alphanumeric characters is accepted and ignored. (Perhaps only a single space or hyphen would be enough ?)

There are some inconsistencies. Eg,

- `utf8` and `utf16` are accepted but not `utf32`.
- Mac OS Roman seems to be the common name for that encoding but only `macintosh` is accepted.

Here's a list of lowercase "canonical" spellings I came up with; I believe all of these are accepted:

```
ascii
utf-8
utf-16
utf-32
iso-8859-1
iso-8859-2
iso-8859-3
iso-8859-4
iso-8859-5
iso-8859-6
iso-8859-7
iso-8859-8
iso-8859-9
iso-8859-10
iso-8859-11
iso-8859-13
iso-8859-14
iso-8859-15
iso-8859-16
cp1250
cp1251
cp1252
cp1253
cp1254
cp1255
cp1256
cp1257
cp1258
koi8-r
koi8-u
gb18030
macintosh
jis-x-0201
jis-x-0208
iso-2022-jp
shift-jis
cp437
cp737
cp775
cp850
cp852
cp855
cp857
cp860
cp861
cp862
cp863
cp864
cp865
cp866
cp869
cp874
cp932
```

I'm not sure it's worthwhile supporting other punctuations, unless it's systematic/comprehensive/documented.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

document, validate encoding names #28

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

document, validate encoding names #28

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions