The detect function can't always guess the right encoding, so the decompress methods falls back to 'utf-8' and ignoring the errors. Therefore the latin chars get discarded.
There should be an optional param to force the decoding to be used when decoding the data.