Last year I learned about Byte Order Mark (BOM), and how imports of CSV files generated on Windows should be decoded as “utf-8-sig” instead of “utf-8” when using Python `csv.reader()`.
That lesson saved me a lot of time today.
Last year I learned about Byte Order Mark (BOM), and how imports of CSV files generated on Windows should be decoded as “utf-8-sig” instead of “utf-8” when using Python `csv.reader()`.
That lesson saved me a lot of time today.
Extra background info courtesy of
@jscholesJames Scholes:
> “The wider issue is that many Python programs are written with no explicit `encoding` argument in `open` calls, implicitly expecting UTF-8 because that's often the default on Unix systems. But it usually is not the default on Windows.”
> “As I understand it, that's set to change in Python 3.15:”
If you have a fediverse account, you can quote this note from your own instance. Search https://hachyderm.io/users/mahryekuh/statuses/115504046021679150 on your instance and quote it. (Note that quoting is not supported in Mastodon.)