New blog; alternate title: "in which the author attempts to both empathize with and critique the data-packrat impulse that seems to be driving a lot of toxic behavior in the software industry right now"
@glyph It's interesting to contrast the attitude you're talking about with a blog post I saw a few years ago - I can't find it now, but the gist was:
Some company wanted to extract some overall metrics from their customer database, to help them make management decisions. And they got two quotes:
One, a "big data" company, suggested a massive setup with multiple servers dedicated to constantly churning through their records.
Two, the blog author, who applied a bit of statistics and concluded that a random sample of a few thousand records would be plenty for the accuracy they needed. That needed just a few minutes on one server per update. And if I remember correctly, their quote came in $x00,000 below the other company.
Now, obviously, that isn't always possible, but it's surprising how often people reach for big-data tools when they aren't really necessary.
If you have a fediverse account, you can quote this note from your own instance. Search https://tech.lgbt/users/rachelplusplus/statuses/114260195484913531 on your instance and quote it. (Note that quoting is not supported in Mastodon.)