And the vast majority of the English-language corpus available will reflect Western, imperial core liberal politics.
Oh, I’m sure that isn’t going to be a problem for their goals. They can always overrepresent training data from 2016-2020 and 2024-2028 to add some balance to the model’s political compass. /s
It’s possible that Lemmy uses fixed-size buffers for the username and unhashed password. It would be pretty bad to give an unauthenticated user the power to allocate hundreds of megabytes in a shared process.
Not that I read the source code to know for sure, but it’s common practice to reduce the opportunity for denial of service attacks by limiting user input size.