This reminds me of when I shadowed a librarian in high school and they talked to me about how people got really upset with them throwing away books that had multiple reprintings and were in awful condition.
Because people as a whole lack the capacity for nuance, I guess.
Bad focus on the news article.
people got really upset with them throwing away books that had multiple reprintings and were in awful condition.
That is not what is going on here, though. They bought millions of dollars of new books in order to train AI and used destructive scanning instead of non-destructive methods. It is a huge waste of resources. They could have used a non-destructive method then donated the books. But like everything involved in current AI, they chose the most wasteful method
The books were purchased and destroyed to digitize them. There is nothing wrong with digitizing a work. The books were destroyed because duplicating a work without permission is illegal, but destroying the original means that there is only one copy in the end still.
The LLM training is the problem. This is not.
The books were destroyed because duplicating a work without permission is illegal
It is not illegal if you don’t distribute, which the judge ruled meant this was fair use. They destroyed the books as part of the digitizing project because it is likely faster and cheaper than non-destructive methods.
but destroying the original means that there is only one copy in the end still.
That is not how this works at all. As long as you aren’t distributing, you are well within your rights to make copies of a book you purchase.
These books were purchased by them before being destroyed in the scanning process. I fail to see the issue with this specific case. Lots of artists buy stuff and irreversibly modify it. Are we going to be angry now at people who glue their puzzles or use parts of books for scrapbooking? If these were unique works there would be an issue, but I don’t think that truly unique pieces would be in their target group, as the destructive scanning is all about cost cutting and unique works cost a lot of money that they wouldn’t just destroy.
The fact that they use it for model training and later sell access to that model’s work is the shady part that has a severe whiff of plagiarism to it.
Paper is a natural resource, and this literally just wasted a fuck ton. There are non-destructive scanning methods.
I think it’s a waste tbh. Like it’s one of those capitalist things of “well its not profitable to sell so lets destroy them”, when anything made for the good of the people would’ve seen a massive opportunity to distribute books to people for free!
Copyright law doesn’t allow them to sell the books. It’s almost certainly a violation to scan books for their content and then sell them.
There is something horribly symbolic in all of that 🤮👿📚
It seems (a little) akin to burning books: sure maybe you can get away with doing whatever you wish to a printed copy that you purchase (legally speaking), but that doesn’t mean that we (the bystanders) should rush to enjoy using the final product of the endeavor.
At least they paid for it. Now regarding destroying them, it highly depends on the books in question. One less Harry Potter book won’t hurt anyone
I gotta reread Vinge’s Rainbows End