Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Unexpected underdog argument. What is happening in reality is all companies are racing to (a) scrape, buy and collect as much as they can from others, both individuals and companies while (b) locking down their own data against everyone else who isn’t directly making them money (eg through viewing their ads).

Part of me thinks that the open web has a paradox of tolerance issue, leading to a race to the bottom/tragedy of the commons. Perhaps it needs basic terms of use. Like if you run this kind of business, you can build it on top of proprietary tech like apps and leave the rest of us alone.



We need to wake up and understand that all the information already uploaded is more or less a free web material, once taken through the lens of ML-somethings. With all the second, and third-order effects such as the fact that this changes completely the whole motivation, and consequence of open-source perhaps.

It is also only a matter of time scrapers once again get through walls by twitter, reddit and alike. This is, after all, information everyone produced, without being aware of it was now considered not theirs anymore.


Reddit sold their data already. Twitter made thier own AI.


Precisely my point, and there is little if any evidence, there is anyone among the big players who puts peoples' rights before else by respecting licensing agreements before scrapping for training.

Indeed, Reddit sold their data the other thay GPT2 was announced, and it was very apparent why everyone closed their APIs in 2021-2023. Wonder what Aaron would've said about it.

Now we have walled gardens of information where people are allowed to plant, but never own the blossom.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: