Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Ok, that wasn't clear before since you just kept saying how you expressed your consent rather than why your consent should be taken into account.

Licensing is much much more limited than you seem to be thinking of it. For instance, you said explicitly you want a way to control your ideas. The only thing this can mean is a way to control who gets to use your ideas, or what they get to use them for. So if I express a political idea in a novel way or tell a funny joke or something I should be able to dictate who gets to repeat it, or in this case with LLMs who gets to summarise and describe it.

This kind of control is antithetical to the spirit of the internet and would be frankly evil if people were actually able to assert it. Luckily in most cases it's impossible, nobody can actually stop me from describing a movie to my friends or from reposting a meme. Just copying and reposting what you wrote verbatim is something we can probably agree is wrong, but that isn't what's up for questioning here. The idea I was actually replying to in the first place was that you can decide somebody can't read your ideas - even if they're public - just because you don't like them or you don't like what they will do with them. It is hard to think of a more egregious kind of 1984-style censorship, really.

There is a place for regulation of LLM companies, they are doing a lot of harm that I wish governments would effectively rein in. It would not be hard if the political will existed. But this idea of saying I should be able to "control my ideas" is way, way worse.



LLMs are not "someone", LLMs are something, and they don't "read content", they by definition acquire and reuse that content (for example, by summarizing it), as part of their product.

So here the consent is indeed about what can be done with the data.

In general, it's absolutely the norm that public websites (I.e., unauthenticated) restrict even who can access the data. The simplest example that comes to mind is geoblocking. I have all the rights to say that my website is not made available to anybody in the US, for example. Would you still call that website "public"? Would bypassing the block via a VPN be a violation of my consent? This is mostly a moral discussion I suppose.

But anyway, it's not what's happening here. LLMs access content for the sole purpose of doing something with that content, either training or providing the service to their customers. They are not humans, they are not consumers, they don't simply fetch the content and present it to the users (a much more neutral action, like curl or the browser does). It's impossible to distinguish, in the case of LLMs the act of accessing and the act of using, so the difference you make doesn't apply in my opinion.


LLMs are indeed not "someone". They are programs, like web browsers, acting on user instruction. The user is a person. I am only talking about people - I never said that an LLM does anything of its own volition.

> The simplest example that comes to mind is geoblocking.

Do you think it is alright to geoblock people, for arbitrary reasons? It is one thing when GDPR imposes a legal obligation on you for serving content in a particular way. Note that that actually doesn't prevent you from seeing the content, it just prevents you from being served by that server. The distinction is important - circumventing a geoblock is something I think should be legally protected.

> They are not humans, they are not consumers, they don't simply fetch the content and present it to the users

They simply fetch the content, run it through a software, and present it to the user. As far as you, the service owner, are concerned, they are simply fetching the content for the user. It is none of your business what the user and the AI company go on to do with "your content".


> like web browsers, acting on user instruction.

No, they are not like browsers. The browser access my content in a transparent way. An LLM reuses the information and acts as an opaque intermediary which - maybe - will at most add a reference to my content.

> I never said that an LLM does anything of its own volition

It doesn't matter why it does what it does, it matters what it does. Your previous comment stressed the idea that it's possible to regulate _what can be done_ with my intellectual property (licensing), but not who can access it, once made it public. What I am saying is that this is exactly the case for LLMs, who _use_ my intellectual property, they are not a tool to _access_ it (like a browser).

> Do you think it is alright to geoblock people, for arbitrary reasons?

Yes. Why wouldn't it be? And if you believe it's not, where do you draw the line? Once you share a picture with your partner, everyone has the right to see it? Or if you share it with your group of friends? Or if you share it on a private social media profile (where you have acquaintances)? When does the audience turn from "a restricted group" to "everyone"? Or why would it be different with my blog? If I want my blog accessible only from my country, I can absolutely do that and there is nothing wrong with it at all. Nobody is entitled to my intellectual property. Obviously I am playing devil's advocate, but this was to say that the fact that something is public, doesn't mean it's unrestricted. And don't get me started on "the spirit of the internet". I can't imagine something breaking that spirit more than LLMs acting as interface between people and the other people on the internet. That spirit is gone, and belongs to a time when the internet was tiny. When OpenAI and company will respect the "spirit of the internet", maybe I will think about doing the same.

> As far as you, the service owner, are concerned, they are simply fetching the content for the user. It is none of your business what the user and the AI company go on to do with "your content".

No, as far as I am concerned the program can take my information, summarize, change, distort, misinterpret it and then present it back to its user. This can happen with or without the user ever knowing that the information can from me. Considering this equal to the user accessing the information is something I simply will not concede and is a fundamental disagreement between us, from which many other disagreements stems.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: