You are only browsing one thread in the discussion! All comments are available on the post page.

Return

PabloDiscobar ,
@PabloDiscobar@kbin.social avatar

For reddit and twitter it's also induced by the threat of AI. Twitter and reddit host a lot of content, organized, sorted, coherent. It's invaluable for training an AI and these companies don't want to let it go for free. They want control over it, therefore they are making it very hard for AI companies to farm their content. The fact that it's happening now is because AI companies are probably rushing to copy as much data as possible before laws are voted to put a limit over them.

It will be the same for the fediverse, our content will be scanned by AI's. Our content is freely visible, organized, sorted and scored. We should be careful about that. If you are not a professional publisher or a public person then you should probably think about rotating your username as often as possible.

edit: But also, with the rise of tiktok, a lot of countries are now suspicious about the soft power of those apps, and are ready to legislate against them. The EU already did, they did vote fines against them and are regularly getting money out of them. The taboo is gone, you can attack those companies, it works. They were supposed to be out of reach, but they are not.

Also there is no genius in Twitter, as far as I know they have no patent over anything. If someone manages to become more popular than them on the same principle then twitter is done. Gravity will do the rest and users will move to a different platform. People are using it because people are using it. So the model is fragile and the value is questionable.

fearout ,
@fearout@kbin.social avatar

What’s so bad about giving AI models something to learn on? Add LLM-tier accounts to your social media company and have at it. And fix data/traffic issues by giving users the ability use their own tokens/api keys/whatever to limit bandwidth without affecting end users as significantly as they did with current decisions.

That way you could detect and address rogue scrubbers while still working with LLM creators who are open to an honest training integration. And if your company can’t really detect the difference between users and LLM crawlers after implementing something like this, well, then those crawlers don’t really affect the company as much as the CEOs would like to pretend.

ExistentialOverloadMonkey ,

The fuckwits at reddit and twitter HQs think they own that data. Data they didn't create, or even contribute to. They imagine that by providing server space, they somehow own the content. As if the government owned the cars that use the roads, or if an airline claimed they owned the travelers' baggage. Greedy bastards without shame.

PabloDiscobar ,
@PabloDiscobar@kbin.social avatar

What’s so bad about giving AI models something to learn on?

From a user point of view? A lot. So far the AI has made itself the champion of the creation of fake. Fake news, fake pictures, fake videos, fake history, fake identity. Do you think that the AI will be used for your own good? Do you think that your private data are farmed for you own good? I don't.

I posted an example about fake identities and fake posters on Twitter. This is the end goal. This is where the money generated by the AI will come from.

That way you could detect and address rogue scrubbers while still working with LLM creators who are open to an honest training integration. And if your company can’t really detect the difference between users and LLM crawlers after implementing something like this, well, then those crawlers don’t really affect the company as much as the CEOs would like to pretend.

Twitter and Reddit probably want to be their own LLM creators. They don't want to leave this market to another LLM. Also it doesn't take a lot of API calls to generate the content that will astroturf your product.

Anyway the cat is out of the bag and this data will be harvested. The brands will astroturf their products using AI processes. People are not stupid and will realize the trick played on them. We are probably heading toward platforms using full authenticated access.

FaceDeer ,
@FaceDeer@kbin.social avatar

From a user point of view? A lot. So far the AI has made itself the champion of the creation of fake. Fake news, fake pictures, fake videos, fake history, fake identity. Do you think that the AI will be used for your own good? Do you think that your private data are farmed for you own good? I don't.

That's addressing whether the mere existence of LLMs is "good" or not. That's not going to be affected by whether someone changes their username every couple of months or whether some particular social media site makes their content annoying to scrape. LLMs exist now, and they're only getting better; attempting to staunch the flow of genies out of the bottle at this stage is futile.

Personally, I'm actually rather pleased that my comments on Reddit over the years are factoring in to how LLMs "think." All that ranting about the quality of Disney's Star Wars movies not only convinced all the people who read it (I assume) but will now also convince our future AI overlords too.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • random
  • [email protected]
  • All magazines