darkoshi: (Default)
Darkoshi ([personal profile] darkoshi) wrote2023-05-04 12:39 am

chatGPT and trust

One concern I have about ChatGPT and similar things is how they will affect us being able to find reliable and useful information online.

In the last few years, I've noticed more and more webpages which appear to consist of data scraped from other places on the web. These pages contain a series of questions and answers on a particular topic. One can tell a human didn't write or compose the page, because the questions are repetitive and include many variations of the same question. The answers in one part of the page sometimes contradict other answers on the same page.

Presumably these pages have ads on them, and the people who create them do so to get money from ad-traffic. I don't usually see that due to my ad-blockers.

Now with ChatGPT, I imagine that rather than generating pages like that with data scraped from the web, people will generate pages with questions answered by ChatGPT. The text on these pages will look much more convincing than the ones out there now. The information will probably be less reliable. The reliable information sources will be greatly outnumbered by the unreliable ones, and it will be difficult to distinguish them.

Maybe it has already happened. For example, this page: Are American Toads Poisonous to Humans, Dogs, or Cats? doesn't seem like one of those generated/scraped pages. But it repeats itself multiple times which makes me suspicious. I can't tell from reading it whether I should trust it or not. I know that I probably shouldn't trust anything, especially not a random website I've come across, but for non-critical questions, it's nice to do a search and find a plausible answer that satisfies my curiosity or need-to-know, and then get on with my life. It's nicer if the plausible answers are true or at least based on what someone believes to be true, as opposed to some made-up answer.

I worry that there will be fake accounts on sites like Dreamwidth, posting content generated by tools like ChatGPT. I worry that someday I won't be able to tell which accounts are real people and which not, even when I interact with them. Will that affect my desire to interact with other people online?

I worry about art and music and poems, that I won't be able to tell if a human had a large part in making them or not. And that I won't feel as enthusiastic about them, for not knowing.
mellowtigger: (pikachu magnifying glass)

sousveillance

[personal profile] mellowtigger 2023-05-04 01:41 pm (UTC)(link)
It's a real issue ahead of us. I rather like David Brin's solution in Existence, where any online feed passes through algorithms to determine authenticity. Due to 1) deepfake photo, audio, and video, and 2) security breaches that compromise publishing keys, it will require "triangulation" of multiple trust systems to determine if a source is real, even for live broadcasting. That's why I dislike the privacy laws currently being implemented. We need everyone recording everywhere, because that's the only way we'll be able to verify that the apparent cellphone recording made at 34th and Georgetown Avenue at 1:14pm on Tuesday in Los Angeles is authentic, by matching perspectives and timecodes for events from multiple people with multiple devices not under the same authority.
lhexan: as a fox, i ride the book and yip (Default)

[personal profile] lhexan 2023-05-05 12:26 am (UTC)(link)
Looking at that specific site, it might be generated, or it might be an author hastily copy/pasting information into a template. (On that same site, there are many "Is X poisonous?" articles with identical overall structures.) Overall though, the legitimacy of this particular site is easy to challenge. The "About Us" page claims the site is an individual's passion project, but the "Contact Us" page is blank and deactivating my ad blocker revealed a half-dozen autoplaying videos.

I remember first seeing pages of this kind in the late aughts. Previously, search strings such as "games similar to X" would mainly yield forum discussions. After that, searches would mainly show sites clearly generated from a template, that had zero-effort pages on "games similar to X" for every halfway popular game, boosted by SEO. The situation is slightly better now, but boilerplate sites still rank too high on the results.

Tangentially, I didn't realize until I wrote this reply that Startpage is just a frontend for Google searches. Sigh. Maybe I'll give Mojeek a try.

I do think that chatbot text generation is going to be a big problem going forward, though. And then the search chatbots will be using other chatbots as part of their training data. It's going to be strange.