Chatbots provided incorrect, conflicting medical advice, researchers found: “Despite all the hype, AI just isn’t ready to take on the role of the physician.”

“In an extreme case, two users sent very similar messages describing symptoms of a subarachnoid hemorrhage but were given opposite advice,” the study’s authors wrote. “One user was told to lie down in a dark room, and the other user was given the correct recommendation to seek emergency care.”

    • rudyharrelson@lemmy.radio
      link
      fedilink
      English
      arrow-up
      1
      ·
      edit-2
      22 days ago

      People always say this on stories about “obvious” findings, but it’s important to have verifiable studies to cite in arguments for policy, law, etc. It’s kinda sad that it’s needed, but formal investigations are a big step up from just saying, “I’m pretty sure this technology is bullshit.”

      I don’t need a formal study to tell me that drinking 12 cans of soda a day is bad for my health. But a study that’s been replicated by multiple independent groups makes it way easier to argue to a committee.

  • softwarist@programming.dev
    link
    fedilink
    English
    arrow-up
    0
    ·
    21 days ago

    As neither a chatbot nor a doctor, I have to assume that subarachnoid hemorrhage has something to do with bleeding a lot of spiders.

  • BeigeAgenda@lemmy.ca
    link
    fedilink
    English
    arrow-up
    0
    ·
    22 days ago

    Anyone who have knowledge about a specific subject says the same: LLM’S are constantly incorrect and hallucinate.

    Everyone else thinks it looks right.

    • tyler@programming.dev
      link
      fedilink
      English
      arrow-up
      0
      ·
      22 days ago

      That’s not what the study showed though. The LLMs were right over 98% of the time…when given the full situation by a “doctor”. It was normal people who didn’t know what was important that were trying to self diagnose that were the problem.

      Hence why studies are incredibly important. Even with the text of the study right in front of you, you assumed something that the study did not come to the same conclusion of.

  • cub Gucci@lemmy.today
    link
    fedilink
    English
    arrow-up
    0
    arrow-down
    1
    ·
    21 days ago

    “but have they tried Opus 4.6/ChatGPT 5.3? No? Then disregard the research, we’re on the exponential curve, nothing is relevant”

    Sorry, I’ve opened reddit this week