• sanguine_artichoke@midwest.social
    link
    fedilink
    English
    arrow-up
    33
    ·
    11 months ago

    This is what I wondered about a few months ago when people were saying that ChatGPT was a ‘google killer’. So we just have ‘AI’ read websites and sum them up, vs. visiting websites? Why would anyone bother putting information on a website at that point?

    • dantheclamman@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      34
      ·
      11 months ago

      We are barreling towards this issue. StackOverflow for example has crashing viewer numbers. But an AI isn’t going to help users navigate and figure out a new python library for example, without data to train on. I’ve already had AIs straight up hallucinate about functions in R that actually don’t exist. It seems to happen primarily in the newer libraries, probably with fewer posts on stackexchange about them

      • GenderNeutralBro@lemmy.sdf.org
        link
        fedilink
        English
        arrow-up
        12
        arrow-down
        3
        ·
        11 months ago

        AI isn’t going to help users navigate and figure out a new python library for example

        Current AI will not. Future AI should be able to as long as there is accurate documentation. This is the natural direction for advancement. The only way it doesn’t happen is if we’ve truly hit the plateau already, and that seems very unlikely. GPT-4 is going to look like a cheap toy in a few years, most likely.

        And if the AI researchers can’t crack that nut fast enough, then API developers will write more machine-friendly documentation and training functions. It could be as ubiquitous as unit testing.

        • FaceDeer@kbin.social
          link
          fedilink
          arrow-up
          5
          arrow-down
          2
          ·
          11 months ago

          Current AI can already “read” documentation that isn’t part of its training set, actually. Bing Chat, for example, does websearches and bases its answers in part on the text of the pages it finds. I’ve got a local AI, GPT4All, that you can point at a directory full of documents and tell “include that in your context when answering questions.” So we’re we’re already getting there.

          • GenderNeutralBro@lemmy.sdf.org
            link
            fedilink
            English
            arrow-up
            5
            ·
            11 months ago

            Getting there, but I can say from experience that it’s mostly useless with the current offerings. I’ve tried using GPT4 and Claude2 to give me answers for less-popular command line tools and Python modules by pointing them to complete docs, and I was not able to get meaningful answers. :(

            Perhaps you could automate a more exhaustive fine-tuning of an LLM based on such material. I have not tried that, and I am not well-versed in the process.

            • FaceDeer@kbin.social
              link
              fedilink
              arrow-up
              2
              ·
              11 months ago

              I’m thinking a potentially useful middle ground might be to have the AI digest the documentation into an easier-to-understand form first, and then have it query that digest for context later when you’re asking it questions about stuff. GPT4All already does something a little similar in that it needs to build a search index for the data before it can make use of it.

              • GenderNeutralBro@lemmy.sdf.org
                link
                fedilink
                English
                arrow-up
                1
                ·
                11 months ago

                That’s a good idea. I have not specifically tried loading the documentation into GPT4All’s LocalDocs index. I will give this a try when I have some time.

                • FaceDeer@kbin.social
                  link
                  fedilink
                  arrow-up
                  3
                  ·
                  11 months ago

                  I’ve only been fiddling around with it for a few days, but it seems to me that the default settings weren’t very good - by default it’ll load four 256-character-long snippets into the AI’s context from the search results, which is pretty hit and miss on being informative in my experience. I think I may finally have found a good use for those models with really large contexts, I can crank up the size and number of snippets it loads and that seems to help. But it still doesn’t give “global” understanding. For example, if I put a novel into LocalDocs and then ask the AI about general themes or large-scale “what’s this character like” stuff it still only has a few isolated bits of the novel to work from.

                  What I’m imagining is that the AI could sit on its own for a while loading up chunks of the source document and writing “notes” for its future self to read. That would let it accumulate information from across the whole corpus and cross-reference disparate stuff more easily.

            • sanguine_artichoke@midwest.social
              link
              fedilink
              English
              arrow-up
              1
              ·
              10 months ago

              What about Github Copilot? It has tons of material available for training. Of course, it’s not necessarily all bug-free or well written.

  • corsicanguppy@lemmy.ca
    link
    fedilink
    English
    arrow-up
    17
    ·
    11 months ago

    How does it help creators? Without them there is no web…” After all, if a web browser sucked out all information from web pages without users needing to actually visit them, why would anyone bother making websites in the first place

    This reminds me of when Mozilla was 0.9 and the web was just taking the baton from Gopher.

    When Ben suggests there would be no web without monetization, he seems to forget WHEN HE WAS THERE before the sellout.

  • TheRealCharlesEames@lemm.ee
    link
    fedilink
    English
    arrow-up
    18
    arrow-down
    1
    ·
    11 months ago

    Thank you to Arc for reminding me how much I enjoy browsing the internet and its many unique pages — these soulless generated results are the opposite of what I want.

    • fidodo@lemmy.world
      link
      fedilink
      English
      arrow-up
      11
      arrow-down
      1
      ·
      11 months ago

      More and more of the Internet is being ai generated, so you’ll get to choose from a soulless summary or soulless SEO spam.

      • 1984@lemmy.today
        link
        fedilink
        English
        arrow-up
        3
        ·
        edit-2
        11 months ago

        There will be alternatives I think. Maybe the web turns into even more trash but then there will be alternatives for people who knows where to look.

        I guess it’s the fate of the web to become cable TV. But that doesn’t mean we have to watch that content.

    • dantheclamman@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      16
      ·
      11 months ago

      Well people who make content are already suffering for a collapse of ad prices. News sites are shutting down left and right. Not everything is about money, but they need revenue or external support to continue operating.

      • GenderNeutralBro@lemmy.sdf.org
        link
        fedilink
        English
        arrow-up
        15
        ·
        11 months ago

        I see the advent of AI browsers much like ad blockers; the web has become increasingly user-hostile and users are pushing back. Advertising was never sustainable, and that has only become more apparent over the past decade. This is a long-overdue comeuppance. The cost of the advertising economy is extraordinary and cannot be measured in mere dollars.

        I miss the internet from the 90s, when sites were information-dense and operated mostly as a public service by enthusiasts, usually for free. Of course, that was not sustainable as the Internet became more popular, because the cost of serving a thousand people was, like, couch-cushion money, but the cost of serving billions of people…well, I don’t have millions of couch cushions to plunder.

        But also, the cost of web site operation today is artificially high, largely because of advertising and the incentives that an ad-driven market creates. What was once a few KB of text is now many MB of ads, scripts, layouts, and graphics, or even GB of videos, all for the sake of manipulating users into viewing more ads. Commercial sites do not compete on the quality of information; they compete over ad impressions. This was not borne out of need, but out of economic incentives that are misaligned with the needs of society, individuals, and, yes, even content producers.

        This isn’t new, of course. I remember the same conversations back in the 90s and early 2000s. First with Sherlock, then later with Google.

      • 1984@lemmy.today
        link
        fedilink
        English
        arrow-up
        4
        ·
        edit-2
        11 months ago

        Not everyone creates content to make money. This discussion and this thread is not about making any money, as an example.

        So why do we post when there is no monetary gain? Because we enjoy it.

        • dantheclamman@lemmy.worldOP
          link
          fedilink
          English
          arrow-up
          2
          ·
          11 months ago

          I have a long-running blog for fun, so you’re preaching to the choir. But some things can’t replace a dedicated journalist, particularly at local level, sitting in city council meetings, chasing leads, and interviewing people.

      • FaceDeer@kbin.social
        link
        fedilink
        arrow-up
        6
        arrow-down
        4
        ·
        11 months ago

        People who make content for money are suffering from a collapse in ad prices. There are people who make content because they enjoy making and sharing content.

        • Corkyskog@sh.itjust.works
          link
          fedilink
          English
          arrow-up
          8
          ·
          11 months ago

          That’s not what we’re talking about… we’re talking about news. Real news, with investigative journalism costs money. You need to pay for people to be on the ground, travel expense, etc.

        • willya@lemmyf.uk
          link
          fedilink
          English
          arrow-up
          3
          arrow-down
          2
          ·
          11 months ago

          This thought that everything you consume online should be completely free is insane. If everything we consumed online was just someone’s hobby there’d be even more trash.

          • sin_free_for_00_days@sopuli.xyz
            link
            fedilink
            English
            arrow-up
            2
            arrow-down
            1
            ·
            11 months ago

            There sure was a lot less trash when the web first came to the internet and there weren’t any paid sites. Of course there was a lot less everything.

            • willya@lemmyf.uk
              link
              fedilink
              English
              arrow-up
              2
              ·
              edit-2
              11 months ago

              When it first started there were more smart people using it over dumbasses. What was there that would have even been worth paying for?

  • smileyhead@discuss.tchncs.de
    link
    fedilink
    English
    arrow-up
    5
    arrow-down
    2
    ·
    11 months ago
    • We create WWW, where everyone can freely put things on and discuss anything.

    • Oh no! But what about the profits?

    • We create this summarize tool to quickly get knowleadge without always needing to peek deeper into text.

    • Oh no! But what about the profits?

    • dantheclamman@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      6
      arrow-down
      1
      ·
      11 months ago

      It’s not about profit. It’s about being compensated for one’s hard work which was appropriated without permission by giant corporations

      • TheMurphy@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        arrow-down
        1
        ·
        11 months ago

        Lesson #1 on the Internet.

        Put something on it, expect it to be there forever. You never own whatever you put out there. Both text, pictures nor video.

        Maybe companies should realise this too.