Cloudflare, along with a majority of the world’s leading publishers and AI companies, is changing the default to block AI crawlers unless they pay creators for content.

  • ter_maxima@jlai.lu
    link
    fedilink
    English
    arrow-up
    44
    ·
    1 day ago

    If they could stop blocking real users at the same time they block AI crawlers, that would be nice.

    • AceFuzzLord@lemmy.zip
      link
      fedilink
      English
      arrow-up
      15
      ·
      1 day ago

      I absolutely hate cloudflare because they always block me whenever I visit a site while connected to any VPN server on Proton, regardless of country. Even when I’m not connected to a VPN I sometimes have trouble with them deciding my connection is suspicious despite not having anything that should trigger it.

      • fmstrat@lemmy.nowsci.com
        link
        fedilink
        English
        arrow-up
        2
        ·
        edit-2
        16 hours ago

        Make a dummy Google Account, and log into it when on the VPN. Having an ad history avoids the blocks usually. (Note: only do this if your browsing is not activist related/etc)

        Also, if it’s image captchas that never end, switch to the accessibility option for the captcha.

  • Bell@lemmy.world
    link
    fedilink
    English
    arrow-up
    21
    ·
    2 days ago

    A nice explanation of what’s wrong with web-based AI. I hope other content providers follow suit.

  • Dr. Moose@lemmy.world
    link
    fedilink
    English
    arrow-up
    12
    arrow-down
    3
    ·
    edit-2
    1 day ago

    This has been tried before many times. The problem is that this exchange can never satisfy all of the parties.

    Would site host take 0.01$ for a page? If its Walmart.com then they’d happily lose even 0.10$ or more if competitors can’t analyze their products and other perceived IP damages.

    For example, let’s assume they do the business math and come out that if it is anything below 5$ is a no deal - what scraper would pay 5$ for a single product page scrape? Maybe openAI can pay that but is this what we want where public scraping is only accessible to billionaires? What if you’re just a user that wants to track Walmart price to build your own budgeting script? Are you paying 5$ on every request?

    Now for creative content like blogs etc. it could actually work and micropayments have been holy grail here forever but what more likely to happen is that free content will outcompete paid because when LLM asks do you want to read this for free or pay 2$ for this other source 99% of the users will pick free because some unknown source has zero authority in the end user’s eyes to justify this risk.

    What I suspect will happen is similar with what happened with SEO spam rise but it’ll be a but better because LLMs are harder to game than Google. Most content will be free but have injected biases, shilling or other promotions or agendas to subsidize the costs. On the other hand a lot of content will remain high quality and free as a legitimate source for authority signaling within relevant industries which is already a big thing.

    Anyway thanks for coming to my ted talk

  • 3dcadmin@lemmy.relayeasy.com
    link
    fedilink
    English
    arrow-up
    8
    arrow-down
    1
    ·
    2 days ago

    Cloudflare aren’t perfect, but I still use them because for a free account the benefits outweigh the negatives like this… However to say the worlds leading publishers and AI are on board is simply not true…

  • InternetCitizen2@lemmy.world
    link
    fedilink
    English
    arrow-up
    6
    ·
    2 days ago

    Kind of reminds me of that time CF bait and switched that gambling website. Everyone was wondering who they find more predatory and distasteful.

    • sucoiri@lemmy.world
      link
      fedilink
      English
      arrow-up
      8
      ·
      1 day ago

      iirc it came out that the gambling site was intentionally cycling through cloudflare’s IP range to avoid IP blocking, causing legitimate websites to be flagged as unsafe. They said they’d still host them, they just need their own IP and not one of cloudflare’s. Horribly communicated to the customer though