Almost every website and services are getting scraped at alarming rate, are Lemmy servers facing this issue?

Please share mitigations you’ve seen applied to this.

  • SSUPII@sopuli.xyz
    link
    fedilink
    arrow-up
    16
    ·
    edit-2
    4 months ago

    Those are truly useless to go against bad actors and is instead only annoying for the humans that read. And good actors with proper licenses won’t be scraping Lemmy, Reddit or Twitter.

    You just cannot prevent it on Lemmy because if an instance places filters like Anubis, another will not. And it is not feasable to mandate every instance to do so. Also, this is an open platform by nature and there is no group or company that can mandate rules of access. As you are limiting non-humans, you might also be limiting real users with peculiar configurations or under heavy privacy middlewares.

    • Captain Beyond@linkage.ds8.zone
      link
      fedilink
      arrow-up
      4
      ·
      4 months ago

      The point (as I see it) is not so much to stop scraping as it is to prevent bots from effectively DDOS-ing web services. As others have said ActivityPub content is public and there are ways to get it without slamming instances with scraper bots.