• @Strawberry@lemmy.blahaj.zone
      link
      fedilink
      423 hours ago

      The bots scrape costly endpoints like the entire edit histories of every page on a wiki. You can’t always just cache every possible generated page at the same time.

      • @jagged_circle@feddit.nl
        link
        fedilink
        English
        0
        edit-2
        3 hours ago

        Of course you can. This is why people use CDNs.

        Put the entire site on a CDN with a cache of 24 hours for unauthenticated users.

    • @nutomic@lemmy.ml
      link
      fedilink
      111 day ago

      Cache size is limited and can usually only hold a limited number of most recently viewed pages. But these bots go through every single page on the website, even old ones that are never viewed by users. As they only send one request per page, caching doesnt really help.

    • LiveLM
      link
      fedilink
      English
      422 days ago

      I’m sure that if it was that simple people would be doing it already…