Drunk & Root@sh.itjust.works to Selfhosted@lemmy.worldEnglish · 3 months agoHow to combat large amounts of Ai scrapersmessage-squaremessage-square5linkfedilinkarrow-up11file-text
arrow-up11message-squareHow to combat large amounts of Ai scrapersDrunk & Root@sh.itjust.works to Selfhosted@lemmy.worldEnglish · 3 months agomessage-square5linkfedilinkfile-text
everytime i check nginx logs its more scrapers then i can count and i could not find any good open source solutions
minus-squaredaniskarma@lemmy.dbzer0.comlinkfedilinkEnglisharrow-up0·edit-23 months agoHow do you know it’s “AI” scrappers? I’ve have my server up before AI was a thing. It’s totally normal to get thousands of bot hits and to get scraped. I use crowdsec to mitigate it. But you will always get bot hits.
minus-squareSheldan@lemmy.worldlinkfedilinkEnglisharrow-up0·3 months agoSome of them are at least honest and have it as a user agent.
minus-squarekrakenfury@lemmy.sdf.orglinkfedilinkEnglisharrow-up0·3 months agoIs ignoring robots.txt considered “honest”?
minus-squareSheldan@lemmy.worldlinkfedilinkEnglisharrow-up1·3 months agoThat’s not what I was talking about
How do you know it’s “AI” scrappers?
I’ve have my server up before AI was a thing.
It’s totally normal to get thousands of bot hits and to get scraped.
I use crowdsec to mitigate it. But you will always get bot hits.
Some of them are at least honest and have it as a user agent.
Is ignoring robots.txt considered “honest”?
That’s not what I was talking about