• 0 Posts
  • 18 Comments
Joined 5 months ago
cake
Cake day: March 31st, 2025

help-circle

  • Yeah, you’re absolutely right and I agree. So then do we have to resign the situation to being an eternal back-and-forth of just developing random new challenges every time the scrapers adapt to them? Like antibiotics for viruses? Maybe that is the way it is. And honestly that’s what I suspect. But Anubis feels so clever and so close to something that would work. The concept of making it about a cost that adds up, so that it intrinsically only effects massive processes significantly, is really smart…since it’s not about coming up with a challenge a computer can’t complete, but just a challenge that makes it economically not worth it to complete. But it’s disappointing to see that, at least with the current wait times, it doesn’t seem like it will cost enough to dissuade scrapers. And worse, the cost is so low that it seems like making the cost significant to the scrapers will require really insufferable wait times for users.


  • By negligence, I meant that the cost is negligible to the companies running scrapers, not that the solution itself is negligent. I should have said “negligibility” of Anubis, sorry - that was poor clarity on my part.

    But I do think that the cost of it is indeed negligible, as the article shows. It doesn’t really matter if the author is biased or not, their analysis of the costs seems reasonable. I would need a counter-argument against that to think they were wrong. Just because they’re biased isn’t enough to discount the quantification they attempted to bring to the debate.

    Also, I don’t think there’s any hypocrisy in me saying I’ve only thought about other solutions here and there - I’m not maintaining an anti-scraping library. And there’s already been indications that scrapers are just accepting the cost of Anubis on Codeberg, right? So I’m not trying to say I’m some sort of tech genius who has the right idea here, but from what Codeberg was saying, and from the numbers in this article, it sure looks like Anubis isn’t the right idea. I am indeed only having fun with my suggestions, not making whole libraries out of them and pronouncing them to be solutions. I personally haven’t seen evidence that Anubis is so clearly working? As the author points out, it seems like it’s only working right now because of how new it is, but if scrapers want to go through it, they easily can - which puts us in a sort of virus/antibiotic eternal war of attrition. And if course that is the case with many things in computing as well. So I guess my open wondering are just about if there’s ever any way to develop a countermeasure that the scrapers won’t find “worth it” to force through?

    Edit for tone clarity: I’m don’t want to be antagonistic, rude, or hurtful in any way. Just trying to have a discussion and understand this situation. Perhaps I was arrogant, if so I apologize. It was also not my intent, fwiw. Also, thanks for helping me understand why I was getting downvoted. I intended my post to just be constructive spitballing about what I see as the eventual inevitable weakness in Anubis. I think it’s a great project and it’s great that people are getting use out of it even temporarily, and of course the devs deserve lots of respect for making the thing. But as much as I wish I could like it and believe it will solve the problem, I still don’t think it will.




  • Yeah, well-written stuff. I think Anubis will come and go. This beautifully demonstrates and, best of all, quantifies the negligence negligible cost to scrapers of Anubis.

    It’s very interesting to try to think of what would work, even conceptually. Some sort of purely client-side captcha type of thing perhaps. I keep thinking about it in half-assed ways for minutes at a time.

    Maybe something that scrambles the characters of the site according to some random “offset” of some sort, e.g maybe randomly selecting a modulus size and an offset to cycle them, or even just a good ol’ cipher. And the “captcha” consists of a slider that adjusts the offset. You as the viewer know it’s solved when the text becomes something sensical - so there’s no need for the client code to store a readable key that could be used to auto-undo the scrambling. You could maybe even have some values of the slider randomly chosen to produce English text if the scrapers got smart enough to check for legibility (not sure how to hide which slider positions would be these red herring ones though) - which could maybe be enough to trick the scraper into picking up junk text sometimes.


  • I know it’s popular to call conservatives dumb and while I don’t like to beat dead horses, I really think the explanation for this is that they’re dumb. The illiterate form of dumb, to be precise. The caps are a way of adding emphasis - which is something that can also be done by phrasing, word choice, sentence structure, and so on. But those techniques would require a beyond 4th grade level of writing and reading ability, so they do not succeed in the conservative communication marketplace.

    I will add a disclaimer, my beloved Nietzsche also does all caps words sometimes, but I believe that since this is alongside his impressive eloquence, it is clearly not a sign of stupidity in that context. Likewise, it is not a sign of stupidity in many other contexts. That is to say:

    Using that style of communication does not always make it a safer bet that someone is stupid, but being stupid does make it a safer bet that they use that style of communication.





  • I wasn’t being totally serious, but also, I do think that while accessibility concerns come from a good place, there is some practical limitation that must be accepted when building fringe and counter-cultural things. Like, my hidden rebel base can’t have a wheelchair accessible ramp at the entrance, because then my base isn’t hidden anymore. It sucks that some solutions can’t work for everyone, but if we just throw them out because it won’t work for 5% of people, we end up with nothing. I’d rather have a solution that works for 95% of people than no solution at all. I’m not saying that people who use screen readers are second-class citizens. If crawlers were vision-based then I might suggest matching text to background colors so that only screen readers work to understand the site. Because something that works for 5% of people is also better than no solution at all. We need to tolerate having imperfect first attempts and understand that more sophisticated infrastructure comes later.

    But yes my image map idea is pretty much a joke nonetheless




  • I’m trying to take a progress over perfection approach to these things. My number one priority was to get off of Chrome and Firefox is pretty rough on mobile. I tried a few things and Brave was the one with the best experience, especially because of the ad blocking without needing to mess around with a bunch of plugins. I figure I can go deeper into that iceberg over time. What do you use?



  • I see what you’re saying. You’re not talking about “making sense” in an ethical or social well-being sense, you mean it’s literally confusing why the technology wouldn’t be used for all kinds of crimes, given that it already exists - irrespective of whether the technology should be used. Is that right? I think you’re getting downvoted because it kinda sounds like you’re saying this is all a good idea when you say it “makes sense”. Unfortunate English ambiguities. But you’re saying, like, sure it’s dystopian and creepy and wrong, but why wouldn’t the creepy dystopia use the tech for all cases then rather than just some? That’s a good question. I think because there is legitimately some understanding of the dangers of using these powerful tools willy-nilly. While people aren’t perfect angels, they also aren’t perfect devils either. Another factor is that there is some pressure to appear not to be overly heavy-handed with these tools - as we see in those chats, they knew it made them look bad for this to get out.

    And the final most pessimistic factor is that this Flock company almost certainly charges per seat, so giving direct usernames and logins to every officer or even every department is probably absurdly expensive. Companies (in this case the police) will often try to limit their license seats to as few people as possible and then just funnel as much different people’s work through that one person’s license as they can.


  • It does make sense. Police are not perfect saint-like beings, and the government is not composed of perfect beings either. I’m not sure what kind of person you are, but I’m sure there are some things you enjoy and partake in which some other social group really despises. If you’re religious, it may be militant atheists who despise you going to church. If you’re not religious, it may be militant theists who despise you not going to church. The point is, there’s probably some social cultures out there that hate you for the things that you love. Those people may not be in charge right now, but they might be one day. Those people can end up in police departments, as developers for these camera companies, as administrators for the database that collects information on where you drive and when. Those people, being imperfect as they are, may not always resist the temptation to use this system in a way to track down and identify people like you for doing whatever it is that you love and they hate. Now you end up on a list for that.

    There’s no denying that sophisticated surveillance technology does make it easier to catch criminals and does legitimately protect from the threats those criminals pose. But surveillance technology, by it’s very nature, cannot surveil only the criminals - it has to surveil everyone to find the criminals. And the notion of what is criminal may change. If your favorite hobby becomes criminalized, or if the government criminalizes your identity itself, these beautifully effective tools are suddenly turned against you.

    There is a happy medium to be found between giving your society tools to enforce the will of constituents, vs. giving your society tools that be too easily abused. Given that this tool is already being abused, it probably isn’t worth the benefits.