Links are almost always base64 encoded now and the online url decoders always produce garbage. I was wondering if there is a project out there that would allow me to self-host this type of tool?
I’d probably network this container through gluetun because, yanno, privacy.
Edit to add: Doesn’t have to be specifically base64 focused. Any link decoder that I can use in a privacy respecting way, would be welcome.
Edit 2: See if your solution will decode this link (the one in the image): https://link.sfchronicle.com/external/41488169.38548/aHR0cHM6Ly93d3cuaG90ZG9nYmlsbHMuY29tL2hhbWJ1cmdlci1tb2xkcy9idXJnZXItZG9nLW1vbGQ_c2lkPTY4MTNkMTljYzM0ZWJjZTE4NDA1ZGVjYSZzcz1QJnN0X3JpZD1udWxsJnV0bV9zb3VyY2U9bmV3c2xldHRlciZ1dG1fbWVkaXVtPWVtYWlsJnV0bV90ZXJtPWJyaWVmaW5nJnV0bV9jYW1wYWlnbj1zZmNfYml0ZWN1cmlvdXM/6813d19cc34ebce18405decaB7ef84e41 (it should decode to this page: https://www.hotdogbills.com/hamburger-molds)
Not that you should vibe code, but you could vibe code this so easily. Have it output a static website. Give the source code a scan if you’re paranoid. Check the network tab if you’re really really paranoid. But literally you could have it output this as a static index.html file that you drop into your browser of choice.
This is the only type of coding LLMs should ever be used for imo. A small, very clearly defined task that is very easy to verify if it works. And code that won’t infect a larger project.
Edit: as others pointed out, that url isn’t base64 encoded. You would have to clearly define what you are trying to do if you want this to work. For example, do all urls follow the same format as the above?
There is no such thing as a base64 encoded url. Part of an url might hold base64 encoded data, but never the url itself.
These online tools aren’t working because you’re using them wrong.
deleted by creator
Fun fact: Base64url is not quite the same as base64. Its alphabet is slightly different from base64 so its characters can be used in more places (URLs, filenames, etc.).
I suppose the tool’s name is more clear for those who are aware of those differences, but very unclear for others.
RAHHHHH this is embarrassing. You’re totally right. I’m wrong.
Ada Lovelace, forgive me!
That url isn’t base64 encoded. You can tell by the fact that it’s still a URL, and doesn’t decode……
…that one little detail everyone missed
Just take the base64 bit of the url. The whole url isn’t a base64, so it decoded to garbage.
The base64 bit decodes just fine.
IT-Tools is kind of fun: a web page full of common tools, converters, references, cheat sheets, etc.
Came here to type ittools but @mike_wooskey@lemmy.thewooskeys.com was here first
Came to post this, glad to see it’s already here.
Nice little utility tool box that does a ton of helpful stuff in a small package. Super easy to self host and container images easily available.
This is amazing, thank you!
I mean… It’s decoding into garbage because you’re feeding it more than just the base64 section. I suppose if you’re already running nginx or something you could easily make a page that uses javascript to break the link down (possibly using /, ?, = as separators) and decode sections that look like base64. If you make it javascript and client side there’s not really any privacy concerns.
EDIT: Oops. My Lemmy client didn’t load the other replies at first, I didn’t realize you already had plenty of other options.
Don’t include the non-encoded part of the data or it will corrupt the decryption. The decoder can’t tell the difference between data that’s not encoded and data that is encoded since it’s all text.
I’ve been working on a URL cleaning tool for almost 2 years now and just committed support for that type of URL. I’ll release it to crates.io shortly after Rust 1.90 on the 18th.
https://github.com/Scripter17/url-cleaner
It has 3 frontends right now: a CLI, an HTTP server and userscript to clean every URL on every webpage you visit, and a discord bot. If you want any other integration let me know and I’ll see what I can do.
Also, amusingly, you decoded the base64 wrong. You forgot to change the _ to / and thus missed the
/burger-dog-mold
and tracking parameter garbage at the end. I made sure to remove the tracking parameters.Edit: Published on crates.io and github under AGPL. Sadly the discord frontend couldn’t be published to crates.io because to work around something (I forget exactly what) I changed a dependency from the one on crates.io to a more up-to-date version of it on github. Crates.io correctly rejects that kind of stuff. If you want to use the discord frontend, git clone the repository then run
cargo build -r -p url-cleaner-discord-app
.The offer to write extra frontends stands, btw. If you want a slack bot I’ll make one.
It’s 3 lines of code in basically every programming language, no need for selfhosting, just open the terminal?
You know, it would be a really neat browser plug-in. Mouse over a URL and get the encoded bit decoded?
Got an example in BASH?Edit: someone else has a link
deleted by creator
You may want to use
-n
to skip the newline and the end.You may also want to single quote the text to negate expansion when doing the opposite and encoding the text.
echo -n 'my text' | base64
There’s something else going on there besides base64 encoding of the URL – possibly they have some binary tracking data or other crap that only makes sense to the creator of the link.
It’s not hard to write a small Python script that gets what you want out of a URL like that though. Here’s one that works with your sample link:
#!/usr/bin/env python3 import base64 import binascii import itertools import string import sys input_url = sys.argv[1] parts = input_url.split("/") for chunk in itertools.accumulate(reversed(parts), lambda b,a: "/".join([a,b])): try: text = base64.b64decode(chunk).decode("ascii", errors="ignore") clean = "".join(itertools.takewhile(lambda x: x in string.printable, text)) print(clean) except binascii.Error: continue
Save that to a file like
decode.py
and then you can you run it on the command line likepython3 ./decode.py 'YOUR-LINK-HERE'
e.g.
$ python3 ./decode.py 'https://link.sfchronicle.com/external/41488169.38548/aHR0cHM6Ly93d3cuaG90ZG9nYmlsbHMuY29tL2hhbWJ1cmdlci1tb2xkcy9idXJnZXItZG9nLW1vbGQ_c2lkPTY4MTNkMTljYzM0ZWJjZTE4NDA1ZGVjYSZzcz1QJnN0X3JpZD1udWxsJnV0bV9zb3VyY2U9bmV3c2xldHRlciZ1dG1fbWVkaXVtPWVtYWlsJnV0bV90ZXJtPWJyaWVmaW5nJnV0bV9jYW1wYWlnbj1zZmNfYml0ZWN1cmlvdXM/6813d19cc34ebce18405decaB7ef84e41' https://www.hotdogbills.com/hamburger-molds/burger-dog-mold
This script works by spitting the URL at ‘/’ characters and then recombining the parts (right-to-left) and checking if that chunk of text can be base64 decoded successfully. If it does, it then takes any printable ASCII characters at the start of the string and outputs it (to clean up the garbage characters at the end). If there’s more than one possible valid interpretation as base64 it will print them all as it finds them.
Wow, this is really helpful. Thank you!!
The encoding format of URLs is URL encoding, also known as percent-encoding. Content in the URL may be first encoding in some other format, like JSON or base64, and then encoded additionally using percent-encoding.
While there is a standard way to decode percent-encoding, websites are free to use base64 or JSON in URLs however they wish, so there’s not a one-size-fits-all way to decode them all. For example, the “/” character is valid in both percent-encoding and base64-encoding, so to know if it’s part of a base64-encoded blob or not, you might end up trying decoding several parts of the URL as base64 and checking if the result looks like URL-- essentially brute force.
A smarter way to do this might be to maintain a mapping between your favorite sites that you want to decode and what methods they use to encode links. Then a tool could efficiently directly decode the URLs embedded in these click trackers.
I have nothing to add except the appreciation for everyone who helped and amazement at the vastly differing ways people produced working results.
I wrote this little webapp thing some time ago. It’s not exactly what you asked for but is a good example.
All it does is base64 encode a link and adds the server url in front of it. When someone visits that link it will redirect them to the destination. The intent is to bypass simple link tracking / blocking in discord and other platforms.
There are also checks for known bad domains and an attempt to remove known tracking query parameters.
https://git.tsps-express.xyz/liliumstar/redir
Edit: I forgot to add it also blocks known crawlers (at least at time of writing) so that they can’t just follow the 302 and figure out where it goes.
CorentinTh/it-tools does that and a lot more
https://addons.mozilla.org/en-US/firefox/addon/redirect-bypasser-webextension/ in desktop Firefox seems to work for your link. For mobile there might be apps that you share the link to and they dissect it, but a very quick search didn’t turn up anything.