Is there a self-hosted project that does url decoding in a privacy respecting fashion?

ReedReads@lemmy.zip · edit-2 4 months ago

Is there a self-hosted project that does url decoding in a privacy respecting fashion?

hendrik@palaver.p3x.de · 4 months ago

There’s base64 -d on the command line.

ReedReads@lemmy.zip · 4 months ago

base64 -d

Right but the / in the url trips it up and I’d like to just copy/paste the full url and have it spit out the proper, decoded link.

ExFed@programming.dev · edit-2 4 months ago

The / character isn’t a part of the base64 encoding. In fact, only one part of the URL looks like base64. No plain base64 tool (whether via CLI, self-hosted, or otherwise) will be able to decode an entire URL like that. You’ll first need to parse the URL to isolate the base64 part. This is literally solved with a single line of bash:

echo "https://link.sfchronicle.com/external/41488169.38548/aHR0cHM6Ly93d3cuaG90ZG9nYmlsbHMuY29tL2hhbWJ1cmdlci1tb2xkcy9idXJnZXItZG9nLW1vbGQ_c2lkPTY4MTNkMTljYzM0ZWJjZTE4NDA1ZGVjYSZzcz1QJnN0X3JpZD1udWxsJnV0bV9zb3VyY2U9bmV3c2xldHRlciZ1dG1fbWVkaXVtPWVtYWlsJnV0bV90ZXJtPWJyaWVmaW5nJnV0bV9jYW1wYWlnbj1zZmNfYml0ZWN1cmlvdXM/6813d19cc34ebce18405decaB7ef84e41" | cut -d/ -f6 | base64 -d

See TIO for example.

edit: add TIO link

ReedReads@lemmy.zip · 4 months ago

Thank you for this
You know more than I do re: bash. Where can I learn what | cut -d/ -f6 | means? I assume the cut is the parsing? But maybe that is wrong? Would love to learn how to learn this.

Lucy :3@feddit.org · edit-2 4 months ago

cut --help and man cut can teach you more than anyone here.

But: “|” takes the output of the former command, and uses it as input for the latter. So it’s like copying the output of “echo […]”, executing “cut -d ‘/’ -f 6”, and pasting it into that. Then copy the output of “cut”, execute “base64 -d” and paste it there. Except the pipe (“|”) automates that on one line.

And yes, cut takes a string (so a list of characters, for example the url), splits it at what -d specifies (eg. cut -d ‘/’ splits at “/”), so it now internally has a list of strings, “https:”, “”, “link.sfchronicle.com”, “external”, 41488169.38548", “aHR0cHM6Ly93d3cuaG90ZG9nYmlsbHMuY29tL2hhbWJ1cmdlci1tb2xkcy9idXJnZXItZG9nLW1vbGQ_c2lkPTY4MTNkMTljYzM0ZWJjZTE4NDA1ZGVjYSZzcz1QJnN0X3JpZD1udWxsJnV0bV9zb3VyY2U9bmV3c2xldHRlciZ1dG1fbWVkaXVtPWVtYWlsJnV0bV90ZXJtPWJyaWVmaW5nJnV0bV9jYW1wYWlnbj1zZmNfYml0ZWN1cmlvdXM” and “6813d19cc34ebce18405decaB7ef84e41”, and from that list outputs whatever is specified by -f (so eg. -f 6 means the 6th of those strings. And -f 2-3 means the 2nd to 3rd string. And -5 means everything up to and including the fifth, and 3- means everything after and including the third).

But all of that is explained better in the manpage (man cut). And the best way to learn is to just fuck around. So echo "t es t str i n g, 1" | cut ... and try various arguments.

krnl386@lemmy.ca · edit-2 4 months ago

Try explainshell.com - you can paste in any oneliner and the site will parse it and explain each part.

Here’s the link

Enoril@jlai.lu · 4 months ago

Really nice! Thanks for sharing this

feedorimid@lemmynsfw.com · edit-2 4 months ago

Cut into fields based on the delimiter (/ in this case). The “-f6” selects which field you want.

ccryx [he/him]@discuss.tchncs.de · edit-2 4 months ago

You can use man <command> (in this case man cut) to read a program’s manual page. Appending --help (without any other arguments will often produce at least a short description of the program and list the available options.

hendrik@palaver.p3x.de · edit-2 4 months ago

~~Well, the URL is a bit weird.~~

echo "aHR0cHM6Ly93d3cuaG90ZG9nYmlsbHMuY29tL2hhbWJ1cmdlci1tb2xkcy9idXJnZXItZG9nLW1vbGQ" | base64 -d

gives me “https://www.hotdogbills.com/hamburger-molds/burger-dog-mold”. (Without the ‘s’.) And then there are about 176 characters left. I suppose the underscore is some delimiter. The rest is:

echo "c2lkPTY4MTNkMTljYzM0ZWJjZTE4NDA1ZGVjYSZzcz1QJnN0X3JpZD1udWxsJnV0bV9zb3VyY2U9bmV3c2xldHRlciZ1dG1fbWVkaXVtPWVtYWlsJnV0bV90ZXJtPWJyaWVmaW5nJnV0bV9jYW1wYWlnbj1zZmNfYml0ZWN1cmlvdXM" | base64 -d

“sid=6813d19cc34ebce18405deca&ss=P&st_rid=null&utm_source=newsletter&utm_medium=email&utm_term=briefing&utm_campaign=sfc_bitecurious”

And I suppose the stuff after the last slash is there for some other reason, tracking or some hash or whatever. But the things before that are the URL and the parameters.

But the question remains whether we have some kind of tool to do this automatically and make it a bit easier…

ReedReads@lemmy.zip · 4 months ago

I really appreciate all of the time and effort you spent on this url. You’re right, the url is weird, which is why I thought it was a good example.

But the question remains whether we have some kind of tool to do this automatically and make it a bit easier…

But you nailed it with this last sentence. Especially when one is on mobile.

Thanks for replying again.

hendrik@palaver.p3x.de · edit-2 4 months ago

I know. Guess I mainly wanted to say your given solution isn’t the entire story and the potential tool should decode the parameters as well, they might or might not be important. I’m often at the computer and I regularly do one-off tasks this way… But I’m aware it might not be an one-off task to you and you might not have a Linux terminal open 24/7 either 😉 Hope some of the other people have what you need. And btw… since I clicked on a few of the suggestions: I think the thing called URL encoding is a something different, that’s with all the percent signs and not base64 like here.

Snot Flickerman@lemmy.blahaj.zone · edit-2 4 months ago

This is the internet, maybe build it yourself instead of demanding others do the work for you?

You could also just as easily only paste in the encoded part and put the decoded bit back into the link yourself.

ReedReads@lemmy.zip · 4 months ago

No one is demanding anything. I’m simply stating my preferred solution, which would work on both mobile and desktop, and asking if anyone knows if that solution or something similar already exists.

Nothing suggested so far will properly decode the link that I’ve included above.

But there is no reason to build something duplicative if a solution is already out there. Hence, the post.

GreenKnight23@lemmy.world · 4 months ago

you strike me as a competent developer but you lack the experience with Linux.

install xclip, then copy your URL and use the following command.

base64 -d "$(xclip -o)"

there’s probably a better way but I’m just remembering off the top of my head.

could probably pipe it into something that would spit it out with each param on new lines but you’ll need to google that.

carl_dungeon@lemmy.world · 4 months ago

Just put it in quotes?

FreedomAdvocate@lemmy.net.au · edit-2 4 months ago

That url isn’t base64 encoded. You can tell by the fact that it’s still a URL, and doesn’t decode……

Possibly linux@lemmy.zip · 4 months ago

…that one little detail everyone missed

ick@infosec.pub · 4 months ago

https://wikipedia.org/wiki/XY_problem

Hawk@lemmy.dbzer0.com · 4 months ago

There is no such thing as a base64 encoded url. Part of an url might hold base64 encoded data, but never the url itself.

These online tools aren’t working because you’re using them wrong.

JackbyDev@programming.dev · 4 months ago

deleted by creator

percent@infosec.pub · 4 months ago

Fun fact: Base64url is not quite the same as base64. Its alphabet is slightly different from base64 so its characters can be used in more places (URLs, filenames, etc.).

I suppose the tool’s name is more clear for those who are aware of those differences, but very unclear for others.

https://datatracker.ietf.org/doc/html/rfc4648#section-5

JackbyDev@programming.dev · edit-2 4 months ago

RAHHHHH this is embarrassing. You’re totally right. I’m wrong.

Ada Lovelace, forgive me!

countzukula@lemmy.world · 4 months ago

Cyberchef!

hypna@lemmy.world · 4 months ago

Cyberchef does this and so so much more https://github.com/gchq/CyberChef

qwerty@discuss.tchncs.de · 4 months ago

I was about to install it on my server until I found out that it’s developed by the UK government. Now I won’t trust it even though it’s open source.

lepinkainen@lemmy.world · 4 months ago

This is the only correct answer

Mike Wooskey@lemmy.thewooskeys.com · 4 months ago

IT-Tools is kind of fun: a web page full of common tools, converters, references, cheat sheets, etc.

oeLLph [ɛlf]@feddit.org · 4 months ago

Came here to type ittools but @mike_wooskey@lemmy.thewooskeys.com was here first

borth@sh.itjust.works · 4 months ago

This is amazing, thank you!

clif@lemmy.world · 4 months ago

Came to post this, glad to see it’s already here.

Nice little utility tool box that does a ton of helpful stuff in a small package. Super easy to self host and container images easily available.

e0qdk@reddthat.com · 4 months ago

There’s something else going on there besides base64 encoding of the URL – possibly they have some binary tracking data or other crap that only makes sense to the creator of the link.

It’s not hard to write a small Python script that gets what you want out of a URL like that though. Here’s one that works with your sample link:

#!/usr/bin/env python3

import base64
import binascii
import itertools
import string
import sys

input_url = sys.argv[1]
parts = input_url.split("/")
  
for chunk in itertools.accumulate(reversed(parts), lambda b,a: "/".join([a,b])):
  try:
    text = base64.b64decode(chunk).decode("ascii", errors="ignore")
    clean = "".join(itertools.takewhile(lambda x: x in string.printable, text))
    print(clean)
  except binascii.Error:
    continue

Save that to a file like decode.py and then you can you run it on the command line like python3 ./decode.py 'YOUR-LINK-HERE'

e.g.

$ python3 ./decode.py 'https://link.sfchronicle.com/external/41488169.38548/aHR0cHM6Ly93d3cuaG90ZG9nYmlsbHMuY29tL2hhbWJ1cmdlci1tb2xkcy9idXJnZXItZG9nLW1vbGQ_c2lkPTY4MTNkMTljYzM0ZWJjZTE4NDA1ZGVjYSZzcz1QJnN0X3JpZD1udWxsJnV0bV9zb3VyY2U9bmV3c2xldHRlciZ1dG1fbWVkaXVtPWVtYWlsJnV0bV90ZXJtPWJyaWVmaW5nJnV0bV9jYW1wYWlnbj1zZmNfYml0ZWN1cmlvdXM/6813d19cc34ebce18405decaB7ef84e41'
https://www.hotdogbills.com/hamburger-molds/burger-dog-mold

This script works by spitting the URL at ‘/’ characters and then recombining the parts (right-to-left) and checking if that chunk of text can be base64 decoded successfully. If it does, it then takes any printable ASCII characters at the start of the string and outputs it (to clean up the garbage characters at the end). If there’s more than one possible valid interpretation as base64 it will print them all as it finds them.

ReedReads@lemmy.zip · 4 months ago

Wow, this is really helpful. Thank you!!

irotsoma@lemmy.blahaj.zone · 4 months ago

Don’t include the non-encoded part of the data or it will corrupt the decryption. The decoder can’t tell the difference between data that’s not encoded and data that is encoded since it’s all text.

𝕸𝖔𝖘𝖘@infosec.pub · 4 months ago

Just take the base64 bit of the url. The whole url isn’t a base64, so it decoded to garbage.

The base64 bit decodes just fine.

Finadil@lemmy.world · edit-2 4 months ago

I mean… It’s decoding into garbage because you’re feeding it more than just the base64 section. I suppose if you’re already running nginx or something you could easily make a page that uses javascript to break the link down (possibly using /, ?, = as separators) and decode sections that look like base64. If you make it javascript and client side there’s not really any privacy concerns.

EDIT: Oops. My Lemmy client didn’t load the other replies at first, I didn’t realize you already had plenty of other options.

amzd@lemmy.world · 4 months ago

It’s 3 lines of code in basically every programming language, no need for selfhosting, just open the terminal?

rumba@lemmy.zip · 4 months ago

You know, it would be a really neat browser plug-in. Mouse over a URL and get the encoded bit decoded?

SayCyberOnceMore@feddit.uk · edit-2 4 months ago

~~Got an example in BASH?~~

Edit: someone else has a link

Blemish5236@lemmy.world · 4 months ago

deleted by creator

bitwolf@sh.itjust.works · 4 months ago

You may want to use -n to skip the newline and the end.

You may also want to single quote the text to negate expansion when doing the opposite and encoding the text.

echo -n 'my text' | base64

Onno (VK6FLAB)@lemmy.radio · 4 months ago

Use a bash command line:

https://stackoverflow.com/questions/6250698/how-to-decode-url-encoded-string-in-shell

Scripter17@lemmy.world · edit-2 4 months ago

I’ve been working on a URL cleaning tool for almost 2 years now and just committed support for that type of URL. I’ll release it to crates.io shortly after Rust 1.90 on the 18th.

https://github.com/Scripter17/url-cleaner

It has 3 frontends right now: a CLI, an HTTP server and userscript to clean every URL on every webpage you visit, and a discord bot. If you want any other integration let me know and I’ll see what I can do.

Also, amusingly, you decoded the base64 wrong. You forgot to change the _ to / and thus missed the /burger-dog-mold and tracking parameter garbage at the end. I made sure to remove the tracking parameters.

Edit: Published on crates.io and github under AGPL. Sadly the discord frontend couldn’t be published to crates.io because to work around something (I forget exactly what) I changed a dependency from the one on crates.io to a more up-to-date version of it on github. Crates.io correctly rejects that kind of stuff. If you want to use the discord frontend, git clone the repository then run cargo build -r -p url-cleaner-discord-app.

The offer to write extra frontends stands, btw. If you want a slack bot I’ll make one.

masterofn001@lemmy.ca · 4 months ago

I have nothing to add except the appreciation for everyone who helped and amazement at the vastly differing ways people produced working results.

liliumstar@lemmy.dbzer0.com · edit-2 4 months ago

I wrote this little webapp thing some time ago. It’s not exactly what you asked for but is a good example.

All it does is base64 encode a link and adds the server url in front of it. When someone visits that link it will redirect them to the destination. The intent is to bypass simple link tracking / blocking in discord and other platforms.

There are also checks for known bad domains and an attempt to remove known tracking query parameters.

https://git.tsps-express.xyz/liliumstar/redir

Edit: I forgot to add it also blocks known crawlers (at least at time of writing) so that they can’t just follow the 302 and figure out where it goes.

sneakyninjapants@sh.itjust.works · 4 months ago

CorentinTh/it-tools does that and a lot more