No problem. I think this is a great “final boss” question for learning sed, because it turns out it is deceptively hard!! You have to understand not only a lot about regex, but about sed to get it right. I learned a lot about sed just by tackling this problem!
I really do not want to mess around with your regex
It is very delicate for sure, but one part you can for sure change is at the # Add hyphens
part. In the regex you can see (%20|\.)
. These are a list of “characters” which get converted to hyphens. For example, you could modify it to (%20|\.|\+)
and it will convert +
s to -
s as well!
Still it is not perfect:
\\\\\[LINK](#LINK)
or [
]\\\\](But for a sed-only solution this is about as good as it will get I’m afraid.
Overall I’m very happy with it. Someday I would like to make a video that goes into depth about sed, since it is tricky to learn just from the docs.
I did it!! It also handles the case where an external link and internal link are on the same line :D
sed -E ':l;s/(\[[^]]*\]\()([^)#]*#[^)]*\))/\1\n\2/;Te;H;g;s/\n//;s/\n.*//;x;s/.*\n//;/^https?:/!{:h;s/^([^#]*#[^)]*)(%20|\.)([^)]*\))/\1-\3/;th;s/(#[^)]*\))/\L\1/;};tl;:e;H;z;x;s/\n//;'
Here is my annotated file
# Begin loop
:l;
# Bisect first link in pattern space into pattern space and append to hold space
# Example: `text [label](file#fragment)'
# Pattern space: `file#fragment)'
# Hold space: `text [label]('
# Steps:
# 1. Strategically insert \n
# 1a. If this fails, branch out
# 2. Append to hold space (this creates two \n's. It feels weird for the
# first iteration, but that's ok)
# 3. Copy hold space to pattern space, remove first \n, then trim off
# everything past the second \n
# 4. Swap pattern/hold, and trim off everything up to and incl the last \n
s/(\[[^]]*\]\()([^)#]*#[^)]*\))/\1\n\2/;
Te;
H;
g; s/\n//; s/\n.*//;
x; s/.*\n//;
# Modify only if it is an internal link
/^https?:/! {
# Add hyphens
:h;
s/^([^#]*#[^)]*)(%20|\.)([^)]*\))/\1-\3/;
th;
# Make lowercase
s/(#[^)]*\))/\L\1/;
};
# "conditional" branch so it checks the next conditional again
tl;
# Exit: join pattern space to hold space, then move to pattern space.
# Since the loop uses H instead of h, have to make sure hold space is empty
:e;
H;
z;
x; s/\n//;
I’ll give another go at it :)
Why you assume there’s only one link in the line?
They did not want external (http) links to be modified as that would break it:
[Example](https://example.com/#Some%20Link)
[Example](https://example.com/#some-link)
I compromised by thinking that it might be unlikely enough to have an external http link AND internal link within the same line. You could probably still do it, my first thought was [^h][^t][^t][^p]
but that would cause issues for #ttp
and #A
so i just gave up. Instead I think you’d want a different approach, like breaking each link onto their own line, do the same external/internal check before the substitution, and join the lines afterward.
Also, you perform substitutions in the whole URL instead of the fragment component
That requirement i missed. I just assumed the filename would be replaced the same way too Lol. Not too hard to fix tho :)
annotated it is working like this:
# use a loop to iteratively replace the %20 with -, since doing s/%20/-/g would replace too much. we loop until it cant substitute any more
# label for looping
:loop;
# skip the following substitute command if the line contains an http link in markdown format
/\[[^]]*\](http/!
# capture each part of the link, and join it together with -
s/\(\[[^]]*\]\)\(([^)]*\)%20\([^)]*)\)/\1\2-\3/g;
# if the substitution made a change, loop again, otherwise break
t loop;
# convert all insides to the link lowercase if the line doesnt contain an http link
/\[[^]]*\](http/!
# this is outside the loop rather than in the s command above because if the link doesnt contain %20 at all then it won't convert to lowercase
s/\(\[[^]]*\]\)\(([^)]*)\)/\1\L\2/g
This is very close
sed ':loop;/\[[^]]*\](http/! s/\(\[[^]]*\]\)\(([^)]*\)%20\([^)]*)\)/\1\2-\3/g;t loop;/\[[^]]*\](http/! s/\(\[[^]]*\]\)\(([^)]*)\)/\1\L\2/g'
example file
[Some text](#Header%20Linking%20MARKDOWN.md)
(#Should%20stay%20as%20is.md)
Text surrounding [a link](readme.md#Other%20Page). Cool
Multiple [links](#Links.md) in (%20) [a](#An%20A.md) SINGLE [line](#Lines.md)
Do [NOT](https://example.com/URL%20Should%20Be%20Untouched.html) CHANGE%20 [hyperlinks](http://example.com/No%20Touchy.html)
but it doesn’t work if you have a http link and markdown link in the same line, and doesn’t work with [escaped \] square brackets](#and-escaped-\)-parenthesis)
in the link
but!! it was fun!
a different comment was saying ricing has a sense of being overdone. So with this I was thinking of “overtuning.” I think it fits more as a hobbyist term than a pragmatic one.
Arch is the only person who has been in my house for the last week and i have no clue how he is going about it and he has no clue how it is affecting him or how he feels and how it is affected me
Mine also starts off the exact same way?? I’m pressing the middle option
Women are not allowed in this world anymore because of their own personal preferences or the way their body and body is designed and made and made and they have no choice to make decisions
but right here it takes a different path:
that make it a choice to do it and that makes them a bad person to do so they have no right of way of life or the choice that is not their right of way and that they are entitled and have to choose their choice to choose what to choose to choose to live with that choice is a right that is theirs and it’s a choice and not yours
My reasons were more hardware related. When I was a bit younger my parents gave me a netbook which had 32 GB of storage, and Windows used almost all of it. I wanted to do creative projects in my free time, but I couldn’t install programs or save any of my work. I would often restart to clear log files and gain a bit more working storage, which was extremely annoying because it took like 5 mins for the computer to finally settle down and be usable.
I eventually got a 32GB flash drive which helped a lot, but it was not enough. With 4GB ram I could only have about 3 browser tabs open, and not all the programs I wanted could be run off the flash drive. It was still resource management hell.
Somehow, some way, I learned about Linux. I got a 128GB microSD, put Mint on it. It truly set me free. I could install the software I wanted, I could make the things I wanted to make, I could open more programs at once, and I could do it all without unbearable lag. I never looked back since.
If you’d like to learn how to speedrun a niche puzzle game, check this one out :)
I haven’t written all the tutorial posts I’ve wanted to yet, so stay tuned.
There’s some unexplored territory I haven’t explained for myself, like the connection to graph theory (i dont have any foundational knowledge for graph theory so maybe someone smarter than me can help ;) i figure it would help formalize some proofs)
Feel free to share your progress!
fish. I think it has most things i want out of the box, so it should be simpler and snappier than my zsh setup. it’s just that zsh hasnt bothered me enough to try it yet.
also nushell, im interested in the idea of manipulating structured data instead of unstructured text
And make sure the time is synced to the cloud so they need internet connection, and so the player can’t be sneaky and reload the game to reset the timer if they pressed x too many times
This reminds me of my ex gf 😅 not only does she enjoy “kid” shows and movies, but HER NAME IS ANDY TOO. That image would definitely dealt some damage. For us though we broke off on good terms. Right person, wrong time, wrong place :(
If they aren’t equal, there should be a number in between that separates them. Between 0.1 and 0.2 i can come up with 0.15. Between 0.1 and 0.15 is 0.125. You can keep going, but if the numbers are equal, there is nothing in between. There’s no gap between 0.1 and 0.1, so they are equal.
What number comes between 0.999… and 1?
(I used to think it was imprecise representations too, but this is how it made sense to me :)
Imagine they have an internal tool to check if the hash exists in their database, something like
"SELECT user FROM downloads WHERE hash = '" + hash + "';"
You set the pdf hash to be 1'; DROP TABLE books;--
they scan it, and it effectively deletes their entire business lmfaoo.
Another idea might be to duplicate the PDF many times and insert bogus metadata for each. Then submit requests saying that you found an illegal distribution of the PDF. If their process isn’t automated it would waste a lot of time on their part to find the culprit Lol
I think it’s more interesting to think of how to weaponize their own hash rather than deleting it
this might not be what you meant, but the word “tar” made me think of tar.gz. Don’t most sites compress the HTTP response body with gzip? What’s to stop you from sending a zip bomb over the network?