Need to let loose a primal scream without collecting footnotes first? Have a sneer percolating in your system but not enough time/energy to make a whole post about it? Go forth and be mid: Welcome to the Stubsack, your first port of call for learning fresh Awful youāll near-instantly regret.
Any awful.systems sub may be subsneered in this subthread, techtakes or no.
If your sneer seems higher quality than you thought, feel free to cutānāpaste it into its own post ā thereās no quota for posting and the bar really isnāt that high.
The post Xitter web has spawned soo many āesotericā right wing freaks, but thereās no appropriate sneer-space for them. Iām talking redscare-ish, reality challenged āculture criticsā who write about everything but understand nothing. Iām talking about reply-guys who make the same 6 tweets about the same 3 subjects. Theyāre inescapable at this point, yet I donāt see them mocked (as much as they should be)
Like, there was one dude a while back who insisted that women couldnāt be surgeons because they didnāt believe in the moon or in stars? I think each and every one of these guys is uniquely fucked up and if I canāt escape them, I would love to sneer at them.
(Credit and/or blame to David Gerard for starting this.)
New article from Axos: Publishers facing existential threat from AI, Cloudflare CEO says
Baldur Bjarnason has given his commentary:
Anyways, personal sidenote/prediction: I suspect the Internet Archiveās gonna have a much harder time archiving blogs/websites going forward.
Up until this point, the Archive enjoyed easy access to large swathes of the 'Net - site owners had no real incentive to block new crawlers by default, but the prospect of getting onto search results gave them a strong incentive to actively welcome search engine robots, safe in the knowledge that theyād respect robots.txt and keep their server load to a minimum.
Thanks to the AI bubble and the AI crawlers its unleashed upon the 'Net, that has changed significantly.
Now, allowing crawlers by default risks AI scraper bots descending upon your website and stealing everything that isnāt nailed down, overloading your servers and attacking FOSS work in the process. And you can forget about reigning them in with robots.txt - theyāll just ignore it and steal anyways, theyāll lie about who they are, theyāll spam new scrapers when you block the old ones, theyāll threaten to exclude you from search results, theyāll try every dirty trick they can because these fucks feel entitled to steal your work and fundamentally do not respect you as a person.
Add in the fact that the main upside of allowing crawlers (turning up in search results) has been completely undermined by those very same AI corps, as āAI summariesā (like Googleās) steal your traffic through stealing your work, and blocking all robots by default becomes the rational decision to make.
This all kinda goes without saying, but this change in Internet culture all-but guarantees the Archive gets caught in the crossfire, crippling its efforts to preserve the web as site owners and bloggers alike treat any and all scrapers as guilty (of AI fuckery) until proven innocent, and the web becomes less open as a whole as people protect themselves from the AI robber barons.
On a wider front, I expect this will cripple any future attempts at making new search engines, too. In addition to AI making it piss-easy to spam search systems with SEO slop, any new start-ups in web search will struggle with quality websites blocking their crawlers by default, whilst slop and garbage will actively welcome their crawlers, leading to your search results inevitably being dogshit and nobody wanting to use your search engine.
I donāt like that itās not open source, and there are opt-in AI features, but I can highly, highly recommend Kagi from a pure search result standpoint, and one of the only alternatives with their own search index.
(Give it a try, theyāve apparently just opened up their search for users without an account to try it out.)
Almost all the slop websites arenāt even shown (or put in a āListiclesā section where they can be accessed, but are not intrusive and do not look like proper results, and you can prioritize/deprioritize sites (for example, I have gituib/reddit/stackoverflow to always show on top, quora and pinterest to never show at all).
Oh, and they have a fediverse ālensā which actually manages to reliably search Lemmy.
This doesnāt really address the future of crawling, just the āGoogle has gone to shitā part š
FWIW, due to recent developments, Iāve found myself increasingly turning to non-search engine sources for reliable web links, such as Wikipedia source lists, blog posts, podcast notes or even Reddit. This almost feels like a return to the early days of the internet, just in reverse and - sadly - with little hope for improvement in the future.
Searching Reddit has really become standard practice for me, a testament to how inhuman the web as a whole has gotten. What a shame.