Seeking feedback: how should lemm.ee move forward with external images? (related to frequent broken images)

sunaurus@lemm.ee · 5 months ago

Seeking feedback: how should lemm.ee move forward with external images? (related to frequent broken images)

homesnatch@lemm.ee · 4 months ago

Storing permanently locally doesn’t sound like a good solution…

If you could adjust the length of time to keep cached/proxy’d images locally and increase it significantly, I’d think that would be the preferred solution.

Admiral Patrick@dubvee.org · edit-2 5 months ago

Not a lemm.ee user, but here’s my thoughts on #2 since it affects me via federation:

I am not a fan of how Lemmy chose to implement image proxying Specifically, federating the proxied URL.

That frequently prevents my instance from fetching a thumbnail locally (option 1 above). Which, ironically, increases the load on your server as my instance has to fetch it from your proxy every time instead of just once to generate a local copy here.

From a UI development standpoint, the proxied thumbnail URLs also make it harder to detect the image type (gif, static image, video) to handle rendering. It also complicates other proxying/caching methods I have in place. Ultimately, in the UI I develop, I’ve had to resort to passing thumbnail images through a function to un-proxy them so they can be handled sanely.

So I generally wish that admins avoid Lemmy’s proxying until it no longer federates the proxied URL and does something sane like just return that for the local API calls.

4 months ago

Wow didnt know they federated the proxied image kinda stupid ngl.

We really need some sort of distributed content hosting for images that allows everything to have a single unique address servable by anyone. Perhaps a bittorrent that has all federated media.Can still have the address to the media be a url for the local instance as not to break frontends but backends could recognise it as universal bittorrent resource and fetch it in a distributed manner.

Would also mean clients can implement their own retrieval as not to rely on the server but that wouldnt be required.

I suppose u could also put websites content into the same system as a sort of archive. Make the fediverse more p2p distribute load to more smaller nodes improving resiliency.

Anyone know how peertube has done their bitorrent implementation?

Petter1@lemm.ee · 4 months ago

Like, the usenet?

xavier666@lemm.ee · 4 months ago

I would prefer the option which allows lemm.ee to run in the most sustainable manner

Blaze (he/him)@feddit.org · 5 months ago

No opinion for the moment, but thank you for the very detailed post

TachyonTele@lemm.ee · 5 months ago

I’m on a lot, and I scroll from All/Hot. I rarely see broken images. They do pop up, but not enough to ever bother me. The only option I’d avoid is method 1, because of that image debacle a few months ago. Regarding methods 2 and 3, they both seemed to work fine. I leave it to smarter minds than myself.

Good work on next, btw. Enjoy your weekend and thank you for everything you do!

shackled@lemm.ee · 4 months ago

Option 3 is the only one that seems sustainable long term. Donations will NEVER keep up with user growth, thus storage costs will balloon out of control.

Completely avoiding any chance of illegal content touching the servers should immediately have everyone agreeing on this option. I doubt anyone here is willing to foot legal bills and as such even minor legal actions would be the end of this instance.

Privacy is nice but ip logging is the simplest form to “protect” against with even a free VPN. If those claiming privacy concerns here aren’t already using a VPN and are depending purely on lemme.ee’s proxy then their internet hygiene needs an update.

As for usability, the image being deleted from external provider presents the same issue to the user between option 2 and 3. The cache from option 2 will inventually get cleared and it’ll fail to pull a fresh copy if deleted from the external hosts.

thefartographer@lemm.ee · 4 months ago

I say option 3. Sure, it’s annoying, but that’s our problem. Your problem is keeping the server operational, safe, and low-cost.

Considering the vote:content ratio, it looks like most users spend most of their time spectating. The spectators feed our egos and determine the trends in content by singular positive or negative votes. The spectator experience seems far more important to me and it should be the onus of the contributors to ensure their own privacy, just like they do in avoiding doxxing themselves via text.

Option 2 certainly follows in the vein of improved performance, but if real-world implementation is proving too unstable or creating too much overhead, then I say “fuck it, option 3 sounds great.”

simple@lemm.ee · edit-2 4 months ago

Thanks for your hard work as always.

I’m in favor of moving away from proxying. Too many images break and proxying in general is very wasteful, having to download images from potentially small servers constantly would definitely get you ratelimited.

Passing through external images is OK. Many people often post external links anyways to sites like imgur and catbox because of the file size limits anyways.

I think the end goal would always to store images locally though - or at least caching them for extended periods of time. Don’t large instances like Lemmy World and huge Mastodon instances work this way? How do they manage the risk?

Thassodar@lemm.ee · 5 months ago

I would pick image pass through because I’m not necessarily concerned about the image hoster logging my IP. I have been more frustrated with broken links, so I am very much opposed to the current method.

barsoap@lemm.ee · 4 months ago

2+3: Try to fetch image, if you get it proxy it, if your storage gets full use LRU eviction, once evicted or some amount of time has passed delete it and don’t fetch again, ever. Fall back to pure 3 if there’s ever any issues with anything, including you not particularly feeling like implementing smart caching: Our referrer privacy is not your responsibility.

MyOpinion@lemm.ee · 4 months ago

Option 3.

flashgnash@lemm.ee · edit-2 4 months ago

I believe just passing through external images is the way to go, it’s always the one I opt for if I can

Hosting those images is gonna get expensive and that kinda sucks for a donation run platform when that money could be far better spent elsewhere

I think also using external image sources is more in line with the idea of decentralisation, Lemmy isn’t an image host it’s a link aggregator and forum - I believe most image hosting sites will be far better at loading images quickly than Lemmy’s implementation could ever be

ditty@lemm.ee · 4 months ago

I think it’s important for us to be mindful of content retention for posterity’s sake if we want Lemmy to compete with Reddit long-term. If possible, I’d hope we can avoid dead image links like we see with old forums and photobucket pics, for example.

perishthethought@lemm.ee · edit-2 4 months ago

Definitely OK with either option 2 or 3. And I trust @sunaurus to choose what’s best for this instance.

fossphi@lemm.ee · 5 months ago

Images have been a bit problematic for me lately, for sure. If storing them locally is not a solid option, the question I would have is how much of the other requests are proxied? As in, what other stuff apart from images/media is not being proxied? If the clients are leaking IPs anyway, maybe it’s okay to have them download the images, too. But if the server is proxying everything else then having some sort of a cache might not be that bad an idea

Seeking feedback: how should lemm.ee move forward with external images? (related to frequent broken images)

Seeking feedback: how should lemm.ee move forward with external images? (related to frequent broken images)

Hey folks!

Storing images locally

Proxying images

Passing through external images

Current situation