-
arcanicanis
Hopefully I'm not intruding, but I guess I'll go straight to a few thoughts that have crossed my mind in the recent year: For HTTP File Uploads (XEP-0363), perhaps there could be an optional feature, whereas upon the HTTP response for the file upload, the server includes an HTTP header in the response detailing a URL with an expiry token (like maybe a 5-10 minute validity?) that can be used for retracting the upload? Use case was: user accidentally uploaded the wrong file, which contained PII, and asked if I could manually nuke it from the server for themself (whereas honestly, the client should be able to do that themself; beyond just the message retraction). Has anyone previously tried proposing an authenticated HTTP mediaproxy (which could even be an HTTP endpoint that's separate from the server software entirely, but using some cross-protocol authentication to avoid abuse, such as XEP-0070) XEP, for proxied loading of files shared via XEP-0363?
-
moparisthebest
I think both have been at least thought of, feel free to write XEPs and/or code to do so though, folks here can help
-
moparisthebest
the simplest way to "solve" the first is just only upload encrypted files then you don't need to delete them
-
arcanicanis
Absolutely, I'd be glad to, just wanted to preflight-check it first, to make sure I'm not conflicting with something someone's already done, or if there were any design considerations that might conflict or have to be considered
-
arcanicanis
> the simplest way to "solve" the first is just only upload encrypted files then you don't need to delete them I agree, although in some contexts that might not be practical, such as with public groups ↺
-
arcanicanis
and speaking of encryption, have there been any further developments regarding integrating MLS within XMPP, or has the burden of that been predominantly on one (or two) people that might have gotten busy with life, or other demise?
-
arcanicanis
and also, just as a pet project that I've had in the back of my mind: has anyone written any test suites for OMEMO implementations?
-
moparisthebest
if I recall correctly the prosody/snikket folks got pretty far with the http mediaproxy thing but the main problem is there is really no way to stop it from just being a full open http proxy for your users
-
arcanicanis
Correct; I think one way I would address it would be a limit on cache misses; if someone's just abusing it for a semi-open HTTP proxy, then there's going to be a lot of cache misses, or alternatively, a very weird wide spread of domains and such; there could be ways to set limits to impede abuse
-
arcanicanis
Alternatively, endpoints that host user uploaded media could have some HTTP header to affirm "hey, this content is user-uploaded media for XMPP, not just random website content", and if the mediaproxy doesn't see that in the response, then it'll refuse to forward the content
-
moparisthebest
"a very weird wide spread of domains" is exactly what http upload is though?
-
arcanicanis
it would require existing XEP-0363 implementations to add that HTTP header, but at the same time, a user _could_ always fallback to not using the mediaproxy, for those endpoints that don't add that header
-
arcanicanis
> "a very weird wide spread of domains" is exactly what http upload is though? I mean if someone were just generically using it as a proxy to their web browser, there's going to be a scatter of miscellaneous domains, than what would be routine in what you'd see in a MUC of like 300 people sharing media ↺
-
moparisthebest
header is interesting... X-Allow-363-Mediaproxy: true ? (:
-
arcanicanis
X-XMPP-USG: ?1 maybe ("?1" is supposed to be a "true" boolean, in HTTP vocab, or at least in some of the newer specs)
-
arcanicanis
or X-Mediaproxy-Allowed: ?1 if it were to be something protocol neutral✎ -
arcanicanis
or X-Mediaproxy-Allowed: ?1 if it were to be named something protocol neutral ✏
-
singpolyma
>> the simplest way to "solve" the first is just only upload encrypted files then you don't need to delete them > I agree, although in some contexts that might not be practical, such as with public groups nothing prevents sharing an encrypted file with a public group. the key to decrypt is included in the message. it's semi "useless" in this case, but there are some reasons one might want to ↺
-
arcanicanis
Valid point
-
singpolyma
for example it make it so that if the url leaks it won't "just work" yes the key is right there but you can't just plug it into an img tag and go for example
-
arcanicanis
Nonetheless, what I'm addressing is more that the file itself can be retracted, rather than just the reference to it; as is the intention for said "retraction URL"
-
arcanicanis
That uploader has a means to delete, within some fair time window, the file they mistakenly uploaded
-
singpolyma
they honestly should always be able to delete, regardless of time window
-
moparisthebest
arcanicanis: search found https://issues.prosody.im/845
-
stratself
my question would be that if a user delete a message reference to a media, could the server also expunge it immediately too (at least in non-encrypted MUCs)✎ -
stratself
my question would be that if a user delete a message reference to a media, could the server also expunge it immediately too (at least in non-encrypted MUCs) disclaimer: i'm just a curious passerby ✏
-
arcanicanis
> they honestly should always be able to delete, regardless of time window I certainly don't disagree at all, it's just a precautionary idea to have a time window (such as with signed tokens or whatever authentication mechanism be part of the URL) ↺
-
arcanicanis
It would be two separate interaction: deletion of the media, and deletion of it's reference; whereas the two things could be on two different servers entirely. Meanwhile, a client could do both for you, without you having to know it's doing both
👍 1 -
stratself
> It would be two separate interaction: deletion of the media, and deletion of it's reference; whereas the two things could be on two different servers entirely. Meanwhile, a client could do both for you, without you having to know it's doing both 👍 ↺
-
moparisthebest
The prosody issue has a good hint, 363 already allows different URLs for PUT and GET, and nothing stops a server from also supporting DELETE on the PUT URL
-
arcanicanis
also, as an aside, with the mediaproxy thingy, it could also make some interaction between Tor and non-Tor users a little easier too, maybe✎ -
arcanicanis
also, as an aside, with the mediaproxy thingy, it could also make some interaction between Tor and non-Tor users a little easier too, maybe (or I2P, or whichever other overlay net people come up with) ✏
-
stratself
> Hopefully I'm not intruding, but I guess I'll go straight to a few thoughts that have crossed my mind in the recent year: > > For HTTP File Uploads (XEP-0363), perhaps there could be an optional feature, whereas upon the HTTP response for the file upload, the server includes an HTTP header in the response detailing a URL with an expiry token (like maybe a 5-10 minute validity?) that can be used for retracting the upload? Use case was: user accidentally uploaded the wrong file, which contained PII, and asked if I could manually nuke it from the server for themself (whereas honestly, the client should be able to do that themself; beyond just the message retraction). > > Has anyone previously tried proposing an authenticated HTTP mediaproxy (which could even be an HTTP endpoint that's separate from the server software entirely, but using some cross-protocol authentication to avoid abuse, such as XEP-0070) XEP, for proxied loading of files shared via XEP-0363? I know it's more complex, but maybe a way to request DELETE tokens when needed would be better? ↺
-
arcanicanis
I'm just saying something like: X-Retraction-URL: https://example.com/upload.php?file=bean-burrito.jpg&action=delete&token=7907d53159e79c9c6d3426b3984a5b47
-
arcanicanis
whether it's just a GET, or POST, or if it's a DELETE; whereas DELETE can add some complications, as it's just used/abused as repurposing from WebDAV in a different context, if I'm not mistaken
-
arcanicanis
where the server defines a URL, specific to that file, for requesting deletion; I just don't know definitely what would be the ideal HTTP method
-
arcanicanis
which could cryptographically, could be a hash function (e.g. sha256(filename+pepper)), or whatever; invariably, the endpoint decides the URL that's meaningful to itself to authenticate a deletion request✎ -
arcanicanis
which could be cryptographically signed, could be a result of a hash function (e.g. sha256(filename+pepper)), or whatever; invariably, the endpoint decides the URL that's meaningful to itself to authenticate a deletion request ✏
-
stratself
I mean, if one could also send this URL as a `<delete>` stanza, extending the XEP, that'd be nice. But I get you're accounting for cases where the upload server is different from the xmpp
-
arcanicanis
as it stands, HTTP File Upload generally is an out-of-band protocol, just the XMPP server facilitates providing the client with a self-authenticated URL to upload to
-
stratself
> I'm just saying something like: > X-Retraction-URL: https://example.com/upload.php?file=bean-burrito.jpg&action=delete&token=7907d53159e79c9c6d3426b3984a5b47 hmm, you're saying this should be included in the 200 response after the upload, right? that makes sense ↺
-
arcanicanis
Yes, correct
👍 1 -
stratself
> Alternatively, endpoints that host user uploaded media could have some HTTP header to affirm "hey, this content is user-uploaded media for XMPP, not just random website content", and if the mediaproxy doesn't see that in the response, then it'll refuse to forward the content this would still require the clients to use the mediaproxy in the first place right ↺
-
arcanicanis
Noap, these are two completely separate proposals
-
stratself
yea ik, I was switching topics
-
arcanicanis
If a response header is added, it doesn't pre-requisite that the client understands it
-
arcanicanis
as I was just suggesting something the _response_ header, that the mediaproxy would look for, versus something in the _request_ that the mediaproxy sends (but that could be another approach)✎ -
arcanicanis
as I was just suggesting something adding the _response_ header, that the mediaproxy would look for, versus something in the _request_ that the mediaproxy sends (but that could be another approach) ✏
-
stratself
sorry i'm not up to speed. As I understand the mediaproxy idea serves 2 purposes: 1. not reveal your client IP, and 2. authenticated access to media
-
stratself
are you talking about 2.?
-
arcanicanis
Both, but I'm not addressing the authentication side on this specific aspect. I'm saying that if the HTTP response the mediproxy gets from the remote HTTP service doesn't contain a header of "X-XMPP-USG: ?1", then it refuses to forward any of the response to the client, just dumping an empty "403 Forbidden" or similar error,✎ -
arcanicanis
Both, but I'm not addressing the authentication side on this specific aspect. I'm saying that if the HTTP response the mediproxy gets from the remote HTTP service doesn't contain a header of "X-XMPP-USG: ?1", then it refuses to forward any of the response to the client, just dumping an empty "403 Forbidden" response or similar error, to the requesting client ✏
-
stratself
yeah, that'd make sense then. it'd be trivial to bypass with curl though, so I suppose you'll include the access token in the media URL sent to the MUC too?✎ -
stratself
yeah, that'd make sense then. it'd be trivial to bypass with curl though, so I suppose you'll include the access token in the media URL body part that's sent to the MUC too? ✏
-
arcanicanis
The authentication part would be some cross-protocol mechanism, whether something like XEP-0070 to authenticate the request, to avoid it from being abused as some open proxy✎ -
arcanicanis
The authentication part would be some cross-protocol mechanism, whether something like XEP-0070 to authenticate the request, to avoid it from being abused as some open proxy (by non-users of the XMPP service) ✏
-
stratself
again, this is me using the proxy service of mydomain.tld, right? then yeah seems like 0070 would be on point
-
arcanicanis
Correct, a mediaproxy would only be for users of the same service, typically for loading media by users of other services
👍 1 -
stratself
maybe cache misses can be used as a secondary control mechanism yeah
-
arcanicanis
And additionally: it could provide easy enablement of accessing things between Tor and non-Tor users. e.g. if you on a Tor-enabled server, but just accessing your XMPP server on the surface web, a Tor user could post a file in a MUC, identified under a .onion address; meanwhile, a non-Tor user, using the mediaproxy service (if that mediaproxy service is Tor-enabled), could conveniently retrieve that file for a non-Tor user, without their client having _any_ awareness of handling Tor connections or .onion URLs
-
arcanicanis
(and the same with I2P)
-
stratself
are you saying clearnet clients can connect to onion servers, as long as their server supports it? The only thing left is the media proxy✎ -
stratself
are you saying clearnet clients can connect to onion JIDs, as long as their server supports it? The only thing left is the media proxy ✏
-
arcanicanis
I'm saying clearnet clients could access HTTP uploads that Tor clients had uploaded to an HTTP File Upload service on a .onion, without knowing nor directly handling anything special about a .onion address, if it's through a mediaproxy that has Tor support
👍 1 -
stratself
i'm still not exactly sure why a generic header would be needed in the case of proxy --> uploadserver, or even client --> uploadserver. If it's some kind of compliance mechanism then sure, but if one looks for some sort of real 403 for remote media access then I don't think it holds value
-
arcanicanis
It's to keep it from being used for proxying content that isn't from an HTTP File Upload service
-
arcanicanis
as it's a simple hint, to tell the mediaproxy that the requested media is from an HTTP File Upload service, and not just some generic website
-
arcanicanis
(and sure, maybe someone could 'weaponize' their personal website, to bypass that check, by putting that header on all the responses, as some witty sidestep of that measure, but even then that's still significantly limited)
-
arcanicanis
and again, other measures that can be piled ontop: cache misses or other means to ratelimit or restrict it
-
arcanicanis
as it stands, I doubt the average client is going to be doing like 5+ reqs/sec or something, consistently across a minute, or other heuristics that could be used to catch such things
-
stratself
well, i'm sure that's a first layer of defense nevertheless. Though I'd say since the client still has to auth to the mediaproxy to access the cached media, so their proxying of bad stuff doesn't have that big of a radius to blow
-
stratself
> and again, other measures that can be piled ontop: cache misses or other means to ratelimit or restrict it I guess this is server-admin controlled territory then ↺
-
arcanicanis
> well, i'm sure that's a first layer of defense nevertheless. Though I'd say since the client still has to auth to the mediaproxy to access the cached media, so their proxying of bad stuff doesn't have that big of a radius to blow yea, I've had a little fun with that before on other protocols (and more): https://arcanican.is/excerpts/cve-2024-23832/ ↺
-
arcanicanis
(responsible disclosure of course, and not any wilful malicious acts, but that was certainly a find; compounded by the fact that any server could be tainted as a non-local user)
-
stratself
interesting, thanks for the discovery and writeup. I see your point with local cache poisoning. And I guess it scales with the server's userbase in xmpp's case
-
stratself
what was the mitigation though?
-
arcanicanis
This situation wasn't with media itself so much, but rather mistrusting a random file as an authentic ActivityPub object, without even checking if the domain matches up, with what it claims to have been from
-
arcanicanis
meanwhile, as ascertained above, if XEP-0070 auth is required, then it can't be weaponized as a generic malicious file host
-
arcanicanis
meanwhile, compare with Discord: they have gaps in their reporting; a Discord user could upload a file to themself, take the URL, hotlink it from a website, and use Discord as a platform for basically hosting malware, and have no mechanism to complain about the upload (only being able to report it, if the file was directly sent to them). Meanwhile, if they do a scam where they link to the website instead, it's just completely off their radar from reporting the abuse (and there's no forms to report abusive uploads either).✎ -
arcanicanis
meanwhile, compare with Discord: they have gaps in their reporting; a Discord user could upload a file to themself, take the URL, hotlink it from a website, and use Discord as a platform for basically hosting malware, and have no mechanism that users can complain about the upload (only being able to report it, if the file was directly sent to them). Meanwhile, if they do a scam where they link to the website instead, it's just completely off their radar from reporting the abuse (and there's no forms to report abusive uploads either). ✏
-
arcanicanis
(though, alternatively, the same could apply to an HTTP File Upload service, but at least mediaproxies couldn't also be abused to the same, since resources through mediaproxies couldn't be able to be hotlinked, since they'd require XEP-0040 auth)
-
stratself
i'm all for protecting media proxies, but if the upload server can still be open access, it's still a cdn
-
arcanicanis
Just as the situation currently is, there's nothing in the idea that changes that threat, meanwhile the mediaproxy proposal isn't abusable in that way
-
arcanicanis
also, rather instead of XEP-0040, perhaps XEP-0101 or some other mechanism makes more sense, as I see the request for an explicit ACL authorization with XEP-0040 unless I'm skimreading too fast
-
arcanicanis
Either way, there is invariably a way to make an HTTP mediaproxy work, and even further features of it (non-Tor/non-I2P users being able to pull media from Tor or I2P endpoints, if respectively enabled for that)✎ -
arcanicanis
Either way, there is invariably a way to make an HTTP mediaproxy work, and other features that come from it (non-Tor/non-I2P users being able to pull media from Tor or I2P endpoints, if respectively enabled for that) ✏
-
stratself
huh, xep-101 is interesting
-
stratself
> Just as the situation currently is, there's nothing in the idea that changes that threat, meanwhile the mediaproxy proposal isn't abusable in that way as it stands, one should still be able to use it to proxy _any_ media from anywhere tho. So the headers idea you said is kinda opt-in imo ↺
-
MattJ
arcanicanis, https://mail.jabber.org/hyperkitty/list/standards@xmpp.org/message/OQ43VZXLADEJTW55KCBYCFKFXGFB7F7X/ is a write-up of my initial attempt. I have an implementation of the alternative approach mentioned at the end of my post, but I haven't published it yet (I can't remember why, it probably needs some polish)
-
arcanicanis
Hadn't considered CONNECT, but I'm sure there's complications in that; in other platforms, I've generally just seen GET requests used; URL concatenation is actually the same as how DID resolvers work with DID URLs, last I remember
-
MattJ
I was hoping with CONNECT we would be able to use off-the-shelf proxies and libraries, rather than cook up our own custom pseudo-proxy protocol
-
MattJ
But yeah, it turned out to not be an option
-
arcanicanis
There isn't a whole lot to really reinvent: just as long as it's authenticated in some way, a simple GET, in some structured way, works enough
-
MattJ
In general I am very keen on implementing self-service deletion, and fixing the forms of abuse you mentioned. One of the tricky problems is the diversity of clients and deployments.
-
MattJ
For example, having the upload server set a custom header isn't necessarily trivial, if your server is S3, for example
-
MattJ
or one of the various self-hosted S3 alternatives
-
arcanicanis
Yes, not in all situations will it work, but it's something that can be addressible in some contexts, which is enough of a gain over doing nothing at all
-
MattJ
The S3 API (which has been implemented by more than AWS by now) does support signed upload URLs, which works great with the current XEP
-
MattJ
I'd love to address the "some contexts" if possible, but I'm not sure it's that easy
-
MattJ
If we deploy the proxy and require the header, people will immediately run into issues with receiving files
-
MattJ
Deletion, yeah, we can say that's an optional feature and it's supported if it's supported
-
MattJ
There are a couple of options for deletion
-
arcanicanis
> If we deploy the proxy and require the header, people will immediately run into issues with receiving files except they can fallback to just not using the proxy, as people currently are living without, right now ↺
-
MattJ
One is that we simply allow DELETE on the PUT URL. However PUT URLs are often signed with a ttl, so they don't permit uploading forever. Otherwise you could create an account, create 10000 upload slots, and start uploading bad content. If your XMPP account gets terminated, you still have 10000 URLs you are free to upload to whenever you want.
-
MattJ
Another option is to return a dedicated <delete> URL in the slot response. This has more flexibility, so more likely to work with external upload services (and the <delete> URL can still be the same as the <put> URL if that's what the deployment wants)
-
MattJ
The downside of this is that only the uploading client gets it, so the user wouldn't be able to delete the file from a different client on their account
-
MattJ
(this also applies to the `DELETE <put url>` approach)
-
MattJ
So the next option is to request the deletion via the XMPP server, on demand
-
MattJ
This requires some smartness in the XMPP server, but I think it's the best approach because it's ultimately the most flexible and allows deletion from any client (which is what people want and expect, quite rightly)
-
arcanicanis
I forget some of the history or purposes for some of the past ideas, but could that be shared with other clients via Message Carbons?
-
MattJ
It could be shared somewhere, yes
-
arcanicanis
Or the client just messages their bare JID, with that metadata of the upload?
-
MattJ
Either that, or in PEP, or the server just stores it somewhere that clients can query
-
MattJ
But at that point you might as well just ask the server to give you a DELETE URL when you want it
-
stratself
might just have a delete url and a delete token shared separately?
-
stratself
nvm I forgot that the delete url cant be scaled to different clients either, unless stored in-band