-
larma
singpolyma, you haven't specified which hash algorithm to use. md5 collisions are totally a thing.
-
larma
So whenever you start to add hashes, you will also need to add hash agility in the design so that you can deal with such things
-
larma
hash agility means you have to introduce hash algorithm negotiation
-
larma
all of this adds a ton of compexity. And the URL already acts as a unque identifier for each file upload. The only advantage of your approach is that even if the client does not know about a previous upload, it would not need to upload the same file again. Which may be nice, but also really isn't that big of a deal. Further, we can just introduce means that the client does know about the previous upload (e.g. making it easy to discover all previous uploads and get live notification via pubsub or similar when a new file is uploaded).
-
jonas’
does md5 still count as a hash, and not just as a checksum? ;P
-
pulkomandy
Identifying files by hash without ever comparing them seems a bit dangerous. I'd say this only makes sense if either the client or the server knows for sure it is the same file. For the client this means the user is sending the same file to several places, this seems possibne to check. For the server it could be de-duplicating after uploading (useful in agroup chat where people reuse the same 3 gif imaces over and over?). But if you have one file on the server and another on the client, I see no way to answer "is this the same file?" with a definitive yes unless you upload or download it
-
larma
👍 1The server can alredy deduplicate identical files on the server storage, nobody is stopping them from doing that 🙂✎ -
larma
The server can already deduplicate identical files on the server storage, nobody is stopping them from doing that 🙂 ✏
-
Link Mauve
pulkomandy, that’s the purpose of a cryptographical hash isn’t it?
👍 1 -
Link Mauve
To make sure we both have the same file without having to share it.
-
pulkomandy
To make it hard to generate collisions on purpose. But they can still happen by accident, even if it's very unlikely
-
Link Mauve
You’re way more likely to get hit by a meteor than to find a collision with a good hashing mechanism.
-
pulkomandy
So you can be sure that you don't have the same file (if the hash don't match). If the hash matches, you are not 100% sure it'sthe same file, it depends how sure you want to be, it may be good enough
-
singpolyma
> singpolyma, you haven't specified which hash algorithm to use. md5 collisions are totally a thing. We have hash agility in the xep for a reason 🙂 ↺
-
singpolyma
> pulkomandy, that’s the purpose of a cryptographical hash isn’t it? 👍 ↺
-
Cynthia
nobody uses MD5 for secure file hashing anymore
- Cynthia slowly pushes BLAKE2b/BLAKE2s and BLAKE3 to the room and leaves
-
singpolyma
Maybe. Realistically in a standard protocol sha3 is more likely
-
Cynthia
true
-
Link Mauve
singpolyma, why?
-
Cynthia
blake2 is only a RFC, but both hash algorithms are faster than the SHA counterpars✎ -
Cynthia
blake2 is only a RFC, but both hash algorithms are faster than the SHA counterparts ✏
-
Cynthia
and simpler
-
Link Mauve
Last I checked Blake3 was hugely faster than SHA3 when not hardware accelerated.
-
Cynthia
(also has been trusted for key derivation in libsodium)
-
Cynthia
Link Mauve: that's the thing
-
Link Mauve
And SHA3 hardware support is still spotty, I don’t think I own a single computer with that.
-
Cynthia
don't really trust most people to have hardware with modern cryptography HW acceleration
-
singpolyma
> singpolyma, why? Because sha3 is a standard algorithm? ↺
-
Cynthia
singpolyma: define standard btw?
-
Cynthia
is RFC considered a standard?
-
singpolyma
Usually I use https://singpolyma.net/2009/01/beasts-of-the-standards-world/ but here I meant in the way that xmpp uses it so yes an ietf RFC accepted would be a standard by (usually) both measures
-
Cynthia
okay
-
Cynthia
then BLAKE2 would fit
-
Link Mauve
singpolyma, XEP-0414 specifies both equally.
-
Link Mauve
It’s been four and a half years since the last yearly review by the Council though.
-
Cynthia
blake2 has a RFC standard
-
Link Mauve
“These recommendations ought to be reviewed yearly by the XMPP Council [14].”
-
singpolyma
That xep is a neat idea, but you'll not it's deferred
-
singpolyma
It hasn't been reviewed because it was never approved
-
Link Mauve
Ah oops.
-
erik
Link Mauve, it was last updated 4.5 years ago, but the copyright is 2024; so maybe it *was* reviewed?
-
singpolyma
Never trust copyright dates, heh
-
erik
well, the document states it was last *changed* 4+ years ago, so that's no testament to its review either.