jdev - 2024-12-26


  1. larma

    singpolyma, you haven't specified which hash algorithm to use. md5 collisions are totally a thing.

  2. larma

    So whenever you start to add hashes, you will also need to add hash agility in the design so that you can deal with such things

  3. larma

    hash agility means you have to introduce hash algorithm negotiation

  4. larma

    all of this adds a ton of compexity. And the URL already acts as a unque identifier for each file upload. The only advantage of your approach is that even if the client does not know about a previous upload, it would not need to upload the same file again. Which may be nice, but also really isn't that big of a deal. Further, we can just introduce means that the client does know about the previous upload (e.g. making it easy to discover all previous uploads and get live notification via pubsub or similar when a new file is uploaded).

  5. jonas’

    does md5 still count as a hash, and not just as a checksum? ;P

  6. pulkomandy

    Identifying files by hash without ever comparing them seems a bit dangerous. I'd say this only makes sense if either the client or the server knows for sure it is the same file. For the client this means the user is sending the same file to several places, this seems possibne to check. For the server it could be de-duplicating after uploading (useful in agroup chat where people reuse the same 3 gif imaces over and over?). But if you have one file on the server and another on the client, I see no way to answer "is this the same file?" with a definitive yes unless you upload or download it

  7. larma

    The server can alredy deduplicate identical files on the server storage, nobody is stopping them from doing that 🙂

    👍 1
  8. larma

    The server can already deduplicate identical files on the server storage, nobody is stopping them from doing that 🙂

  9. Link Mauve

    pulkomandy, that’s the purpose of a cryptographical hash isn’t it?

    👍 1
  10. Link Mauve

    To make sure we both have the same file without having to share it.

  11. pulkomandy

    To make it hard to generate collisions on purpose. But they can still happen by accident, even if it's very unlikely

  12. Link Mauve

    You’re way more likely to get hit by a meteor than to find a collision with a good hashing mechanism.

  13. pulkomandy

    So you can be sure that you don't have the same file (if the hash don't match). If the hash matches, you are not 100% sure it'sthe same file, it depends how sure you want to be, it may be good enough

  14. singpolyma

    > singpolyma, you haven't specified which hash algorithm to use. md5 collisions are totally a thing. We have hash agility in the xep for a reason 🙂

  15. singpolyma

    > pulkomandy, that’s the purpose of a cryptographical hash isn’t it? 👍

  16. Cynthia

    nobody uses MD5 for secure file hashing anymore

  17. Cynthia slowly pushes BLAKE2b/BLAKE2s and BLAKE3 to the room and leaves

  18. singpolyma

    Maybe. Realistically in a standard protocol sha3 is more likely

  19. Cynthia

    true

  20. Link Mauve

    singpolyma, why?

  21. Cynthia

    blake2 is only a RFC, but both hash algorithms are faster than the SHA counterpars

  22. Cynthia

    blake2 is only a RFC, but both hash algorithms are faster than the SHA counterparts

  23. Cynthia

    and simpler

  24. Link Mauve

    Last I checked Blake3 was hugely faster than SHA3 when not hardware accelerated.

  25. Cynthia

    (also has been trusted for key derivation in libsodium)

  26. Cynthia

    Link Mauve: that's the thing

  27. Link Mauve

    And SHA3 hardware support is still spotty, I don’t think I own a single computer with that.

  28. Cynthia

    don't really trust most people to have hardware with modern cryptography HW acceleration

  29. singpolyma

    > singpolyma, why? Because sha3 is a standard algorithm?

  30. Cynthia

    singpolyma: define standard btw?

  31. Cynthia

    is RFC considered a standard?

  32. singpolyma

    Usually I use https://singpolyma.net/2009/01/beasts-of-the-standards-world/ but here I meant in the way that xmpp uses it so yes an ietf RFC accepted would be a standard by (usually) both measures

  33. Cynthia

    okay

  34. Cynthia

    then BLAKE2 would fit

  35. Link Mauve

    singpolyma, XEP-0414 specifies both equally.

  36. Link Mauve

    It’s been four and a half years since the last yearly review by the Council though.

  37. Cynthia

    blake2 has a RFC standard

  38. Link Mauve

    “These recommendations ought to be reviewed yearly by the XMPP Council [14].”

  39. singpolyma

    That xep is a neat idea, but you'll not it's deferred

  40. singpolyma

    It hasn't been reviewed because it was never approved

  41. Link Mauve

    Ah oops.

  42. erik

    Link Mauve, it was last updated 4.5 years ago, but the copyright is 2024; so maybe it *was* reviewed?

  43. singpolyma

    Never trust copyright dates, heh

  44. erik

    well, the document states it was last *changed* 4+ years ago, so that's no testament to its review either.