HolgerThere seems to be a consensus that predictable resource strings introduce a privacy issue. Do we have any text in any XEP (or on the wiki or the mailing list) explaining the issue?
HolgerMattJ: Thanks. This says you "might be able to determine if the client [...] is online" if you know the resource string. There's no explanation how you would determine that, right?
MattJNot anything explicit that I know of
HolgerMattJ: I'm aware of a few ways to do that at least with ejabberd users, but each of those are spec or implementation issues which IMO should be fixed either way (rather than only hidden by making the resource string unpredictable).
Ge0rGHolger: +1
MattJBut this is implicitly assumed by various XEPs
Kevhas left
Ge0rGHolger: the only issue I'm aware of is periodic sending of IQs to a full JID, to probe the client availability and network latency
HolgerMattJ: Unpredictable resource strings are assumed?
xyzhas left
Ge0rGIt's also a really poor idea to use UUIDs for randomness if we have the full power of unicode (or at least base32 / base64)
HolgerIndeed.
MattJI'd wager that yes, numerous XEPs rely on unpredictable resource strings for security/privacy purposes
Ge0rGYouTube, the world's most used video platform, manage to identify videos by less than a dozen characters. Why do we need to use 36 characters to identify two clients on an account?
KevIf there's a standard for how to do something equivalent to UUID (global uniqueness without fingerprinting) in fewer bytes, using that instead of UUID seems entirely sensible.
Ge0rGAre we protecting from enumeration attacks? From birthday attacks? What is the actual amount of randomness required in a resource to prevent those?
HolgerKev: Why do we need a standard for that? I mean there's no interop requirement?
jonasww
jonasw^
KevHolger: Well, we'd at least want it to be consistently implemented across clients, else you can fingerprint (which probably isn't the end of the world, but seems like something we'd want to prevent).
Ge0rGKev: sticking to default implementations is what causes user-visible URLs like https://xmpp.yaxim.org:5281/upload/54f59abf-de9b-4fb9-a1e4-f0b5b78a0a9d/a5f532df-0fef-47c4-81a9-2aff690419fc.jpg
intosiHolger: libraries have the benefit of people not writing their own random identifier generator.
intosi* standard methods present in libraries, I mean.
jonaswgetentropy(nbits=32) | base64 | strip('=') should be possible with every standard library.
jonaswwe are requiring PRECIS which has *much less* support than that
intosiWhich base64?
intosiBut yeah, another well-defined method could work equally well.
intosiWhen unicode comes into play, I'm less inclined to believe that every client gets it right, though. Unicode is hard.
jonaswbase64 is luckily only ascii
Ge0rGwhat we are looking for is a number of random bits packed into as few characters as possible for a given element
KevWell, not quite.
jonaswbase-emoji
ZashBase 85
KevWe're also looking for it to be typable by server admins grepping for logs, etc.
KevAnd visibly distinguishable.
KevSimply picking random bits forced into UTF-8 clearly wouldn't work, for example.
Ge0rGKev: uuids are not very visibly distinguishable.
KevIf you give me two UUIDs, I'm fairly sure I can tell you if they're the same.
jonaswKev: agreed (with random bits in utf-8), not only because of distinguiushability, but because it also needs to pass resourceprep and/or precis unmodified. that’s nontrivial.
Ge0rGKev: what if you have a complex scenario with N clients in a MUC, a set of M reflected messages with rewritten UUID ids? For which values of N and M can you keep N+2*M UUIDs in your short-term memory?
ZashDisco identity ?
jonaswwhat about dictionary-based strings?
jonaswjust sample from /usr/share/dict/british-english-insane (it’s a thing!)
HolgerZash: There's no disco identity when staring at XEP examples ...
Ge0rGbase64-strings have better visible distinguishability due to more uppercase/lowercase.
KevGe0rG: In what scenario are you imagining someone debugging such a thing and using the resources as the mental identifiers?
KevAs I said earlier, if we can more tightly pack than UUID, while still maintaining human readability, great.
Kev(And the other desirable properties of UUIDs)
intosiBasically still gibberish. I'm not convinced it would make a difference, I would suck at remembering many of either UUID, base64, or picks-from-a-wordlist-with-sufficient-entropy.
intosiFor all of them, I would just use the first four or five chars anyway.
Ge0rGKev: which identifiers would you use if all you have at hand are debug logs?
HolgerKev: I'm still quite baffled you'd list 'readability' as one of the properties of UUIDs :-)
KevNicks in a MUC, in the usual case of them being pretty static.
Holger(Then again these XMPP people are used to reading XML ...)
KevHolger: They're not *nice*, but it is possible to straightforwardly read them.
HolgerKev: But what makes them any more readable than e.g. Base64?
Ge0rGKev: but I want it *nice*.
KevCompared with random UTF-8, which is impossible to read because you've no idea which of the many matching characters you're looking at.
KevHolger: Nothing.
HolgerOh. Well yeah there's always something worse.
jonaswKev: nobody was seriously talking about using random utf-8
jonasw(I hope.)
Kevjonasw: You might not be taking Ge0rG seriously, but I was still trying to address his point that we should be using UTF.
jonaswKev: I’m pretty sure Ge0rG was only making a point that using only hex is a waste of bytes and we should use base64 or something.
Ge0rGhttps://gist.github.com/windytan/7910910/
Kevjonasw: Given he explicitly said we should use the power of UTF-8, I don't think so :)
jonasw"the power of unicode", right…
KevYou're right, he said unicode, not UTF-8.
Ge0rGKev: actually, I should have written "a number of random bits packed into as few bytes as possible" - that should rule out non-ascii
KevWhy?
KevYou get denser packing with UTF-8 than with ASCII encoded as UTF-8.
intosiWell, that emoji-encoder uses four bytes per byte ;)
jonaswKev: I’m not sure about that. with ascii, you have a constant 7 bits / byte, with UTF-8 you lose bits for codepoints above 127 for signalling
Ge0rGintuitively, I'm with jonasw here.
KevI might be wrong, conceivably. It doesn't match my mental mapping of UTF-8, but I know that's a poor mapping.
jonaswin any case, that’s not the point of the discussion.
KevI think you get to use the extra bit as encoding for at least the first byte.
KevSure.
Ge0rGnobody has still said what attack we are trying to prevent, and how many bits of randomness are required to guard it off
jonasw(using any fancy unicode would be very hard, as I said, with the mappings done by resourceprep et al anyways)
KevThe important thing isn't that these are UUIDs, it's that they have the properties of UUIDs and are consistently implemented.
Sonnyhas left
Ge0rGin a world where clients leak their presence like crazy, having 256 bits of randomness in the resource might be a solution to the wrong problem.
Kevgoes back to work.
kalkinhas left
Sonnyhas left
kalkinhas joined
Guushas left
winfriedhas left
xyzhas left
Viniloxhas left
Zashhas joined
Zashhas left
Zashhas joined
Sonnyhas left
manchohas left
waqashas joined
moparisthebesthas left
SouLhas joined
SouLhas joined
Piotr Nosekhas joined
SouLhas joined
SouLhas joined
SouLhas joined
SouLhas joined
suzyohas left
suzyohas joined
Steve Killehas joined
jubalhhas joined
jubalhhas left
Valerianhas left
Valerianhas joined
xyzhas joined
jcbrandhas left
jcbrandhas left
Kevhas left
Kevhas left
jonaswhas left
Flowhas joined
Flowhas joined
jerehas joined
xyzhas left
Neustradamushas left
daurnimatorhas left
SouLhas joined
manchohas joined
daurnimatorhas left
kalkinhas left
kalkinhas joined
Valerianhas left
Valerianhas joined
mimi89999has left
vurpohas left
vurpohas joined
suzyohas left
suzyohas joined
Steve Killehas left
mimi89999has left
ZashXEP-0198 doesn't define what should happen when a session times out, or am I missing something?
Kevhas left
vurpohas left
vurpohas joined
manchohas left
xyzhas joined
manchohas joined
Steve Killehas left
xyzhas left
Kevhas left
SamWhitedhas left
Piotr Nosekhas left
Piotr Nosekhas joined
vurpohas left
vurpohas joined
vurpohas left
vurpohas joined
xyzhas joined
sezuanhas left
Valerianhas left
ralphmhas left
xnyhpshas left
jerehas joined
mimi89999has left
kalkinhas left
ZashGe0rG: Btw, for maximum entropy per byte with base64, get a multiple of 3 bytes. You get a multiple of 4 bytes out and no padding.