flowralphm, if I am not mistaken, the current rules of rfc7622 disallow unassigned to in resourceparts, domainparts and probably also localparts
flowi'd say the spec is sound and as sensible as possible, it is the implementations that do not follow the rules and so, once in a while, an invalid jid slips through. That's the main motivation for creating the jid/xmpp strings testframework and the valid/invalid jid corpus
jonas’flow, except that RFC 7622 does not pin the unicode version
jonas’so one entity running on Unicode 10 could consider something as legitimate which an entity on Unicode 9 would not
flowjonas’, right, but as I said earlier, I would consider this to be very rare. But I could be wrong. And I don't think there is a better solution, happy to be proven wrong though
flowThat is, I think the tradeoff of not pinning the unicode version is justified
flowAt least the troubles we had so far are not caused by not pinning the unicode version, as far as i can tell
Ge0rGNo, but they have the same symptoms
ralphmIsn't my example a sign of why this is a problem? Emoji are all Symbols (So), I believe, and as such valid in parts of JIDs. Differing Unicode versions have different ideas on newer codepoints, so also on validity of JIDs?
Ge0rGIf we don't want to break the experience for everybody when somebody employs new unicode, we need to accept unassigned as valid from remote entities
ralphmThe problem with that, though, is unassigneds that become prohibited.
ralphmSince the foremost expert on this is Peter, I suggest someone write an email about this to standards@. He's busy, but it's more likely he can respond there.
Ge0rGI'm not sure he'll be able to solve that problem either ;)
ralphmNo, but he can at least confirm we have this problem and/or know about strategies.
jonas’07:07:12 Ge0rG> If we don't want to break the experience for everybody when somebody employs new unicode, we need to accept unassigned as valid from remote entities
jonas’that’s only a partial solution
jonas’codepoints may change categories and stuff between unicode versions
jonas’and an unassigned codepoint in one version may well be a RTL-codepoint in another version
jonas’so by accepting unassigned input, you may accept something which someone else will consider invalid.
flowwell, since the problem is mostly in resourceparts, localparts and domainparts forbid emojis, we should probably establish a pattern that resourceparts are not user-configurable nor user-visible. Shame on you xep45! I wonder what the state in MIX is
flowAnd we should probably add a note to xep45 that the use of certain unicode categories is discouraged
flowBut I don't want to be the person to discourage emojis in muc usernames…
jonas’flow, passwords and such are also affected.
flowjonas’, how's that?
Ge0rGlocalparts can be Emoji as well.
jonas’flow, passwords are also passed through stringprep/precis
flowGe0rG, localparts are UsernmaeCaseMapped profile of the IdentifierClass, and that class forbids symbols under which emojis fall, no?
flowMaybe not all emojis, haven't check them all
ralphmIn MIX, nicks are an attribute of a participant, not part of their identity. However, it also says you have to follow https://tools.ietf.org/html/rfc7700
ralphmWhich in turn depends on Precis FreeformClass, and thus has the same issues as resources.
flowguess users just want emojis in their nickname
flowmaybe there is a reserved for future emojis unicode range?
jonas’there’s still the problem that you can’t do proper normalisation if you don’t know the codepoints
flowwell if the reserved range also states the properties of the eventually assigned codepoints?
jonas’that won’t work
jonas’then they could just be assigned
jonas’stuff like how they combine with fitzpatrick modifiers
flowNo because you don't now yet what they are assigned to
flowbut if this codepoint is assigned to, then it has the following properties
flowbtw, there is an excellent post about this topic at https://hsivonen.fi/string-length/
Ge0rGflow: I have a user ♥@ツ.op-co.de
flowGe0rG, I am not suprised that you do, if that's the question
ralphmflow: no, when we think of as emoji is all over the place in several Unicode blocks.
flowAnyhow, yes the situation is not perfect, and I am happy if we could improve it. I just don't know how, and I can probably live with the status quo
ralphmI like the one on chess symbols: https://www.unicode.org/charts/PDF/Unicode-12.0/U120-1FA00.pdf
ralphmActually https://tools.ietf.org/html/rfc7564#section-12.3 spells out the issue quite clearly:
“Strings that conform to the FreeformClass and many profiles thereof
can include virtually any Unicode character. This makes the
FreeformClass quite expressive, but also problematic from the
perspective of possible user confusion. Protocol designers are
hereby warned that the FreeformClass contains code points they might
not understand, and are encouraged to profile the IdentifierClass
wherever feasible; however, if an application protocol requires more
code points than are allowed by the IdentifierClass, protocol
designers are encouraged to define a profile of the FreeformClass
that restricts the allowable code points as tightly as possible.”
ralphm(there's a similar remark in the interop section 13.
flowsad that the emoji which could express my feelings right now is only coming in unicode 13: Smiling Face With Tear
flowBut is the situation really that bad? Implementation could get the latest unicode standard over some sort of data network once in a while. You don't even have to update the involved libraries etc.
jonas’flow, is that true?
jonas’I think that highly depends on the libraries
jonas’I’m not sure how to update python unicodedata for example without updating python
ralphmThere are libraries that still do just resourceprep instead of Precis, simply because RFC 6122 is directly linked from RFC 6120, even though it is obsoleted by RFC 7622.
ralphmOne example is Twisted, which I am author of.
ralphmOne could argue that with resourceprep being more restrictive, just having that is at least a bit clearer as an interop goal.
ralphmTo be honest, I don't know what the best course of action is in this regard.
jonas’stay with unicode 3.2 forever
Ge0rGralphm: be liberal in what you accept and strict in what you emit
Zashs/emit/allow users to send/
jonas’would a MUC service be strict or liberal, regarding nicknames for example? :)
ralphmGe0rG: my argument here is that this means that something like U+-061C causes problems.
Ge0rGZash: yes, I implied that
ralphmIt was unassigned before (so not valid), then assigned (but still invalid).
ralphmBut 🥓 was unassigned before (so not valid), and now assigned (but valid)
Ge0rGralphm: yes, but if the MUC service accepts it, other servers or clients receiving it from the MUC shouldn't freak out
Ge0rGi.e. a MUC service can strictly police the nickname, but not the resourcepart of the users' real JID.
jonas’ralphm, it’s not invalid, it’s only invalid if used with LTR characters :)
ralphmA MUC service is not something magical. It is just another server that connects to other servers over s2s and uses JIDs in addressing of stanzas.
ralphmjonas’: it is invalid as it is a control character.
Ge0rGralphm: a regular server should police the resourcepart of local users, but not of remote users.
ralphmjonas’: (for FreeformClass)
jonas’ralphm, ah, fun
ralphmGe0rG: well, that might be sensible approach, indeed. I'm not sure how well that works with mapping on new code points, and what kind of normalization issues arrise from that, but ok.
ralphmIn any case it deserves some wider attention. Maybe even to the XMPPWG mailing list.
Ge0rGralphm: framed differently: you shouldn't police any JIDs that you don't have the authority over, except when they are illegal in a breaking way, i.e. contain " or '
ralphmdoes that include localpart?
ralphmGe0rG: should a server do precis processing on localparts of a remote JID?
ralphmGe0rG: also, for resourcepart, should it a) use incoming JIDs as is (no processing), b) allow unassigneds, but still do Precis, c) something else.
Ge0rGralphm: I'm not sure yet where the point of no return between a and b is, for either localpart or resourcepart
Ge0rGIf you do a, that probably opens up some very interesting ways to break your clients
jonas’I think it boils down to: treat JIDs as opaque if you don’t have authority over them
ralphmYep, things like IV and Ⅳ.
jonas’don’t do normalisation on them, or any processing at all, just treat them as opaque sequences of codepoints
ralphm(I followed by V, vs. ROMAN NUMBER 4)
Ge0rGralphm: I don't think _that_ would break things
jonas’it is the domain authorities responsibility to ensure that stuff is valid and comparable when it is emitted from there
flowjonas’, I think so. You sure could bulid an python library that does so
ralphmbut you can then have different people with arguably the same nick
Ge0rGralphm: this is something the MUC has authority over.
Ge0rGralphm: if you try to enforce that on your user's server, your user will get kicked
flow> jonas’> ralphm, it’s not invalid, it’s only invalid if used with LTR characters :)
I think it is invalid regardless the context with rfc7622
ralphmBut I definitely don't want to be so lenient for localpart
Ge0rGralphm: just tear down s2s and blacklist the remote server as incompliant.
ralphmflow: it is invalid in resourceprep because unassigned in 3.2, and invalid in Precis FreeformClass because it is an a prohibited class
Ge0rGConveniently, it also prevents you from contacting the server admin
ralphmjonas’: because (bare) JIDs are identity
jonas’ralphm, from whose perspective are you currently arguing?
ralphmjonas’: I don't want to accept incoming stanzas that fail precis processing on localpart
jonas’as a client? as a MUC service? as a server? as anyone?
ralphmall, I guess
flow> jonas’> don’t do normalisation on them, or any processing at all, just treat them as opaque sequences of codepoints
That would probably open up another box of issues
flowSince Unicode does us so much good, I'l like to suggest that the XSF adopts a character (for as little as 100$, but maybe we could got for silver) before matrix does it: https://www.unicode.org/consortium/adopted-characters.html
jonas’flow, send this to board
flowon my way
jonas’and find a good character thing to sponsor
flowbut I am open for suggestions
Ge0rGI propose U+1F926
Ge0rG💡 U+1F4A1 would be too obvious, right?
SeveWould be nice to havethe logo as a character :D
Danielso any client that shows it potentially has f'uped carbon parsing?
moparisthebestyep missing from my Conversations though, neat
moparisthebestI love that mysterious bug finder
DanielGe0rG, do you just dump all the xml?
Ge0rGDaniel: that's from poezio debug log file
Ge0rGEverything old is new again. https://www.cvedetails.com/cve/CVE-2017-5589/
Danielsadly i think dino even existed back then
GuusIt's interesting to ponder on how this can be utilized to have covert discussions en plein public
Ge0rGmoparisthebest: Guus: can you open bug reports?
moparisthebestDaniel, but you said it *didn't* display in your dino? but it did in mine... what version do you have?
ZashGuus, MUC PMs seems simpler
Danielbut maybe it wasn’t stored in muc history
GuusZash: where's the fun in that though
Danielso don’t count on that
moparisthebestAH that makes more sense
moparisthebestmine is built from git HEAD too, but trying to figure out exactly when...
Ge0rGAlso I need to talk to our content manager because the advisory url is 404
ZashMine is whatever Debian package from OBS, and I saw it.
Danielconverse showed it as well?
Ge0rGFunny how the month changed... https://rt-solutions.de/en/2017/01/cve-2017-5589_xmpp_carbons/
Ge0rGConverse was affected back then.
Ge0rGbalu_der_baer: are you a pentester or is your client broken?
Danielthat does not look like a broken client
Daniel(on the sending end)
Ge0rGDaniel: something like delayed delivery gone very much wrong?
Ge0rGNext up: unrequested MAM impersonation
moparisthebestthe `i_am_groot` seems like a dead giveaway for deliberate test
moparisthebestotherwise that'd be an insanely odd client bug
Danielthere is so much long hanging fruit to pick in the xmpp world
Ge0rGIt's good that somebody does the testing. And this place is actually well suited
ZashSo what's next, shall we try the MEGALOL-attack?
GuusIt would have been nice to share findings though.
GuusI found out by accident.
moparisthebestisn't that what that was? :D
Danieli mean i was wondering why Zash found the p2 story so interesting…
pep.Daniel, same :D
Ge0rG"complain loudly if you can read this"
moparisthebestso you can probably impersonate actual people that are in the MUC right?
Danieldepending on how fucked it is not just muc
Ge0rGmoparisthebest: most probably you can impersonate anyone, even outside of the MUC
Ge0rGmoparisthebest: read the CVE
moparisthebestyea I just meant the XML groot just sent was MUC only, and implied you could impersonate anyone
moparisthebestI'd seen the old general carbons CVE before though
Ge0rGIt's not really new
Ge0rGWe should have a test suite for clients.
Danieli wouldn’t be shocked if dino was vulnerable to CVE-2015-8688
Douglas Terabytehas left
Douglas Terabytehas joined
lovetoxso is this covered by this line in the XEP
Danielsomeone should try; probably...
lovetoxAny forwarded copies received by a Carbons-enabled client MUST be from that user's bare JID
lovetoxsomeone cant fake a message from a bare muc jid
GuusUff, this was hard on mobile. https://github.com/conversejs/converse.js/issues/1704
GuusPlease augment if needed
Daniellovetox, it not bare jid. just the users bare jid is allowed
Danielthere shouldn’t be carbons in mucs
lovetoxyeah but the server is responsible that there are none
lovetoxat least that says the xep
Danielyour carbons parsing code needs to be wrapped in a if from == null || from == my_account_jid
lovetoxah i get it
lovetoxyes must be from my account bare jid
lovetoxnot a "user"
Danielwhich excludes the shit balu send
lovetox # Carbon must be from our bare jid
if not stanza.getFrom() == own_jid.getBare():
raise InvalidFrom('Invalid from: %s' % stanza.getAttr('from'))
lovetoxwas scared i fucked up :) but seems i did this right
pep.That's not a new bug, gajim would have probably been tested at that time :)
Ge0rGI've added a section to the test cases
Ge0rGStill looking for somebody who can implement them
Ge0rGWould probably have to be a component for the MUC parts
Ge0rGOTOH, a bot could fake being a MUC, right?
lovetoxyes pep. but as of course i think i can do everything better i reimplement much code, also carbon parsing
ZashThis carbons thing could be done by a bot
Ge0rGIt was a huge strain to my eyes, my fingers and my patience to add those three lines to the wiki from my android phone.
lovetoxthough its much harder wth MAM
lovetoxi only accept mam messages with query-id s that im actually waiting for
Danielwell you do…
Danieland yes can confirm that dino is vuln to https://gultsch.de/gajim_roster_push_and_message_interception.html
Danielwhy does this shit keep happening
Danielso question is do i fix it now?
Ge0rGDaniel: can you do a roster push through a MUC?
DanielGe0rG: looking at the code I'm relatively certain you could
DanielHaven't tested that one tho
DanielYou have to get lucky to get your iq routed I guess. Lol
Ge0rGDaniel: only with MSN
moparisthebestis there a generic bot/component someplace that can just try all of these things against a JID
pep.Which is probably the default in this MUC
pep.So not a correct target
moparisthebestso it can be used across projects
Ge0rGmoparisthebest: write one please! https://wiki.xmpp.org/web/Client_Test_Cases#Staying_inside
moparisthebestit would probably be hard to write it with most existing libraries, they tend to try to insist on you sending proper things
DanielGlad the Spammer haven't found out how to but themselves right into your roster
DanielThe cool thing about that CVE is due to roster version it also won't go away
moparisthebestI'd gladly accept spam from such a smart spammer though
DanielSo my Dino will be stuck with that test jid I injected
moparisthebestmight even buy what he's selling
Ge0rGmoparisthebest: it would get propagated into the spam sending tools and used by dozens spammers within some weeks
DanielSo who is going to collect the CVE for mam injection in multiple clients?
Ge0rGDaniel: let's wait half a year until there is a significant deployed base
Douglas Terabytehas left
Douglas Terabytehas joined
Ge0rGOther than that, I'll gladly volunteer. I need some more CVEs on my CV
ZashCVEs go on your CV?
lovetoxthats why they start with CV..
Ge0rGCurriculum Vitae Extension.
Ge0rGDo we have an up to date entity caps database?
balu_der_baerCan you see me?
pep.Only the hash? Or all features? If it's just hashes, movim probably has a few up to some point in the past(?) https://nl.movim.eu/?about#caps_widget_tab, otherwise I'm sure you can gather some by running code on prosody
ZashA wild haxxor appears
Ge0rGpep.: all the features. Looking for clients with MAM