XSF Discussion - 2018-11-02

  flow

    jonas’, which server side impl does the full flush?

  flow

    ahh it is "bytes saved", that is why full flush percentages are lower…

  148. Str4tocaster has left

  161. mimi89999 has left

  177. lovetox has left

  jonas'

  Ge0rG

  Ge0rG

  pep.

  202. alexis has joined

  Ge0rG

    pep.: yes, that's about what I know. But that link isn't there, so now I'm lost.

  pep.


  ralphm

  Ge0rG

  flow

    jonas’, good job. that sounds like it is possible to make prosody announce a zlib-with-full-flush-or-whatever-you-wanna-call-it compression method

  dwd

    jonas’, Real percentages are much higher, or at least they used to be. I did a load of work compressing real traffic captures - but performing a sync flush after multiple stanzas helps a lot, and using CSI really drives it up.

  Ge0rG

    dwd: CSI doesn't absolve you from sync-flushing after each stanza, right?

  dwd

    Ge0rG, I don't think anything mandates you sync-flush after every stanza - just after every buffer flush.

  flow

    Ge0rG, you only need sync flush if there is no more data

  dwd

    flow, That's more or less what I was typing - you only flush once all the inbound traffic has been processed, at least on C2S.

  flow

    dwd, sorry, I just didn't get what you meant with "after every buffer flush" and wanted to clarify it a bit for Ge0rG

  dwd

    flow, Yeah, I was clarifying it the same way as you, but you beat me to it. :-)

  flow

    dwd, I'm also confused why you wrote "inbound" traffic, I'd say it is the "outbound" traffic where an entity controls the zlib behavior. For inbound traffic it is just consuming whatever bytes have been send to it

  dwd

    flow, Ah. So if a client sends you N stanzas, you only need to flush after processing all N.

  Ge0rG

    dwd: the issue we are working around is that compression provides a plaintext size oracle for attackers, right?

  flow

    dwd, that again sounds like the receiving entity would flush

  flow

    which is kind of new to me

  dwd

    Ge0rG, Sure, if you think that's a realistic security problem, then you have to compress only traffic that can be influenced by one entity at a time. Which basically means compressing each stanza individually.

  flow

    dwd, or "do a full flush on every channel change"

  Ge0rG

    dwd: all security problems tend to become realistic sooner or later.

  dwd

    flow, It is exactly that. But it only makes a difference on C2S, I hasten to add.

  flow

    dwd: hmm, well as long as the from/to pair does not change on s2s…

  dwd

    Ge0rG, Sure. But the compression oracle in HTTP was significant because it allowed access to password data, for example.

  245. flow

    you don't need to drop the dictionary, I think

  Ge0rG

    dwd: are you saying that s2s is not affected by the oracle vulnerability, or that the channel stays always the same between the two server domains?

  flow

    Ge0rG, I think he meant that the channel changes with every stanza

  flow

    but I'd argue that the channel stays stable until the from/to pair changes

  flow

    whereas in c2s, on of from/to is always fixed

  flow


  251. Ge0rG

    Ge0rG

  252. dwd


  flow

    I don't follow how this is implied by that logic

  dwd

    I think that you can run a compression-oracle attack on S2S more easily - I think it's easier to inject traffic, and possibly easier to witness the transport channel as well - but you'd find it harder to get anything useful once you had the attack in place.

  Ge0rG

    dwd: the compression oracle in HTTP made it comparably easy to extract credentials, yes. But it does apply to content as well, just that it's rather hard for an attacker to control data injected after the typical body of a web site.

  256. Ge0rG

    with XMPP, the game is vastly different

  flow

    Slightly unrelated: I also wonder how widespread s2s compression is

  dwd

    flow, Not very. Early versions of Openfire did it, but we disabled it (because it stopped working).

  Ge0rG

    I don't know, but I'd argue that s2s compression is largely irrelevant in typical federated deployments

  dwd

  264. dwd has left

  flow

    Ge0rG, I'm not sure about the "irrelevant" part

  Ge0rG

    flow: irrelevant in the sense that you are not gaining much from it

  flow

    Not everyone hosts its XMPP server in a well connected datacenter

  flow

    Ge0rG, I figured so far, but I still believe that this may not be true in every case.

  Ge0rG

    flow: if you run your XMPP server in your basement on a crappy ADSL line, you are probably not going to use IBB transfers much

  flow

    Ge0rG, i was thinking more about third world countries

  dwd

    flow, Well, tactical military deployments are all S2S over long/thin links, but usually with heavy compression on the links themselves, so I'm not sure '138 would be needed.

  dwd

    Still, all this is rather irrelevant. If we posit that content ends up fully encrypted under OMEMO/MLS/OX/eSessions/PGP then it's incompressible (one hopes). The remaining traffic is best compressed by EXI.

  flow

    True, but then again, we are far from the point where content is fully, or even mostly, encrypted. It may take years until we reach that.

  274. dedekin has joined

  flow

    So I am again not sure about the "irrlevant" part :)

  Ge0rG

    dwd: if the s2s connection is encrypted, you can't compress it much on the underlying link layer

  dwd

    Ge0rG, WHich is why they don't in those circumstances.

  jonas'

    dwd, right, so both the prosody and aioxmpp implementations do a sync (or full) flush after each stanza

  jonas'

    so by implementing some CORKing, that could be made better I suppose

  dwd


  flow

    jonas’, CORKing?

  jonas'

    flow, like TCP CORK, where you wait for more data for a short period of time before sending it out

  flow

    ahh, nagle algorithmus, right?

  MattJ

    Similar, but manual

  dwd

    jonas’, Yeah, that's Nagle not CORK. CORK is holding the transmission until you manually release it.

  jonas'

    dwd, isn’t nagle that thing which reduces the data rate when stuff gets lost?

  jonas'

    I am lost in the TCP termini, sorry for the confusion.

  flow

    jonas’, I don't think so

  jonas'

    you’re probably both right :)

  jonas'

    nevermind me, you know what I mean (now) though :)

  dwd

    jonas’, No, that's backoff, which might well have been developed by Jon Nagle, but doesn't bear his name at least.

  flow

    nagle just defers the write a widen the window for more data from the application

  MattJ

    jonas’, Nagle's is basically automatic corking at the beginning of a connection

  flow

    jonas'

  305. jonas’


  flow

    now that I read up on TCP_CORK I can imagine that it isn't heavily used because it appears to be error prone

  dwd

    jonas’, And then there's the reverse - lose stuff when the data rate drops - which is best done with RED, which is Sally Fields's design as I recall. But I don't think that makes sense in XMPP.

  jonas'

    this was more about the concept anyways

  dwd

    flow, Very platform specific too, and irrelevant to us because we need to compress as we go, I think.

  flow

    MattJ, "at the beginning of a connection"? Isn't nagle used over the whole lifetime of a connection (if enabled)?

  flow

    jonas’, if the idea is to wait for more outbound stanzas until you give the network layer green light to send it, then I'm fully with you. And like to note that Smack allows for that since many years. Even though I've implemented it to reduce the powered-up time of the radio, it will also help regarding the compression ratio

  jonas'

    flow, no, the idea is to wait for more stanzas before performing the full/sync flush in zlib

  320. jonas’

    instead of flushing after each stanza

  flow

    MattJ, no worries, just wanted to make sure that I'm not missing something

  jonas'

    (of course taking into account the "(to, from) pair must match to be secure" criterion)

  flow

    jonas’, I think we are talking about the same mechanism

  jonas'


  325. flow

    flow

  jonas'


  Ge0rG

    what about having a zlib dictionary per JID?

  jonas'

    memory cost

  jonas'

    and I think both parties need to agree on the dictionary beforehand

  jonas'

    so you’d have to transfer that dictionary every time you switch?

  jonas'

    or if you had multiple compression streams, you’d have to have an out-of-band way to signal to the peer which one the next bytes belongs to

  jonas'

    or if you had multiple compression streams, you’d have to have an out-of-band way to signal to the peer which one the next bytes belong to

  Ge0rG


  dwd

    You could build state-switching into the compression framing, of course, but yeah - memory cost would be scary-huge.

  jonas'

    regarding the use of compression and e2ee: zlib seems to be rather good at reversing the base64-bloat, so that’s at least something.

  Ge0rG

    We need a way to embed raw bytestreams into XML.

  Ge0rG

    Or just replace XML with... protobufs? ASN.1?

  jonas'

    using base92 would probably go a long way already

  Zash


  jonas'

    (or was it 96?)

  jonas'

    anything above that would give diminishing returns due to UTF-8 encoding anyways

  351. alexis has joined

  Ge0rG

    jonas’: base-91

  Ge0rG

    yeah, UTF-8 is not an efficient encoding.

  dwd

    Well. Not in terms of bits, anyway.

  jonas'


  jonas'

    base91 uses < and >

  jonas'

    an &

  jonas'

    and &

  jonas'

    while not-using -, \ and '

  Ge0rG

    Anybody still remembers https://en.wikipedia.org/wiki/YEnc ?

  jonas'

  jonas'

  dwd

  jonas'

    don’t shut the door on namespaced attributes completely.

  Ge0rG

    XML is really a horrible encoding protocol for machines.

  flow

    what jonas’ said

  Zash

    It's fine, don't worry too much

  dwd

    jonas’, Well, it wouldn't matter if they were considered an outlier and not encoded very efficiently, at least.

  jonas'

    dwd, that’s true

  379. Guus has left

  dwd

    Ge0rG, I quite like many of the properties of XML for our purposes. Certainly the alternatives would make a bunch of things much more painful - and I always have a nagging feeling that a construct like JSON imposes a data structure that is hard to break away from.

  Zash

    Do something like header compression in h2?

  Ge0rG

    dwd: JSON shares most of the disadvantages of XML

  Zash


  Ge0rG

    I liked the MIDI format, where all numbers are dynamic-width.

  dwd

    Ge0rG, Or BER, where they can be?

  jonas'


  jonas'


  Ge0rG

    dwd: I'd go with DER for lesser ambiguity

  dwd

    Ge0rG, CER?

  Ge0rG

    Also whoever made it possible to encode U-0000 as an arbitrarily long UTF-8 sequence deserves the highest punishment.

  jonas'

    tell me more

  jonas'

    can’t you encode all things as arbitrarily long utf-8 sequence though?

  dwd

    jonas’, Only by ignoring the standard.

  jonas'

    but that’s not true for U+0000?

  Zash

    JSON Encoding Rules

  Ge0rG

    jonas’: I'm only bitching because U+0000 has special meaning in C.

  Zash

    Is a thing

  Ge0rG

    jonas’: https://en.wikipedia.org/wiki/UTF-8#Description - UTF-8 just stuffs the data bits after the header. A sane encoding would be to automatically add 0x80 to the bits in a two-byte encoded charset, because you can represent the first 0x80 values in one byte, etc.

  Ge0rG

    jonas’: https://en.wikipedia.org/wiki/UTF-8#Description - UTF-8 just stuffs the data bits after the header. A sane encoding would be to automatically add 0x80 to the bits in a two-byte encoded codepoint, because you can represent the first 0x80 values in one byte, etc.

  Ge0rG

  Link Mauve

    “Possibly even ignore namespaced attributes, since we never (?) use them.”, we do, @xml:lang for instance.

  Link Mauve

    dwd, ↑

  dwd

    Ah, true. But known ones like that we'd handle differently anyway.

  Link Mauve

    “12:00:46 Ge0rG> Also whoever made it possible to encode U-0000 as an arbitrarily long UTF-8 sequence deserves the highest punishment.”, you’re expected to reject it though.

  Link Mauve

    Same as any other overly-long sequence.

  425. vaulor has joined

  426. alexis has joined

  427. Guus has left

  428. Guus has joined

  429. Guus has left

  430. genofire has left

  431. Holger has left

  432. alexis has left

  433. ThibG has joined

  434. genofire has left

  435. krauq has left

  436. krauq has joined

  dwd

    Oh. I found an actual bug in MUC.

  MattJ

    I'm all ears

  Ge0rG

    No way!

  jonas'

    Just one?

  dwd

    Well, sorta, anyway. When a client drops, it sends unavailable to the MUC automatically because Magic(tm) on the server.

  dwd

    But if the MUC switches nickname on join (210 code stuff), then the directed presence recorded on the server is wrong, and the user never leaves.

  jonas'


  MattJ

    Oh, that one

  jonas'

    that’s a known issue

  jonas'

    servers need to track nickname changes for that :)

  dwd

    I'd seen it with nickname changes, but it didn't occur to me (for some reason) it'd happen with nickname enforcing.

  Ge0rG

    Why can't we just implement MUC proxies on the server.

  Ge0rG

    That really would solve 99% of MUC's problems, in a backward compatible manner

  Ge0rG

    Zash even wrote a POC already.

  Ge0rG

    It's got some minor drawbacks, like you can't ever leave a MUC.

  fippo

    ge0rg: i think one of the dmuc proposals took that approach

  jonas'

    which is fun, by the way, because it means that the user’s server needs to support MUC for it to work properly :-)

  jonas'

    which reminds me of MIX

  jonas'

    except that with MUC, this requirement is hidden and not spelt out and you can join a MUC without that requirement fulfilled and have it work to a certain extent and then run in weird edge cases :)

  Ge0rG

    jonas’: you mean the weird edge cases we cope with every day now?

  Ge0rG

    Like never leaving a MUC if you changed your nickname?

  jonas'


  Ge0rG

    The awesome thing about MUC Proxy would be that it's 100% transparent to the clients and can be rolled out in an instant as an upgrade to fix most of the issues.

  Ge0rG

    Also could include offline notifications and other nice things.

  jonas'


  477. Zash has joined

  jonas'

    it would be somewhat like biboumi but for xmpp

  jonas'

  Ge0rG

    jonas’: the quirks are there because the biboumi developers violently refuse to accept what's good design and practice.

  jonas'

    hm, where?

  Ge0rG

    jonas’: like where they send you individual messages to all of your resources with Carbons disabled?

  jonas'

    what would be a better way?

  491. Alex has joined

  flow

    I don't see a problem with that either, but I believe it should be the responsiblity of the receiving entity that they messages arrive on all devices (if it whishes so), not of the sending

  493. matlag has left

  494. Valerian has joined

  Ge0rG

    flow: the problem is that if you go offline, your messages get rerouted to a different resource, which ends up with two, three or four copies

  500. guusdk has left

  flow

    Ge0rG, ahh, ok I see the issue now.

  jonas'

    Ge0rG, but on the other hand, relying on carbons would mean that resources which are not interested in those messages (read: not joined in any IRC) get them.

  flow

    but wait,

  jonas'

    there’s no good solution here

  jonas'

    and we’ll have the same issues with MUC proxies.

  flow

    you have to go offline while biboumi is sending, otherwhise biboumi won't know of the resource

  flow

    Ge0rG, do you experience that a lot?

  Ge0rG

    flow: there used to be a long discussion on the biboumi tracker

  flow

    with many people reporting to hit that issue of duplicate messages?

  Ge0rG

    jonas’: that's the same problem as with MUCs you join from one client only and the PM Carbons.

  511. Ge0rG

    Ge0rG

  Ge0rG


  513. jonas’

    jonas'

  Ge0rG

    > Opened 1 year ago by Jonas Schäfer

  jonas'

    > Closed

  Ge0rG

    Also https://lab.louiz.org/louiz/biboumi/issues/3304

  jonas'

    also Closed

  Ge0rG

    jonas’: took some months to convince them.

  519. guusdk has joined

  520. Guus has joined

  jonas'

    not für #3277

  522. Zash has left

  Ge0rG

    jonas’: I can't find a way to search for comments by me, but I'm sure most of those would be bitching about how the developers don't understand XMPP.

  jonas'

    I wouldn’t accuse them of that.

  jonas'

    also, they’re still doing great work. I’m fine with the community ironing out the rough edges by filing issues.

  Ge0rG

    jonas’: oh, yes they are.

  Ge0rG

    biboumi is the best cross-protocol gateway I've ever seen.

  jonas'


  Ge0rG

    jonas’: the other thing being https://lab.louiz.org/louiz/biboumi/issues/3283

  530. Zash has left

  531. lorddavidiii has joined

  532. efrit has left

  533. l has joined

  534. l has joined

  535. l has joined

  jonas'

    Ge0rG, that might be fixed during the refactor mentioned in #3382

  Ge0rG

    jonas’: it's not about things being fixed, it's about how hard it is to convince the developers that they _need_ to be fixed.

  jonas'

    edge-cases all abound

  jonas'

    lots of edge-cases not only means lots of code to write, it also means lots of hard-to-reproduce stuff which will be tricky to nail down and prove.

  jonas'

    and we’ll have exactly the same issues with a MUC proxy

  Ge0rG

    I'm a certified MUC Corner Case Debugging Engineer.

  Zash

    If that's the case, where's your diploma?

  543. fffo881 has joined

  544. fffo881 has left

  Ge0rG


  546. lumi has left

  547. Ge0rG has left

  Seve

    Good job Ge0rG! You deserve it!

  jonas'

    well done

  Seve claps

  551. jonas’

    put it on your council application

  552. jonas’ wonders about the significance of that date

  Ge0rG

    jonas’: @horazont horazont merged commit b017284 into xsf:master on Mar 8

  jonas'

    ah, #stable_id

  edhelas

    Ge0rG don't fix too much MUC, we'll not have reasons to work on MIX anymore

  Ge0rG

    jonas’: good idea!

  Ge0rG

    edhelas: now you uncovered my evil secret plan!

  edhelas

    Make MUC Great Again

  Zash

    MUC was never great

  570. Guus has left

  571. guusdk has left

  572. j.r has joined

  573. Guus has joined

  574. guusdk has joined

  575. Guus has left

  576. guusdk has left

  577. Guus has joined

  578. guusdk has joined

  579. daniel has left

  580. daniel has joined

  581. Alex has left

  pep.

    Who can modify the xsf calendar? To add 35C3

  583. guusdk has joined

  584. guusdk has left

  585. guusdk has joined

  586. alacer has joined

  587. dwd has joined

  588. tux has joined

  pep.

    I still have one last voucher btw, if people are interested. Grab it now or it will expire

  590. labdsf has left

  591. dwd has left

  592. alacer has left

  593. lskdjf has joined

  594. SamWhited has joined

  595. !xsf_martin has joined

  596. lovetox has joined

  597. muppeth has left

  598. muppeth has joined

  599. labdsf has joined

  600. lskdjf has left

  601. waqas has joined

  602. waqas has left

  603. ThibG has left

  604. ThibG has joined

  605. waqas has joined

  606. lskdjf has joined

  607. lskdjf has left

  608. grumpy has joined

  609. lskdjf has left

  610. lskdjf has joined

  611. Alex has joined

  612. lskdjf has left

  613. lskdjf has joined

  614. tux has joined

  615. ThibG has left

  616. ThibG has joined

  629. krauq has joined

  edhelas

    In 0060 the <configure/> tag is defined this way <xs:element name='configure'> <xs:complexType> <xs:choice minOccurs='0' xmlns:xdata='jabber:x:data'> <xs:element ref='xdata:x'/> </xs:choice> </xs:complexType> </xs:element>

  edhelas

    However I see some <configure node='princely_musings'> in the examples

  edhelas

    Shoundn't we add <xs:attribute name='node' type='xs:string' use='required'/> ?

  647. Andrew Nenakhov has joined

  663. Zash has left

  Ge0rG

    Our wiki also has a horrible mobile expediency. Pinging I-team

  690. dwd has joined

  715. Steve Kille has joined

  716. Tobias has joined

  732. Guus has left

  733. Guus has joined

  734. Guus has left

  ralphm

    edhelas: well, not required. If using collections, you also want to be able to configure the root node, which is basically leaving off the node attribute.

  750. valo has joined

  751. Alex has left

  ralphm

    Also, you're looking at the wrong namespace. Try pubsub#owner

  ralphm

    The one in the regular pubsub node goes together with <create/> where you already have the node reference.

  ralphm

    eh, pubsub namespace

  edhelas

    ralphm thanks for the precision

  edhelas

    my bad

  ralphm

    So example 137 vs 140

  ralphm

    no worries

  ralphm

    I still regret we used multiple namespaces

  Zash

    The verb another level in is weird too

  783. Andrew Nenakhov has joined

