-
flow
jonas’, which server side impl does the full flush?
-
flow
ahh it is "bytes saved", that is why full flush percentages are lower…
-
jonas’
flow, I hacked the prosody impl to do a full flush
-
Ge0rG
ain't it called a "straight flush"? 😁
-
Ge0rG
how do you add images to the xmpp wiki?
-
pep.
There's a special page to upload file on mediawiki? That's usually accessible on the panel on the left when editing a page
-
Ge0rG
pep.: yes, that's about what I know. But that link isn't there, so now I'm lost.
-
pep.
/Special:Upload
-
pep.
If not maybe that's disabled?
-
ralphm
It is disabled
-
Ge0rG
And I presume you can't inline-link externally hosted files?
-
flow
jonas’, good job. that sounds like it is possible to make prosody announce a zlib-with-full-flush-or-whatever-you-wanna-call-it compression method
-
dwd
jonas’, Real percentages are much higher, or at least they used to be. I did a load of work compressing real traffic captures - but performing a sync flush after multiple stanzas helps a lot, and using CSI really drives it up.
-
Ge0rG
dwd: CSI doesn't absolve you from sync-flushing after each stanza, right?
-
dwd
Ge0rG, I don't think anything mandates you sync-flush after every stanza - just after every buffer flush.
-
flow
Ge0rG, you only need sync flush if there is no more data
-
dwd
flow, That's more or less what I was typing - you only flush once all the inbound traffic has been processed, at least on C2S.
-
flow
dwd, sorry, I just didn't get what you meant with "after every buffer flush" and wanted to clarify it a bit for Ge0rG
-
dwd
flow, Yeah, I was clarifying it the same way as you, but you beat me to it. :-)
-
flow
dwd, I'm also confused why you wrote "inbound" traffic, I'd say it is the "outbound" traffic where an entity controls the zlib behavior. For inbound traffic it is just consuming whatever bytes have been send to it
-
dwd
flow, Ah. So if a client sends you N stanzas, you only need to flush after processing all N.
-
Ge0rG
dwd: the issue we are working around is that compression provides a plaintext size oracle for attackers, right?
-
flow
dwd, that again sounds like the receiving entity would flush
-
flow
which is kind of new to me
-
dwd
Ge0rG, Sure, if you think that's a realistic security problem, then you have to compress only traffic that can be influenced by one entity at a time. Which basically means compressing each stanza individually.
-
flow
dwd, or "do a full flush on every channel change"
-
Ge0rG
dwd: all security problems tend to become realistic sooner or later.
-
dwd
flow, It is exactly that. But it only makes a difference on C2S, I hasten to add.
-
flow
dwd: hmm, well as long as the from/to pair does not change on s2s…
-
dwd
Ge0rG, Sure. But the compression oracle in HTTP was significant because it allowed access to password data, for example.
-
flow
you don't need to drop the dictionary, I think
-
Ge0rG
dwd: are you saying that s2s is not affected by the oracle vulnerability, or that the channel stays always the same between the two server domains?
-
flow
Ge0rG, I think he meant that the channel changes with every stanza
-
flow
but I'd argue that the channel stays stable until the from/to pair changes
-
flow
whereas in c2s, on of from/to is always fixed
-
flow
*one
-
Ge0rG
By that logic, with CSI we should reorder messages so that same-channel messages are sent consecutively
-
dwd
*sigh*
-
flow
I don't follow how this is implied by that logic
-
dwd
I think that you can run a compression-oracle attack on S2S more easily - I think it's easier to inject traffic, and possibly easier to witness the transport channel as well - but you'd find it harder to get anything useful once you had the attack in place.
-
Ge0rG
dwd: the compression oracle in HTTP made it comparably easy to extract credentials, yes. But it does apply to content as well, just that it's rather hard for an attacker to control data injected after the typical body of a web site.
-
Ge0rG
with XMPP, the game is vastly different
-
flow
Slightly unrelated: I also wonder how widespread s2s compression is
-
dwd
flow, Not very. Early versions of Openfire did it, but we disabled it (because it stopped working).
-
Ge0rG
I don't know, but I'd argue that s2s compression is largely irrelevant in typical federated deployments
-
dwd
It wouldn't surprise me if M-Link did it, given its use-cases, but I don't know (and it's a looong time since I knew that kind of thing).
-
flow
Ge0rG, I'm not sure about the "irrelevant" part
-
Ge0rG
flow: irrelevant in the sense that you are not gaining much from it
-
flow
Not everyone hosts its XMPP server in a well connected datacenter
-
flow
Ge0rG, I figured so far, but I still believe that this may not be true in every case.
-
Ge0rG
flow: if you run your XMPP server in your basement on a crappy ADSL line, you are probably not going to use IBB transfers much
-
flow
Ge0rG, i was thinking more about third world countries
-
dwd
flow, Well, tactical military deployments are all S2S over long/thin links, but usually with heavy compression on the links themselves, so I'm not sure '138 would be needed.
-
dwd
Still, all this is rather irrelevant. If we posit that content ends up fully encrypted under OMEMO/MLS/OX/eSessions/PGP then it's incompressible (one hopes). The remaining traffic is best compressed by EXI.
-
flow
True, but then again, we are far from the point where content is fully, or even mostly, encrypted. It may take years until we reach that.
-
flow
So I am again not sure about the "irrlevant" part :)
-
Ge0rG
dwd: if the s2s connection is encrypted, you can't compress it much on the underlying link layer
-
dwd
Ge0rG, WHich is why they don't in those circumstances.
-
jonas’
dwd, right, so both the prosody and aioxmpp implementations do a sync (or full) flush after each stanza
-
jonas’
so by implementing some CORKing, that could be made better I suppose
-
fffo881
F
-
dwd
G
-
flow
jonas’, CORKing?
-
jonas’
flow, like TCP CORK, where you wait for more data for a short period of time before sending it out
-
flow
ahh, nagle algorithmus, right?
-
MattJ
Similar, but manual
-
dwd
jonas’, Yeah, that's Nagle not CORK. CORK is holding the transmission until you manually release it.
-
jonas’
dwd, isn’t nagle that thing which reduces the data rate when stuff gets lost?
-
jonas’
I am lost in the TCP termini, sorry for the confusion.
-
flow
jonas’, I don't think so
-
jonas’
you’re probably both right :)
-
jonas’
nevermind me, you know what I mean (now) though :)
-
dwd
jonas’, No, that's backoff, which might well have been developed by Jon Nagle, but doesn't bear his name at least.
-
flow
nagle just defers the write a widen the window for more data from the application
-
MattJ
jonas’, Nagle's is basically automatic corking at the beginning of a connection
-
flow
batch/bundle and defer
-
jonas’
ok
-
flow
now that I read up on TCP_CORK I can imagine that it isn't heavily used because it appears to be error prone
-
dwd
jonas’, And then there's the reverse - lose stuff when the data rate drops - which is best done with RED, which is Sally Fields's design as I recall. But I don't think that makes sense in XMPP.
-
jonas’
this was more about the concept anyways
-
dwd
flow, Very platform specific too, and irrelevant to us because we need to compress as we go, I think.
-
flow
MattJ, "at the beginning of a connection"? Isn't nagle used over the whole lifetime of a connection (if enabled)?
-
MattJ
Mmm, yeah, sorry
-
flow
jonas’, if the idea is to wait for more outbound stanzas until you give the network layer green light to send it, then I'm fully with you. And like to note that Smack allows for that since many years. Even though I've implemented it to reduce the powered-up time of the radio, it will also help regarding the compression ratio
-
jonas’
flow, no, the idea is to wait for more stanzas before performing the full/sync flush in zlib
-
jonas’
instead of flushing after each stanza
-
flow
MattJ, no worries, just wanted to make sure that I'm not missing something
-
jonas’
(of course taking into account the "(to, from) pair must match to be secure" criterion)
-
flow
jonas’, I think we are talking about the same mechanism
-
jonas’
good
-
flow
I just want to point out that it also increases efficiency in other areas
-
jonas’
true
-
Ge0rG
what about having a zlib dictionary per JID?
-
jonas’
memory cost
-
jonas’
and I think both parties need to agree on the dictionary beforehand
-
jonas’
so you’d have to transfer that dictionary every time you switch?
-
jonas’
or if you had multiple compression streams, you’d have to have an out-of-band way to signal to the peer which one the next bytes belongs to✎ -
jonas’
or if you had multiple compression streams, you’d have to have an out-of-band way to signal to the peer which one the next bytes belong to ✏
-
Ge0rG
Yay.
-
dwd
You could build state-switching into the compression framing, of course, but yeah - memory cost would be scary-huge.
-
jonas’
regarding the use of compression and e2ee: zlib seems to be rather good at reversing the base64-bloat, so that’s at least something.
-
Ge0rG
We need a way to embed raw bytestreams into XML.
-
Ge0rG
Or just replace XML with... protobufs? ASN.1?
-
jonas’
using base92 would probably go a long way already
-
Zash
XER
-
jonas’
(or was it 96?)
-
jonas’
anything above that would give diminishing returns due to UTF-8 encoding anyways
-
Ge0rG
jonas’: base-91
-
Ge0rG
yeah, UTF-8 is not an efficient encoding.
-
dwd
Well. Not in terms of bits, anyway.
-
jonas’
meh
-
jonas’
base91 uses < and >
-
jonas’
an &✎ -
jonas’
and & ✏
-
jonas’
while not-using -, \ and '
-
Ge0rG
Anybody still remembers https://en.wikipedia.org/wiki/YEnc ?
-
jonas’
base85 seems to be the highest thing which is specified somewhere✎ -
jonas’
base85 seems to be the highest thing which is specified somewhere sane ✏
-
dwd
I do occasionally muse over whether a dedicated XMLStream compression could outperform EXI in practical ways, though. Easy to have binary blobs instead of base64, for example, and we could accrue symbols and store dictionaries of XML symbols between sessions and things. We could also ignore the problems of comments, PIs, etc. Possibly even ignore namespaced attributes, since we never (?) use them.
-
jonas’
don’t shut the door on namespaced attributes completely.
-
Ge0rG
XML is really a horrible encoding protocol for machines.
-
flow
what jonas’ said
-
Zash
It's fine, don't worry too much
-
dwd
jonas’, Well, it wouldn't matter if they were considered an outlier and not encoded very efficiently, at least.
-
jonas’
dwd, that’s true
-
dwd
Ge0rG, I quite like many of the properties of XML for our purposes. Certainly the alternatives would make a bunch of things much more painful - and I always have a nagging feeling that a construct like JSON imposes a data structure that is hard to break away from.
-
Zash
Do something like header compression in h2?
-
Ge0rG
dwd: JSON shares most of the disadvantages of XML
-
Zash
CBOR!
-
Ge0rG
I liked the MIDI format, where all numbers are dynamic-width.
-
dwd
Ge0rG, Or BER, where they can be?
-
jonas’
matroshka?✎ -
jonas’
matroshka! ✏
-
Ge0rG
dwd: I'd go with DER for lesser ambiguity
-
Zash
PER?
-
dwd
Ge0rG, CER?
-
Ge0rG
Also whoever made it possible to encode U-0000 as an arbitrarily long UTF-8 sequence deserves the highest punishment.
-
jonas’
tell me more
-
jonas’
can’t you encode all things as arbitrarily long utf-8 sequence though?
-
dwd
jonas’, Only by ignoring the standard.
-
jonas’
but that’s not true for U+0000?
-
Zash
JSON Encoding Rules
-
Ge0rG
jonas’: I'm only bitching because U+0000 has special meaning in C.
-
Zash
Is a thing
-
Ge0rG
jonas’: https://en.wikipedia.org/wiki/UTF-8#Description - UTF-8 just stuffs the data bits after the header. A sane encoding would be to automatically add 0x80 to the bits in a two-byte encoded charset, because you can represent the first 0x80 values in one byte, etc.✎ -
Ge0rG
jonas’: https://en.wikipedia.org/wiki/UTF-8#Description - UTF-8 just stuffs the data bits after the header. A sane encoding would be to automatically add 0x80 to the bits in a two-byte encoded codepoint, because you can represent the first 0x80 values in one byte, etc. ✏
-
jonas’
yeah
-
Ge0rG
it would also reduce the required number of bytes.
-
Link Mauve
“Possibly even ignore namespaced attributes, since we never (?) use them.”, we do, @xml:lang for instance.
-
Link Mauve
dwd, ↑
-
dwd
Ah, true. But known ones like that we'd handle differently anyway.
-
Link Mauve
“12:00:46 Ge0rG> Also whoever made it possible to encode U-0000 as an arbitrarily long UTF-8 sequence deserves the highest punishment.”, you’re expected to reject it though.
-
Link Mauve
Same as any other overly-long sequence.
-
dwd
Oh. I found an actual bug in MUC.
-
MattJ
I'm all ears
-
Ge0rG
No way!
-
jonas’
Just one?
-
dwd
Well, sorta, anyway. When a client drops, it sends unavailable to the MUC automatically because Magic(tm) on the server.
-
dwd
But if the MUC switches nickname on join (210 code stuff), then the directed presence recorded on the server is wrong, and the user never leaves.
-
jonas’
yes
-
MattJ
Oh, that one
-
jonas’
that’s a known issue
-
jonas’
servers need to track nickname changes for that :)
-
dwd
I'd seen it with nickname changes, but it didn't occur to me (for some reason) it'd happen with nickname enforcing.
-
Ge0rG
Why can't we just implement MUC proxies on the server.
-
Ge0rG
That really would solve 99% of MUC's problems, in a backward compatible manner
-
Ge0rG
Zash even wrote a POC already.
-
Ge0rG
It's got some minor drawbacks, like you can't ever leave a MUC.
-
fippo
ge0rg: i think one of the dmuc proposals took that approach
-
jonas’
which is fun, by the way, because it means that the user’s server needs to support MUC for it to work properly :-)
-
jonas’
which reminds me of MIX
-
jonas’
except that with MUC, this requirement is hidden and not spelt out and you can join a MUC without that requirement fulfilled and have it work to a certain extent and then run in weird edge cases :)
-
Ge0rG
jonas’: you mean the weird edge cases we cope with every day now?
-
Ge0rG
Like never leaving a MUC if you changed your nickname?
-
jonas’
yes
-
Ge0rG
The awesome thing about MUC Proxy would be that it's 100% transparent to the clients and can be rolled out in an instant as an upgrade to fix most of the issues.
-
Ge0rG
Also could include offline notifications and other nice things.
-
jonas’
mh
-
jonas’
it would be somewhat like biboumi but for xmpp
-
jonas’
and looking at the quirks which still are there with persistency and biboumi, I’m not sure it’s as easy as you make it out to be
-
Ge0rG
jonas’: the quirks are there because the biboumi developers violently refuse to accept what's good design and practice.
-
jonas’
hm, where?
-
Ge0rG
jonas’: like where they send you individual messages to all of your resources with Carbons disabled?
-
jonas’
what would be a better way?
-
flow
I don't see a problem with that either, but I believe it should be the responsiblity of the receiving entity that they messages arrive on all devices (if it whishes so), not of the sending
-
Ge0rG
flow: the problem is that if you go offline, your messages get rerouted to a different resource, which ends up with two, three or four copies
-
flow
Ge0rG, ahh, ok I see the issue now.
-
jonas’
Ge0rG, but on the other hand, relying on carbons would mean that resources which are not interested in those messages (read: not joined in any IRC) get them.
-
flow
but wait,
-
jonas’
there’s no good solution here
-
jonas’
and we’ll have the same issues with MUC proxies.
-
flow
you have to go offline while biboumi is sending, otherwhise biboumi won't know of the resource
-
flow
Ge0rG, do you experience that a lot?
-
Ge0rG
flow: there used to be a long discussion on the biboumi tracker
-
flow
with many people reporting to hit that issue of duplicate messages?
-
Ge0rG
jonas’: that's the same problem as with MUCs you join from one client only and the PM Carbons.
-
Ge0rG
flow: yeah
-
Ge0rG
https://lab.louiz.org/louiz/biboumi/issues/3277
-
jonas’
Ge0rG, yes
-
Ge0rG
> Opened 1 year ago by Jonas Schäfer
-
jonas’
> Closed
-
Ge0rG
Also https://lab.louiz.org/louiz/biboumi/issues/3304
-
jonas’
also Closed
-
Ge0rG
jonas’: took some months to convince them.
-
jonas’
not für #3277
-
Ge0rG
jonas’: I can't find a way to search for comments by me, but I'm sure most of those would be bitching about how the developers don't understand XMPP.
-
jonas’
I wouldn’t accuse them of that.
-
jonas’
also, they’re still doing great work. I’m fine with the community ironing out the rough edges by filing issues.
-
Ge0rG
jonas’: oh, yes they are.
-
Ge0rG
biboumi is the best cross-protocol gateway I've ever seen.
-
jonas’
indeed.
-
Ge0rG
jonas’: the other thing being https://lab.louiz.org/louiz/biboumi/issues/3283
-
jonas’
Ge0rG, that might be fixed during the refactor mentioned in #3382
-
Ge0rG
jonas’: it's not about things being fixed, it's about how hard it is to convince the developers that they _need_ to be fixed.
-
jonas’
edge-cases all abound
-
jonas’
lots of edge-cases not only means lots of code to write, it also means lots of hard-to-reproduce stuff which will be tricky to nail down and prove.
-
jonas’
and we’ll have exactly the same issues with a MUC proxy
-
Ge0rG
I'm a certified MUC Corner Case Debugging Engineer.
-
Zash
If that's the case, where's your diploma?
-
Ge0rG
https://op-co.de/tmp/MUC-CCDE.jpg
-
Seve
Good job Ge0rG! You deserve it!
-
jonas’
well done
- Seve claps
-
jonas’
put it on your council application
- jonas’ wonders about the significance of that date
-
Ge0rG
jonas’: @horazont horazont merged commit b017284 into xsf:master on Mar 8
-
jonas’
ah, #stable_id
-
edhelas
Ge0rG don't fix too much MUC, we'll not have reasons to work on MIX anymore
-
Ge0rG
jonas’: good idea!
-
Ge0rG
edhelas: now you uncovered my evil secret plan!
-
edhelas
Make MUC Great Again
-
Zash
MUC was never great
-
pep.
Who can modify the xsf calendar? To add 35C3
-
pep.
I still have one last voucher btw, if people are interested. Grab it now or it will expire
-
edhelas
In 0060 the <configure/> tag is defined this way <xs:element name='configure'> <xs:complexType> <xs:choice minOccurs='0' xmlns:xdata='jabber:x:data'> <xs:element ref='xdata:x'/> </xs:choice> </xs:complexType> </xs:element>
-
edhelas
However I see some <configure node='princely_musings'> in the examples
-
edhelas
Shoundn't we add <xs:attribute name='node' type='xs:string' use='required'/> ?
-
Ge0rG
Our wiki also has a horrible mobile expediency. Pinging I-team
-
ralphm
edhelas: well, not required. If using collections, you also want to be able to configure the root node, which is basically leaving off the node attribute.
-
ralphm
Also, you're looking at the wrong namespace. Try pubsub#owner
-
ralphm
The one in the regular pubsub node goes together with <create/> where you already have the node reference.
-
ralphm
eh, pubsub namespace
-
edhelas
ralphm thanks for the precision
-
edhelas
my bad
-
ralphm
So example 137 vs 140
-
ralphm
no worries
-
ralphm
I still regret we used multiple namespaces
-
Zash
The verb another level in is weird too