-
Ge0rG
If I'll ever going to write a server-side PARS [xep 379], I'll call it SPARSE: Server-side Pre-Authenticated Roster Subscription Emission.
-
jonasw
is the board meeting *still* going on? *glances at topic*
- Guus has a Summit 21 flashback: https://imgflip.com/i/1jqylr
-
jonasw
:D
-
Ge0rG
Guus: ha, so awesome!
-
Ge0rG
And so I added 'MUC subject' to https://wiki.xmpp.org/web/Usability/Glossary
-
Ge0rG
should section1 and section2 headings in XEPs be capitalized?
-
Ge0rG
actually any headings.
-
Flow
I have started working on ISR-SASL2. In case you want to have a look at the first alpha quality draft and provide early feedback: http://geekplace.eu/xeps/xep-isr-sasl2/xep-isr-sasl2.html
-
dwd
Flow, Just submit it.
-
jonasw
Flow: nitpick > It MUST not contain whitespace characters. that should probably be MUST NOT
-
dwd
Also I still dislike the term "Nonza".
-
nyco
dwd, "wtf-za" is better
-
nyco
nyco, of "stream first-born child", maybe... or not
-
dwd
nyco, "Element" is suitable in almost every case, and a term of art much more widely known.
-
nyco
dwd, nah too simple
-
Ge0rG
dwd: aren't elements the things you can stuff into your messages?
-
dwd
nyco, I just dislike inventing jargon for the sake of it.
-
jonasw
it would have to be stream-level elements to be precise, and that’s long
-
Ge0rG
let's abbreviate Stream-Level Element as SLE and call it a SLEnza.
-
dwd
jonasw, Only in cases where the distinction is needed.
-
dwd
jonasw, I mean, we use "session" quite happily in multiple senses, and we use "node" pretty much everywhere. I don't understand why we needed to have a made-up word for "non-routable stream-level element".
-
nyco
Stream Level Anti Stanza Hop (SLASH)
-
Ge0rG
dwd: because ambiguous protocol references are ambiguous.
-
dwd
Ge0rG, Sure. But what's ambiguous about, for example "followed by a <authenticate/> element", as opposed to using a made-up word?
-
Ge0rG
I don't like the specific word very much either, but at least it is a well-defined term for a well-defined thing in XMPP, which makes it much better than much of our other terminology.
-
Kev
I dislike nonza fairly intensely :)
-
Kev
For much the same reason, it's inventing a word that doesn't need inventing.
-
Ge0rG
If you read the intro to https://tools.ietf.org/html/rfc6120#section-8 then you'll realize that nonzas are stanzas, too.
-
jonasw
huh, I thought that stanzas are limited to the jabber:{client,server} namespaces?
-
jonasw
and also {iq,message,presence} possibly
-
dwd
jonasw, And XEP-0114. Two namespaces defined there.
-
Ge0rG
jonasw: "either party can send XML stanzas. Three kinds of XML stanza are defined for the 'jabber:client' and 'jabber:server' namespaces: <message/>, <presence/>, and <iq/>." - the spec doesn't claim this to be an exhaustive list of stanzas.
-
jonasw
well, okay, but that doesn’t mean that any stream-level element is a stanza, does it, Ge0rG?
-
dwd
Ge0rG, Right. And XEP-0360 simply says everything not a stanza is a nonza.
-
Ge0rG
oh, §4.1 is the important one: Definition of XML Stanza: An XML stanza is the basic unit of meaning in XMPP. A stanza is a first-level element (at depth=1 of the stream) whose element name is "message", "presence", or "iq" and whose qualifying namespace is 'jabber:client' or 'jabber:server'.
-
dwd
Ge0rG, Except that's not true.
-
Zash
What about jabber:component:something
-
Ge0rG
By contrast, a first-level element qualified by any other namespace is not an XML stanza (stream errors, stream features, TLS-related elements, SASL-related elements, etc.), nor is a <message/>, <presence/>, or <iq/> element that is qualified by the 'jabber:client' or 'jabber:server' namespace but that occurs at a depth other than one (e.g., a <message/> element contained within an extension element (Section 8.4) for reporting purposes), nor is a <message/>, <presence/>, or <iq/> element that is qualified by a namespace other than 'jabber:client' or 'jabber:server'. An XML stanza typically contains one or more child elements (with accompanying attributes, elements, and XML character data) as necessary in order to convey the desired information, which MAY be qualified by any XML namespace (see [XML-NAMES] as well as Section 8.4 in this specification).
-
Zash
For XMPP 2.0, can we just use a single namespace?
-
Ge0rG
it goes on!
-
Ge0rG
dwd: what's not true?
-
dwd
Ge0rG, As Zash says, XEP-0114 carries stanzas in other namespaces.
-
intosi
dwd: no ends of fun because of it.
-
Ge0rG
dwd: 0114 is historical. It doesn't even count. And even if it did, it were wrong, because 6120§4.1
-
Kev
I don't think jabber:(client|server|component) adds anything useful to the protocol that I can see, over a single one.
-
dwd
Ge0rG, A better definition would be that stanzas are stream-level XML elements, with local names "iq", "message", or "presence", within the content namespace of the stream, which may be routed without additional negotiation over other connections (and namespaces).
-
dwd
Ge0rG, But much of that is reversing the defition of a Nonza in XEP-0360.
-
dwd
Kev, No, I agree. It's a distcintion that proves more annoying than useful.
-
Ge0rG
from 114: "Once authenticated, the component can send stanzas through the server and receive stanzas from the server. All stanzas sent to the server MUST possess a 'from' attribute and a 'to' attribute, as in the 'jabber:server' namespace." - I don't even see how this is violating the Stanza definition
-
Zash
The Default Namespace
-
intosi
Kev: +1
-
Ge0rG
dwd: having different terms for routable, standardized stream elements vs. unroutable negotiated ones is very useful. Please rephrasse your critique in a way that still allows for this distinction, without writing out "non-routable top-level stream-elements" every time.
-
dwd
Ge0rG, I'll do so when you can answer the question about when such a precise distinction is warranted above.
-
dwd
Ge0rG, Even when we need to talk about stream-level elements other than stanzas, we can do so clearly (as I just have).
-
dwd
Ge0rG, That, incidentally, requires no reference or document to support it. Whereas even someone well-versed in the RFCs, who has read and implemented a number of XEPs, will not know what "Nonza" means without further reading.
-
Ge0rG
From 0198: "To enable use of stream management, the client sends an <enable/> command to the server." - it's using "command", which is ambiguous and makes me think of ad-hoc commands. There is merely a single mention that 0198 is using not-stanzas at the root level.
-
Kev
Sure, it should say 'element' instead of 'command'.
-
Kev
It doesn't need to say nonza.
-
Ge0rG
dwd: The benefit of "nonza" is actually that the word itself, being a portmanteau of not-stanza, is easy to remember and even to guess from context.
-
Ge0rG
0198 obviously pre-dates the term. My point is that our specs are ambiguous, and that the term helps reducing the ambiguity
-
Zash
It's kinda like abstracting some code into a function
-
Zash
Silly term tho
-
Ge0rG
Kev: it should say 'top-level element', or 'root-level element', so it won't be confused with a <message> or <iq> element.
-
Kev
If anyone reads "an <enable/> element" and thinks "I think it means a <message/> stanza", then I doubt their ability to understand any other part of the stack they need to implement.
-
dwd
Ge0rG, You';re right. I'm convinced by all those people who implemented '198 as an ad-hoc command.
-
Ge0rG
well then, looks like we are done here.
-
Ge0rG
Now, what term do I need to search for if I want to get a list of all XEPs that define new non-routable root-level stream elements?
-
Ge0rG
Okay, enough time spent on bikeshedding today. Another XEPs PR is waiting to be completed.
-
Ge0rG
SamWhited: would you still like to have a revision block added to #413?
-
Zash
Howabout #413
-
Bunneh
Zash: XEP-280: Improve readability #413 https://github.com/xsf/xeps/pull/413
-
Ge0rG
Bunneh doesn't like me.
-
Zash
Wasn't enabled for this room
-
Ge0rG
Zash: That's what you say. I tell you: robot discrimination can go both ways.
-
Flow
What Ge0rg said.
-
Flow
Also you all had enough time to suggest a term for xep360. :-P It's not like I care about the exact name. But I agree with everything Ge0rg said. A definition for non-stanza top level stream elements was truly missing.
-
Ge0rG
Flow: sorry, you missed the discussion. But the XEP-0368 LC on standards@ could use some more bikeshedding.
-
Flow
Ge0rG: I disagree with everything said in that thread, including what I said myself there
-
Ge0rG
Flow: that'd make a great follow-up post.
-
Ge0rG
Also, 0368 needs a pre-Direct-TLS nonza to inform the other party of an imminent connection security upgrade.
-
Flow
jonasw: Thanks for the hint :)
-
Flow
Ge0rG: I do believe SASL2 probably could do that :)
-
Zash
No what we really should do is make SASL a TLS extension!!
-
Ge0rG
Zash: yay! We could use SNI to send our bare-JID and encode the password in ALPN!
-
Zash
YES!
-
Ge0rG
+also
-
Zash
And use session tickets instead of 198
-
jonasw
this doesn’t sound like a bad idea at all!
-
Ge0rG
What is the correct way to attribute XEP changes to external authors? the <initials> element of the <revision> element looks like it's not sufficient.
-
Ge0rG
Flow: wow, that was blazing fast. re #423
-
Bunneh
Flow: XEP-0379: Added "Usability Considerations", removed actual XMPP client, some text editing. #423 https://github.com/xsf/xeps/pull/423
-
Ge0rG
30 minutes between PR and merge. New record :)
-
jonasw
Ge0rG: are you working on 0280 currently?
-
Ge0rG
jonasw: yes
-
jonasw
s/elible/eligible/ in the revision history, if that’s legit
-
Ge0rG
jonasw: thanks, added
-
Ge0rG
that 0.10 block is a good example for what I asked above, btw.
-
Ge0rG
Flow, SamWhited: we need to update the "XMPP Extensions Editor" email template to link to https://xmpp.org/ instead of http://
-
Flow
I first want to know what is missing in xep-README that is preventing your updated version to appear at xmpp.org
-
Ge0rG
Flow: if you want to try another attempt: https://github.com/xsf/xeps/pull/413 is ready to merge now.
-
jonasw
Ge0rG: the inversion of SHOULD to MUST NOT should *probably* be mentioned in the changelog
-
jonasw
ah it is
-
jonasw
nevermind
-
SamWhited
Aww man, all the good rants get taken before I'm a awake: Nonza always just seems like needless tribal knowledge to me too.
-
jonasw
I like nonzas.
-
Ge0rG
This is a clear case of majority-vote-needed. Just let us trump it down properly.
-
jonasw
speaking of which, is "yes we can make xmpp great again" a better slogan than simply "make xmpp great again"?
-
dwd
jonasw, Twice as presidential.
-
Ge0rG
so much bikeshedding. So little actual input.
-
dwd
"Make Britain Great Again" was the slogan under which Thatcher first stood as an MP.
-
dwd
(It was Churchill's reelection campaign, actually - he won).
-
Zash
I quite like "Make America Great Britain Again"
-
Ge0rG
Now that #413 is as-good-as-approved, I'm going to push forward with my threats from https://mail.jabber.org/pipermail/standards/2017-January/032048.html re MUC-PMs in 45 and 280.
-
Bunneh
Ge0rG: XEP-280: Improve readability #413 https://github.com/xsf/xeps/pull/413
-
Ge0rG
I'd really like to hear Kev and dwd on that.
-
dwd
It's a huge rewrite.
-
dwd
Which is probably a good thing, but it does mean going over it carefully.
-
Ge0rG
dwd: what is a huge rewrite?
-
dwd
#413 - seems to make a lot of changes to normative language.
-
Bunneh
dwd: XEP-280: Improve readability #413 https://github.com/xsf/xeps/pull/413
-
dwd
Bunneh, Thanks.
-
jonasw
Ge0rG: FWIW, what you wrote regarding MUC-PMs seems reasonable to me, but I haven’t looked in detail yet. also it seems really a server-implementor specific thing, so I cannot really give feedback :/
-
Ge0rG
dwd: I'd like to hear you on https://mail.jabber.org/pipermail/standards/2017-January/032048.html and not on the almost-approved PR, thanks
-
jonasw
is there any … information material which shows why matrix is uncool?
-
moparisthebest
well re-inventing the wheel is uncool
-
dwd
jonasw, They spent a huge amount of time and money telling the world that XMPP was terrible.
-
MattJ
moparisthebest, that's why XMPP should have just been IRC :)
-
dwd
jonasw, They've stopped that now, mostly. But they still have lots of time and money to spend on publicity. We really need to catch up.
-
goffi
the protocol itself and work done is interesting, my main grief against them is there attitude against XMPP community (it is still visible in the F.A.Q.), and also I'm naturally suspicious with corporate stuff.
-
dwd
goffi, Right, it's still a single company, with a lot of money, pushing their proprietary solution. It's more open than, say, WhatsApp, but not much.
-
goffi
and if we have to compare, I think it's extensibility is a good think in opposition to what they say (that monolitic is better)
-
moparisthebest
MattJ, well IRC didn't federate and had other problems
-
goffi
dwd: I don't say it's not open, I'm just suspicious and I'm more confident with XMPP workflow (which is not perfect either), were I know I can have my word to say
-
moparisthebest
matrix is just xmpp re-written in a less extensible manner with json and webservices meh
-
goffi
it's impressive to see the number of clients/libraries available they already have after 2 years and something
-
goffi
-available
-
dwd
moparisthebest, No, there are some fundamental differences. Mostly that domains are not autonomous.
-
Tobias
moparisthebest, and a different data model, not? doesn't it use a tree structure to distribute messages instead of just routing XML pieces like XMPP?
-
dwd
goffi, Yes. Mindshare is important, as is being able to devote some cash to good examples.
-
Tobias
dwd, domains not being autonomous? what do you mean by that?
-
jonasw
oh great. people complaining about XMPP because it loses messages on mobile. you ask what client they use, they say xabber.
-
dwd
jonasw, Yeah. Matrix don't have this problem because the pool is very small, so far.
-
dwd
Tobias, Chatrooms don't live on one domain exclusively, so aren't under single control.
-
Tobias
ahh...so they have federated MUC built in
-
goffi
dwd: that can be a good or bad thing
-
dwd
Tobias, Sort of. FMUC and friends operate on the principle that there is a source of truth, or else that a sort of semi-independence can be achieved.
-
goffi
single control may be needed in some case (enforcing policy for instance), avoiding single point of failure is nice for popular public room
-
dwd
Tobias, Matrix operates on the notion that there is no single source of truth,
-
goffi
how do message deletion/modification works on Matrix?
-
dwd
goffi, Well, XMPP has supported clustered services for years, so there's no *single* point of failure, but the entirety of control resides within a single autonomous domain.
-
dwd
goffi, I have no clue.
-
intosi
git commit --allow-empty -m 'Hello world!'
-
intosi
git push
-
dwd
goffi, I also have never quite worked out how your server knows to stop mirroring the entire chatroom if ever you leave it.
-
intosi
dwd: despite asking them, right?
-
dwd
intosi, I can't decide if that's an ironic statement or a wrong window.
-
Ge0rG
dwd: I'd really like to get your 2¢ on the MUC-PM thing.
-
daniel
goffi: that's probably something they will add an extension - aehm I mean module - to their monolithic spec for later on
-
dwd
daniel, Right, they don't do extensions. They do fork-lift upgrades.
-
goffi
couldn't Matrix be used as a distributed database for XMPP ? Data replication is interesting for directory.
-
dwd
intosi, And yes, I did ask them, once. I can't even recall the answer, though.
-
goffi
or put in other words, would it be possible (and yes it would) to implement a similar thing in XMPP?
-
dwd
goffi, Oh, sure.
-
SamWhited
Ge0rG: ++ ; thanks for the first "Usability Considerations" section; looking forward to reading that (and Flow ++ for getting it merged so quickly!)
-
Ge0rG
SamWhited: half of it was in the XEP already, part under "Business", part under "Security"
-
Ge0rG
SamWhited: Flow performed the magic, but https://xmpp.org/extensions/xep-0379.html wasn't immediately updated. He wondered why.
-
Ge0rG
SamWhited: also you can merge #413 now, we've provided the <revision> block (plus some revision typos)
-
Bunneh
SamWhited: XEP-280: Improve readability #413 https://github.com/xsf/xeps/pull/413
-
Ge0rG
SamWhited: and I'd like to move on with my 0280 rewrite ;)
-
SamWhited
Ge0rG: Thanks, let me go merge that now before I get too deep into the weeds with my day job.
-
Ge0rG
SamWhited: yay!
-
SamWhited
When you say, "wasn't immediately updated", what actually happened?
-
Ge0rG
SamWhited: I have no idea beyond "12:49:16 Flow> I first want to know what is missing in xep-README that is preventing your updated version to appear at xmpp.org"
-
Tobias
i think it requires action from the XEP editor to rerender
-
Ge0rG
SamWhited: btw, I've also encouraged @penguineer to make another PR to xeps/README, describing the contribution process
-
SamWhited
yah, I'm not sure what that means; is the README not up to date? Maybe I just forgot to update it last time I tweaked things.
-
SamWhited
I wouldn't add anything people need to be able to discover easily to xep-README; it's just a bunch of technical details for the editors so that we don't forget how to do things
-
Flow
I found out why I failed, everything is fine now
-
Ge0rG
SamWhited: not sure about your version of the README, but the one I see on github is a bunch of links and two hints about the makefile, nothing about the editorial process
-
SamWhited
Oh, that readme, sorry, different thing.
-
Ge0rG
I have no idea how the editorial process looks like or whether it should be public.
-
SamWhited
Ge0rG: It's public, it's just probably not something we want to point people too; it's not really written to be consumable or easy to follow.
-
SamWhited
Yah, I'm sure the GitHub README could be improved; PRs welcome :)
-
Ge0rG
SamWhited: TBH, I don't care too much about how it looks, as long as onboarding new editors works sufficiently well for them.
-
SamWhited
Maybe we should just merge the two and just have the markdown readme *be* the editor readme.
-
SamWhited
Although I don't think it will generate a nice table of contents for you, so maybe that would be harder.
-
Ge0rG
SamWhited: I think the README.md should be aimed at contributors first, not at editors.
-
SamWhited
Ge0rG: Yah, you're right
-
Ge0rG
SamWhited: it's okay to cover editorial tasks further down, but it's not a prio for me
-
intosi
Editors have xep-README.*, right?
-
Ge0rG
SamWhited: and the email template needs to be https'ed.
-
SamWhited
intosi: Yup: https://xmpp.org/extensions/xep-README.html
-
Ge0rG
wow, that file is impossible to discover :D
-
SamWhited
I mean, it's not supposed to be hidden, it's just not listed or linked anywhere really
-
Ge0rG
which is not a problem probably.
-
SamWhited
and is full of confusing details and incorrect information that I haven't updated yet :)
-
SamWhited
Yah, not a problem as long as contribution details don't go in there (but that was just me getting my README's confused)
-
Ge0rG
All of this meta talk reminds me that I still need an XML schema for 0379. And I'd love to get that contributed by somebody more familiar with schemas
- SamWhited is pretty sure it's the editors job to help with that…
- SamWhited runs away and hides.
-
Kev
Editor's job to write schemas for people? No, I don't think so.
-
Kev
Something the authors need to do before Draft.
-
SamWhited
Oh? In that case, nevermind, I'm happy :) thought I read somewhere that if you needed help with the schema the editor was supposed to provide it.
-
SamWhited
XEP-0143:
-
SamWhited
> The XMPP Extensions Editor team can assist you in defining an XML Schema for the protocol you are proposing
-
Kev
Maybe I misremember, it happens sometimes.
- SamWhited hides again.
-
Kev
Oh, help, yes. But not to write the thing.
-
intosi
Or point you to the nearest person who hasn't lost sanity yet?
-
SamWhited
Kev: I did say "editors job to help with that", not "to write that" :)
-
Kev
You did.
-
Ge0rG
I've lost my sanity a long time ago. Help me please.
-
Kev
I got the wrong end of the piece of rope's back.
-
intosi
The frayed end?
-
Ge0rG
https://xmpp.org/extensions/xep-README.html#updating references an announce.py. Is that part of https://github.com/xsf/xmpp.org ?
-
Ge0rG
Kev: I'd also like to hear your opinion on https://mail.jabber.org/pipermail/standards/2017-January/032048.html before I start writing PRs.
-
SamWhited
Ge0rG: https://github.com/xsf/xeps/blob/master/announce.py
-
Ge0rG
SamWhited: thanks
-
Ge0rG
SamWhited: #424 :D
-
Bunneh
SamWhited: gen-scripts: Encrypt all URLs ;-) #424 https://github.com/xsf/xeps/pull/424
-
SamWhited
Ge0rG: LGTM, thanks.
-
Tobias
although it's not encrypting URLs
-
Ge0rG
Tobias: it's not?
-
SamWhited
No, the URLs are still in plain text :)
-
Ge0rG
damn, that's a security vulnerability. Let me pull a CVE ID fast.
-
Ge0rG
SamWhited: what did you do to change the sha1 of my gen-script commit?
-
SamWhited
Ge0rG: Used GitHub to merge it instead of doing it myself (which is always a mistake)
-
Zash
I go for one short walk in the sun and I get back to a bazillion messages?
-
Ge0rG
SamWhited: it wasn't even a merge, it rather looks like a rebase. Generally I like rebase more, but it could have been a fast-forward
-
SamWhited
With GitHub you get your choice of: Add a worthless merge commit with some useless default message, change the hash and get a useless merge commit with a default message, or just change the hash.
-
intosi
We need more options.
-
SamWhited
Ge0rG: Yah, I agree, this is why I normally don't use GitHub's web interface
-
Ge0rG
SamWhited: I don't mind it at all, I just wondered. Thanks.
-
Zash
Re Matrix: If they do what I think they should, based on what I've heard (because their docs are terrible), it's all basically MAM queries all the time.
-
SamWhited
Ge0rG: Good eye though; I'm impressed you noticed :)
-
Ge0rG
SamWhited: Switched to branch 'master' Your branch and 'xsf/master' have diverged, and have 1 and 2 different commits each, respectively.
-
Ge0rG
SamWhited: (actually, I saw it in gitk, but this is just a minor thing)
-
Ge0rG
My other project's git history looks like a map of the London Underground. I appreciate linear histories.
-
Ge0rG
(which now reminds me of that one Linux commit that is octomerging 60 different branches)
-
Zash
Should have stuck with Mercurial
-
Zash
It doesn't even allow more than 2 parents :)
-
Ge0rG
Zash: because a series of 60 merge commits is much cleaner than one octomerge? (usually, project that do either are seriously broken)
-
Zash
Or if it does, I have no idea how that would work with the internal data structures I've looked at.
-
SamWhited
Yah, I feel like if you think you need to merge 60 things, you have other problems and choosing a different VCS isn't going to help.
-
Ge0rG
choosing a different VCS was one of the main culprits of the NTPsec fork.
-
Zash
SamWhited: Having a system that allows it does sorta encourage it tho
-
Ge0rG
Zash: I think that hg only was inveted to troll git users.
-
Zash
Ge0rG: You got it wrong, it was to troll Python 3 users
-
SamWhited
Zash: yah, I agree, octomerge is dumb
-
Ge0rG
Zash: I'm not part of that audience, but maybe it was meant to troll both.
-
Zash
Probably to troll everyone but SVN users
-
Ge0rG
Zash: I have heavily used svn before git, and hg still makes me stumble every time.
-
SamWhited
I heavily used SVN, and then HG, and was absolutely an HG fanboy for a while just because it was my first DVCS. Then I realized that we should have learned from our mistakes, and that literally everything was easier and just worked better in Git and that a few minor foibles about the interface not being very consistent should not be enough to stop me from using it, so I learned it and haven't looked back since.
-
Zash
Maybe it's just the order you learn things in
-
Zash
I used svn first, then git, then hg
-
Zash
I like hg the most
-
Ge0rG
Zash: it's the same order for me, and I really can't stand hg. It's trolling me right into my face: hg: unknown command 'fetch' 'fetch' is provided by the following extension: use "hg help extensions" for information on enabling extensions
-
Flow
MattJ: I assume you saw https://github.com/xsf/xeps/pull/420 ?
-
Zash
Ge0rG: That's your brain on git
-
Ge0rG
Zash: no, that's a program telling me: "I know what you want, but I won't let you do it. Instead, you have to read a dozen pages of my useless manual first"
-
SamWhited
I understand the difference and don't really care between feetch/merge and update/pull, but that specific example aside I do agree that the extension thing always pisses me off.
-
Zash
Ge0rG: I could say the same thing about all the times I've done git pull and wondered why the heck it did a merge
-
Ge0rG
What about a friendly "hg fetch is provided by the 'fetch' extension. Activate? (Y/n)"
-
SamWhited
Oh, I just don't like the idea that it has extensions at all (I mean, in a sense Git does too, but it's not actually a thing you're supposed to make generic extensions against, it's just how commands work internally)
-
SamWhited
Not to say that people can't write their own tools to manipulate stuff, I just don't like that it's built right in and it will actually try to get you to use them.
-
Zash
Ge0rG: Here, it yells that fetch is deprecated
-
Ge0rG
Is there any other Elder whom I can summon to be enlightened about the interaction of MUCs, PMs and Carbons?
-
Zash
Elders predate Carbons
-
Ge0rG
Zash: I seriously hope that the Elders predate any of the XMPP protocols.
-
jonasw
Zash: you replied to Ge0rGs thread that at least two implementations are already tagging outgoing MUC PMs with <x/>. can you tell me which?
-
Zash
jcbrand: Prosody and ejabberd✎ -
jonasw
you’re not good at tabcompletion today
-
Zash
jonasw: Prosody and ejabberd ✏
-
Zash
I'm not
-
Zash
I blame Kev. Everything is Kevs fault!
-
jonasw
Ge0rG was talking about clients, not servers, I think.
-
jonasw
Zash wrote: > a) Require carbon-enabled clients to tag outgoing MUC-PMs with <x/>, > carbon-copy the 'sent' MUC-PM to all clients, require carbon-enabled > clients to check for <x/> tag and to drop if they are not joined. This > is a 90% solution (it will still display outgoing PMs if you are > joined to the same MUC under different nicknames, as the other client > doesn't know which nickname the 'sent' message came from). I believe at least two implementations do this already.
-
Zash
jonasw: I believe I was talking about servers
-
Ge0rG
Zash: you believe? :P
-
jonasw
ah, oddly quoted then
-
Zash
Ge0rG: bee-hive
-
Tobias
Zash, you mean bhyve? http://bhyve.org/ :P
-
moparisthebest
I wrote a kontalk JID hash to phone number lookup service if anyone is interested https://www.moparisthebest.com/phonehash/
-
moparisthebest
more of a fun learning excercise than anything, but you can look up any 1 of 100 billion phone numbers with it in ~2 seconds
-
jonasw
"kaputt" as we say in germany :)
-
Zash
Phone numbers in what format?
-
moparisthebest
Zash, currently supports currently supports 0-000-000-0000 to 9-999-999-9999, which kontalk hashes like '+00000000000'
-
Zash
So basically +%011d then
-
moparisthebest
so I guess it doesn't support 2 digit country codes or strange formats? it'd be easy to generate files with those then
-
moparisthebest
yes all 11 digit phone numbers
-
Zash
Hm, 2TB of storage required for a rainbow table, or is my math wrong?
-
moparisthebest
Zash, I couldn't find really good resources for rainbow tables, so I don't know :)
-
moparisthebest
this only takes 500gb of storage though because I'm not storing any hashes, just the numbers
-
Zash
What
-
Zash
10¹¹ * 24B
-
jonasw
(also you could probbaly get away with storing only a unique prefix or part of the hash, reducing the storage needed drastically. after all, a phone number has only 36 bits of entropy)
-
moparisthebest
I explain it all here: https://github.com/moparisthebest/phonehash
-
Zash
20 byte sha1 output + 4 byte number
-
moparisthebest
99,999,999,999 won't quite fit in 4 bytes, you need 5, right?
-
jonasw
moparisthebest: nice hack!
-
jonasw
congrats on that idea :)
-
Zash
How many bytes is 64 bits again?
-
jonasw
8, Zash
-
moparisthebest
yea
-
Zash
Well then
-
moparisthebest
actually iirc that number fits in like 38 bits instead of the 40 bits I'm using
-
Zash
Probably possible to use truncated sha1 hashes, don't think the full output is required to avoid collisions
-
moparisthebest
but unaligned bytes sounded TERRIBLE
-
Zash
Storage is cheap
-
Zash
Maybe I should have done some calculations before attempting to generate this rainbow table in memory
-
moparisthebest
yea but I'm hosting this on my server that only had 800gb of free space hehe
-
moparisthebest
if someone really cared they could get a huge SSD and it'd be faster
-
Zash
Build a big B tree or something
-
moparisthebest
but 90 hours for generation on 2 slow spinners in linux software raid1, and then ~2 seconds per lookup is fine for me
-
moparisthebest
yea I sorted the numbers in the file by hash, but only store the numbers, so I could do a binary search
-
moparisthebest
which for 100 billion numbers is max 26 lookups/sha1 hashes
-
moparisthebest
which my machine can apparantly do in <2 seconds, probably mostly constrained by disk seek speeds
-
jonasw
yeah, 26 times sha1 should be *very* cheap
-
jonasw
microseconds cheap
-
jonasw
it’s the disk :)
-
moparisthebest
during generation I wrote to 65535 files and it KILLED my disk
-
moparisthebest
I had to put in synchronization code so only one file was written to at any given time
-
moparisthebest
so an SSD with no seek time, I bet generation would go from 90 hours to 20 or less
-
moparisthebest
anyone want to send me a >500gb SSD to find out? :P
-
jonasw
no.
-
moparisthebest
or run it yourself :P
-
jonasw
ENOSPC
-
Zash
Rent some CLOUD
-
moparisthebest
I don't have an SSD with that much free space
-
moparisthebest
or money to burn on a toy idea like this lol
-
moparisthebest
Zash, so how were you calculating how big a rainbow table needed to be?
-
moparisthebest
I couldn't really find good info
-
moparisthebest
the strings being hashed are like +00000000000
-
moparisthebest
so a +, then 11 digits
-
Zash
moparisthebest: But if they are numbers, you can just encode them as digits
-
Zash
err
-
Zash
as computers do
-
moparisthebest
yea but can rainbow tables?
-
Zash
integers
-
moparisthebest
I mean presumably a custom implementation can do whatever it wants
-
Zash
Gaint hash table?
-
moparisthebest
but yea all the tools I found only let you specific 'character set' and 'length', which meant it was doing length of 12
-
moparisthebest
and also trying numbers like '00000+000000'
-
moparisthebest
and I had it do that and using 100% cpu for 4 days only had generated a 4gb file so far
-
moparisthebest
so I stopped it hehe
-
Zash
Hu, I ran into the 1GB memory limit of LuaJIT in a few seconds :/
-
moparisthebest
this was the first time java using a 32-bit signed integer as array indices actually effected me
-
Zash
t [ sha1( sprintf("+%011d", i) ) ] = i for i in 0 → 10^11
-
moparisthebest
yea that's a *lot* of memory
-
jonasw
ha, I know why I have swap turned off by default :-)
-
moparisthebest
because just storing the integers as 5 bytes each is 500gb
-
moparisthebest
storing them with 20 byte sha1 hashes is 2.5tb
-
moparisthebest
if you stored them as the 12 character strings they are actually hashed with, that's a TON of space
-
moparisthebest
depending on character encoding and such of course hehe
-
jonasw
I’m still amazed by your binary search hack
-
Zash
... binary search what
-
Zash
If you sort them into 256 buckets based on the first byte of sha1, then sort each of those buckets into buckets based on the second byte of sha1 output, and so on, you get a tree thing...
-
moparisthebest
Zash: 256 buckets is a bit big, I sorted them into 65535 buckets based on the first 2 bytes of the sha1 hash, then sorted those, concatenated all of them into one big now sorted file
-
SamWhited
Isn't that just the definition of a prefix tree?
-
moparisthebest
And then just do a binary search on it
-
Zash
SamWhited: Being self-taught, I rarely know the names of things.
-
moparisthebest
Idk I forgot most of these terms from school :-)
-
SamWhited
ah, no, this is the phone numbers… sort of a weird mix of prefix tree and binary search.
-
SamWhited
or a trie or whatever
-
moparisthebest
It's like a bucket sort, what I did
-
moparisthebest
That's where I got the idea anyway
-
Zash
I've seen a physical bucket sort. It was cool.
-
Zash
Post sorter machine✎ -
Zash
Mail sorter machine ✏
-
moparisthebest
there I added a tl;dr to the readme
-
moparisthebest
tl;dr I put all 11 digit phone numbers represented as 5 byte integers in a 500gb file sorted by their sha1 hashes, now I can binary search it fast.
-
moparisthebest
https://github.com/moparisthebest/phonehash if I didn't link it already
-
Zash
There are some fun ways to store sets of integers
-
Zash
Like, delta compression
-
Zash
Or a giant bitfield
-
jonasw
not sure if any of these work if you have essentially shuffled integers
-
moparisthebest
yea compressing a random set of integers is impossible of course, but these are sequential, but the order is probably essentially random?
-
moparisthebest
it'd be interesting to look into though
-
Zash
If the numbers are divided into blocks of bit fields, then you do a linear search through the bit field but binary search on the blocks...
-
moparisthebest
I have no idea what you just said, why don't you try to implement it then let me look at your code... :)
-
Zash
I have no idea how large a bit field would need to be
-
Zash
Going to implement food instead
-
intosi
Zash: enjoy the debugging!
-
xnyhps
Instead of the binary search, wouldn't it be faster to jump to index (hash / 2^160) * size and search up or down from there depending on the difference between the hashes? The hashes should be quite uniform.
-
moparisthebest
not entirely positive what you mean xnyhps , sounds interesting to try though, care to explain more?
-
jonasw
moparisthebest: you have the data sorted by hashes, so looking at the first 32bit or something to guess the index in your array is a pretty reasonable thing to do
-
moparisthebest
ah yea, interesting
-
xnyhps
If you were looking for the hash 10000000000000..., you can assume it's close to 1/16th in the list of phone numbers because it's 1/16th of the possible values for the hash.
-
xnyhps
You'd start at 8000000..., then 4000000..., etc.
-
moparisthebest
it's not clear to me whether that would always be 26 comparisons or less though?
-
moparisthebest
I guess it'd depend on exactly how evenly distributed the hashes were?
-
xnyhps
Yeah.
-
jonasw
sha1 should be pretty uniform
-
Zash
So you basically treat it as a hash table?
-
moparisthebest
it is for some definition of "pretty" :)
-
xnyhps
You could also do a binary search, but with a weighted "middle" value.
-
moparisthebest
so like when I sorted it into 65535 different files based on the first 2 bytes, if it was *perfectly* distributed the files should have all been like 7.3mb, but they ranged between ~7 and 7.8 or so
-
Zash
moparisthebest: 10^11 is a pretty small sample size tho :)
-
moparisthebest
I should have taken exact byte counts at the time
-
Zash
You could have just kept the files as is
-
moparisthebest
yea could have, more math than just jumping to a place in one file though
-
Zash
Wouldn't it be exactly the same math, just a smaller file?
-
moparisthebest
like I already had the code written to do a binary search in one file and didn't want to bother doing anything else hehe
-
Zash
and you get told the right file to open from the input
-
moparisthebest
and it was the same amount of disk writing, read file into memory, sort file, write file to same file or append to one file is the same
-
moparisthebest
ah yea that's true, yea that would have been faster
-
moparisthebest
well I'm not redoing it haha
-
Zash
And then jumping to some point based on the 3rd byte of the input and done a linear search from there
-
moparisthebest
it would be faster but it's not faster by a huge margin log(100000000000/65535) is 15 sha1+comparison worst case vs 26 for the whole 100 billion
-
moparisthebest
still that would have been better :)
-
jonasw
moparisthebest: it will be quite a bit faster
-
Zash
Don't underestimate the performance of linear searches. CPUs, kernels and the disk can be much smarter
-
jonasw
sequential access is good for spinnign disks
-
jonasw
probably you will have only a single access because all of the numbers fit in the same block
-
jonasw
maybe two disk accesses
-
Zash
binary search mucks up all the caches and whatnot
-
jonasw
I guess the disk latency is the most expensive thing here, and that will vanish to O(1) instead of O(log n)
-
moparisthebest
yea it'd be interesting to see how much it improves
-
moparisthebest
it's already <2 seconds though, if I reworked it like that what would it drop to, 1?
-
moparisthebest
hmm
-
jonasw
100ms or something maybe, from a naive calculation
-
moparisthebest
well also I can see how a forward sequential read would be faster
-
moparisthebest
but jumping to a place might put me ahead of it too which would cause a reverse sequential read
-
moparisthebest
which would equally negate caching
-
jonasw
hm, maybe
-
jonasw
this then really depends on the block alignment
-
jonasw
the disk cache is what benefits your application the most, I think, and that is controlled by the block alignment. unless you hit a block boundary you should be fine. and that can happen both ways.
-
jonasw
on a more on-topic manner, has there been discussion about introducing {urn:xmpp:mix:0}feature elements in disco query responses? I do not like that idea.
-
Zash
Wha?
-
jonasw
Zash: e.g. example 40 in xep 369
-
Zash
-xep 369 ex 40
-
Bunneh
Zash: http://xmpp.org/extensions/xep-0369.html#example-40
-
jonasw
woah, dem features
-
jonasw
(pun not intended)
-
Zash
Hnnng
-
jonasw
I cannot interpret that
-
Zash
<grunt-of-disapproval/>
-
Zash
No caps?
-
jonasw
not sure if mix channels are supposed to support caps :)
-
jonasw
but the interaction with caps would for sure be interesting
-
moparisthebest
so based on a clients support or mix or not, could a server allow them into a mix channel if supported or throw them into some type of muc compatibility layer for the mix jid if not?
-
moparisthebest
sounds super hacky and terrible from a server POV but nice and cozy from a client/user POV :)
-
moparisthebest
ie client A supports mix, client B supports muc, both try to join room@example.org, end up joining same room but B is using his servers muc->mix layer?
-
moparisthebest
since mix requires server support anyway, at least it wouldn't *require* client support this way
-
moparisthebest
plus I'm sure that sounds like loads of fun to implement for Zash
-
moparisthebest
hey it'd work the other way around too, servers could treat remote MUCs as MIXs for their clients that supported MIX ? :)
-
jonasw
good night everyone ☺
-
moparisthebest
alright SamWhited it's up https://github.com/xsf/xeps/pull/426