XSF Discussion - 2017-02-16


  1. Ge0rG

    If I'll ever going to write a server-side PARS [xep 379], I'll call it SPARSE: Server-side Pre-Authenticated Roster Subscription Emission.

  2. jonasw

    is the board meeting *still* going on? *glances at topic*

  3. Guus has a Summit 21 flashback: https://imgflip.com/i/1jqylr

  4. jonasw

    :D

  5. Ge0rG

    Guus: ha, so awesome!

  6. Ge0rG

    And so I added 'MUC subject' to https://wiki.xmpp.org/web/Usability/Glossary

  7. Ge0rG

    should section1 and section2 headings in XEPs be capitalized?

  8. Ge0rG

    actually any headings.

  9. Flow

    I have started working on ISR-SASL2. In case you want to have a look at the first alpha quality draft and provide early feedback: http://geekplace.eu/xeps/xep-isr-sasl2/xep-isr-sasl2.html

  10. dwd

    Flow, Just submit it.

  11. jonasw

    Flow: nitpick > It MUST not contain whitespace characters. that should probably be MUST NOT

  12. dwd

    Also I still dislike the term "Nonza".

  13. nyco

    dwd, "wtf-za" is better

  14. nyco

    nyco, of "stream first-born child", maybe... or not

  15. dwd

    nyco, "Element" is suitable in almost every case, and a term of art much more widely known.

  16. nyco

    dwd, nah too simple

  17. Ge0rG

    dwd: aren't elements the things you can stuff into your messages?

  18. dwd

    nyco, I just dislike inventing jargon for the sake of it.

  19. jonasw

    it would have to be stream-level elements to be precise, and that’s long

  20. Ge0rG

    let's abbreviate Stream-Level Element as SLE and call it a SLEnza.

  21. dwd

    jonasw, Only in cases where the distinction is needed.

  22. dwd

    jonasw, I mean, we use "session" quite happily in multiple senses, and we use "node" pretty much everywhere. I don't understand why we needed to have a made-up word for "non-routable stream-level element".

  23. nyco

    Stream Level Anti Stanza Hop (SLASH)

  24. Ge0rG

    dwd: because ambiguous protocol references are ambiguous.

  25. dwd

    Ge0rG, Sure. But what's ambiguous about, for example "followed by a <authenticate/> element", as opposed to using a made-up word?

  26. Ge0rG

    I don't like the specific word very much either, but at least it is a well-defined term for a well-defined thing in XMPP, which makes it much better than much of our other terminology.

  27. Kev

    I dislike nonza fairly intensely :)

  28. Kev

    For much the same reason, it's inventing a word that doesn't need inventing.

  29. Ge0rG

    If you read the intro to https://tools.ietf.org/html/rfc6120#section-8 then you'll realize that nonzas are stanzas, too.

  30. jonasw

    huh, I thought that stanzas are limited to the jabber:{client,server} namespaces?

  31. jonasw

    and also {iq,message,presence} possibly

  32. dwd

    jonasw, And XEP-0114. Two namespaces defined there.

  33. Ge0rG

    jonasw: "either party can send XML stanzas. Three kinds of XML stanza are defined for the 'jabber:client' and 'jabber:server' namespaces: <message/>, <presence/>, and <iq/>." - the spec doesn't claim this to be an exhaustive list of stanzas.

  34. jonasw

    well, okay, but that doesn’t mean that any stream-level element is a stanza, does it, Ge0rG?

  35. dwd

    Ge0rG, Right. And XEP-0360 simply says everything not a stanza is a nonza.

  36. Ge0rG

    oh, §4.1 is the important one: Definition of XML Stanza: An XML stanza is the basic unit of meaning in XMPP. A stanza is a first-level element (at depth=1 of the stream) whose element name is "message", "presence", or "iq" and whose qualifying namespace is 'jabber:client' or 'jabber:server'.

  37. dwd

    Ge0rG, Except that's not true.

  38. Zash

    What about jabber:component:something

  39. Ge0rG

    By contrast, a first-level element qualified by any other namespace is not an XML stanza (stream errors, stream features, TLS-related elements, SASL-related elements, etc.), nor is a <message/>, <presence/>, or <iq/> element that is qualified by the 'jabber:client' or 'jabber:server' namespace but that occurs at a depth other than one (e.g., a <message/> element contained within an extension element (Section 8.4) for reporting purposes), nor is a <message/>, <presence/>, or <iq/> element that is qualified by a namespace other than 'jabber:client' or 'jabber:server'. An XML stanza typically contains one or more child elements (with accompanying attributes, elements, and XML character data) as necessary in order to convey the desired information, which MAY be qualified by any XML namespace (see [XML-NAMES] as well as Section 8.4 in this specification).

  40. Zash

    For XMPP 2.0, can we just use a single namespace?

  41. Ge0rG

    it goes on!

  42. Ge0rG

    dwd: what's not true?

  43. dwd

    Ge0rG, As Zash says, XEP-0114 carries stanzas in other namespaces.

  44. intosi

    dwd: no ends of fun because of it.

  45. Ge0rG

    dwd: 0114 is historical. It doesn't even count. And even if it did, it were wrong, because 6120§4.1

  46. Kev

    I don't think jabber:(client|server|component) adds anything useful to the protocol that I can see, over a single one.

  47. dwd

    Ge0rG, A better definition would be that stanzas are stream-level XML elements, with local names "iq", "message", or "presence", within the content namespace of the stream, which may be routed without additional negotiation over other connections (and namespaces).

  48. dwd

    Ge0rG, But much of that is reversing the defition of a Nonza in XEP-0360.

  49. dwd

    Kev, No, I agree. It's a distcintion that proves more annoying than useful.

  50. Ge0rG

    from 114: "Once authenticated, the component can send stanzas through the server and receive stanzas from the server. All stanzas sent to the server MUST possess a 'from' attribute and a 'to' attribute, as in the 'jabber:server' namespace." - I don't even see how this is violating the Stanza definition

  51. Zash

    The Default Namespace

  52. intosi

    Kev: +1

  53. Ge0rG

    dwd: having different terms for routable, standardized stream elements vs. unroutable negotiated ones is very useful. Please rephrasse your critique in a way that still allows for this distinction, without writing out "non-routable top-level stream-elements" every time.

  54. dwd

    Ge0rG, I'll do so when you can answer the question about when such a precise distinction is warranted above.

  55. dwd

    Ge0rG, Even when we need to talk about stream-level elements other than stanzas, we can do so clearly (as I just have).

  56. dwd

    Ge0rG, That, incidentally, requires no reference or document to support it. Whereas even someone well-versed in the RFCs, who has read and implemented a number of XEPs, will not know what "Nonza" means without further reading.

  57. Ge0rG

    From 0198: "To enable use of stream management, the client sends an <enable/> command to the server." - it's using "command", which is ambiguous and makes me think of ad-hoc commands. There is merely a single mention that 0198 is using not-stanzas at the root level.

  58. Kev

    Sure, it should say 'element' instead of 'command'.

  59. Kev

    It doesn't need to say nonza.

  60. Ge0rG

    dwd: The benefit of "nonza" is actually that the word itself, being a portmanteau of not-stanza, is easy to remember and even to guess from context.

  61. Ge0rG

    0198 obviously pre-dates the term. My point is that our specs are ambiguous, and that the term helps reducing the ambiguity

  62. Zash

    It's kinda like abstracting some code into a function

  63. Zash

    Silly term tho

  64. Ge0rG

    Kev: it should say 'top-level element', or 'root-level element', so it won't be confused with a <message> or <iq> element.

  65. Kev

    If anyone reads "an <enable/> element" and thinks "I think it means a <message/> stanza", then I doubt their ability to understand any other part of the stack they need to implement.

  66. dwd

    Ge0rG, You';re right. I'm convinced by all those people who implemented '198 as an ad-hoc command.

  67. Ge0rG

    well then, looks like we are done here.

  68. Ge0rG

    Now, what term do I need to search for if I want to get a list of all XEPs that define new non-routable root-level stream elements?

  69. Ge0rG

    Okay, enough time spent on bikeshedding today. Another XEPs PR is waiting to be completed.

  70. Ge0rG

    SamWhited: would you still like to have a revision block added to #413?

  71. Zash

    Howabout #413

  72. Bunneh

    Zash: XEP-280: Improve readability #413 https://github.com/xsf/xeps/pull/413

  73. Ge0rG

    Bunneh doesn't like me.

  74. Zash

    Wasn't enabled for this room

  75. Ge0rG

    Zash: That's what you say. I tell you: robot discrimination can go both ways.

  76. Flow

    What Ge0rg said.

  77. Flow

    Also you all had enough time to suggest a term for xep360. :-P It's not like I care about the exact name. But I agree with everything Ge0rg said. A definition for non-stanza top level stream elements was truly missing.

  78. Ge0rG

    Flow: sorry, you missed the discussion. But the XEP-0368 LC on standards@ could use some more bikeshedding.

  79. Flow

    Ge0rG: I disagree with everything said in that thread, including what I said myself there

  80. Ge0rG

    Flow: that'd make a great follow-up post.

  81. Ge0rG

    Also, 0368 needs a pre-Direct-TLS nonza to inform the other party of an imminent connection security upgrade.

  82. Flow

    jonasw: Thanks for the hint :)

  83. Flow

    Ge0rG: I do believe SASL2 probably could do that :)

  84. Zash

    No what we really should do is make SASL a TLS extension!!

  85. Ge0rG

    Zash: yay! We could use SNI to send our bare-JID and encode the password in ALPN!

  86. Zash

    YES!

  87. Ge0rG

    +also

  88. Zash

    And use session tickets instead of 198

  89. jonasw

    this doesn’t sound like a bad idea at all!

  90. Ge0rG

    What is the correct way to attribute XEP changes to external authors? the <initials> element of the <revision> element looks like it's not sufficient.

  91. Ge0rG

    Flow: wow, that was blazing fast. re #423

  92. Bunneh

    Flow: XEP-0379: Added "Usability Considerations", removed actual XMPP client, some text editing. #423 https://github.com/xsf/xeps/pull/423

  93. Ge0rG

    30 minutes between PR and merge. New record :)

  94. jonasw

    Ge0rG: are you working on 0280 currently?

  95. Ge0rG

    jonasw: yes

  96. jonasw

    s/elible/eligible/ in the revision history, if that’s legit

  97. Ge0rG

    jonasw: thanks, added

  98. Ge0rG

    that 0.10 block is a good example for what I asked above, btw.

  99. Ge0rG

    Flow, SamWhited: we need to update the "XMPP Extensions Editor" email template to link to https://xmpp.org/ instead of http://

  100. Flow

    I first want to know what is missing in xep-README that is preventing your updated version to appear at xmpp.org

  101. Ge0rG

    Flow: if you want to try another attempt: https://github.com/xsf/xeps/pull/413 is ready to merge now.

  102. jonasw

    Ge0rG: the inversion of SHOULD to MUST NOT should *probably* be mentioned in the changelog

  103. jonasw

    ah it is

  104. jonasw

    nevermind

  105. SamWhited

    Aww man, all the good rants get taken before I'm a awake: Nonza always just seems like needless tribal knowledge to me too.

  106. jonasw

    I like nonzas.

  107. Ge0rG

    This is a clear case of majority-vote-needed. Just let us trump it down properly.

  108. jonasw

    speaking of which, is "yes we can make xmpp great again" a better slogan than simply "make xmpp great again"?

  109. dwd

    jonasw, Twice as presidential.

  110. Ge0rG

    so much bikeshedding. So little actual input.

  111. dwd

    "Make Britain Great Again" was the slogan under which Thatcher first stood as an MP.

  112. dwd

    (It was Churchill's reelection campaign, actually - he won).

  113. Zash

    I quite like "Make America Great Britain Again"

  114. Ge0rG

    Now that #413 is as-good-as-approved, I'm going to push forward with my threats from https://mail.jabber.org/pipermail/standards/2017-January/032048.html re MUC-PMs in 45 and 280.

  115. Bunneh

    Ge0rG: XEP-280: Improve readability #413 https://github.com/xsf/xeps/pull/413

  116. Ge0rG

    I'd really like to hear Kev and dwd on that.

  117. dwd

    It's a huge rewrite.

  118. dwd

    Which is probably a good thing, but it does mean going over it carefully.

  119. Ge0rG

    dwd: what is a huge rewrite?

  120. dwd

    #413 - seems to make a lot of changes to normative language.

  121. Bunneh

    dwd: XEP-280: Improve readability #413 https://github.com/xsf/xeps/pull/413

  122. dwd

    Bunneh, Thanks.

  123. jonasw

    Ge0rG: FWIW, what you wrote regarding MUC-PMs seems reasonable to me, but I haven’t looked in detail yet. also it seems really a server-implementor specific thing, so I cannot really give feedback :/

  124. Ge0rG

    dwd: I'd like to hear you on https://mail.jabber.org/pipermail/standards/2017-January/032048.html and not on the almost-approved PR, thanks

  125. jonasw

    is there any … information material which shows why matrix is uncool?

  126. moparisthebest

    well re-inventing the wheel is uncool

  127. dwd

    jonasw, They spent a huge amount of time and money telling the world that XMPP was terrible.

  128. MattJ

    moparisthebest, that's why XMPP should have just been IRC :)

  129. dwd

    jonasw, They've stopped that now, mostly. But they still have lots of time and money to spend on publicity. We really need to catch up.

  130. goffi

    the protocol itself and work done is interesting, my main grief against them is there attitude against XMPP community (it is still visible in the F.A.Q.), and also I'm naturally suspicious with corporate stuff.

  131. dwd

    goffi, Right, it's still a single company, with a lot of money, pushing their proprietary solution. It's more open than, say, WhatsApp, but not much.

  132. goffi

    and if we have to compare, I think it's extensibility is a good think in opposition to what they say (that monolitic is better)

  133. moparisthebest

    MattJ, well IRC didn't federate and had other problems

  134. goffi

    dwd: I don't say it's not open, I'm just suspicious and I'm more confident with XMPP workflow (which is not perfect either), were I know I can have my word to say

  135. moparisthebest

    matrix is just xmpp re-written in a less extensible manner with json and webservices meh

  136. goffi

    it's impressive to see the number of clients/libraries available they already have after 2 years and something

  137. goffi

    -available

  138. dwd

    moparisthebest, No, there are some fundamental differences. Mostly that domains are not autonomous.

  139. Tobias

    moparisthebest, and a different data model, not? doesn't it use a tree structure to distribute messages instead of just routing XML pieces like XMPP?

  140. dwd

    goffi, Yes. Mindshare is important, as is being able to devote some cash to good examples.

  141. Tobias

    dwd, domains not being autonomous? what do you mean by that?

  142. jonasw

    oh great. people complaining about XMPP because it loses messages on mobile. you ask what client they use, they say xabber.

  143. dwd

    jonasw, Yeah. Matrix don't have this problem because the pool is very small, so far.

  144. dwd

    Tobias, Chatrooms don't live on one domain exclusively, so aren't under single control.

  145. Tobias

    ahh...so they have federated MUC built in

  146. goffi

    dwd: that can be a good or bad thing

  147. dwd

    Tobias, Sort of. FMUC and friends operate on the principle that there is a source of truth, or else that a sort of semi-independence can be achieved.

  148. goffi

    single control may be needed in some case (enforcing policy for instance), avoiding single point of failure is nice for popular public room

  149. dwd

    Tobias, Matrix operates on the notion that there is no single source of truth,

  150. goffi

    how do message deletion/modification works on Matrix?

  151. dwd

    goffi, Well, XMPP has supported clustered services for years, so there's no *single* point of failure, but the entirety of control resides within a single autonomous domain.

  152. dwd

    goffi, I have no clue.

  153. intosi

    git commit --allow-empty -m 'Hello world!'

  154. intosi

    git push

  155. dwd

    goffi, I also have never quite worked out how your server knows to stop mirroring the entire chatroom if ever you leave it.

  156. intosi

    dwd: despite asking them, right?

  157. dwd

    intosi, I can't decide if that's an ironic statement or a wrong window.

  158. Ge0rG

    dwd: I'd really like to get your 2¢ on the MUC-PM thing.

  159. daniel

    goffi: that's probably something they will add an extension - aehm I mean module - to their monolithic spec for later on

  160. dwd

    daniel, Right, they don't do extensions. They do fork-lift upgrades.

  161. goffi

    couldn't Matrix be used as a distributed database for XMPP ? Data replication is interesting for directory.

  162. dwd

    intosi, And yes, I did ask them, once. I can't even recall the answer, though.

  163. goffi

    or put in other words, would it be possible (and yes it would) to implement a similar thing in XMPP?

  164. dwd

    goffi, Oh, sure.

  165. SamWhited

    Ge0rG: ++ ; thanks for the first "Usability Considerations" section; looking forward to reading that (and Flow ++ for getting it merged so quickly!)

  166. Ge0rG

    SamWhited: half of it was in the XEP already, part under "Business", part under "Security"

  167. Ge0rG

    SamWhited: Flow performed the magic, but https://xmpp.org/extensions/xep-0379.html wasn't immediately updated. He wondered why.

  168. Ge0rG

    SamWhited: also you can merge #413 now, we've provided the <revision> block (plus some revision typos)

  169. Bunneh

    SamWhited: XEP-280: Improve readability #413 https://github.com/xsf/xeps/pull/413

  170. Ge0rG

    SamWhited: and I'd like to move on with my 0280 rewrite ;)

  171. SamWhited

    Ge0rG: Thanks, let me go merge that now before I get too deep into the weeds with my day job.

  172. Ge0rG

    SamWhited: yay!

  173. SamWhited

    When you say, "wasn't immediately updated", what actually happened?

  174. Ge0rG

    SamWhited: I have no idea beyond "12:49:16 Flow> I first want to know what is missing in xep-README that is preventing your updated version to appear at xmpp.org"

  175. Tobias

    i think it requires action from the XEP editor to rerender

  176. Ge0rG

    SamWhited: btw, I've also encouraged @penguineer to make another PR to xeps/README, describing the contribution process

  177. SamWhited

    yah, I'm not sure what that means; is the README not up to date? Maybe I just forgot to update it last time I tweaked things.

  178. SamWhited

    I wouldn't add anything people need to be able to discover easily to xep-README; it's just a bunch of technical details for the editors so that we don't forget how to do things

  179. Flow

    I found out why I failed, everything is fine now

  180. Ge0rG

    SamWhited: not sure about your version of the README, but the one I see on github is a bunch of links and two hints about the makefile, nothing about the editorial process

  181. SamWhited

    Oh, that readme, sorry, different thing.

  182. Ge0rG

    I have no idea how the editorial process looks like or whether it should be public.

  183. SamWhited

    Ge0rG: It's public, it's just probably not something we want to point people too; it's not really written to be consumable or easy to follow.

  184. SamWhited

    Yah, I'm sure the GitHub README could be improved; PRs welcome :)

  185. Ge0rG

    SamWhited: TBH, I don't care too much about how it looks, as long as onboarding new editors works sufficiently well for them.

  186. SamWhited

    Maybe we should just merge the two and just have the markdown readme *be* the editor readme.

  187. SamWhited

    Although I don't think it will generate a nice table of contents for you, so maybe that would be harder.

  188. Ge0rG

    SamWhited: I think the README.md should be aimed at contributors first, not at editors.

  189. SamWhited

    Ge0rG: Yah, you're right

  190. Ge0rG

    SamWhited: it's okay to cover editorial tasks further down, but it's not a prio for me

  191. intosi

    Editors have xep-README.*, right?

  192. Ge0rG

    SamWhited: and the email template needs to be https'ed.

  193. SamWhited

    intosi: Yup: https://xmpp.org/extensions/xep-README.html

  194. Ge0rG

    wow, that file is impossible to discover :D

  195. SamWhited

    I mean, it's not supposed to be hidden, it's just not listed or linked anywhere really

  196. Ge0rG

    which is not a problem probably.

  197. SamWhited

    and is full of confusing details and incorrect information that I haven't updated yet :)

  198. SamWhited

    Yah, not a problem as long as contribution details don't go in there (but that was just me getting my README's confused)

  199. Ge0rG

    All of this meta talk reminds me that I still need an XML schema for 0379. And I'd love to get that contributed by somebody more familiar with schemas

  200. SamWhited is pretty sure it's the editors job to help with that…

  201. SamWhited runs away and hides.

  202. Kev

    Editor's job to write schemas for people? No, I don't think so.

  203. Kev

    Something the authors need to do before Draft.

  204. SamWhited

    Oh? In that case, nevermind, I'm happy :) thought I read somewhere that if you needed help with the schema the editor was supposed to provide it.

  205. SamWhited

    XEP-0143:

  206. SamWhited

    > The XMPP Extensions Editor team can assist you in defining an XML Schema for the protocol you are proposing

  207. Kev

    Maybe I misremember, it happens sometimes.

  208. SamWhited hides again.

  209. Kev

    Oh, help, yes. But not to write the thing.

  210. intosi

    Or point you to the nearest person who hasn't lost sanity yet?

  211. SamWhited

    Kev: I did say "editors job to help with that", not "to write that" :)

  212. Kev

    You did.

  213. Ge0rG

    I've lost my sanity a long time ago. Help me please.

  214. Kev

    I got the wrong end of the piece of rope's back.

  215. intosi

    The frayed end?

  216. Ge0rG

    https://xmpp.org/extensions/xep-README.html#updating references an announce.py. Is that part of https://github.com/xsf/xmpp.org ?

  217. Ge0rG

    Kev: I'd also like to hear your opinion on https://mail.jabber.org/pipermail/standards/2017-January/032048.html before I start writing PRs.

  218. SamWhited

    Ge0rG: https://github.com/xsf/xeps/blob/master/announce.py

  219. Ge0rG

    SamWhited: thanks

  220. Ge0rG

    SamWhited: #424 :D

  221. Bunneh

    SamWhited: gen-scripts: Encrypt all URLs ;-) #424 https://github.com/xsf/xeps/pull/424

  222. SamWhited

    Ge0rG: LGTM, thanks.

  223. Tobias

    although it's not encrypting URLs

  224. Ge0rG

    Tobias: it's not?

  225. SamWhited

    No, the URLs are still in plain text :)

  226. Ge0rG

    damn, that's a security vulnerability. Let me pull a CVE ID fast.

  227. Ge0rG

    SamWhited: what did you do to change the sha1 of my gen-script commit?

  228. SamWhited

    Ge0rG: Used GitHub to merge it instead of doing it myself (which is always a mistake)

  229. Zash

    I go for one short walk in the sun and I get back to a bazillion messages?

  230. Ge0rG

    SamWhited: it wasn't even a merge, it rather looks like a rebase. Generally I like rebase more, but it could have been a fast-forward

  231. SamWhited

    With GitHub you get your choice of: Add a worthless merge commit with some useless default message, change the hash and get a useless merge commit with a default message, or just change the hash.

  232. intosi

    We need more options.

  233. SamWhited

    Ge0rG: Yah, I agree, this is why I normally don't use GitHub's web interface

  234. Ge0rG

    SamWhited: I don't mind it at all, I just wondered. Thanks.

  235. Zash

    Re Matrix: If they do what I think they should, based on what I've heard (because their docs are terrible), it's all basically MAM queries all the time.

  236. SamWhited

    Ge0rG: Good eye though; I'm impressed you noticed :)

  237. Ge0rG

    SamWhited: Switched to branch 'master' Your branch and 'xsf/master' have diverged, and have 1 and 2 different commits each, respectively.

  238. Ge0rG

    SamWhited: (actually, I saw it in gitk, but this is just a minor thing)

  239. Ge0rG

    My other project's git history looks like a map of the London Underground. I appreciate linear histories.

  240. Ge0rG

    (which now reminds me of that one Linux commit that is octomerging 60 different branches)

  241. Zash

    Should have stuck with Mercurial

  242. Zash

    It doesn't even allow more than 2 parents :)

  243. Ge0rG

    Zash: because a series of 60 merge commits is much cleaner than one octomerge? (usually, project that do either are seriously broken)

  244. Zash

    Or if it does, I have no idea how that would work with the internal data structures I've looked at.

  245. SamWhited

    Yah, I feel like if you think you need to merge 60 things, you have other problems and choosing a different VCS isn't going to help.

  246. Ge0rG

    choosing a different VCS was one of the main culprits of the NTPsec fork.

  247. Zash

    SamWhited: Having a system that allows it does sorta encourage it tho

  248. Ge0rG

    Zash: I think that hg only was inveted to troll git users.

  249. Zash

    Ge0rG: You got it wrong, it was to troll Python 3 users

  250. SamWhited

    Zash: yah, I agree, octomerge is dumb

  251. Ge0rG

    Zash: I'm not part of that audience, but maybe it was meant to troll both.

  252. Zash

    Probably to troll everyone but SVN users

  253. Ge0rG

    Zash: I have heavily used svn before git, and hg still makes me stumble every time.

  254. SamWhited

    I heavily used SVN, and then HG, and was absolutely an HG fanboy for a while just because it was my first DVCS. Then I realized that we should have learned from our mistakes, and that literally everything was easier and just worked better in Git and that a few minor foibles about the interface not being very consistent should not be enough to stop me from using it, so I learned it and haven't looked back since.

  255. Zash

    Maybe it's just the order you learn things in

  256. Zash

    I used svn first, then git, then hg

  257. Zash

    I like hg the most

  258. Ge0rG

    Zash: it's the same order for me, and I really can't stand hg. It's trolling me right into my face: hg: unknown command 'fetch' 'fetch' is provided by the following extension: use "hg help extensions" for information on enabling extensions

  259. Flow

    MattJ: I assume you saw https://github.com/xsf/xeps/pull/420 ?

  260. Zash

    Ge0rG: That's your brain on git

  261. Ge0rG

    Zash: no, that's a program telling me: "I know what you want, but I won't let you do it. Instead, you have to read a dozen pages of my useless manual first"

  262. SamWhited

    I understand the difference and don't really care between feetch/merge and update/pull, but that specific example aside I do agree that the extension thing always pisses me off.

  263. Zash

    Ge0rG: I could say the same thing about all the times I've done git pull and wondered why the heck it did a merge

  264. Ge0rG

    What about a friendly "hg fetch is provided by the 'fetch' extension. Activate? (Y/n)"

  265. SamWhited

    Oh, I just don't like the idea that it has extensions at all (I mean, in a sense Git does too, but it's not actually a thing you're supposed to make generic extensions against, it's just how commands work internally)

  266. SamWhited

    Not to say that people can't write their own tools to manipulate stuff, I just don't like that it's built right in and it will actually try to get you to use them.

  267. Zash

    Ge0rG: Here, it yells that fetch is deprecated

  268. Ge0rG

    Is there any other Elder whom I can summon to be enlightened about the interaction of MUCs, PMs and Carbons?

  269. Zash

    Elders predate Carbons

  270. Ge0rG

    Zash: I seriously hope that the Elders predate any of the XMPP protocols.

  271. jonasw

    Zash: you replied to Ge0rGs thread that at least two implementations are already tagging outgoing MUC PMs with <x/>. can you tell me which?

  272. Zash

    jcbrand: Prosody and ejabberd

  273. jonasw

    you’re not good at tabcompletion today

  274. Zash

    jonasw: Prosody and ejabberd

  275. Zash

    I'm not

  276. Zash

    I blame Kev. Everything is Kevs fault!

  277. jonasw

    Ge0rG was talking about clients, not servers, I think.

  278. jonasw

    Zash wrote: > a) Require carbon-enabled clients to tag outgoing MUC-PMs with <x/>, > carbon-copy the 'sent' MUC-PM to all clients, require carbon-enabled > clients to check for <x/> tag and to drop if they are not joined. This > is a 90% solution (it will still display outgoing PMs if you are > joined to the same MUC under different nicknames, as the other client > doesn't know which nickname the 'sent' message came from). I believe at least two implementations do this already.

  279. Zash

    jonasw: I believe I was talking about servers

  280. Ge0rG

    Zash: you believe? :P

  281. jonasw

    ah, oddly quoted then

  282. Zash

    Ge0rG: bee-hive

  283. Tobias

    Zash, you mean bhyve? http://bhyve.org/ :P

  284. moparisthebest

    I wrote a kontalk JID hash to phone number lookup service if anyone is interested https://www.moparisthebest.com/phonehash/

  285. moparisthebest

    more of a fun learning excercise than anything, but you can look up any 1 of 100 billion phone numbers with it in ~2 seconds

  286. jonasw

    "kaputt" as we say in germany :)

  287. Zash

    Phone numbers in what format?

  288. moparisthebest

    Zash, currently supports currently supports 0-000-000-0000 to 9-999-999-9999, which kontalk hashes like '+00000000000'

  289. Zash

    So basically +%011d then

  290. moparisthebest

    so I guess it doesn't support 2 digit country codes or strange formats? it'd be easy to generate files with those then

  291. moparisthebest

    yes all 11 digit phone numbers

  292. Zash

    Hm, 2TB of storage required for a rainbow table, or is my math wrong?

  293. moparisthebest

    Zash, I couldn't find really good resources for rainbow tables, so I don't know :)

  294. moparisthebest

    this only takes 500gb of storage though because I'm not storing any hashes, just the numbers

  295. Zash

    What

  296. Zash

    10¹¹ * 24B

  297. jonasw

    (also you could probbaly get away with storing only a unique prefix or part of the hash, reducing the storage needed drastically. after all, a phone number has only 36 bits of entropy)

  298. moparisthebest

    I explain it all here: https://github.com/moparisthebest/phonehash

  299. Zash

    20 byte sha1 output + 4 byte number

  300. moparisthebest

    99,999,999,999 won't quite fit in 4 bytes, you need 5, right?

  301. jonasw

    moparisthebest: nice hack!

  302. jonasw

    congrats on that idea :)

  303. Zash

    How many bytes is 64 bits again?

  304. jonasw

    8, Zash

  305. moparisthebest

    yea

  306. Zash

    Well then

  307. moparisthebest

    actually iirc that number fits in like 38 bits instead of the 40 bits I'm using

  308. Zash

    Probably possible to use truncated sha1 hashes, don't think the full output is required to avoid collisions

  309. moparisthebest

    but unaligned bytes sounded TERRIBLE

  310. Zash

    Storage is cheap

  311. Zash

    Maybe I should have done some calculations before attempting to generate this rainbow table in memory

  312. moparisthebest

    yea but I'm hosting this on my server that only had 800gb of free space hehe

  313. moparisthebest

    if someone really cared they could get a huge SSD and it'd be faster

  314. Zash

    Build a big B tree or something

  315. moparisthebest

    but 90 hours for generation on 2 slow spinners in linux software raid1, and then ~2 seconds per lookup is fine for me

  316. moparisthebest

    yea I sorted the numbers in the file by hash, but only store the numbers, so I could do a binary search

  317. moparisthebest

    which for 100 billion numbers is max 26 lookups/sha1 hashes

  318. moparisthebest

    which my machine can apparantly do in <2 seconds, probably mostly constrained by disk seek speeds

  319. jonasw

    yeah, 26 times sha1 should be *very* cheap

  320. jonasw

    microseconds cheap

  321. jonasw

    it’s the disk :)

  322. moparisthebest

    during generation I wrote to 65535 files and it KILLED my disk

  323. moparisthebest

    I had to put in synchronization code so only one file was written to at any given time

  324. moparisthebest

    so an SSD with no seek time, I bet generation would go from 90 hours to 20 or less

  325. moparisthebest

    anyone want to send me a >500gb SSD to find out? :P

  326. jonasw

    no.

  327. moparisthebest

    or run it yourself :P

  328. jonasw

    ENOSPC

  329. Zash

    Rent some CLOUD

  330. moparisthebest

    I don't have an SSD with that much free space

  331. moparisthebest

    or money to burn on a toy idea like this lol

  332. moparisthebest

    Zash, so how were you calculating how big a rainbow table needed to be?

  333. moparisthebest

    I couldn't really find good info

  334. moparisthebest

    the strings being hashed are like +00000000000

  335. moparisthebest

    so a +, then 11 digits

  336. Zash

    moparisthebest: But if they are numbers, you can just encode them as digits

  337. Zash

    err

  338. Zash

    as computers do

  339. moparisthebest

    yea but can rainbow tables?

  340. Zash

    integers

  341. moparisthebest

    I mean presumably a custom implementation can do whatever it wants

  342. Zash

    Gaint hash table?

  343. moparisthebest

    but yea all the tools I found only let you specific 'character set' and 'length', which meant it was doing length of 12

  344. moparisthebest

    and also trying numbers like '00000+000000'

  345. moparisthebest

    and I had it do that and using 100% cpu for 4 days only had generated a 4gb file so far

  346. moparisthebest

    so I stopped it hehe

  347. Zash

    Hu, I ran into the 1GB memory limit of LuaJIT in a few seconds :/

  348. moparisthebest

    this was the first time java using a 32-bit signed integer as array indices actually effected me

  349. Zash

    t [ sha1( sprintf("+%011d", i) ) ] = i for i in 0 → 10^11

  350. moparisthebest

    yea that's a *lot* of memory

  351. jonasw

    ha, I know why I have swap turned off by default :-)

  352. moparisthebest

    because just storing the integers as 5 bytes each is 500gb

  353. moparisthebest

    storing them with 20 byte sha1 hashes is 2.5tb

  354. moparisthebest

    if you stored them as the 12 character strings they are actually hashed with, that's a TON of space

  355. moparisthebest

    depending on character encoding and such of course hehe

  356. jonasw

    I’m still amazed by your binary search hack

  357. Zash

    ... binary search what

  358. Zash

    If you sort them into 256 buckets based on the first byte of sha1, then sort each of those buckets into buckets based on the second byte of sha1 output, and so on, you get a tree thing...

  359. moparisthebest

    Zash: 256 buckets is a bit big, I sorted them into 65535 buckets based on the first 2 bytes of the sha1 hash, then sorted those, concatenated all of them into one big now sorted file

  360. SamWhited

    Isn't that just the definition of a prefix tree?

  361. moparisthebest

    And then just do a binary search on it

  362. Zash

    SamWhited: Being self-taught, I rarely know the names of things.

  363. moparisthebest

    Idk I forgot most of these terms from school :-)

  364. SamWhited

    ah, no, this is the phone numbers… sort of a weird mix of prefix tree and binary search.

  365. SamWhited

    or a trie or whatever

  366. moparisthebest

    It's like a bucket sort, what I did

  367. moparisthebest

    That's where I got the idea anyway

  368. Zash

    I've seen a physical bucket sort. It was cool.

  369. Zash

    Post sorter machine

  370. Zash

    Mail sorter machine

  371. moparisthebest

    there I added a tl;dr to the readme

  372. moparisthebest

    tl;dr I put all 11 digit phone numbers represented as 5 byte integers in a 500gb file sorted by their sha1 hashes, now I can binary search it fast.

  373. moparisthebest

    https://github.com/moparisthebest/phonehash if I didn't link it already

  374. Zash

    There are some fun ways to store sets of integers

  375. Zash

    Like, delta compression

  376. Zash

    Or a giant bitfield

  377. jonasw

    not sure if any of these work if you have essentially shuffled integers

  378. moparisthebest

    yea compressing a random set of integers is impossible of course, but these are sequential, but the order is probably essentially random?

  379. moparisthebest

    it'd be interesting to look into though

  380. Zash

    If the numbers are divided into blocks of bit fields, then you do a linear search through the bit field but binary search on the blocks...

  381. moparisthebest

    I have no idea what you just said, why don't you try to implement it then let me look at your code... :)

  382. Zash

    I have no idea how large a bit field would need to be

  383. Zash

    Going to implement food instead

  384. intosi

    Zash: enjoy the debugging!

  385. xnyhps

    Instead of the binary search, wouldn't it be faster to jump to index (hash / 2^160) * size and search up or down from there depending on the difference between the hashes? The hashes should be quite uniform.

  386. moparisthebest

    not entirely positive what you mean xnyhps , sounds interesting to try though, care to explain more?

  387. jonasw

    moparisthebest: you have the data sorted by hashes, so looking at the first 32bit or something to guess the index in your array is a pretty reasonable thing to do

  388. moparisthebest

    ah yea, interesting

  389. xnyhps

    If you were looking for the hash 10000000000000..., you can assume it's close to 1/16th in the list of phone numbers because it's 1/16th of the possible values for the hash.

  390. xnyhps

    You'd start at 8000000..., then 4000000..., etc.

  391. moparisthebest

    it's not clear to me whether that would always be 26 comparisons or less though?

  392. moparisthebest

    I guess it'd depend on exactly how evenly distributed the hashes were?

  393. xnyhps

    Yeah.

  394. jonasw

    sha1 should be pretty uniform

  395. Zash

    So you basically treat it as a hash table?

  396. moparisthebest

    it is for some definition of "pretty" :)

  397. xnyhps

    You could also do a binary search, but with a weighted "middle" value.

  398. moparisthebest

    so like when I sorted it into 65535 different files based on the first 2 bytes, if it was *perfectly* distributed the files should have all been like 7.3mb, but they ranged between ~7 and 7.8 or so

  399. Zash

    moparisthebest: 10^11 is a pretty small sample size tho :)

  400. moparisthebest

    I should have taken exact byte counts at the time

  401. Zash

    You could have just kept the files as is

  402. moparisthebest

    yea could have, more math than just jumping to a place in one file though

  403. Zash

    Wouldn't it be exactly the same math, just a smaller file?

  404. moparisthebest

    like I already had the code written to do a binary search in one file and didn't want to bother doing anything else hehe

  405. Zash

    and you get told the right file to open from the input

  406. moparisthebest

    and it was the same amount of disk writing, read file into memory, sort file, write file to same file or append to one file is the same

  407. moparisthebest

    ah yea that's true, yea that would have been faster

  408. moparisthebest

    well I'm not redoing it haha

  409. Zash

    And then jumping to some point based on the 3rd byte of the input and done a linear search from there

  410. moparisthebest

    it would be faster but it's not faster by a huge margin log(100000000000/65535) is 15 sha1+comparison worst case vs 26 for the whole 100 billion

  411. moparisthebest

    still that would have been better :)

  412. jonasw

    moparisthebest: it will be quite a bit faster

  413. Zash

    Don't underestimate the performance of linear searches. CPUs, kernels and the disk can be much smarter

  414. jonasw

    sequential access is good for spinnign disks

  415. jonasw

    probably you will have only a single access because all of the numbers fit in the same block

  416. jonasw

    maybe two disk accesses

  417. Zash

    binary search mucks up all the caches and whatnot

  418. jonasw

    I guess the disk latency is the most expensive thing here, and that will vanish to O(1) instead of O(log n)

  419. moparisthebest

    yea it'd be interesting to see how much it improves

  420. moparisthebest

    it's already <2 seconds though, if I reworked it like that what would it drop to, 1?

  421. moparisthebest

    hmm

  422. jonasw

    100ms or something maybe, from a naive calculation

  423. moparisthebest

    well also I can see how a forward sequential read would be faster

  424. moparisthebest

    but jumping to a place might put me ahead of it too which would cause a reverse sequential read

  425. moparisthebest

    which would equally negate caching

  426. jonasw

    hm, maybe

  427. jonasw

    this then really depends on the block alignment

  428. jonasw

    the disk cache is what benefits your application the most, I think, and that is controlled by the block alignment. unless you hit a block boundary you should be fine. and that can happen both ways.

  429. jonasw

    on a more on-topic manner, has there been discussion about introducing {urn:xmpp:mix:0}feature elements in disco query responses? I do not like that idea.

  430. Zash

    Wha?

  431. jonasw

    Zash: e.g. example 40 in xep 369

  432. Zash

    -xep 369 ex 40

  433. Bunneh

    Zash: http://xmpp.org/extensions/xep-0369.html#example-40

  434. jonasw

    woah, dem features

  435. jonasw

    (pun not intended)

  436. Zash

    Hnnng

  437. jonasw

    I cannot interpret that

  438. Zash

    <grunt-of-disapproval/>

  439. Zash

    No caps?

  440. jonasw

    not sure if mix channels are supposed to support caps :)

  441. jonasw

    but the interaction with caps would for sure be interesting

  442. moparisthebest

    so based on a clients support or mix or not, could a server allow them into a mix channel if supported or throw them into some type of muc compatibility layer for the mix jid if not?

  443. moparisthebest

    sounds super hacky and terrible from a server POV but nice and cozy from a client/user POV :)

  444. moparisthebest

    ie client A supports mix, client B supports muc, both try to join room@example.org, end up joining same room but B is using his servers muc->mix layer?

  445. moparisthebest

    since mix requires server support anyway, at least it wouldn't *require* client support this way

  446. moparisthebest

    plus I'm sure that sounds like loads of fun to implement for Zash

  447. moparisthebest

    hey it'd work the other way around too, servers could treat remote MUCs as MIXs for their clients that supported MIX ? :)

  448. jonasw

    good night everyone ☺

  449. moparisthebest

    alright SamWhited it's up https://github.com/xsf/xeps/pull/426