jdev - 2021-10-17


  1. lovetox

    does a element name like <stream:stream> in xml has a special meaning because of the ":" or is this just a element name like any other and ":" has no special meaning

  2. lovetox

    i tried to find it in the xml spec, but this thing is unreadable for me

  3. flow

    lovetox, IIRC it has no special meaning if there are no XML namespaces

  4. flow

    I think https://www.w3.org/TR/xml-names/ is what gives ":" in xml element names a special semantic

  5. flow

    So as soon as you use namespaces, which is probably true for most practical usages of XML, it has a special meaning

  6. MattJ

    Including in XMPP

  7. flow

    lovetox, but I am curious, as it look like there is some more backstory to this question. if so, would you mind elaborating a bit? :)

  8. lovetox

    the backstory is i try to rewrite the nbxmpp streamparser and now i have to understand this :)

  9. flow

    lovetox, I'd hoped that python has a good xml pull parser library that you could re-use?

  10. flow

    otherwise I feel like you will hit a few xml trapdoors and probably re-work your implementation 2-3 times :)

  11. flow

    but you will learn a lot about XML in the process, so not sure if this will be a loss, just be prepared

  12. MattJ

    500% use an existing parser

  13. MattJ

    Please please please

  14. flow

    make sure to look at existing APIs. I think Java's StAX is a very good example of an XML stream parser

  15. jonas’

    the standard library parser is good enough for that

  16. jonas’

    it is based on Expat and does SAX

  17. jonas’

    please use that.

  18. MattJ

    expat is a great choice

  19. Zash

    just remember to disable all the security holes

  20. jonas’

    it can be used non-blockingly. If you need proof, look at aioxmpp.

  21. Zash

    at least that's easier than reinventing the security holes yourself

  22. lovetox

    calm down, i use expat

  23. flow

    jonas’, isn't SAX a push parser? when can those not be used non-blockingly?

  24. lovetox

    still need to understand xml namespaces

  25. MattJ

    I think recent expat versions have more sensible defaults, though forbidden XMPP stuff will still need handling

  26. jonas’

    lovetox, do you? the parser should do all that for you.

  27. jonas’

    (and expat does if you ask nicely)

  28. jonas’

    (and by extension, the python standard library xml module does)

  29. flow

    I think you need to understand XML when parsing XML, and even if not, it can't hurt ;)

  30. jonas’

    https://github.com/horazont/aioxmpp/blob/devel/aioxmpp/xml.py#L818-L1111 explicitly waiving the infectiousness of the *GPL for you, lovetox, if you need/use that as inspiration.

  31. lovetox

    jonas’, if i search for nx prefix in that code, i find a lot of matches

  32. lovetox

    thanks flow, that document did help to understand prefixes :)

  33. jonas’

    nx?

  34. lovetox

    *ns

  35. lovetox

    regarding your comment, that its not necessary to understand and python xml libs do it for you

  36. MattJ

    The summary is that the element called "foo" in one namespace is not the same as the element called "foo" in another namespace. A namespace is identified by a string, usually found in the 'xmlns' attribute. Prefixes are simply a more concise replacement for adding 'xmlns' attributes everywhere (you declare them once and use them as many times as you want)

  37. MattJ

    This looks like a sane explanation: http://tech.usgin.org/content/xml-namespaces-and-prefixes

  38. MattJ

    XMPP is heavily based on XML namespaces, so understanding them is not really optional for an XMPP dev. But a good parser (such as expat) will save you from having to handle all the rules (such as inheritance) yourself, and it will tell you what namespace every element is in (regardless of whether it used a prefix or an xmlns attribute)

  39. lovetox

    yes that would be nice

  40. lovetox

    i just tested expat

  41. lovetox

    and it does not handle it

  42. lovetox

    at least the expat api that is accessible via python

  43. lovetox

    hm altough some examples show it does, maybe i need to investigate this more

  44. lovetox

    Expat can optionally do XML namespace processing for you, enabled by providing a value for namespace_separator

  45. lovetox

    oh i did specify that :)

  46. jonas’

    lovetox, check that: https://github.com/horazont/aioxmpp/blob/devel/aioxmpp/xml.py#L1106-L1111

  47. lovetox

    jonas’, i use currently xml.parsers.expat.ParserCreate

  48. lovetox

    is there any benefit of using the sax module?

  49. jonas’

    less boiler plate to write I guess

  50. jonas’

    I strongly suggest to go with the high-level SAX API, it saves you worrying about namespace prefixes and stuff.

  51. lovetox

    ok i look into it

  52. reset

    n

  53. lovetox

    and all the prefix stuff is just to save on size? it does not add some new feature to xml? so if a client does not use ns prefixes at all its perfectly fine?

  54. Zash

    prefix stuff is generally not used in XMPP, apart from for <stream:stream> and Dialback

  55. Zash

    Does it not add the ability to namespace attributes? I.e how else would you say `<e xmlns:foo="urn:example:ns" foo:bar="blah"/>` ?

  56. Zash

    (Which is also pretty rare in XMPP)

  57. lovetox

    yeah it does, forgot about the attribute stuff

  58. lovetox

    its weird in one example the stream prefix is used

  59. lovetox

    in others not in the rfc

  60. Zash

    Hm?

  61. Zash

    Like `<stream ....` ? Probably just to make the examples more compact

  62. lovetox

    for example here its used

  63. lovetox

    https://datatracker.ietf.org/doc/html/rfc6120#section-9.1.2

  64. lovetox

    https://datatracker.ietf.org/doc/html/rfc6120#section-9.2.1

  65. lovetox

    here than suddenly missing from the features

  66. Zash

    🤷️

  67. lovetox

    i think they want to make a point that it can be used

  68. Zash

    But doesn't it also say that you should encode the stream as `<stream:stream ...>` ?

  69. Zash

    https://datatracker.ietf.org/doc/html/rfc6120#section-4.8.5

  70. lovetox

    oh .. wait

  71. lovetox

    its not optional

  72. lovetox

    if we want to define a default namespace of jabber:client in <stream>

  73. lovetox

    you need a prefix if <stream> should not have jabber:client

  74. lovetox

    is this example not wrong here

  75. lovetox

    S2: <stream from='example.net' id='hTiXkW+ih9k2SqdGkk/AZi0OJ/Q=' to='im.example.com' version='1.0' xmlns='http://etherx.jabber.org/streams'>

  76. lovetox

    it does not define a default namespace of jabber:client or server

  77. Zash

    `<stream xmlns="http://etherx.jabber.org/streams"><message xmlns="jabber:client"/></stream>` == `<stream:stream xmlns:stream="http://etherx.jabber.org/streams" xmlns="jabber:client"><message/></stream>`

  78. Zash

    `<stream xmlns="http://etherx.jabber.org/streams"><message xmlns="jabber:client"/></stream>` == `<stream:stream xmlns:stream="http://etherx.jabber.org/streams" xmlns="jabber:client"><message/></stream:stream>`

  79. lovetox

    hm yeah

  80. lovetox

    but without defining it on the stream element, this would indeed be very wasteful

  81. Zash

    So if you could be certain that the other end of the stream understands namespaces and prefixes fully, you could cut down on repeated namespace declaration for common things. Like imagine `<sm:r/>` instead of `<r xmlns="urn:xmpp:sm:3"/>`

  82. reset

    test

  83. reset

    hmm

  84. Zash

    This was a triumph! I'm making a note here: HUGE SUCCESS!