-
lovetox
does a element name like <stream:stream> in xml has a special meaning because of the ":" or is this just a element name like any other and ":" has no special meaning
-
lovetox
i tried to find it in the xml spec, but this thing is unreadable for me
-
flow
lovetox, IIRC it has no special meaning if there are no XML namespaces
-
flow
I think https://www.w3.org/TR/xml-names/ is what gives ":" in xml element names a special semantic
-
flow
So as soon as you use namespaces, which is probably true for most practical usages of XML, it has a special meaning
-
MattJ
Including in XMPP
-
flow
lovetox, but I am curious, as it look like there is some more backstory to this question. if so, would you mind elaborating a bit? :)
-
lovetox
the backstory is i try to rewrite the nbxmpp streamparser and now i have to understand this :)
-
flow
lovetox, I'd hoped that python has a good xml pull parser library that you could re-use?
-
flow
otherwise I feel like you will hit a few xml trapdoors and probably re-work your implementation 2-3 times :)
-
flow
but you will learn a lot about XML in the process, so not sure if this will be a loss, just be prepared
-
MattJ
500% use an existing parser
-
MattJ
Please please please
-
flow
make sure to look at existing APIs. I think Java's StAX is a very good example of an XML stream parser
-
jonas’
the standard library parser is good enough for that
-
jonas’
it is based on Expat and does SAX
-
jonas’
please use that.
-
MattJ
expat is a great choice
-
Zash
just remember to disable all the security holes
-
jonas’
it can be used non-blockingly. If you need proof, look at aioxmpp.
-
Zash
at least that's easier than reinventing the security holes yourself
-
lovetox
calm down, i use expat
-
flow
jonas’, isn't SAX a push parser? when can those not be used non-blockingly?
-
lovetox
still need to understand xml namespaces
-
MattJ
I think recent expat versions have more sensible defaults, though forbidden XMPP stuff will still need handling
-
jonas’
lovetox, do you? the parser should do all that for you.
-
jonas’
(and expat does if you ask nicely)
-
jonas’
(and by extension, the python standard library xml module does)
-
flow
I think you need to understand XML when parsing XML, and even if not, it can't hurt ;)
-
jonas’
https://github.com/horazont/aioxmpp/blob/devel/aioxmpp/xml.py#L818-L1111 explicitly waiving the infectiousness of the *GPL for you, lovetox, if you need/use that as inspiration.
-
lovetox
jonas’, if i search for nx prefix in that code, i find a lot of matches
-
lovetox
thanks flow, that document did help to understand prefixes :)
-
jonas’
nx?
-
lovetox
*ns
-
lovetox
regarding your comment, that its not necessary to understand and python xml libs do it for you
-
MattJ
The summary is that the element called "foo" in one namespace is not the same as the element called "foo" in another namespace. A namespace is identified by a string, usually found in the 'xmlns' attribute. Prefixes are simply a more concise replacement for adding 'xmlns' attributes everywhere (you declare them once and use them as many times as you want)
-
MattJ
This looks like a sane explanation: http://tech.usgin.org/content/xml-namespaces-and-prefixes
-
MattJ
XMPP is heavily based on XML namespaces, so understanding them is not really optional for an XMPP dev. But a good parser (such as expat) will save you from having to handle all the rules (such as inheritance) yourself, and it will tell you what namespace every element is in (regardless of whether it used a prefix or an xmlns attribute)
-
lovetox
yes that would be nice
-
lovetox
i just tested expat
-
lovetox
and it does not handle it
-
lovetox
at least the expat api that is accessible via python
-
lovetox
hm altough some examples show it does, maybe i need to investigate this more
-
lovetox
Expat can optionally do XML namespace processing for you, enabled by providing a value for namespace_separator
-
lovetox
oh i did specify that :)
-
jonas’
lovetox, check that: https://github.com/horazont/aioxmpp/blob/devel/aioxmpp/xml.py#L1106-L1111
-
lovetox
jonas’, i use currently xml.parsers.expat.ParserCreate
-
lovetox
is there any benefit of using the sax module?
-
jonas’
less boiler plate to write I guess
-
jonas’
I strongly suggest to go with the high-level SAX API, it saves you worrying about namespace prefixes and stuff.
-
lovetox
ok i look into it
-
reset
n
-
lovetox
and all the prefix stuff is just to save on size? it does not add some new feature to xml? so if a client does not use ns prefixes at all its perfectly fine?
-
Zash
prefix stuff is generally not used in XMPP, apart from for <stream:stream> and Dialback
-
Zash
Does it not add the ability to namespace attributes? I.e how else would you say `<e xmlns:foo="urn:example:ns" foo:bar="blah"/>` ?
-
Zash
(Which is also pretty rare in XMPP)
-
lovetox
yeah it does, forgot about the attribute stuff
-
lovetox
its weird in one example the stream prefix is used
-
lovetox
in others not in the rfc
-
Zash
Hm?
-
Zash
Like `<stream ....` ? Probably just to make the examples more compact
-
lovetox
for example here its used
-
lovetox
https://datatracker.ietf.org/doc/html/rfc6120#section-9.1.2
-
lovetox
https://datatracker.ietf.org/doc/html/rfc6120#section-9.2.1
-
lovetox
here than suddenly missing from the features
-
Zash
🤷️
-
lovetox
i think they want to make a point that it can be used
-
Zash
But doesn't it also say that you should encode the stream as `<stream:stream ...>` ?
-
Zash
https://datatracker.ietf.org/doc/html/rfc6120#section-4.8.5
-
lovetox
oh .. wait
-
lovetox
its not optional
-
lovetox
if we want to define a default namespace of jabber:client in <stream>
-
lovetox
you need a prefix if <stream> should not have jabber:client
-
lovetox
is this example not wrong here
-
lovetox
S2: <stream from='example.net' id='hTiXkW+ih9k2SqdGkk/AZi0OJ/Q=' to='im.example.com' version='1.0' xmlns='http://etherx.jabber.org/streams'>
-
lovetox
it does not define a default namespace of jabber:client or server
-
Zash
`<stream xmlns="http://etherx.jabber.org/streams"><message xmlns="jabber:client"/></stream>` == `<stream:stream xmlns:stream="http://etherx.jabber.org/streams" xmlns="jabber:client"><message/></stream>`✎ -
Zash
`<stream xmlns="http://etherx.jabber.org/streams"><message xmlns="jabber:client"/></stream>` == `<stream:stream xmlns:stream="http://etherx.jabber.org/streams" xmlns="jabber:client"><message/></stream:stream>` ✏
-
lovetox
hm yeah
-
lovetox
but without defining it on the stream element, this would indeed be very wasteful
-
Zash
So if you could be certain that the other end of the stream understands namespaces and prefixes fully, you could cut down on repeated namespace declaration for common things. Like imagine `<sm:r/>` instead of `<r xmlns="urn:xmpp:sm:3"/>`
-
reset
test
-
reset
hmm
-
Zash
This was a triumph! I'm making a note here: HUGE SUCCESS!