-
lovetox
so for domainnames normally i use stringprepare
-
lovetox
nameprep
-
lovetox
but in python there is module available that does the idna2008 standard
-
jonas’
I don’t like where this is going
-
lovetox
can i just switch to that, or is this going to be problematic?
-
Zash
Me neither
-
jonas’
no, those are different things
-
lovetox
really?
-
jonas’
you need to do nameprep first, then you can IDNA-encode the name before handing it to the DNS server
-
jonas’
IDNA is an encoding (unicode -> DNS-compatible ascii bytes)
-
jonas’
like UTF-8 is an encoding
-
jonas’
problem is: there are two incompatible versions of idna, and nobody knows which one to use
-
lovetox
im talking about validating domainnames
-
jonas’
nameprep
-
lovetox
in python there is a idna standard module
-
jonas’
that’s all validation you need
-
lovetox
it offers a method thats called nameprep
-
jonas’
there is no idna module in the python standard library
-
jonas’
according to https://docs.python.org/3/library/
-
lovetox
its a submodule of stringprepare
-
jonas’
there is no stringprepare module either
-
lovetox
or stringprep dont know
-
jonas’
stringprep doesn’t have submodules AFAIK. most certainly not an `idna` submodule
-
Zash
It hasn't been long enough since some user put http://example.com:5280/ as their XMPP server name and everything was fine with that.
-
jonas’
I’m wondering what you’re talikng about. LTIC python only supported IDNA2003
-
lovetox
ok https://github.com/python/cpython/blob/e42b705188271da108de42b55d9344642170aa2b/Lib/encodings/idna.py
-
lovetox
its a encoding
-
jonas’
as I said, yes
-
jonas’
and it’s IDNA2003
-
jonas’
not 2008
-
jonas’
again, IDNA* are encodings like UTF-8 are encodings. has nothing to do with validation or normalisation like nameprep does
-
lovetox
yes and it has a method call nameprep
-
jonas’
neat
-
jonas’
but it’s not public API
-
jonas’
seems as if nameprep is a precondition for IDNA2003 and they’re doing that for you
-
lovetox
oh ok so thats what confused me
-
jonas’
you can do a call to codecs.encode(some_domain, "idna") and if it doesn’t raise UnicodeEncodeError, then you know that it passes nameprep
-
lovetox
soo can is witch to nameprep -> IDNA2008
-
lovetox
?
-
jonas’
no
-
jonas’
it’s idna2003
-
jonas’
not idna2008
-
jonas’
also note that nameprep has been deprecated in XMPP
-
lovetox
you just said nameprep has nothing to do with idna, its just a precondition
-
lovetox
so what is it now
-
jonas’
TIL that nameprep is a precondition to IDNA2003.
-
jonas’
it doesn’t matter tho
-
jonas’
IDNA2003 does more than just nameprep
-
jonas’
if you want nameprep, you should do nameprep and not IDNA2003
-
Zash
You actually want IDNA too, nameprep isn't enough to validate an XMPP hostname.
-
Zash
domain. hostpart. thing.
-
jonas’
Zash, IDNA doesn’t buy you much though: >>> codecs.encode("http://foo:5082", "idna") b'http://foo:5082'
-
Zash
See previously mentioned incident with someone having a HTTP URL in their config
-
Zash
wut
-
jonas’
yeah.
-
jonas’
it’s just nameprep plus some mapping of characters
-
jonas’
plus length restrictions
-
Zash
> print(util.encodings.idna.to_ascii("http://foo:123/")) nil
-
jonas’
is that IDNA2003 or IDNA2008?
-
Zash
Probably 2008
-
jonas’
might be the difference
-
jonas’
python3 only has 2003
-
Zash
haha
-
jonas’
yeah
-
Zash
but don't python also have ... whatsitcalled, precis?
-
jonas’
the issue is just open for seven years now: https://bugs.python.org/issue17305
-
jonas’
Zash, not built-in
-
jonas’
there’s a third party module
-
Zash
better than nothing
-
jonas’
true
-
jonas’
there also seems a third-party IDNA module which does things
-
jonas’
and then again: I’m not sure XMPP software should make assumptions about what DNS allows.
-
lovetox
: is not allowed in IDNA2008
-
jonas’
let DNS deal with the weird things we put in the domainpart.
-
jonas’
it’ll tell us to f* off. encoding (too many) assumptions about how DNS works and what it allows seems like it can only lead to a world of pain
-
jonas’
normalisation makes sense for comparision and stuff, but beyond that…
-
Zash
`idn -a <<< "http://foo.bar:123/"` spits out its input
-
Zash
and that would be idna2003
-
lovetox
so i still dont see no problem to run a host through idna 2008?
-
jonas’
probably ok
-
jonas’
not sure what that gains you tho
-
lovetox
and if it returns an exception, tell the user its not a valid domain name
-
jonas’
I hate that type of stuff
-
jonas’
that’s the type of stuff which breaks new things
-
jonas’
you’ll notice that it’s not a valid domain name when you ask the DNS about the name
-
jonas’
(though, you need to do IDNA2008 or IDNA2003 before you ask the DNS)
-
jonas’
(but you can’t know which one is right \o/)
-
Zash
The least painful answer for us: IDNA2008 in IDNA2003 compat mode
-
Zash
IDNA2003 library support status: deprecated.
-
lovetox
so what does the xmpp standard say about domainpart
-
Zash
libidn v1 deprecated libin2 dosen't do nameprep & co, can't manage xmpp parts
-
lovetox
no validation at all?
-
lovetox
it does for node and resource part have precis modules
-
Zash
lovetox: "the", there are like 3 of them
-
Zash
3 versions
-
lovetox
the lastest
-
Zash
idna2003, idna2008, precis
-
lovetox
so wtf it definitly says a domainpart has to conform to idna2008
-
lovetox
so i certainly run it through idna2008 and be finished with it
-
Zash
Latest is https://tools.ietf.org/html/rfc7622
-
lovetox
yes thats what iam refering to
-
jonas’
I love how we still haven’t figured out how to do unicode release interop
-
Zash
This be the IDNA2008 + PRECIS thing righht?
-
lovetox
yes
-
lovetox
and idna2008 has no nameprep as precondition
-
jonas’
I dropped that in #debian-til and someone got nerdsniped by that and dug out that the unicode releases also don’t really have guidelines on compatibility
-
lovetox
so actually i can throw out all nameprep stuff
-
Zash
jonas’: No way we're doing that while in the middle of a pandemic, can't afford to waste painkillers and anti-fever meds on this horror
-
jonas’
:D
-
jonas’
luckily, the next unicode release will be postponed due to the pandemic, too
-
Zash
Praise Glob
-
flow
lovetox> so i certainly run it through idna2008 and be finished with it then you potentially disallow ipv4/ipv6 addresses in the domainpart, which are allowed
-
lovetox
after i check if its an ip :D
-
flow
lovetox, bonus points for allowing ipv6 scope IDs
-
lovetox
i do
-
flow
then here are your bonus points
-
lovetox
but only if you include it within []
- flow hands lovetox 13.37 bonus points
-
flow
lovetox, whatever the IP-literal rule of RFC 6874 allows