pulkomandyhello, I am implementing xhtml-im in my client and currently adding hyperlinks management, is there a recommendation for how to handle phishing attempts like <a href="http://evilwebsite.org">http://totally-legit-looking.org</a> ? For example Thunderbird (on the email side) has a dialog offering to use the href url or the one in the text when this happens, do XMPP client have similar checks?
Millesimushas left
jonas’mind that XHTML-IM is officially deprecated because of how easy it is to shoot yourself in the foot with stuff like this
Link MauveIn poezio we always display both.
nephelehas joined
nephelehas left
Link MauveWeb browsers usually display the target URI in the bottom left (or right depending on where the pointer is), I’d assume this design has been put to the test.
SamThat seems like a bad assumption :) (but I also assume people are used to it at this point, at least even if they never actually look at it and click through anyways)
pulkomandyweb browsers used to do this, yes. These days they do everything so that the user never sees an URL :
Link MauveUh really?
Link MauveFirefox still does so.
pep.Yeah I confirm
Link MauveSam, do you know how else to handle that?
pep.jonas’, and that's still the only solution to do rich text formatting without polluting body, that's actually implemented :)
SamLink Mauve: Lower left seems like a good idea to me. In addition if the link text and actual link both appear to be URLs it couldn't hurt to show a big warning as someone suggested.
Link MauveIndeed.
pulkomandyyes, I don't really care if it's deprecated, it's used by various things I need
SamWell, it could hurt because in commercial systems everything is always behind a tracking link, but making that more painful won't make me lose much sleep.
goffihas left
Sam(and no one is using XMPP commercially in that sense anyways that I know of; eg. there's no newsletters or anything over XMPP)
pulkomandyyes, funnily I mainly know of Thunderbird doing this because outlook changes the content of emails to redirect everything through some "safe links" system
machas left
SamWhat things do you need, maybe we can suggest alternatives that don't have such a bad user experience?
pulkomandyand then thunderbird complains that the link doesn't match the text anymore
Millesimushas joined
jalalhas joined
pulkomandywell the 3 things I saw using xhtml-im so far are: biboumi to forward IRC formatting, some matrix bridge using blockquotes for cited messages, and a notification bot using a href to put links to a forum whenever a message is posted there
Link MauveAnd poezio!
SamFor the notification bot I'd start with auto-linking URLs in the plain text body first. That will give you a nice experience on both ends of the connection if users are chatting and I suspect the bot also has a plain text body that will work fine with this
pulkomandyas far as I know, none of the replacements for xhtml-im allow using colors in the text. So they are all worse than IRC...
SamThe other two are harder obviously as they'd need change to the bridges, so maybe we can't solve that problem unfortunately
TheCoffeMakerhas joined
pulkomandyis there a spec for autolinking urls? Or do I need to figure out my own way to detect URLs?
SamNo, they're better than IRC because people dont' insist on sending you yellow text that looks great against their dark background but can't be read on your light background :)
SamI'm sure there's a URL detection library out there, but no, there's no documented algorithm for doing so in XEPs at least
SamBut it's a common enough thing that's easy enough to do
nephelehas joined
Link Mauve[citation required]
pulkomandystill can't be as easy as parsing <a href=""></a>
SamMaybe, maybe not. It's pretty easy either way.
SamAnyways, just saying that might go ahead and solve that problem for you and be a useful thing to the users of your client.
SamI think most people just use a regexp copied from the internet. This will never be 100% correct with no false positives or negatives, but it generally does well enough 99% of the time.
Link MauveIn my experience, it’s very annoying when it doesn’t.
Link MauveCounting parentheses is one such infuriating example regexp can’t do.
pep.While we could just tell the receiving client it's meant to be a url so that it gets it 100% of the time. But no
pep.Better to get it 99% of the time
qyPerl grammars could though...
Link Mauve:)
Link MauveReminds me of that time I tried to implement <a/> using poezio’s paste.
SamSure, it's a bit annoying. If you have a nice UI for creating links that you can use definitely add an OOB or something too, but either way for people who just type in mysite.example.com you probably want to autolink that, so you'd likely want to do it either way even if you support XHTML-IM or whatever
Link MauveBut then I hate the timer paste it does, so I fell into the rabbit hole that ncurses doesn’t support the proper bracketed paste…
Link MauveSam, wut, no, you definitely don’t.
Link MauveSome websites try to do so, with hilariously bad results.
nephelehas left
nephelehas joined
SamIf I quickly type, "hey, this video was funny <pastes link>" you don't try to autolink that? Seems like a bad experience. I dunno, Conversations does it and it works pretty well. Not saying it's 100%, sure it's annoying sometimes, but mostly it's a much nicer experience when I can just click on it.
Link MauveFor instance in French we have many words ending in -s if masculine, -es if feminine, and using a dot to mean either undeterministically, these systems always think these are links to Spanish websites. ^^'
SamAlthough this is probably more important on Android where you don't have a cursor and can't just copy/paste the text into the address bar
nephelehas left
Link MauveSam, actually in poezio we don’t control what the terminal will autolink (although I’ve seen a proposal for proper HTML-style links recently, but it is not implemented in tmux…).
SamSure, not every possible system can do it.
SamI'm just saying, if you've got a bot sending you links that might be a good first step.
jalalhas left
pep.As a client I'd prefer to tell my terminal what is a link though, because I've got more context than the terminal
Link MauveSam, if you quickly type "hey, this video was funny <pastes link>" and your client creates a proper <a/> link on paste, there is no issue and no need for other clients to guess what is or isn’t a URI.
Link Mauvepep., yup.
SamThat's the same thing, your client just had to guess.
SamInstead of the other side.
Link Mauvepep., https://gist.github.com/egmontkob/eb114294efbcd5adb1944c9f3cb5feda
pep.Sam, it wouldn't, if the sender told it.
nephelehas joined
SamSure, but the sender didn't tell it, they just typed in some text.
Link MauveSam, as a sender you can fix it until it is correct.
Link MauveWhile as a recipient, if the markup is lost then you are condemned to guess.
SamI'm not saying not to do that; I agree, linkify it on both ends it will make for a way better experience.
Link MauveSam, realistically, people very rarely type in URIs in text.
pulkomandyI don't know about your OS, but in mine, the clipboard data has a mimetype so if I copypaste a link into my XMPP client, I can know it's a link, and probably get both the URL and the page title from the clipboard
Link MauveCopy/paste is a much more common feature for that.
SamBut you probably also have to do it on the receiving end for when they have an old client that doesn't understand your XHTML-IM or OOB/references format or whatever you use anyways if you want a fallback.
pulkomandyyes, I will handle legacy clients and OS as best as I can, but that's not a reason to stay locked in the 1990s
pulkomandyotherwise I would be writing an IRC client not an XMPP one
nephelehas left
nephelehas joined
SamI didn't say you should, I said it might be a quick way to get that for a simple bot and might be good enough for now since we don't have a good link format.
Link MauveWe actually do, it’s just that you deprecated it.
Link MauveBut it’s still perfectly usable.
jalalhas joined
pulkomandyyes I'll implement what we have now. I'm happy to replace it with something better if someones comes up with something better, which I don't think the new specs for rich text are
nephelehas left
nephelehas joined
nephelehas left
nephelehas joined
pasdesushihas joined
nepheleI made a specification for formatted messages in matrix, if there is interest I will work on making a similar one for xmpp, if the concept is considered fine :)
moparisthebest> Matrix formatting is currently based on a subset of HTML.
Sounds like most clients are probably vulnerable to what most xhtml-im clients are vulnerable to
nepheleLink mauve: no, that is differnt
Zashnope, because json protects it!
antranigvhas joined
nephelemoparisthebest: yes... which is why i made this alternative formt :)
Zashso something between xhtml-im and https://xmpp.org/extensions/xep-0394.html
nephelehas left
nephelehas joined
SyrupThinkerhas joined
nepheleEh, not that similar either
Zashactually, closer to xhtml-im
Zashmodelled in json
nephelehas left
nephelehas joined
nepheleIt's not html :) that was the main point anyhow
pep.xhtml-im isn't "html" either
pep.It's a strict subset of xhtml
nepheleYes, but you cannot use an html paraer for this one
ZashIt can be translated to HTML, therefore vulnerable.
MattJWell, anything can be vulnerable
pulkomandyunicode seems more dangerous than html :>
SamI would assume this would be less likely to be vulnerable?
pasdesushihas left
ZashThe vulnerability is the web itself, not the format!
moparisthebestIn practice all clients just drop it into a browser that supports JavaScript etc
MattJI think a custom format that can't be passed to a renderer (i.e. not HTML, XHTML or Markdown) is less likely to cause implementation vulnerabilities
moparisthebestI get that it "can be implemented securely"
pulkomandythat's clearly not the case for the code I'm writing
pep.moparisthebest, web* clients
pulkomandyso "all clients" can't be true :)
moparisthebestNo, all
pep.Also not poezio
Sambut yah, adding the massive web footprint and platform is the real problem
moparisthebestIf you have a spec that all clients implement in an insecure way, it's a bad spec even if it can be secure in theory
pulkomandyyou don't need a web engine for this. I used libcss to parse the css and give me easy to use styling attributes. No HTML parser or DOM or anything crazy like that involved
Zashgood luck finding a decent rendering engine that doesn't come with a javascript engine bolted on
pulkomandyyou don't need a full rendering engine for this, that's why it's a subset of xhtml and not the full thing
MattJIdeal would be a new "safe" format, with reference implementations in multiple languages for translation to HTML and other common markup formats
Zashthere aren't a lot of rendering engine implementations afaik
Link MauveYup, poezio’s rendering engine is decent and is written in about 500 lines of Python.
Zashterminal is easier but turning text into pixels is hard
Link MauveFor most toolkits this is a solved problem though.
nephelehas left
nephelehas joined
Zash... because they include HTML+CSS+JS based rendering engines
Link MauveAlthough with resolutions being bigger and bigger, the traditional way is starting to be a bit limited, so newer ways to turn text into pixels (using GPUs this time) are being explored.
Link MauveZash, I’m most familiar with GTK, which only includes CSS out of these three, and pango implements a subset of HTML for its markup.
Link MauveThere is no web engine nor JS available in there, without external libraries like webkit2gtk.
Link MauvePidgin went for the latter, and this has been a massive drag since then.
ZashWasn't half of Gnome written in JS these days?
pulkomandyhere is my 800 lines of code to implement xhtml-im with just libcss and no javascript or html or dom involved: https://github.com/pulkomandy/Renga/blob/master/ui/Xhtml.cpp most of it is callbacks to tell libcss "no we don't need that here"
pasdesushihas joined
Link MauveZash, gnome-shell is written in JS, but that’s the host language, not a language you are forced to embed just because.
nepheleAnyhow, if there is interest let me know and I'd work on a new format for xmmp
ZashAnd back when Swift was more actively developed it was said that there weren't any rendering engines available besides webkit
moparisthebestHow many vulns is in libcss?
pep.nephele, I honestly recommend fastening up your seatbelt really tight if you go that way in the XMPP world. Haters are gonna hate
ZashWe have how many formats already?
homebeachhas left
Matrix Traveler (bot)has left
Matrix Traveler (bot)has joined
homebeachhas joined
pep.2 in use
Zashwe have enough war without another format war
pep.Well they're not even the same thing, that's the worst. One is a wire format missing an input format, the other is an input format missing a wire format
pep.Together they could go very far but for some reason one doesn't like the other. I'll let you guess which
Link Mauvepep., probably just nobody did it so far.
pep.Link Mauve, well the latter mandates input format == wire format, so it's not really possible. That's the trick :p
Link MauveAlthough I’d rather go for something a bit more widespread, such as Markdown, for such an input format.
Link Mauvepep., not really no, does it?
nephelehas left
nephelehas joined
pep.Isn't that the whole point
pep.of 393
SyrupThinkerhas left
Link Mauvepep., it has some examples of it being used in {jabber:client}body, but that’s just examples, not standard text.
Link MauveYou can perfectly well use that as your input format, and transform it before sending it to the recipients.
nephelehas left
nephelehas joined
pep.Link Mauve, I know, see https://lab.louiz.org/poezio/poezio/-/issues/3455#note_7769
homebeachhas left
Matrix Traveler (bot)has left
Matrix Traveler (bot)has joined
homebeachhas joined
Link MauveRight.
J Marinarohas left
pulkomandyI'd rather go with https://xmpp.org/extensions/xep-0394.html than 393 if we really have to remove xhtml-im (but again, no support for colors there, yet?)
pep.Reading 393, I just discovered: « Clients that do not support this specification MUST still be able to receive messages sent by clients using this specification and display them in a human-readable form. »
pep.Is that really a thing? a MUST for non-supporting implementations?
Link Mauvepep., it’s mu. :D
Link MauveA specification can’t force non-implementers to do anything.
SamGood catch; that's just a requirement, that "MUST" should be "must".
nephelehas left
nephelehas joined
SamOh, no, nevermind
SamBut still, it's not a requirement on the clients to do anything, it's a requirement on the spec to do something
pep.Ok
moparisthebestMarkdown also requires a browser which in practice always comes with JavaScript
nephelehas left
SamIt doesn't require a browser, but in a browser all the markdown libraries I looked at appeared to be vulnerable by default to injecting scripts or something executable which is part of the reason I didn't just go with that when writing 0393.
pulkomandyyes I'm a lot more worried about me trying to write a parser for 0393 than about using libcss for 0071
moparisthebestI wouldn't be
nephelehas joined
Link Mauvemoparisthebest, Markdown is a superset of HTML, it doesn’t “require a browser” nor JavaScript.
moparisthebestLink Mauve: in practice it'll always be implemented that way
Link Mauvemoparisthebest, not really no.
SamI would be interested to see a spec that used XML for formatting similar to XHTML-IM but w/o the HTML part and w/o the "tries to link into the plain text body too" part of 0394. I dunno if it would be better or worse, and you end up with the "plaintext/formatted message bodies are entirely different problem", but I'd like to see it and would be curious what could be done with it.
moparisthebestAgain, I don't care what's theoretically possible, only what happens 99.9% of the time
cdcodehas joined
Sammoparisthebest: I don't think that's true, none of the markdown parsers I've ever used required HTML (unless they were javascript ones). I mean, you're right about the problem, just wrong about that detail I think
pulkomandywell, xhtml-im but we do a rot13 on all the xhtml element names to make sure they are not accidentally sent to an html parser?
moparisthebestSee also: _xmppconnect and XMPP XML being a "strict subset of XML" where all projects just use an XML parser and are vulnerable
Link MauveSam, that would be exactly the same as XHTML-IM imo, clueless webdevs will just make it go through some XSLT or whatever and end up with the exact same vulnerabilities, while you have fragmented the ecosystem with one more wire format.
SamI'm not 100% sure that's true, but you might be right
pulkomandyclueless webdevs don't know about XSLT, they would implement something similar, but slower in javascript
Link MauveRight.
Link MauveSam, clueless webdevs have vulnerabilities in anything where plain text is used in the protocol, built-in the browser under the name innerHTML.
Link MauveOnce the JS converter to HTML has been passed, they’ll put it in the DOM with innerHTML and get the same vulnerability they’ve used for years.
SamYah, actually, you're probably right. The naive case would carry over the attributes and one of those will be javascript:onmouseover or whatever.
Link MauveExactly.
moparisthebestYou could say the same about clueless C++ devs who think "I'm sure I can write secure c++ *this* time"
pep.Maybe someday we'll stop betting that clueless webdevs be clueless and limit our specs and we'll start helping/training them instead and write our specs with less worries
atomicwatchhas joined
Link MauveHa, I’m not gonna train a webdev.
pep.:D
Link MauveI’m bad at webdev myself.
Link MauveStuck about ten years ago.
moparisthebestYou need to write specs that can be implemented securely by anyone that can read them without knowing a ton of non obvious stuff
pep.The point is, if you think people are dumb you're not gonna go very far
Link Mauvepep., their very platform is offering them footguns.
pep.Then let's change the platform
Link Mauvemoparisthebest, good luck with that.
moparisthebestLink Mauve: *different footguns
Link MauveThat would be a platform where exactly no wire text is present in the final UI.
emushas left
Link MauveFor a chat system for instance, you wouldn’t go very far.
SamIt's not that we're just assuming web devs aren't intelligent, it's that literally every web client I ever tried that supported XHTML-IM (and I don't think "every" is me being hyperbolic) had trivial vulnerabilities. Sure, I reached out and helped fix a lot of them, but the point is that experience shows us that we handed them a gun pointed at their foot and then just told them "but be careful and don't pull the trigger"
moparisthebestHave you ever used openssl?
pulkomandysadly, yes :(
pep.Sam, you're mistaken on the footgun though
pep.There is one in that story for sure
moparisthebestAll computer stuff is a dumpster fire, pointing out that different trash is burning on the webdev side vs native code doesn't feel helpful
Link MauveSam, our specification might not have carried enough big blinking red warnings, but I’ve found similar vulnerabilities in multiple clients’ handling of MUC nicks, the thing in the resource. :D
pulkomandyalso we have specifically said "clueless webdevs" which is a subset of webdevelopers. There are skilled ones too, and there are clueless C++ developers too
Link MauveIt’s explicitly specified as an opaque string.
pep.pulkomandy, agreed
marchas left
SamSure, there are also other vulnerabilities and common problems; that doesn't mean we shouldn't fix the ones that can be fixed.
Link MauveRemoving the ability to send formatted text was never a fix, even less a good one.
marchas joined
SamNo one removed the ability, we obsoleted the spec which means "the XSF doesn't recommend this particular spec".
Link MauveBut we’ve had pages of emails on that topic, let’s not go over them again. :)
Link MauveSam, right.
pep.Yeah, pages of feedback on that topic which got ignored
nephelehas left
pulkomandywell it seems the result is client devs like me thinking "the XSF is stupid, they don't provide any alternative so I'm going to implement this anyway"
Link Mauvepep., not really, I mean people continue to implement it despite it being obsolete.
SamIt was all discussed multiple times. Just because your way didn't get picked doesn't mean you were ignored.
debaclehas left
Link Mauvepulkomandy, that’s approximately my stance on that too.
emushas joined
SamThe XSF isn't some magical body telling you what to do; the council just said "we don't recommend this one because experience has shown us it's difficult to do right". The XSF is *you*, other alternatives could be proposed (like 0393 and 0394). If one of them got implemented and the other didn't, it's the community that voted with their code, not the XSF. And you could always propose another that includes whatever formatting you think is missing
Link MauveSam, no need for that, 0071 works.
Link MauveAt best what I’d propose would be some bright blinking red warnings about our implementation experience.
pulkomandyyes, what do we do, resubmit 0071 with a new xep number and rename it "totally-not-xhtml-im" ?
SamWell, that's fine, but the council at the time disagreed.
alhas left
SamIn theory the council is experienced people who know a bit about XMPP. That's not to say that every decision will be perfect, and not to say that you can't ignore their warning and go implement it, just that it might be worth considering why they did it and that it wasn't because they ignored you.
qyI like 0394 more but 0393 seems more usable, probably best implement them both
pulkomandywell, there is this spec being used in the wild by at least 4 different xmpp things, there is no replacement (393 and 394 don't implement the two features I need: marking up links so I don't have to guess, and converting IRC styling so that IRC users can smoothly migrate to my client and not lose any features) and I'm not going to spend time writing more specs because I have enough work to do writing code supporting existing stuff. Do whateveryou want with that information :)
Sam(FWIW, I think we need a linking spec in particular and would love to see that exist, I've thought about working on one a few times)
qyi feel like oob is fine for linking, just that it has been implemented in such a wacky way
jalalhas left
SamMaybe I should finish my LaTeX-IM spec. It was meant to be published on April 1st last year, but I never got around to finishing/submitting it.
Link Mauve:D
Link MauveReminds me of a Gajim plugin I once wrote, which would render Lilypond markup inline. <3
Link Mauve(0393-style)
pep.Do I need to download a texlive distribution for the LaTeX-IM spec? :P
Samoooh, I would legit use that, not even as a joke. I used to write a lot of music and I *love* lilypond (even if every release breaks my old stuff and it's really confusing markup for anything more advanced than a simple staff)
Link Mauve(Where a client which didn’t support this markup would still show you the { \treble \time 4/4 c8 d e f g2 }, while a client with support would render a lovely score.
Link MauveThe main issue with that is that Scheme support means you basically own the remote computer.
SamI left the note on codeblocks undefined in 0393, but I keep hoping clients that implement it will do things like that, eg. gajim might let plugins hook into ```note and if it sees ```lilypond it could try to render it, etc.
SamBut yah, that opens a whole other can of worms.
Link MauvePreformatted text (<pre><code/></pre> in HTML) is by no means made to actually render or run the thing.
Link MauveAlthough you could add a Run button in your client, so that for instance a Python snippet can be executed inline.
Link MauveHopefully, only with proper sandboxing in place.
pulkomandya good way to check if there are also clueless python devs :')
Link MauveAre you willing to bet on most clients doing security properly? :)
xnamedhas joined
nephelehas joined
nephelehas left
nephelehas joined
qy> so that for instance a Python snippet can be executed inline.
qy😱️
marchas left
marchas joined
goffihas joined
marchas left
xnamedhas left
nephelehas left
marchas joined
alhas joined
nephelehas joined
moparisthebest> Hopefully, only with proper sandboxing in place.
You just described all of the web
9lakeshas left
nephelehas left
lovetoxhas left
me9has left
nephelehas joined
Link MauveThe web is actually a very good sandbox. :)
jalalhas joined
pulkomandybut that can't protect a website against itself :)
alhas left
nephelehas left
Link MauveActually there are quite a few mechanisms for that, iframe for one, combined with HTTP headers.
moparisthebestCSP?
Link MauveYeah.
Link MauveAnd a few other ones.
moparisthebestIt's just piles upon piles of hacks to try to make it secure