XSF Discussion - 2019-03-24

  147. Guus MattJ / flow : https://github.com/xsf/xeps/pull/771
  148. gengar has left
  149. Guus Would something go horribly, horribly wrong, if a server simply adds a stable/unique stanza ID to any message that it processes?
  150. MattJ It makes client life hard/impossible, sadly
  151. MattJ Clients would no longer know when a message is archived or not
  152. MattJ So they don't know if the id can/should be stored for later querying (e.g. for catch-up)
  153. Guus I do not like the fact that we're deducing that something is archived by merely detecting the presence of something that's supposed to be an opaque identifier.
  154. Guus (also, I don't have a better suggestion)
  155. MattJ Guus: that's why <archived> existed
  156. Guus What's the impossibility for clients, exactly?
  158. Holger Hmm I don't quite see this problem. There's no guarantee archived messages will remain in the archive forever anyway.
  159. Holger And I don't quite see how the info whether a locally stored message is also in the archive helps the client.
  160. Holger In my book it's fine to add a stanza ID to all messages. It may actually help with non-MAM use cases.
  162. gengar has joined
  163. Guus It'd make my implementation a lot easier...
  164. Andrew Nenakhov > Would something go horribly, horribly wrong, if a server simply adds a stable/unique stanza ID to any message that it processes? In short that's the basis of our XEP that we use to ensure message delivery. Works well.
  166. Guus Andrew Nenakhov the basis is that something goes wrong if we do (and you found an alternative), or: you do it, and you've seen that nothing goes wrong?
  167. Andrew Nenakhov Client sends stanza with provisional id, server stamps it with 0359 unique and stable id, sends this id to client as a confirmation.
  168. Andrew Nenakhov Guus, define wrong )
  169. Guus Mushroom-clouds on the horizon.
  170. Andrew Nenakhov We centralize everything to work via server archive. If archive breaks, kaput, yes
  178. gengar has joined
  179. MattJ Holger, you're right (there is no guarantee archived messages will remain in the archive forever)
  180. MattJ But if it's not in the archive, the client can assume it was purged and it needs to re-fetch the archive
  188. Guus MattJ: how can it fetch a purged archive?
  190. MattJ Guus, I mean in the sense that old messages are purged
  191. Guus Mind you that it is Sunday, I'm an idiot, and did not have enough coffee
  192. Guus I don't understand.
  193. MattJ Messages in the archive are not kept forever on most deployments
  194. MattJ The oldest messages are removed after some expiry time (let's say 30 days)
  195. gengar has joined
  196. Guus with you so far.
  197. Guus but why does the client need to re-fetch the archive if that happens?
  198. Guus need/would want to
  201. MattJ So a client that wants to receive all messages works by continuously remembering the id of the last archived message it received
  202. MattJ When it goes offline for a couple of days, it will come back online and request all messages since the last id it saw
  203. Guus if anything, it'd do a massive amount of data transfer only to end up with _less_ local history?
  204. MattJ With me still?
  205. Guus yes
  206. MattJ So now it goes offline for two months
  207. MattJ The last id it saw is no longer in the server's archive
  208. MattJ So it performs the query and gets item-not-found
  209. MattJ so it knows that the message has been expired, and any messages in the archive are messages it has never seen before
  210. MattJ because they are all newer
  211. Guus right. And if we'd slap on a stanza-id on every message without archiving, it'd _always_ get a item-not-found, assume its local cache is older than that's on the server, and it'd download all history, every time.
  212. Guus that's what you're saying, right?
  213. MattJ Yes
  220. waqas I feel like the whole "item-not-found means get full archive" thing is a hack. A server could lose a message for other reasons, e.g., storage failure causing recent stuff to be lost, or deletion of specific message due to gdpr, or some bug, etc.
  221. MattJ waqas, it's not allowed to
  222. pep. > and it'd download all history, every time. It would download up to the date it just requested
  223. Guus waqas I was trying to formulate a similar remark in my head.
  224. MattJ waqas, it can replace with placeholders if it needs to
  225. pep. Which may or may not be the whole history
  226. waqas MattJ: Storage failure isn't something someone can't be allowed to have.
  227. MattJ waqas, handling storage failure in defined ways is entirely sensible
  228. Guus sure, but soleely depening on 'item-not-found' based on a last-known ID still seems ... hackish...
  229. Guus sure, but solelely depening on 'item-not-found' based on a last-known ID still seems ... hackish...
  230. MattJ Guus, it's defined by the XEP to be this way, it's absolutely not a hack
  231. MattJ I mean, what else would you guys propose??
  232. waqas MattJ: Not really. If you lose a disk and restore from recent'ish backup, you'll have a situation where supposedly every recent message would be item-not-found..
  233. Guus MattJ it's a lot easier to disagree with stuff without having to suggest better alternatives 😉
  235. MattJ waqas, you can't just rewind time like that in most systems without consequences
  236. waqas Yes, and given that laws of physics disallow "messages can't be removed from archive after acked", a protocol shouldn't rely on that.
  238. Guus what if the client asks for the last-known ID archived by the server?
  239. waqas MattJ: To be clear, I think a sane recommendation would be if item-not-found, get archive by some timestamp based setup, but trying to get archive from beginning of time is silly in such a case.
  240. pep. (What I said above?)
  241. Guus (removed bad idea)
  242. waqas Yep, listen to pep.
  243. MattJ Yes, but the server was relocated to a different timezone and the admin forgot to set it to UTC
  244. pep. Dates don't include TZs? :s
  245. waqas Almost all popular dbs people use (mysql, postgres) in their default replica settings, when the master node is lost and another takes over (or a restoration from backup happens) will potentially lose recent writes. If the MAM XEP wants to assume that wouldn't happen, I'd consider it pretty silly.
  250. MattJ waqas, if you want to write your own XEP go ahead
  252. waqas MattJ: Do you see the problem I'm pointing out?
  253. Guus Maybe 'silly' isn't the best classifier here.
  254. Guus > Yes, but the server was relocated to a different timezone and the admin forgot to set it to UTC do we need the XEP to account for this?
  262. MattJ There's a simple fix for this, the XEP already has a flag to tell the client that the results are not necessarily persisted
  264. waqas MattJ: I'm asserting that the vast majority of MAM deployments can't guarantee durability in a disk-lost scenario. Recent writes being lost is a fact of life, you can't spec your way around it without mandating things you have no way to mandate.
  265. MattJ I look forward to your PR
  266. waqas Note that I don't think the MAM XEP has to change, just the assumption that item-not-found always means MAM storage was deleted up to that item is wrong.
  267. gengar has joined
  268. MattJ So yet another hidden thing for client devs to think about
  276. Guus MattJ where in the XEP is the what I called 'hack' described?
  277. Guus I was looking to see if the exact wording would make me think of hints for improval
  278. MattJ Guus, it quite possibly isn't
  279. Guus ah ok.
  280. gengar has joined
  283. Guus I'd love to be able to add stanza-id's everywhere, without implying that this means that MAM is available.
  284. lumi has joined
  285. Guus but doesn't service discovery sufficiently guard against that?
  286. MattJ Adding stanza-id doesn't imply MAM is available
  287. MattJ Buf it MAM is available, it implies you can't put stanza-id on every stanza
  296. gengar has joined
  297. Guus I'd like be able to. Is a feasible solution one that allows the client to request the id of the most-recent MAM entry, in order to verify if it has that one in its local archive?
  298. Guus If the XEP doesn't currently define the 'store the id of the last message, assuming that it is the last ID in your server-sided archive', there might be room for a change like that?
  302. MattJ Forget the message purging issue for the moment
  303. MattJ If the client records the id of the last message it received, and then later uses this to query an archive, what would you propose it do if the id it happened to remember wasn't an archived one?
  304. Guus item-not-found
  305. MattJ and then what?
  306. Guus Naively (I'm not client builder): I'd see up until what date I'd have a local archive, and retrieve from there.
  307. MattJ So fetch by timestamp?
  308. Guus with some wiggle-room, but yes.
  309. MattJ That way you'll either get duplicates or miss messages
  310. MattJ And that's not hackish?
  311. gengar has joined
  312. Guus Duplicates I can de-dupe with the message ID
  313. MattJ We could have just built the whole XEP on timestamps instead of ids if we're happy with that
  314. Guus misses would be bad.
  315. MattJ It's an ugly hack
  316. Guus well, let's not rewrite everything just yet - I'm fairly certain you've put way more thoughts into this than I have 🙂
  317. MattJ This is not something I would accept a rewrite for, for certain
  318. MattJ The correct fix is to re-introduce a way for the client to know whether the message is in the server's archive or not
  319. Holger > Buf it MAM is available, it implies you can't put stanza-id on every stanza Depends on server implementation, no? The server just must be able to respond to the before/after requests.
  320. Guus so, why can't it ask for the last-recorded message id in the archive?
  321. MattJ Guus, how does that help?
  322. Guus what's my last message? do I have this? no: resync everything.
  323. MattJ Guus, that's broken
  324. MattJ Just because you don't have the last message in the archive doesn't mean you don't have the first
  325. Holger E.g. ejabberd uses timestamps as IDs, so it doesn't matter whether the queried ID is archived, before/after still does the right thing.
  326. Guus resync everything from the last one that you have, I mean.
  327. MattJ Guus, you don't know what the last one you have is
  328. MattJ Holger, multiple stanzas with the same timestamp?
  329. Holger Microsecond accuracy, if you hit that in practice then yes it breaks.
  330. MattJ Holger, what about clock drift then?
  331. Guus MattJ how don't you know what your last message is? You can order your local archive chronologically, use the last one?
  332. MattJ I'm not against using timestamps *in* the id, but it's wrong to use them as the id with no extra logic
  333. MattJ Guus, the last what? I don't know which ones the server archived
  334. Holger MattJ: Clock drift across cluster nodes? That would break as well yes.
  335. Guus Hmm, my parents just walked in. Wife is preparing for 'the stare' again.
  338. Guus Mattj, but if archiving is enabled, you can assume that the messages that you have in ... aah, I don't have the time to further discuss this now, sorry.
  339. Guus ('stare')
  340. Guus I'd love to pick this up later.
  341. MattJ Holger, using the system's monotonic clock? or something custom?
  342. Guus got to go now
  343. Holger Erlang has a thing that doesn't jump back, not sure how it's implemented.
  344. Holger Anyway yes this is not the most robust solution against such pathological cases of course (it just has other nice properties). Whatever I just wanted to say that MAM doesn't imply only archives messages have an ID per se.
  348. MattJ Holger, as discussed, things will break (read: get hard/impossible) for clients if you add stanza-id to non-archived stanzas
  349. MattJ which is not a good situation, and should be fixed
  350. Holger Maybe I misunderstood the breakage vector. I would've thought things will be fine as long as the server is aware how the non-archived IDs are ordered compared to the archived messages.
  355. Holger MattJ, just in case you're interested, this sounds like custom clock that (attempts to) adjusts towards OS clock by changing frequency (up to 1%) while avoiding jumps: http://erlang.org/doc/apps/erts/time_correction.html#No_Time_Warp_Mode
  356. MattJ Fun
  357. Holger (At the cost of risking incorrect offsets of course, so they warn against doing this.)
  358. MattJ Holger, the server knowing how to interpret the ids is not really relevant... unless you're saying it should not return item-not-found but quietly accept ids that don't actually exist in the archive
  359. waqas has joined
  360. MattJ That would cause weirdness with clients that try to fill holes
  361. MattJ and probably other stuff
  362. Holger > unless you're saying it should not return item-not-found but quietly accept ids that don't actually exist in the archive Ah yes that's what I'm saying. IIRC 0059 suggests doing just that (wasn't it even a SHOULD?).
  363. Holger But I'm on my phone and the sun is shining. Gonna shut up now 🙂
  364. MattJ Is it too late to start over with MAM?
  365. MattJ Not using RSM for a start
  366. MattJ Trying to use existing building blocks has just caused confusion and unintended consequences
  368. pep. Well MAM is still experimental :-°
  369. pep. What about another bump?
  370. MattJ Everyone would love that
  371. gengar has joined
  372. pep. That's a thing I don't like in general. The XEP is still experimental but in reality it's just as if it was almost Final. If you change anything everybody is going to grump
  374. MattJ It certainly still has open issues, as a spec
  375. pep. Sure. I'm not just talking about MAM, that's how I feel about our specs in general
  376. MattJ Can't have it both ways
  377. MattJ Just this morning it was mentioned that XEP-0313 being Experimental is a reason Pidgin doesn't have support
  378. pep. I'd say that's an issue with developer expectations. If you implement it as experimental, know that it's likely going to change
  379. pep. (And even more, really, draft, even a final spec can be amended with another spec, so..)
  384. waqas Are devs expected to implement experimental xeps?
  385. MattJ If a standard explicitly has a big red warning at the top, and warning or no warning is subject to radical change... if I had a limited amount of free time, would I want to implement it?
  386. waqas "While implementation of an Experimental protocol is encouraged in order to determine the feasibility of the proposed solution, it is not recommended for such implementations to be included in the primary release for a software product (as opposed to an experimental branch)." — https://xmpp.org/extensions/xep-0001.html#states-Experimental
  387. pep. waqas, in the meantime, it's a needed feature
  388. pep. And it's even in the compliance suite..
  389. MattJ That's the real problem (that experimental or not, it's a needed feature)
  390. pep. I'd say both these criteria (needed feature / compliance suite) put even more pressure on the XEP to go to draft/final. I'm not saying I like it
  392. pep. And as you say there are still areas that need to be improved
  393. gengar has joined
  394. pep. Maybe there should be a rule that compliance suites can't recommend draft specs. In the hope that people focus/provide feedback on XEPs that are needed
  395. Zash I thought there was
  396. pep. Well if there was, MAM shouldn't be in there
  397. pep. nor carbons? (last call ended but it's still proposed)
  416. debacle has joined
  479. ThibG has joined
  480. igoose has joined
  511. moparisthebest has joined
  512. ThibG has joined
  515. gengar has joined
  516. gengar has left
  517. gengar has joined
  520. flow MattJ, I am not sure if using existing building blocks caused confusion. It appears to me that not clarifying how they are intended to use and are allowed to use (think for example if <before/> and <after/> can be used in the same query) is causing confusion
  527. MattJ flow: they can't, the end :)
  528. flow That is what I would also say, but it is at least underspecified in XEP-RSM
  534. gengar has joined
  548. gengar has left
  551. gengar has joined
  578. kokonoe has joined
  579. gengar has joined
  580. gengar has left
  581. gengar has joined
  596. gengar has joined
  597. gengar has left
  628. kokonoe has joined
