XSF Discussion - 2023-08-06


  1. lovetox

    jcbrand, wanted to ping you about https://github.com/xsf/xeps/pull/1271 i added some notes after implementing, would be interested in a discussion and to bring this XEP to the finish line.

  2. lovetox

    Also maybe the community can jime in on https://github.com/xsf/xeps/pull/1270, where the change would result in using origin-id and stanza-id in a XEP where both are not necessary at all.

  3. flow

    I fear the only way for me to chime in is, to state that I really believe that we should always state the id *and* the id-assigning entity when referencing other stanzas. but judging from the past experience, that is (sadly) probably controversial

  4. Zash

    Namespaced IDs? I like it.

  5. flow

    we should really try to figure out why we have such a hard time solving the "stanza reference" issue

  6. Zash

    Some sort of (jid, id) tuple would be good yes

  7. flow

    Zash, not namespaced, qualified (by an entity)

  8. Zash

    flow, those concepts are very close in my brain :)

  9. flow

    Zash, how about: <referenced-stanza xmlns='urn:xmpp:sid:0' id='xep359-stanza-id' by='muc.example.org'/>

  10. flow

    Zash, granted, they are similar, and I couldn't define and state the exact difference right now

  11. MSavoritias fae.ve

    I agree completely about the reference. for messages too that would be nice.

  12. Zash

    awkward given many of the existing uses are attributes, fitting a whole tag in there would be weird, but changing the attribute would be awkward too

  13. Zash

    Doesn't <stanza-id> have a @by already tho?

  14. flow

    Zash, it does have a by attribute

  15. flow

    basically <reference-stanza> is mimicing <stanza-id>

  16. Zash

    ah, right, sensible

  17. flow

    but I wondered if we should design it for multi-references

  18. flow

    <referenced-stanzas xmlns=…><stanza-id …/><stanza-id …/></referenced-stanzas>

  19. flow

    that would allow to re-use your stanza-id parser

  20. Zash

    the wrapper could plausibly live in each XEP that uses such a reference then?

  21. flow

    Zash, and the advantage of doing so would be?

  22. lovetox

    omg, solving problems nobody has

  23. flow

    Zash, fwiw, <referenced-stanzas/> xmlns would be xep359 one

  24. lovetox

    actually ignoring problems that are pressing and lead to hard implementation and less adoption

  25. flow

    lovetox, let's say I would aggree: what would be wrong with the approach for the sake of consistency?

  26. Zash

    problem statement?

  27. singpolyma

    The lack of a by on replies, reactions, etc is basically the reason we have to have these rules and use <stanza-id> in muc etc

  28. singpolyma

    But otoh people seem fine with <stanza-id> so whatever. It's not a big deal I expect

  29. lovetox

    nothing flow, absolutly nothing, only that it solves a problem nobody has

  30. lovetox

    or do you really think when developers implement this XEP they care from which tag they parse the id attribute?

  31. flow

    Zash, I assume that lovetox assumes that there is only one message ID in the MUC case, which can not be spoofed

  32. singpolyma

    flow: no, we always use <stanza-id> in muc specifically because of this lack of by attribute / spoofing problem for message@id

  33. larma

    flow: the reason why we have hard times to do the referencing thing is that the referencing thing alone doesn't buy us anything.

  34. flow

    singpolyma, ok, and when referencing a message in MUC you want to simply use the stanza-id@id but without the by attribute?

  35. flow

    singpolyma, ok, and when referencing a message in MUC you want to simply use the stanza-id@id but without the 'by' attribute?

  36. larma

    And then the referencing thing adds complexity cases it might not be needed or lacks features in other cases

  37. larma

    And then the referencing thing adds complexity to cases it might not be needed or lacks features in other cases

  38. singpolyma

    flow: that is what is already done

  39. lovetox

    the pressing problem for me is currently, that we have an influx of new XEPs that reference something, and people have no guideline which ID they should actually reference and *why*.

  40. flow

    singpolyma, ok, and I see how I works in MUC cases (if you do what I assume you do). But I believe that we should simply be always include the 'by' attribute for consistency and symetry

  41. singpolyma

    Copy what an existing xep does?

  42. larma

    lovetox: we actually have a pretty good rule for that, I don't see the problem

  43. lovetox

    who do you mean with we?

  44. lovetox

    and where is this rule written down

  45. larma

    All the XEPs have it in a footnote

  46. lovetox

    care to elaborate?

  47. flow

    which does not seem to be the best place for a rule that appears highly relevant?

  48. singpolyma

    flow: if we used a by attribute we could get rid of <stanza-id> completely and just use message@id and get rid of the dependency on mam. But I think only I want that anyway so it's probably not worth it

  49. flow

    singpolyma, if you get rid of mam, then what's your id provider?

  50. lovetox

    flow, for many usecases you dont need an id provider, i guess you mean a server

  51. flow

    by id provider, I mean the id assigning entity

  52. flow

    is it the MUC service or the sending client?

  53. flow

    or something else?

  54. larma

    singpolyma: we do have clients out there that have shitty message@id generation, so no, we can't rely on it for features that need to work on messages generated by those clients

  55. singpolyma

    flow: sending client. As with correction xep

  56. singpolyma

    And as with 1:1 for other xeps

  57. flow

    singpolyma, then which entity ensures that the IDs are not spoofable?

  58. larma

    It's fine to say "you can't reply to messages of those shitty clients", but it's definitely not OK if we can't moderate them

  59. singpolyma

    larma: we already do everywhere but MUC.

  60. larma

    singpolyma: we don't moderate outside MUC

  61. lovetox

    flow, not sure about which spoofing you are talking, its only you your server and a contact in many cases

  62. singpolyma

    Oh sure, I agree moderation is different maybe because you're communicating with the MUC about its own archive. It's basically mam dependent even in concept

  63. flow

    lovetox, MUC case: Alice sends message A1, Bob sends +1 reaction referencing A1

  64. flow

    lovetox, MUC case: Alice sends message A1, Mallorey sends message A1, Bob sends +1 reaction referencing A1, but now it is unclear which message is +1'ed

  65. singpolyma

    flow: that's what adding a by attribute to the reaction solves

  66. flow

    lovetox, MUC case: Alice sends message A1, Mallory sends message A1, Bob sends +1 reaction referencing A1, but now it is unclear which message is +1'ed

  67. flow

    singpolyma, and in the anonymous MUC case?

  68. larma

    singpolyma: let's say we have a XEP for requesting a message with all its edits at once, how would the server know which edit to attach to if we relied in message@id and by attribute and the same id appears twice?

  69. lovetox

    flow, occupant-id

  70. larma

    singpolyma: let's say we have a XEP for requesting a message with all its edits at once, how would the server know which edit to attach to if we relied in message@id and by attribute and the same message@id appears twice?

  71. flow

    singpolyma, but you are right, the by attribute helps here when referencing messages

  72. larma

    Mallory could just use Alice's nick while she is offline

  73. lovetox

    seems you search for one thing that rules all

  74. lovetox

    but you simply need to categories use cases

  75. larma

    Mallory could just use Alice's nick while she is offline and thus have the same by

  76. singpolyma

    larma: again, that sounds like specifically a mam extension so probably it would use mam IDs

  77. lovetox

    user reatract use case, there does not need to be stanza ids, its unspoofable

  78. larma

    singpolyma: wait, edits are a MAM extension now?

  79. lovetox

    message correction use case, unspoofable

  80. lovetox

    reactions use case, unspoofable

  81. singpolyma

    larma: fetching a list of things you said

  82. larma

    singpolyma: sure, but that needs to be possible with the information the messages contain, we don't want to have a new XEP that extends the edit xep to be compatible with the new mam feature

  83. flow

    larma, yes Mallory could also send while Alice is offline, but isn't that an argument to use MUC room MAM IDs when referencing messages? by using the MUC service as central arbitor for message IDs, all of those problems seem to go away. and modern MUC rooms want to provide MAM anways, then you get the MAM IDs for free

  84. flow

    larma, yes Mallory could also send while Alice is offline, but isn't that an argument to use MUC room MAM IDs when referencing messages? by using the MUC service as central arbiter for message IDs, all of those problems seem to go away. and modern MUC rooms want to provide MAM anways, then you get the MAM IDs for free

  85. Zash

    Heh, you could even do <stanza-id> and pretend to have zero retention...

  86. flow

    right

  87. larma

    flow: I'm all in favor to use MUC stanza-id ID in MUC and this is the status quo as far as I can tell

  88. larma

    Or at least should be

  89. flow

    ok, because it appeared to me that we talking about a MAM-less MUC future for a while

  90. flow

    ok, because it appeared to me that we where talking about a MAM-less MUC future for a while. but I probably just misunderstood

  91. larma

    What we don't need to do is to state the by attribute because that's always given in MUCs (because only the MUC's stanza-id is ever referenced

  92. larma

    What we don't need to do is to state the by attribute because that's always given in MUCs (because only the MUC's stanza-id is ever referenced)

  93. lovetox

    by attribute is always unnecessary if we talk about XEPs where users send stuff to other users

  94. lovetox

    servers tell us that anyway

  95. larma

    For direct messages, by attribute could be relevant if we consider that a reference might point to messages of any of the two (not true for edits, but for example for reactions)

  96. lovetox

    dont see how it is necessary in that case?

  97. lovetox

    you reference a id, i attach the reaction there

  98. larma

    What if you and me use the same id?

  99. lovetox

    then i attach it to both

  100. lovetox

    why would i care, its a reaction from another user, if he wants sabotage himself

  101. lovetox

    so be it

  102. lovetox

    its like the question, what if the client uses 1 as id for everything and makes then, a correction for message id 1

  103. lovetox

    yeah .. then i correct 100 messages, not really a problem of the receiving client

  104. larma

    I don't want to react with a thumbs up to your message "people are stupid", I want to thumbs-up my previous message "don't be stupid"

  105. lovetox

    user will be annoyed to talk to other user who has a broken client, and will tell hum

  106. lovetox

    user will be annoyed to talk to other user who has a broken client, and will tell him

  107. lovetox

    larma, i see your point, if i think about a enviroment where all kind of clients use non-stable ids

  108. lovetox

    you are right, in that case namespacing it with by is necessary

  109. lovetox

    but we are talking about 1:1 case

  110. larma

    I agree from that standpoint that intentionally misbehaving clients are not that much of an issue because it only breaks the conversation of those two users and that conversation is already broken if one side breaks rules in any other way, so that's something acceptable to me

  111. lovetox

    its evident that the client is broken, users will sort it out themself

  112. lovetox

    in MUC case stanza id is used, and its definitly unique, so nothing of that sort can happen

  113. larma

    lovetox: what is non-stable id?

  114. lovetox

    ah sorry, meant non-unique

  115. larma

    Ah, yeah, that's why Dino for example won't allow you to reply to 1:1 messages without origin-id, because origin-id explicitly requires uniqueness and suggests fully randomness

  116. lovetox

    yeah, and i feel thats unncessary

  117. lovetox

    but i respect your motivation to support older clients

  118. lovetox

    i actually would want to have stats which clients are really in the eco system that do this

  119. lovetox

    the main problem for me is currently origin-id/message-id

  120. lovetox

    it actually stops me from implementing more stuff

  121. lovetox

    because its uncertain if some people will change it in the future, and i have to deal with this later

  122. lovetox

    and larma, i bet you allow message corrections without origin-id

  123. lovetox

    so quite inconsistent for me

  124. lovetox

    message correction is a rather old XEP

  125. lovetox

    and it works only really if the client has unique ids

  126. larma

    Why? The ruleset is pretty easy: if origin-id is present on 1:1 messges use it for references. If not use message@id for references. Then maybe decide to not allow message@id values that seem not unique/random

  127. larma

    Why? The ruleset is pretty easy: if origin-id is present on 1:1 messges use it for references. If not use message@id for references. Then maybe decide to not allow message@id values in references that seem not unique/random

  128. lovetox

    its not about the rule

  129. lovetox

    its about if people decide to change the rule in 2 years

  130. lovetox

    i want to have a stable enviroment where i can depend on that design choices by XEP authors / board are thought trough and not changed suddenly

  131. larma

    What could be a possible change of the rule that would break with that rule

  132. lovetox

    as i said, there are other people, that dont have your opinion that origin-id is something we need

  133. lovetox

    and it can go away

  134. larma

    I totally agree we can get rid of origin-id eventually (except for MUC reflection usecase)

  135. lovetox

    so you can think of something that breaks that rule

  136. lovetox

    :D

  137. larma

    I also don't want it, but shitty clients are out there and we have to handle them somehow

  138. lovetox

    look i dont care which camp wins

  139. larma

    lovetox: no, if we drop origin-id you use message@id as per the rule

  140. lovetox

    thats a bad rule sorry, there is no XEP in stable that mandates origin-id

  141. lovetox

    is it too much to ask to force a decision now

  142. larma

    The only things origin-id is effectively being used for is: - let others know your message@id is unique - be able to identify if a message@id was modified

  143. flow

    I hope we see ourselfs not as a community divided into camps that need to win, but a community striving for consensus

  144. lovetox

    instead of doing nothing and let authors further publish XEPs until we are in a position where it needs to be supported for all eternity

  145. lovetox

    you know for my implementation its not trivial to simply switch to another id

  146. larma

    You know that origin-id and message@id are the same on literally every implementation that has origin-id at all?

  147. lovetox

    yes, thats why i think about to just ignore it, and simply always reference the message id

  148. lovetox

    but this is a risky implementation

  149. flow

    *every implementation you are aware of

  150. lovetox

    as it seems not clear which way things go with ids

  151. larma

    Sure, let's change the origin-id XEP to require that

  152. larma

    So that we can be sure any future implementation also has this feature that origin-id = message@id

  153. lovetox

    would be a good start

  154. flow

    wouldn't that make origin-id obsolete?

  155. lovetox

    flow, some people want the info if a id is unique

  156. larma

    flow: we still need it for the two things I mentioned above

  157. flow

    (which wouldn't be a bad thing, I suppose)

  158. larma

    > The only things origin-id is effectively being used for is: > - let others know your message@id is unique > - be able to identify if a message@id was modified

  159. Zash

    What happens if I write a server-side plugin that strips origin-id if == message@id ?

  160. lovetox

    larma, the second case is also non-existent

  161. larma

    lovetox: MUC reflection

  162. lovetox

    if you think there about the MUC case

  163. lovetox

    and the XEP was ammended

  164. lovetox

    larma, are you aware of a implementation that is not 20 years old that does this actively

  165. lovetox

    ?

  166. lovetox

    then lets fix it

  167. lovetox

    every new implementation will be compliant with the XEP

  168. larma

    What do you mean has been amended?

  169. larma

    MUC #stable-id is fully optional

  170. lovetox

    of course because MUC is a old stable xep

  171. larma

    Biboumi still does not support it afaik

  172. lovetox

    because it splits message right?

  173. larma

    Yes

  174. lovetox

    damn

  175. lovetox

    ok ignore what i said

  176. lovetox

    and what does it do then?

  177. Zash

    Tangent to that, XEP wishlist: advertising certain message limits

  178. lovetox

    does it invent new ids?

  179. lovetox

    or uses the same

  180. lovetox

    i wonder does biboumi need to do this?

  181. lovetox

    only because it splits the outgoing messages, why does it need to split them for the reflection to xmpp?

  182. lovetox

    but on the other side, non of these XEPs work for IRC anyway

  183. lovetox

    retraction, reaction, moderation, whatever

  184. lovetox

    and stop, you can simply use the muc feature as indication for that

  185. lovetox

    dont need to have this info on every message

  186. lovetox

    so its actually only the 1. case

  187. lovetox

    and there i would argue, its not worth it, in MUC we use anyway stanza id, so origin-id is not necessary

  188. lovetox

    and in 1:1 case its simply not important, people just migrate away from broken clients

  189. lovetox

    also your message reply use case, i dont see it

  190. lovetox

    you only allow a message reply if the other party uses origin-id

  191. lovetox

    but if it does not use origin-id, it will also not have support for message replies

  192. lovetox

    so it will not see it anyway

  193. lovetox

    nothing happens here

  194. lovetox

    your case would only be for clients who dont have unique ids, but support the most new XEPs like message replies

  195. Zash

    in 1:1 it should be more feasible to rely on disco#info too, but it may still be weird given multiple devices

  196. lovetox

    no its simply not important for 1:1

  197. larma

    There might be two messages with the same ID in a 1:1 chat and still multiple Clients that understand replies

  198. larma

    It's not like 1:1 chats only have two clients

  199. lovetox

    yes there can be cases constructed where something breaks, but its not worth it to add something new for that

  200. larma

    It's not something new

  201. lovetox

    larma, we have message corrections since years

  202. lovetox

    i heard no complaints that its totally broken because there are clients with non unique ids

  203. larma

    Message corrections only work on the last message, so the id doesn't really matter anyway

  204. lovetox

    no, thats implementation decision

  205. lovetox

    and we will definitly support this for not only the last message

  206. larma

    What most clients do is to just edit the last message with that id if there are duplicates

  207. larma

    Which is reasonable in the case where the XEP explicitly says it's only to be used for the last message

  208. larma

    But I might want to reply to the older of two messages with the same id

  209. lovetox

    ok so someone uses a old client, and a very new client at the same time

  210. larma

    Like many users

  211. lovetox

    and you need to reply to a message from the old client

  212. lovetox

    its very far fetched

  213. larma

    That have Pidgin on Desktop and Conversations on mobile

  214. lovetox

    and for that you mandate that all clients forever add a origin-id tag

  215. larma

    That's the two most popular clients

  216. larma

    How is that far fetched?

  217. lovetox

    cost/benefit is not there for me sorry

  218. larma

    For the most popular desktop client

  219. lovetox

    Pidgin uses maybe a non-unique id

  220. lovetox

    but that you hit the same id multiple times in a single chat with a contact

  221. lovetox

    is still very low probably

  222. lovetox

    at what point do you conisder the cost of the measure?

  223. lovetox

    if somebody can construct a single case where something would break, you add a mitigation not looking at the costs?

  224. lovetox

    even if i would agree, that there can be constructed a case with a used client, that happens to 1 in 1000 messages, i would still not mandate every client in existence forever, to add a origin id tag

  225. larma

    No, we surely should come up with rules as to when message@id is fine even if it's not an origin-id. See NY previous message.

  226. larma

    No, we surely should come up with rules as to when message@id is fine even if it's not an origin-id. See my previous message.

  227. larma

    > The ruleset is pretty easy: if origin-id is present on 1:1 messges use it for references. If not use message@id for references. Then maybe decide to not allow message@id values in references that seem not unique/random

  228. larma

    I would propose to accept any uuidv4 message@id the same way as origin-id

  229. MattJ

    Do any of these debates have bearing on what servers will need to index for fastening-like functionality?

  230. larma

    So the only thing we need to do is to mandate that origin-id must match message@id if present. Then it's fine for a client to only rely on message@id but require it to be uuidv4 to do replies/reactions/...

  231. larma

    MattJ: for MUCs it will always be the MUC assigned stanza-id, I think everyone agrees on that

  232. larma

    For 1:1 it's more complicated because your index must be on (local-user's-jid+remote-user's-jid+(origin-id/message@id))

  233. larma

    But that's basically status quo

  234. MattJ

    😔

  235. lovetox

    i think its wrong to do all this complexity to support unmaintained client like pidgin

  236. larma

    (Because uniqueness can't be guaranteed on 1:1 IDs and thus must be restricted to the chat)

  237. larma

    What complexity? Requiring that origin-id must match message@id? Or requiring that message@id must be UUIDV4?

  238. larma

    What complexity? Requiring that origin-id must match message@id? Or requiring that message@id must be UUIDV4 for references?

  239. MattJ

    I'm in favour

  240. lovetox

    the complexity in code, in my database design, when i join tables on IDs and must conisder this

  241. Zash

    UUIDv7 would also be nice

  242. larma

    For me those two rules are to reduce complexity from what we theoretically have right now.

  243. lovetox

    no, what we could have is easy, use always message id in single chat, always stanza id in MUC

  244. larma

    > the complexity in code, in my database design, when i join tables on IDs and must conisder this Don't implement origin-id. If we have those two rules you're good to go without it.

  245. lovetox

    yes thanks, thats why i want a decision

  246. Zash

    Database design is its own field for a reason.

  247. lovetox

    something that shows me, this is the future i can implement that

  248. lovetox

    not just words :)

  249. larma

    You will still need to take the participants into considerations when doing database joins

  250. MattJ

    Sorry, typing on an annoying phone keyboard. I'm in favour of anything that moves us back away from origin-id and towards a sensible place.

  251. larma

    Because I shouldn't be able to react to messages in your chat with Matt must because I discovered the ID of a message in that chat

  252. MattJ

    The @id is the origin id, and it's regrettable that the semantics weren't more watertight in the RFCs

  253. larma

    Because I shouldn't be able to react to messages in your chat with Matt just because I discovered the ID of a message in that chat

  254. lovetox

    of course larma :) its just you dont want to have OR in your join clause, message_id = id OR origin_id = id

  255. lovetox

    from what i read its very inefficient and a full table scan

  256. MattJ

    and you're assuming an SQL database anyway

  257. larma

    > UUIDv7 would also be nice Agreed ;)

  258. Zash

    lovetox, what if you have your own internal ID, along with a lookup table for message-id → gajim-id ?

  259. Zash

    That's basically what we do in Prosody

  260. lovetox

    of course, a third column :D yes in code everything is possible, im not saying this is an unsolveable problem

  261. lovetox

    but having a proper decision, and not needing to consider all possible futures makes it easier to choose the best way

  262. larma

    So we all agree that we have to make flow adjust XEP-0359 to: - Ask specifically for UUID v4 or v7 (instead of current just any UUID) - Mandate that origin-id, if present, must match message@id if the origin is the sender of the message. ?

  263. Zash

    I just wish there was a more compact UUID representation

  264. lovetox

    yes that would be a good start larma

  265. larma

    Zash: isn't that up for the underlying transport/storage engone to handle

  266. larma

    Zash: isn't that up for the underlying transport/storage engine to handle

  267. lovetox

    altough i have no opinion about the specific version of UUID, could it not be enough to say that implementors must choose a fitting UUID themself

  268. Zash

    larma, I'm thinking for wire protocol

  269. MattJ

    larma, I'd be in favour of that move

  270. larma

    lovetox: UUID v1 doesn't entail any randomness

  271. Zash

    lovetox, since you often speak of using timestamps, surely you want UUIDv7 for everything, being basically a timestamp + some random noise

  272. lovetox

    hmm no, i want a sequential number in a archive

  273. lovetox

    and timestamp generated by uuid7 certainly is somehow dependent on the computer time on the sender

  274. lovetox

    i currently would have no use for this

  275. MattJ

    Sequential counters, the easiest thing in computer science

  276. lovetox

    larma, but would it not be better to say, choose a uuid which has some randomness

  277. lovetox

    instead of mandating a specific version

  278. lovetox

    Zash, or do you mean that the archive uses uuid7 as stanza id

  279. lovetox

    could be interesting yes

  280. lovetox

    the question is, is uuid7 sortable?

  281. lovetox

    Zash are you sure it includes randomness?

  282. lovetox

    i doubt it

  283. lovetox

    > The UUIDv7 format is designed to encode a Unix timestamp with arbitrary sub-second precision.

  284. lovetox

    ok seems they all gurantee global uniquness

  285. lovetox

    so they need to have somekind of randomness in it, how this still can be sortable is beyond me

  286. lovetox

    ah i get it, they put the timestamp in front

  287. lovetox

    and hope that no entry has the same timestamp

  288. lovetox

    yeah so that would work if a archive simply uses that as stanza-id

  289. lovetox

    its a bit dependent on the correct time on the machine, but we have that anyway, as the timestamp of the message is recorded by the server

  290. lovetox

    so if it is wrong, it will be totally broken anyway

  291. Zash

    Even with unsynced clocks, you get things that are near in time closer in the index, which might be good. Or maybe it doesn't matter, I'm not a DBA 🤷️

  292. Zash

    Unordered things like UUIDv4 seem unfriendly to indexes

  293. Zash

    Tho I assume that proper databases can handle that too

  294. lovetox

    but this uuid7 think is mainly useful for some kind of distributed databases, where i would need some global assignment of id otherwise

  295. lovetox

    for a single machine storage, simply use a autoincrementing field which all databases have

  296. Zash

    That's what all UUIDs are designed for

  297. lovetox

    no need for complicated uuid7

  298. Zash

    *Universally* Unique IDs

  299. Zash

    Unique forever and troughout the entire universe!

  300. lovetox

    yes but archives dont need globaly unique ids

  301. Zash

    Pretty sure this is actually insane overkill

  302. lovetox

    ejabberd does this btw

  303. Zash

    Especially if IDs are scoped per JID

  304. lovetox

    it simply uses a timestamp as ID

  305. Zash

    ( JID, 64-bit number ) or so would probably be fine

  306. theTedd

    Relevant to the above discussion: https://github.com/ulid/spec And https://en.wikipedia.org/wiki/Snowflake_ID is worth a look.