Page MenuHomePhabricator

[keyserver] Clean up fetchMessageInfos using ROW_NUMBER window function
ClosedPublic

Authored by ashoat on Jul 28 2022, 11:20 PM.
Tags
None
Referenced Files
F3550494: D4675.diff
Thu, Dec 26, 6:15 PM
Unknown Object (File)
Sun, Dec 15, 7:27 AM
Unknown Object (File)
Sun, Dec 15, 7:27 AM
Unknown Object (File)
Sun, Dec 15, 7:26 AM
Unknown Object (File)
Sun, Dec 15, 7:22 AM
Unknown Object (File)
Mon, Dec 9, 5:09 AM
Unknown Object (File)
Tue, Dec 3, 9:36 PM
Unknown Object (File)
Nov 18 2024, 3:13 PM

Details

Summary

From reviewing some StackOverflow answers, I learned that MariaDB 10.8 supports ROW_NUMBER, which wasn't previously supported in MySQL 5.7. We can simplify this query while maintaining the same performance.

Depends on D4674

Test Plan

I first tested this stack locally in my environment to make sure the queries worked. Then I patched it onto my production keyserver and confirmed performance was improved

Diff Detail

Repository
rCOMM Comm
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

Looks great!

keyserver/src/fetchers/message-fetchers.js
281 ↗(On Diff #15086)

Do we still need to support the old engine? This query contains code specific to MariaDB LIMIT 18446744073709551615 so it is not specific to mysql5.7.

331 ↗(On Diff #15086)

What if two different messages have the same time? We can handle that in two ways: we can either add another order by column, e.g. id, or we can use RANK instead of ROW_NUMBER. The difference between these two is that when ROW_NUMBER will see an equal result, it will assign non deterministically different values, but RANK will assign the same value, but then the next value is increased accordingly. So if we order by a field f the following will happen

fRANKROW_NUMBER
111
222
223
344

So overall, in our use case, it is safer to use RANK. At the same time, we should use ids to sort when the same timestamp is assigned.

This revision is now accepted and ready to land.Jul 29 2022, 4:33 AM
keyserver/src/fetchers/message-fetchers.js
281 ↗(On Diff #15086)

Do we still need to support the old engine?

Yes until all dev environments have been migrated.

This query contains code specific to MariaDB LIMIT 18446744073709551615 so it is not specific to mysql5.7.

That's valid, I could remove that patch. That said, I did see a StackOverflow comment yesterday implying it was generally good practice to apply this LIMIT. Given MariaDB's relationship with MySQL I'm willing to bet MySQL 8 may have the same requirement.

Either way, I expect this code will be deleted soon.

331 ↗(On Diff #15086)
  1. It seems unlikely that two messages have the same time, but it's possible
  2. Seems like it's better to just add id to the ordering than to use RANK, since it maintains the property that we return no more than numberPerThread results
  3. If we want to modify the ORDER BY in that way, it might be best to do it for all message-fetchers.js code in a separate diff, rather than just for fetchMessageInfos
keyserver/src/fetchers/message-fetchers.js
281 ↗(On Diff #15086)

ENG-1394 tracks the deprecation. Hoping to have it complete by Monday, August 8th

331 ↗(On Diff #15086)

Created ENG-1488 to track this