Page MenuHomePhabricator

[services] Backup - Add function to get utf8 string length
AbandonedPublic

Authored by jakub on Jul 12 2022, 1:19 AM.
Tags
None
Referenced Files
Unknown Object (File)
Tue, May 14, 6:41 AM
Unknown Object (File)
Mon, Apr 29, 5:26 PM
Unknown Object (File)
Mon, Apr 29, 5:26 PM
Unknown Object (File)
Fri, Apr 19, 4:30 PM
Unknown Object (File)
Fri, Apr 19, 4:29 PM
Unknown Object (File)
Fri, Apr 19, 4:23 PM
Unknown Object (File)
Apr 14 2024, 4:58 AM
Unknown Object (File)
Apr 4 2024, 2:13 AM

Details

Summary

Depends on D4504
context: here

Chars encoded in utf8 have different size between 1 and 4 bytes. Standard length method return different values on different architectures. To avoid that, we need to have an universal function for counting string bytes.

Test Plan

cd services && yarn run-unit-tests backup

Diff Detail

Repository
rCOMM Comm
Lint
No Lint Coverage
Unit
No Test Coverage

Event Timeline

jakub edited the test plan for this revision. (Show Details)

Surprising that there is no direct library function for this! Did some quick Googling and found this StackOverflow, but couldn't find any single API that would return the size of a UTF-8 string.

Additionally – reminder that you should ALWAYS specify a reviewer for a diff!

karol added 1 blocking reviewer(s): tomek.

Looks ok, personally, I'd try to do this without pointers but it probably doesn't matter.

tomek requested changes to this revision.Jul 18 2022, 5:46 AM

Can this function be called from multiple threads?

services/backup/src/Tools.cpp
60

Before this while in https://en.cppreference.com/w/cpp/string/multibyte/mblen there's a line std::mblen(nullptr, 0); // reset the conversion state. Should we include it?

63

What is strlen_mb() in this error message? Could you modify the message so that it is more helpful?

This revision now requires changes to proceed.Jul 18 2022, 5:46 AM
ashoat requested changes to this revision.Jul 18 2022, 11:51 AM

It seems to me that this diff should be abandoned since it's replaced in D4544. It's important that you think in terms of diffs going forward... when you fix an issue introduced in an earlier diff, you should always update the existing diff instead of introducing a new one.

Think about it from the perspective of the reviewer. What is the point of reviewing this code if it's going to immediately be thrown away anyways?