This differential adds method that parses logs in search for blob hashes and stores all found hashes to an attachments file.
Details
- Reviewers
michal kamil - Commits
- rCOMMeaeb4f63f0dd: Implement attachments capture
The same as for parent differential but this time ensure that attachments file are created especially if you send multimedia messages. Attachments file is
a list of blob hashes separated by newline.
Diff Detail
- Repository
- rCOMM Comm
- Lint
No Lint Coverage - Unit
No Test Coverage
Event Timeline
native/cpp/CommonCpp/DatabaseManagers/SQLiteConnectionManager.cpp | ||
---|---|---|
102–145 ↗ | (On Diff #36168) | So basically we iterate over each change in a log and for UPDATE and INSERT changes affecting media table we retrieve uri value and if it contains blob hash we return it to be persisted in a file. Checking for blob service hash is exactly the same as in main compaction creation. |
It seems like our approach here is specific to one single place we store blobs: the uri column of the media table.
- Are we considering old-style uploads here, which are not stored in the blob service?
- Do we use the blob service anywhere else in the codebase today? Do we need to add holders for those cases too?
- Are we planning any other uses of the blob service that are worth considering? I wonder if we should design a more generic mechanism here, eg. have a standard way of storing blob IDs across SQLite, or perhaps have a single table in SQLite containing all of them.
- Old-style uploads do not need to be handled in attachments. Attachments are specific to blob-service uploads so that the backup service can read through the list of attachments and create a holder for each blob so that it is not deleted during clean-up before someone fetches them after restore.
- I talked to @bartek and confirmed that uri column of media table is the only place that we need to look for attachments now.
- There is a task for the correct solution: https://linear.app/comm/issue/ENG-6261/introduce-blob-hash-table
Thanks for explaining!
Old-style uploads do not need to be handled in attachments. Attachments are specific to blob-service uploads so that the backup service can read through the list of attachments and create a holder for each blob so that it is not deleted during clean-up before someone fetches them after restore.
I'm aware of that. I'm asking if the uri field might contain something for old-style uploads, that we want to skip. Is that being handled here correctly?
Your concern is correct - see translateMediaToClientDBMediaInfos() - the uri column is used for both kinds of media. They're distinguished by having encryption_key field in the extras column's tringified JSON. But this is about encrypted vs non-encrypted media.
Is that being handled here correctly?
Yes, because the Blob vs non-blob URIs are distinguished by the comm-blob-service:// URI prefix, which is done in L149 in this diff
Yes - I am not adding uri to the attachments list if it doesn't start with "comm-blob-service://" prefix.
native/cpp/CommonCpp/DatabaseManagers/NativeConnectionManager.cpp | ||
---|---|---|
66 | seems more idiomatic | |
89–90 | I think something strange happened here with the formatting | |
102–106 | I've seen so many of those in code already, wondering if adding function like this: void NativeConnectionManager::handleSQLError(int errorCode, const std::string& errorPrefix) { if (errorCode != SQLITE_OK) { throw std::runtime_error( errorPrefix + std::string(sqlite3_errstr(errorCode))); } } isn't beneficial | |
117 | ||
149 | how about using std::string::compare? avoids creating a sub-string and feels more efficient |