Page MenuHomePhabricator

[blob] Delete S3 objects during cleanup
ClosedPublic

Authored by bartek on Oct 4 2023, 4:10 AM.
Tags
None
Referenced Files
Unknown Object (File)
Sat, Jan 11, 7:05 PM
Unknown Object (File)
Sat, Jan 11, 12:33 AM
Unknown Object (File)
Tue, Dec 31, 4:09 AM
Unknown Object (File)
Tue, Dec 31, 4:09 AM
Unknown Object (File)
Tue, Dec 31, 4:09 AM
Unknown Object (File)
Tue, Dec 31, 4:09 AM
Unknown Object (File)
Tue, Dec 31, 4:09 AM
Unknown Object (File)
Tue, Dec 31, 4:07 AM
Subscribers

Details

Summary

Final part of blob cleanup logic - added deletion of S3 objects.

Depends on D9352.

Test Plan

Locally, uploaded a few items to S3 with object names hashA, hashB.
Created blob item rows in DDB: one with 'unchecked' attribute (hashA) and the other without (hashB).
No holders were given.
Ran the cleanup: only hashA file was deleted from S3.

Diff Detail

Repository
rCOMM Comm
Lint
No Lint Coverage
Unit
No Test Coverage

Event Timeline

bartek held this revision as a draft.
bartek published this revision for review.Oct 4 2023, 6:49 AM

If I understand correctly this will try to remove S3 blobs that might not exist (for holders can exist without having a blob uploaded and the logic will run for them too). Is this expected?

services/blob/src/service.rs
304 ↗(On Diff #31789)

orphans could be potentially changed to be a HashMap<String /*blob hash*/, Vec<String /*blob holder*/>>. Or maybe even iterate directly over unchecked_items?

If I understand correctly this will try to remove S3 blobs that might not exist (for holders can exist without having a blob uploaded and the logic will run for them too). Is this expected?

This is expected. It causes no harm. Anyway, this is very unlikely scenario, if not impossible, because a DDB row is created after a successful S3 upload.

services/blob/src/service.rs
304 ↗(On Diff #31789)

Iterating directly over unchecked_items makes sense, we could avoid duplicates for free, thanks!

This revision is now accepted and ready to land.Oct 11 2023, 1:05 AM

Address feedback: Optimize S3 path iteration

This revision was automatically updated to reflect the committed changes.