Page MenuHomePhabricator

[blob] Delete S3 objects during cleanup
ClosedPublic

Authored by bartek on Oct 4 2023, 4:10 AM.
Tags
None
Referenced Files
Unknown Object (File)
Fri, Jun 14, 3:50 AM
Unknown Object (File)
Tue, Jun 11, 7:28 PM
Unknown Object (File)
Tue, Jun 11, 7:28 PM
Unknown Object (File)
Wed, Jun 5, 11:34 PM
Unknown Object (File)
Sat, May 25, 7:20 PM
Unknown Object (File)
May 4 2024, 6:24 PM
Unknown Object (File)
May 4 2024, 6:24 PM
Unknown Object (File)
May 4 2024, 6:24 PM
Subscribers

Details

Summary

Final part of blob cleanup logic - added deletion of S3 objects.

Depends on D9352.

Test Plan

Locally, uploaded a few items to S3 with object names hashA, hashB.
Created blob item rows in DDB: one with 'unchecked' attribute (hashA) and the other without (hashB).
No holders were given.
Ran the cleanup: only hashA file was deleted from S3.

Diff Detail

Repository
rCOMM Comm
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

bartek held this revision as a draft.
bartek published this revision for review.Oct 4 2023, 6:49 AM

If I understand correctly this will try to remove S3 blobs that might not exist (for holders can exist without having a blob uploaded and the logic will run for them too). Is this expected?

services/blob/src/service.rs
304 ↗(On Diff #31789)

orphans could be potentially changed to be a HashMap<String /*blob hash*/, Vec<String /*blob holder*/>>. Or maybe even iterate directly over unchecked_items?

If I understand correctly this will try to remove S3 blobs that might not exist (for holders can exist without having a blob uploaded and the logic will run for them too). Is this expected?

This is expected. It causes no harm. Anyway, this is very unlikely scenario, if not impossible, because a DDB row is created after a successful S3 upload.

services/blob/src/service.rs
304 ↗(On Diff #31789)

Iterating directly over unchecked_items makes sense, we could avoid duplicates for free, thanks!

This revision is now accepted and ready to land.Oct 11 2023, 1:05 AM

Address feedback: Optimize S3 path iteration

This revision was automatically updated to reflect the committed changes.