Page MenuHomePhabricator

[blob] Delete S3 objects during cleanup
ClosedPublic

Authored by bartek on Oct 4 2023, 4:10 AM.
Tags
None
Referenced Files
F2104413: D9355.id32020.diff
Tue, Jun 25, 5:01 AM
F2104412: D9355.id31921.diff
Tue, Jun 25, 5:01 AM
F2104411: D9355.id31918.diff
Tue, Jun 25, 5:01 AM
F2104410: D9355.id31628.diff
Tue, Jun 25, 5:01 AM
F2104409: D9355.id31789.diff
Tue, Jun 25, 5:01 AM
F2104373: D9355.id.diff
Tue, Jun 25, 5:00 AM
F2104335: D9355.diff
Tue, Jun 25, 4:54 AM
Unknown Object (File)
Mon, Jun 24, 5:24 PM
Subscribers

Details

Summary

Final part of blob cleanup logic - added deletion of S3 objects.

Depends on D9352.

Test Plan

Locally, uploaded a few items to S3 with object names hashA, hashB.
Created blob item rows in DDB: one with 'unchecked' attribute (hashA) and the other without (hashB).
No holders were given.
Ran the cleanup: only hashA file was deleted from S3.

Diff Detail

Repository
rCOMM Comm
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

bartek held this revision as a draft.
bartek published this revision for review.Oct 4 2023, 6:49 AM

If I understand correctly this will try to remove S3 blobs that might not exist (for holders can exist without having a blob uploaded and the logic will run for them too). Is this expected?

services/blob/src/service.rs
304 ↗(On Diff #31789)

orphans could be potentially changed to be a HashMap<String /*blob hash*/, Vec<String /*blob holder*/>>. Or maybe even iterate directly over unchecked_items?

If I understand correctly this will try to remove S3 blobs that might not exist (for holders can exist without having a blob uploaded and the logic will run for them too). Is this expected?

This is expected. It causes no harm. Anyway, this is very unlikely scenario, if not impossible, because a DDB row is created after a successful S3 upload.

services/blob/src/service.rs
304 ↗(On Diff #31789)

Iterating directly over unchecked_items makes sense, we could avoid duplicates for free, thanks!

This revision is now accepted and ready to land.Oct 11 2023, 1:05 AM

Address feedback: Optimize S3 path iteration

This revision was automatically updated to reflect the committed changes.