Page MenuHomePhabricator

[services] Tests - Backup - Calculate a real hash for backups/logs
ClosedPublic

Authored by karol on Jul 28 2022, 6:35 AM.
Tags
None
Referenced Files
F2194892: D4667.id15211.diff
Fri, Jul 5, 8:29 AM
F2189404: D4667.id.diff
Thu, Jul 4, 10:41 AM
Unknown Object (File)
Wed, Jul 3, 9:50 AM
Unknown Object (File)
Tue, Jul 2, 6:44 PM
Unknown Object (File)
Tue, Jul 2, 10:05 AM
Unknown Object (File)
Sun, Jun 30, 5:45 PM
Unknown Object (File)
Wed, Jun 26, 7:14 AM
Unknown Object (File)
Wed, Jun 26, 7:14 AM

Details

Summary

Depends on D4636

Calculating and sending real sha512 hash instead of some mocked bytes when creating a new backup and sending logs

Test Plan

term 1

cd services
yarn run-blob-service-in-sandbox

term 2

cd services
yarn run-backup-service-in-sandbox

term 3

cd services
yarn run-integration-tests backup

check in the database that you have real hashes instead of something like ABCD

Diff Detail

Repository
rCOMM Comm
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

karol edited the test plan for this revision. (Show Details)
karol added reviewers: varun, tomek.
tomek requested changes to this revision.Jul 29 2022, 1:37 AM
tomek added a reviewer: ashoat.

It would be better to have an abstraction over hash algorithm so that when we decide to change it, we don't need to modify its usages.

Adding @ashoat because there are new dependencies.

services/commtest/tests/backup/send_log.rs
42–55 ↗(On Diff #15060)

I don't understand this logic. We generate some data and compute its hash. Then we generate data again and send it. If we generate data twice, it should be different every time. Shouldn't we compute hash and send the same data?

This revision now requires changes to proceed.Jul 29 2022, 1:37 AM

It would be better to have an abstraction over hash algorithm so that when we decide to change it, we don't need to modify its usages.

Fair point, we can exclude it to an external place.

services/commtest/tests/backup/send_log.rs
42–55 ↗(On Diff #15060)

I admit this logic could be better but the problem is, we're first required to provide a hash and only then the chunks. I didn't want to keep a lot of data in the memory so I figured it's better to generate the data twice.

The function generate_nbytes generates the same output when called multiple times with the same input, so we're going to send the same data as we calculated the hash for.

My guess is that we've already discussed this and decided to first send the hash and then the data for some reason. If you feel like rediscussing it, I'm open and we can get back to this and discuss switching the order.

If you feel like there should be more context for this, I can spend some time and try to figure out the reasons behind the chosen approach - just let me know.

New dependencies seem reasonable; deferring to other reviewers on the rest of the review

services/commtest/tests/backup/send_log.rs
42–55 ↗(On Diff #15060)

Not sure if this is relevant, but if this leads to increased round-trips between server/client (eg. what we were trying to address in ENG-1052) then I would support revisiting this discussion

tomek requested changes to this revision.Aug 1 2022, 4:45 AM
tomek added inline comments.
services/commtest/tests/backup/send_log.rs
42–55 ↗(On Diff #15060)

Sending the hash first and then the data is good enough at this point. The issue I have has to do with just the test. It isn't obvious that generate_nbytes

generates the same output when called multiple times with the same input

I think this can change at some point. If we are sure that it will not change, then we should change the name to e.g. generate_stable_nbytes - to indicate that repeated calls will return the same value.

But the main reason behind introducing this logic is

I didn't want to keep a lot of data in the memory so I figured it's better to generate the data twice.

which doesn't make too much sense to me. Currently, every call to this method would result in generating a couple of kB of data - it's ok to use that much of memory. And even if we decide to generate more, even hundreds of MB should be fine.

This revision now requires changes to proceed.Aug 1 2022, 4:45 AM

renamse function to stable

services/commtest/tests/backup/send_log.rs
42–55 ↗(On Diff #15060)

I just didn't want to go unlimited with this. Ok, I'm going to update the name to generate_stable_nbytes.

Ideally, this rename should be put in a new diff before this one.

This revision is now accepted and ready to land.Aug 2 2022, 7:15 AM