Page MenuHomePhabricator

[services] Tests - Add performance tests base for Blob
ClosedPublic

Authored by karol on Jul 20 2022, 4:15 AM.
Tags
None
Referenced Files
Unknown Object (File)
Thu, Dec 19, 2:00 AM
Unknown Object (File)
Mon, Dec 16, 7:19 AM
Unknown Object (File)
Mon, Dec 16, 7:19 AM
Unknown Object (File)
Mon, Dec 16, 7:19 AM
Unknown Object (File)
Mon, Dec 16, 7:19 AM
Unknown Object (File)
Mon, Dec 16, 7:19 AM
Unknown Object (File)
Mon, Dec 16, 7:19 AM
Unknown Object (File)
Mon, Dec 16, 7:19 AM

Details

Summary

Depends on D4580

This is the base code for the performance tests for the Blob service.

I used tokio here as I was having problems with thread::spawn thread::join working with the async closures - @varun have you done anything like this? I'm not saying that tokio's bad here - there's some additional boilerplate but not too much - acceptable I'd say.

The idea here is to simultaneously run service calls (they're async) in multiple threads.

Test Plan
cd services
yarn run-performance-tests blob

It doesn't do anything with the service yet but should compile and pass.

Diff Detail

Repository
rCOMM Comm
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

karol edited the test plan for this revision. (Show Details)
karol added reviewers: tomek, varun.
karol added a subscriber: varun.
services/commtest/tests/blob/blob_utils.rs
8 ↗(On Diff #14680)

This could also be moved to the next diff. It was needed as I wanted to use the blob data collection in the different threads.

services/commtest/tests/blob_performance_test.rs
3–8 ↗(On Diff #14680)

We may want to remove them from here and add them in the following diffs lazily.

27 ↗(On Diff #14680)

We may want to have this value specified from the outside, what do you think? Optionally, this could fall back to a specified default value.

48 ↗(On Diff #14680)

The logic's going to be added in one of the following diffs.

53 ↗(On Diff #14680)

The logic's going to be added in one of the following diffs.

58 ↗(On Diff #14680)

The logic's going to be added in one of the following diffs.

karol edited the summary of this revision. (Show Details)

clean up

tomek requested changes to this revision.Jul 21 2022, 4:32 AM
tomek added inline comments.
services/commtest/tests/blob_performance_test.rs
21 ↗(On Diff #14691)

What do you think about using more functional approach?

let blob_data: Vec<BlobData> = (0..NUMBER_OF_THREADS).map(|i| ...).collect()
22 ↗(On Diff #14691)

Why do we take modulo 10? What is special about this number?

27 ↗(On Diff #14680)

This sounds like a good idea. I think that usually https://crates.io/crates/num_cpus might be a good default, but it depends on where these tests are run - on CI we might want to have something different.

This revision now requires changes to proceed.Jul 21 2022, 4:32 AM
karol added inline comments.
services/commtest/tests/blob_performance_test.rs
21 ↗(On Diff #14691)

We can do this, does that change anything? I think it is more readable the way it is now.

22 ↗(On Diff #14691)

I just didn't want any of the values below (the results of subtractions)(ByteSize::kib(200 + (300 - index * 20)).as_u64() as usize,) to get "flipped" (so they would go negative and the numbers would flip because they're unsigned) no matter how many threads we'd launch.

27 ↗(On Diff #14680)

Sorry but I'm confused what does the number of CPUs have to do with how many threads we want to spawn? I think we may be good with a default value of just 3 or 5 hardcoded.

receive number of threads from the outside

tomek requested changes to this revision.Jul 22 2022, 2:28 AM
tomek added inline comments.
services/commtest/tests/blob_performance_test.rs
21 ↗(On Diff #14691)

We can do this, does that change anything?

It doesn't change the logic but it changes how the code looks like - it's just a refactoring.

I think it is more readable the way it is now.

Ok, fair enough.

22 ↗(On Diff #14691)

Could you explain what is the benefit of having these formulas for sizes at the first place? You explained what is the reason for having this modulo, which makes some sense (I think we could have different ones that use addition instead of subtraction, but that's another thing).

  1. Why do we need to have different sizes for each thread? Is it a kind of sanity check? These tests should check the performance not the correctness, so I'm not sure if this is that important. Also, even when the sizes were equal, what would be the downside?
  2. When different threads have different sizes, our results could be wrong. We could conclude that e.g. the performance per thread grows with each thread, which doesn't make sense, but with chunk being lower with every new thread, would be our conclusion. Can we find a formula that gives at least the same average size for each thread?
27 ↗(On Diff #14680)

Sorry but I'm confused what does the number of CPUs have to do with how many threads we want to spawn? I think we may be good with a default value of just 3 or 5 hardcoded.

This is surprising. When choosing the right number of threads you can expect the speedup to degrade when exceeding the number of cores. It happens because when there are more threads than cores, some threads would need to wait for the physical resources to execute. That's why it is usually a good idea to go with number of cores, less is ok, more is usually not a good idea as it introduces performance bottleneck.

This revision now requires changes to proceed.Jul 22 2022, 2:28 AM
services/commtest/tests/blob_performance_test.rs
22 ↗(On Diff #14691)

I wanted to go with subtraction instead of addition because I wanted to have more data in the first threads spawned so there's more probable for the threads to overlap - if we first spawn threads with fewer data and then increase it with the following threads, then they may not overlap because the first ones will finish before the following ones.

  1. Sorry but I disagree. These tests should check performance and correctness as well. We want to check if there are situations when something goes wrong when we spawn a lot of overlapping threads. If something doesn't upload correctly - we should catch it.

We can have the same sizes but I wanted them to be different so the pattern of thread overlapping would be more complex and, therefore, more cases would be checked.

  1. The size doesn't get smaller with every new thread. Please note that the index variable is calculated with the modulo - % 10 so every 10th thread the pattern (the amount of data) is being reset. That gives a stable average, right?

If you still think that we should go with the constant amount of data for every thread, I think we can do this but I don't really think that having different sizes hurts that much and I think it makes the tests check more cases but I may be wrong.

27 ↗(On Diff #14680)

I see. It's a bit outside of the goal here I think - the goal is to measure the maximum capacity of the system, not to run this program as efficiently as possible but we can do this, it's ok.

address feedback - number of cores

services/commtest/tests/blob_performance_test.rs
17–22 ↗(On Diff #14805)

I tried to use unwrap_or but for some reason, it didn't work :/

let number_of_threads: usize = env::var("COMM_NUMBER_OF_THREADS")
  .unwrap_or(num_cpus::get().to_string())
  .parse::<usize>()
  .unwrap();
tomek requested changes to this revision.Jul 25 2022, 2:46 AM
tomek added a reviewer: ashoat.

Adding @ashoat because there's a new dependency

services/commtest/tests/blob_performance_test.rs
8 ↗(On Diff #14805)

Why do you use extern crate? Can't we use use instead?

19–22 ↗(On Diff #14805)

If we can avoid having mut, we should do that. In this case we can either use if's return value or use match expression.

This revision now requires changes to proceed.Jul 25 2022, 2:46 AM

Dependency seems reasonable, haven't reviewed the rest

services/commtest/tests/blob_performance_test.rs
8 ↗(On Diff #14805)

use works. For me, there's no difference so I will change that. Originally, I was following docs AFAIR.

19–22 ↗(On Diff #14805)

Good idea, thanks!

tomek added inline comments.
services/commtest/tests/blob_performance_test.rs
8 ↗(On Diff #14805)

According to https://doc.rust-lang.org/edition-guide/rust-2018/path-changes.html

extern crate is no longer needed in 99% of circumstances.

It looks like it had to be used before rust-2018 but is no longer required.

services/scripts/run_performance_tests.sh
20 ↗(On Diff #14931)

Maybe mention that the default value is number of cpus?

This revision is now accepted and ready to land.Jul 26 2022, 4:18 AM
services/scripts/run_performance_tests.sh
20 ↗(On Diff #14931)

We can do it, ok.

mention that the default value is the number of CPUs