Page MenuHomePhabricator

Implement data structure and methods to start and continue gRPC multimedia download streaming
AbandonedPublic

Authored by marcin on Sep 5 2022, 11:43 AM.
Tags
None
Referenced Files
Unknown Object (File)
Sat, Nov 9, 4:40 PM
Unknown Object (File)
Sat, Nov 9, 7:40 AM
Unknown Object (File)
Tue, Nov 5, 11:27 AM
Unknown Object (File)
Oct 10 2024, 2:25 PM
Unknown Object (File)
Oct 10 2024, 2:25 PM
Unknown Object (File)
Oct 10 2024, 2:25 PM
Unknown Object (File)
Oct 10 2024, 2:25 PM
Unknown Object (File)
Oct 10 2024, 2:25 PM

Details

Summary

This differential implements data structure that holds multimedia download stream internal state and methods to initialize download stream and continue it unit there are no data to be downloaded. Once download stream has been initialized, caller can download chunks of data one-by-one into provided buffer. Once there are not data to be downloaded blob service resumes stream and pull_chunk method returns false, without adding any data to the buffer.

Test Plan

Start blob service locally, then add the following code at the end of lib.rs file:

#[tokio::test]
async fn upload_download_test() {
  let holder = "test1".to_string();
  let hash = "test1".to_string();
  let upload_data = vec!["test".to_string(), "test".to_string()];

  let mut upload_state = initialize_upload_state().await.unwrap();

  start_upload(&mut upload_state, holder.clone(), hash)
    .await
    .unwrap();
  for chunk in upload_data.iter() {
    upload_chunk(&mut upload_state, &chunk).await.unwrap();
  }
  resume_upload(upload_state).await.unwrap();

  let mut download_data = Vec::new();
  let mut download_state = initialize_download_state(holder).await.unwrap();

  loop {
    if !pull_chunk(&mut download_state, &mut download_data)
      .await
      .unwrap()
    {
      break;
    }
  }
  assert_eq!(upload_data.join(""), download_data.join(""));
}

Test should pass. No additional steps required.

Diff Detail

Repository
rCOMM Comm
Lint
No Lint Coverage
Unit
No Test Coverage

Event Timeline

marcin edited the test plan for this revision. (Show Details)
tomek requested changes to this revision.Sep 6 2022, 6:17 AM
tomek added inline comments.
native/cpp/CommonCpp/grpc/blob_client/rust/src/lib.rs
128–130 ↗(On Diff #16333)

We're not using holder any more, so maybe we can just move it instead of cloning?

137–139 ↗(On Diff #16333)

Can we directly return DownloadState?

158 ↗(On Diff #16333)

If we were calling this function and expecting to pull some chunks, the fact that server closed a stream sounds like an error instead of Ok result

This revision now requires changes to proceed.Sep 6 2022, 6:17 AM
native/cpp/CommonCpp/grpc/blob_client/rust/src/lib.rs
153 ↗(On Diff #16333)

Are we sure that the chunk is encoded using UTF8? The attempt to decode invalid byte sequence might result in error.

native/cpp/CommonCpp/grpc/blob_client/rust/src/lib.rs
128–130 ↗(On Diff #16333)

Correct.

137–139 ↗(On Diff #16333)

As explained in previous differentials, this is expected to be returned to C++ where we cannot return opaque Rust type.

153 ↗(On Diff #16333)

This function pulls data that was send by upload_chunk. upload_chunk takes data in the form of string String and transforms to Vec<u8> before sending. According to rust docs: https://doc.rust-lang.org/stable/std/string/struct.String.html#method.from_utf8 not every Vec<u8> is valid utf8 but every String is valid utf8. So if the data we are pulling were String once they are valid uft8. If we refactor to use &str instead of &Stringthe reasoning above is still valid (https://doc.rust-lang.org/stable/std/str/fn.from_utf8.html).

158 ↗(On Diff #16333)

We do not know how many chunks of data blob service will send, nor does blob service send a special kind of message that informs us it is done sending data. If there are no more data to be sent blob service closes the stream. Here if we try to pull sone data from the closed stream we get None and we inform user about this fact by returning false from function. The user knows that blob service is done sending data. This function is expected to be called inside the loop as long as it returns true (it adds pulled data to the buffer in this case). When it returns false and no more data is added to the buffer user can assume they received all data from the blob service. If any error occurs in the process on either server or client side it will be propagated from this function (Neither true nor false will be returned)

native/cpp/CommonCpp/grpc/blob_client/rust/src/lib.rs
153 ↗(On Diff #16333)

So everything will work as long as C++ passes correct UTF8 strings to this function. I will research the matter in more detail, whether we can safely make such assumption

marcin edited the test plan for this revision. (Show Details)

Do not clone holder

marcin added inline comments.
native/cpp/CommonCpp/grpc/blob_client/rust/src/lib.rs
153 ↗(On Diff #16333)

I have done a quick research - we cannot guarantee that C++ string (std::string, char* etc.) represents correct UTF-8 bytes. I will try using &[u8] and see how it works with C++ via cxx bridge. If it works fine I will update function signatures to use this type instead.

Return bytes directly instead of passing then to user provided buffer

Brind 'Streaming' to scope

tomek added inline comments.
native/cpp/CommonCpp/grpc/blob_client/rust/src/lib.rs
137 ↗(On Diff #16494)

Use a shorthands where possible

This revision is now accepted and ready to land.Sep 9 2022, 7:41 AM

Refactor after varun's changes

Diff from 2022 what I was assigned encrypted blob upload work.