Page MenuHomePhabricator

[services] [3/n] Rust opensearch indexing lambda
ClosedPublic

Authored by will on Nov 20 2023, 6:36 PM.
Tags
None
Referenced Files
F3664928: D9936.id34753.diff
Sun, Jan 5, 7:08 PM
Unknown Object (File)
Sat, Dec 28, 1:43 PM
Unknown Object (File)
Mon, Dec 23, 11:36 AM
Unknown Object (File)
Mon, Dec 23, 11:36 AM
Unknown Object (File)
Mon, Dec 23, 11:36 AM
Unknown Object (File)
Mon, Dec 23, 11:36 AM
Unknown Object (File)
Mon, Dec 23, 11:36 AM
Unknown Object (File)
Mon, Dec 23, 11:36 AM
Subscribers

Details

Summary

A lambda written in Rust that consumes a dynamodb stream for identity-users and indexes users into the identity-search OpenSearch instance.

Depends on D9877

Test Plan

Successfully built using cargo lambda and deployed with terraform. Confirmed manually the lambda is syncing the stream.

Diff Detail

Repository
rCOMM Comm
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

will retitled this revision from [services] Rust opensearch indexing lambda to [services] [4/n] Rust opensearch indexing lambda.Nov 20 2023, 6:38 PM
Harbormaster returned this revision to the author for changes because remote builds failed.Nov 20 2023, 6:53 PM
Harbormaster failed remote builds in B24294: Diff 33446!
will retitled this revision from [services] [4/n] Rust opensearch indexing lambda to [services] [3/n] Rust opensearch indexing lambda.Nov 21 2023, 12:15 AM
will edited the summary of this revision. (Show Details)
will edited the summary of this revision. (Show Details)
will requested review of this revision.Nov 21 2023, 8:09 PM
bartek requested changes to this revision.Nov 23 2023, 3:26 AM
bartek added inline comments.
services/search-index-lambda/Cargo.toml
15 ↗(On Diff #33477)

In other places of our codebase, we use the tracing crate instead of log.
Is it possible to use it in lambda?

17 ↗(On Diff #33477)

leftover?

services/search-index-lambda/src/main.rs
55 ↗(On Diff #33477)

We tend to use expect() instead of unwrap to give some context what went wrong

72–83 ↗(On Diff #33477)

Tip:
It's much more elegant to create a dedicated types to what we want to operate on.

Instead of operating directly on serde_json::Value you can create some types with Deserialize trait

#[derive(Deserialize)]
enum EventName {
 INSERT,
 MODIFY,
 REMOVE
}

#[derive(Deserialize)]
struct Record {
  dynamodb: Map<String, Value>, // you can create dedicated types for this one too
  event_name: EventName,
}

type EventPayload = Vec<Record>;

Then you can try using LambdaEvent<EventPayload> directly, or if that doesn't work, try doing

let (payload, _context) = event.into_parts();
let records: EventPayload = serde_json::from_value(payload).expect("Failed to deserialize payload");

Also, since this looks like a DDB Stream payload (am I right?), aren't there already predefined types for this?
I found this one. The problem is that it doesn't implement the serde::Deserialize trait so you'd need to do something described in this doc

159 ↗(On Diff #33477)

Please avoid committing commented-out code

This revision now requires changes to proceed.Nov 23 2023, 3:26 AM

Use tracing and struct deserialization

I tried my hand at using remote derive to implement the Deserialization trait. However, many of the structs are non-exhaustive, preventing usage of constructors outside of the crate. This means I'm unable to implement structs with the remote attribute. Tried looking but I don't know if there's any solutions to this.

Instead, I just opted to writing my own structs.

Replace unwraps with expect for serializations

  • Replaced all unwraps with expects
  • Removed all prior leftover comments
  • Utilized dedicated types for deserialization of payload
  • Replaced usage of log with tracing
  • Moved constants to their own separate file
bartek added 1 blocking reviewer(s): varun.

Thanks for addressing these, good job! I'll let @varun take a look

In D9936#293865, @wyilio wrote:

I tried my hand at using remote derive to implement the Deserialization trait. However, many of the structs are non-exhaustive, preventing usage of constructors outside of the crate. This means I'm unable to implement structs with the remote attribute. Tried looking but I don't know if there's any solutions to this.

Instead, I just opted to writing my own structs.

That's unfortunate but yeah, writing own structs is a reasonable solution. You can optionally add a comment that they correspond to DynamoDB Streams types.

varun requested changes to this revision.Nov 30 2023, 3:58 PM

what happens if we hit an expect and panic? wondering if we need to handle errors more gracefully

.gitignore
29 ↗(On Diff #34001)

i don't think we want to gitignore the lock file

This revision now requires changes to proceed.Nov 30 2023, 3:58 PM

what happens if we hit an expect and panic? wondering if we need to handle errors more gracefully

Since lambda is "serverless", I assume it's one of two scenarios:

  • This single func call fails, unaffecting other function calls that are running at the same time. That's okay
  • This kills the whole function "instance" (kind of VM for func calls) so other simultaneous calls on this instance gets terminated too. This is bad.

I have a feeling that introducing custom error types is kinda overkill here, the codebase is rather small. Do you think that anyhow context would fit better here? Then we could map anyhow::Error to lambda_runtime::Error in main(). In fact this is just Box<dyn Error>.

services/search-index-lambda/src/main.rs
120 ↗(On Diff #34001)

what happens if we hit an expect and panic? wondering if we need to handle errors more gracefully

Since lambda is "serverless", I assume it's one of two scenarios:

  • This single func call fails, unaffecting other function calls that are running at the same time. That's okay
  • This kills the whole function "instance" (kind of VM for func calls) so other simultaneous calls on this instance gets terminated too. This is bad.

I have a feeling that introducing custom error types is kinda overkill here, the codebase is rather small. Do you think that anyhow context would fit better here? Then we could map anyhow::Error to lambda_runtime::Error in main(). In fact this is just Box<dyn Error>.

A single error from the lambda can block further processing of other entries for up to a day by default.

We can configure this behavior in our terraform configuration. There are a few options:

  1. Setting smaller batches sizes so that errors get isolated so we can quickly move onto new batches
  2. Setting retry limits / maximum age limit

I think we could definitely use anyhow

Update with blob type support and address comments

add anyhow for error handling

varun requested changes to this revision.Dec 18 2023, 11:08 AM

Not sure why we've converted expect()'s to unwrap()'s.

We can configure this behavior in our terraform configuration. There are a few options:
Setting smaller batches sizes so that errors get isolated so we can quickly move onto new batches
Setting retry limits / maximum age limit

is there a task tracking this?

services/search-index-lambda/src/main.rs
281 ↗(On Diff #34754)

confused by this. shouldn't it read "failed to get old image"

This revision now requires changes to proceed.Dec 18 2023, 11:08 AM

Revert unwraps back to expect and old image comment

In D9936#300551, @varun wrote:

We can configure this behavior in our terraform configuration. There are a few options:
Setting smaller batches sizes so that errors get isolated so we can quickly move onto new batches
Setting retry limits / maximum age limit

is there a task tracking this?

When we use anyhow and Result instead of panicking, this doesn't matter. Returning Err from main() doesn't kill the whole instance (batch?), just fails a single invocation.

In D9936#300551, @varun wrote:

Not sure why we've converted expect()'s to unwrap()'s.

We can configure this behavior in our terraform configuration. There are a few options:
Setting smaller batches sizes so that errors get isolated so we can quickly move onto new batches
Setting retry limits / maximum age limit

is there a task tracking this?

I've addressed part of this in [4/n] with:

resource "aws_lambda_function_event_invoke_config" "example" {
  function_name                = aws_lambda_function.search_index_lambda.function_name
  maximum_event_age_in_seconds = 60
  maximum_retry_attempts       = 2
}

When running prior batches, I was encountering problems with it running with too many retry attempts. Here's an issue I modified with the current context:
https://linear.app/comm/issue/ENG-5491/setup-error-handling-with-aws-lambda-function

This revision is now accepted and ready to land.Dec 29 2023, 5:23 AM