Details

Reviewers

• karol
tomek

Commits

rCOMMecd2f2cb68fe: [services] Inner function to write to DynamoDB by batch

Summary

This is a services database inner function that uses batch write to the DynamoDB.

Batch write can be used for batch putting or batch deleting items (D4220). The batch operations are much more effective than put/delete one-by-one in case of the network and database latency and DynamoDB WCU (Write Capacity Units) usage per second on a single partition.

This function splits the write requests into chunks sized by the constant from D4372 if needed and sends the write requests to the DynamoDB in chunks. Because DynamoDB has a batch operations chunk size limit.

Linear task: ENG-1302

Test Plan

Run yarn test-tunnelbroker-service-dev-mode passing the tests.

Diff Detail

Repository

rCOMM Comm

Branch

add-innerBatchWriteItem

Lint

No Lint Coverage

Unit

No Test Coverage

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

tomek added inline comments.Jun 28 2022, 2:44 AM

services/lib/src/DatabaseManagerBase.cpp
51–61	Please simplify the code to avoid the duplication. In this case it shouldn't be necessary to handle the case where `writeRequests.size() <= MAX_DYNAMODB_BATCH_ITEMS` as a special case: the general code should handle that well - just like it can handle the last batch.
67	Why do we use `int` here?
68–74	This code can be simplified by creating a variable that holds an index of the last element `last = std::min(writeRequests.size(), i + MAX_DYNAMODB_BATCH_ITEMS)`.

This revision now requires changes to proceed.Jun 28 2022, 2:44 AM

Updates based on @palys-swm comments.

services/lib/src/DatabaseManagerBase.cpp
51–61	Please simplify the code to avoid the duplication. In this case it shouldn't be necessary to handle the case where `writeRequests.size() <= MAX_DYNAMODB_BATCH_ITEMS` as a special case: the general code should handle that well - just like it can handle the last batch. Maybe this is an overengineering (over-optimization) here... I've removed this special case.
67	Why do we use `int` here? Switched to `size_t` to match the current codebase. Thanks.
68–74	This code can be simplified by creating a variable that holds an index of the last element `last = std::min(writeRequests.size(), i + MAX_DYNAMODB_BATCH_ITEMS)`. Good catch! Updated with this much simpler approach.

Harbormaster failed remote builds in B10041: Diff 13881!Jun 28 2022, 4:46 AM

The code looks ok, only some comments inline and an issue from CI

services/lib/src/DatabaseManagerBase.cpp
53–61 ↗	(On Diff #13881)	You can simplify the code by avoiding reusing the variables and using `auto`
65–68 ↗	(On Diff #13881)	`outcome` isn't used outside of this loop so we can declare it here

This revision is now accepted and ready to land.Jun 28 2022, 4:51 AM

Changed chunk size as a function variable, because this is a common library function and it should be passed from the point of usage.
Fixed build errors on the blob and backup.

Harbormaster completed remote builds in B10045: Diff 13885.Jun 28 2022, 5:13 AM

In D4373#123609, @palys-swm wrote:

The code looks ok, only some comments inline and an issue from CI

Thanks! CI issues were already fixed.

services/lib/src/DatabaseManagerBase.cpp
53–61 ↗	(On Diff #13881)	You can simplify the code by avoiding reusing the variables and using `auto` I'm not a big fan of using `auto` where the type can be declared explicitly. Also, reusing the same variables is a part of loop optimization here (we don't create new ones inside the loop and reuse existing allocated memory for it).
65–68 ↗	(On Diff #13881)	`outcome` isn't used outside of this loop so we can declare it here Same as above. I'm sure we should not use auto, where the type can explicitly be used, and allocating and reusing this variable here, is an optimized way from my perspective.

tomek added inline comments.Jun 28 2022, 5:29 AM

services/lib/src/DatabaseManagerBase.cpp
53–61 ↗	(On Diff #13881)	Also, reusing the same variables is a part of loop optimization here Have you measured the performance difference? I don't think this type of premature optimization is beneficial - if the variable is used only locally, it is easier to reason about it: you don't have to think about its initial value and about possible usages after the loop. Also, having a variable defined inside a loop might enable some optimization (I'm guessing now). Currently, we only reuse the name - new variables are constructed in every iteration. So overall, we have a lot more important performance issues and saving a couple of machine instructions wouldn't be noticeable. On the other hand, from readability point of view, this is a noticeable difference - you are certain that the variable is used only where it is valid.

• max mentioned this in D4378: [services] Tunnelbroker - Function to put message items to the database by batch.Jun 28 2022, 6:44 AM

• max added a child revision: D4378: [services] Tunnelbroker - Function to put message items to the database by batch.Jun 28 2022, 6:45 AM

• max edited the summary of this revision. (Show Details)Jun 28 2022, 6:59 AM

• max edited the summary of this revision. (Show Details)

tomek requested changes to this revision.Jun 29 2022, 5:05 AM

tomek added inline comments.

services/lib/src/DatabaseManagerBase.cpp
66 ↗	(On Diff #13885)	We should check what is the outcome https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_BatchWriteItem.html If any requested operations fail because the table's provisioned throughput is exceeded or an internal processing failure occurs, the failed operations are returned in the UnprocessedItems response parameter. You can investigate and optionally resend the requests. Typically, you would call BatchWriteItem in a loop. Each iteration would check for unprocessed items and submit a new BatchWriteItem request with those unprocessed items until all items have been processed.

This revision now requires changes to proceed.Jun 29 2022, 5:05 AM

• max planned changes to this revision.Jun 29 2022, 5:50 AM

• max marked 2 inline comments as done.

Additional processing of the throttled items was added.

services/lib/src/DatabaseManagerBase.cpp
66 ↗	(On Diff #13885)	We should check what is the outcome https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_BatchWriteItem.html If any requested operations fail because the table's provisioned throughput is exceeded or an internal processing failure occurs, the failed operations are returned in the UnprocessedItems response parameter. You can investigate and optionally resend the requests. Typically, you would call BatchWriteItem in a loop. Each iteration would check for unprocessed items and submit a new BatchWriteItem request with those unprocessed items until all items have been processed. This is a good catch, we should consider this scenario. Additional processing of the throttled items was added.

Harbormaster failed remote builds in B10158: Diff 14028!Jun 30 2022, 8:38 AM

Harbormaster completed remote builds in B10158: Diff 14028.Jun 30 2022, 8:48 AM

tomek requested changes to this revision.Jul 1 2022, 8:32 AM

tomek added inline comments.

services/lib/src/DatabaseManagerBase.cpp
66 ↗	(On Diff #13885)	Have you checked the document I linked? There's a note there, highlighted in red and marked as important If DynamoDB returns any unprocessed items, you should retry the batch operation on those items. However, we strongly recommend that you use an exponential backoff algorithm. If you retry the batch operation immediately, the underlying read or write requests can still fail due to throttling on the individual tables. If you delay the batch operation using exponential backoff, the individual requests in the batch are much more likely to succeed. Are we sure that this strong recommendation doesn't apply to us?

This revision now requires changes to proceed.Jul 1 2022, 8:32 AM

• max planned changes to this revision.Jul 1 2022, 8:45 AM

Add exponential backoff delay with a jitter to the batchWrite function realization.

• max marked an inline comment as done.Jul 5 2022, 5:21 AM

• max added inline comments.

services/lib/src/DatabaseManagerBase.cpp
66 ↗	(On Diff #13885)	Have you checked the document I linked? There's a note there, highlighted in red and marked as important If DynamoDB returns any unprocessed items, you should retry the batch operation on those items. However, we strongly recommend that you use an exponential backoff algorithm. If you retry the batch operation immediately, the underlying read or write requests can still fail due to throttling on the individual tables. If you delay the batch operation using exponential backoff, the individual requests in the batch are much more likely to succeed. Are we sure that this strong recommendation doesn't apply to us? This is a good catch. I've dug into this and seems that there is a chance to lose some items if we don't check for unprocessed items. I've added the check for it and exponential backoff delay along with jitter for a delay. Btw, Google Cloud documentation describes such an approach too.

Fixing a space.

• max added a parent revision: D4436: [services] Tunnelbroker - Add exponential backoff related constants.Jul 5 2022, 5:23 AM

• max mentioned this in D4436: [services] Tunnelbroker - Add exponential backoff related constants.Jul 5 2022, 5:53 AM

Harbormaster completed remote builds in B10288: Diff 14180.Jul 5 2022, 6:21 AM

Harbormaster completed remote builds in B10289: Diff 14181.Jul 5 2022, 6:27 AM

• karol requested changes to this revision.Jul 5 2022, 6:30 AM

• karol added inline comments.

services/lib/src/DatabaseManagerBase.cpp
86–87 ↗	(On Diff #14181)	Why do we wait for a variable amount of time? Why isn't it constant? Where does this formula come from? How did you come up with this?
91–93 ↗	(On Diff #14181)	Why do we first wait and only then attempt to perform a DB operation? Shouldn't we go in the opposite direction? What we do outcome = BatchWriteItem(); while(!outcome.empty()) { sleep(delay); outcome = BatchWriteItem(); } I think we could avoid repeating these lines: outcome = getDynamoDBClient()->BatchWriteItem(writeBatchRequest); if (!outcome.IsSuccess()) { throw std::runtime_error(outcome.GetError().GetMessage()); } What I'd do: outcome = nullptr; do { outcome = BatchWriteItem(); sleep(delay); } while(outcome != nullptr && !outcome.empty()); Maybe I missed something but I think this code structure could be improved.

This revision now requires changes to proceed.Jul 5 2022, 6:30 AM

• max requested review of this revision.Jul 6 2022, 4:29 PM

• max marked 2 inline comments as done.

• max added inline comments.

services/lib/src/DatabaseManagerBase.cpp
86–87 ↗	(On Diff #14181)	Why do we wait for a variable amount of time? Why isn't it constant? Where does this formula come from? How did you come up with this? The waiting time should change exponentially (growing exponentially with every attempt), that's why rewriting the current variable instead of creating a new one every iteration is way more efficient. This is a common formula, for example, it's used in google cloud docs. We're increasing delay time exponentially that higher the probability of the successful request. There are two common ways to make an exponential backoff delay: with or without jitter. Random jitter is added to avoid the request waves.
91–93 ↗	(On Diff #14181)	Why do we first wait and only then attempt to perform a DB operation? Shouldn't we go in the opposite direction? In this part of the code, we're trying to send unprocessed items which were not processed due to the AWS throttling from the previous write request. If we would not wait first, this follow-up request will be made immediately after the request that was throttled before. In this scenario, executing a second request after that which has been throttled by the AWS is useless, because we need to wait exactly before it. What I'd do: outcome = nullptr; do { outcome = BatchWriteItem(); sleep(delay); } while(outcome != nullptr && !outcome.empty()); As I've commented above, we must wait first in the case of processing follow-up requests with the unprocessed items.

• max added a child revision: D4476: [services] Tunnelbroker - Database function that removes messages by a batch using messageIDs and receiver DeviceID..Jul 7 2022, 7:46 AM

• max added a child revision: D4478: [services] Tunnelbroker - Add test for `removeMessageItemsByIDsForDeviceID` database function.Jul 7 2022, 8:37 AM

tomek requested changes to this revision.Jul 8 2022, 2:13 AM

tomek added inline comments.

services/lib/src/DatabaseManagerBase.cpp
86–87 ↗	(On Diff #14181)	This is a part of exponential backoff, which indeed should make the delay grow exponentially with number of retries, but it isn't implemented correctly. If you think about the units here, you can realize what's wrong. `backoffStep` is expressed in time units, e.g. in `ms` and we raise it to a power, which would result in `ms` for the first iteration, `ms^2` for the second and so on. This doesn't make sense to express a delay in time squared, does that? A correct formula should include exponentiation of unitless quantity, e.g. a constant factor of 2. Something like: `backoffStep * std::pow(2, delayRetry) + jitterMs` As a side note, we need to check if the algorithm isn't included in the SDK - see D4436

This revision now requires changes to proceed.Jul 8 2022, 2:13 AM

• max planned changes to this revision.Jul 11 2022, 1:06 AM

• max marked 2 inline comments as done.

• max removed parent revisions: D4436: [services] Tunnelbroker - Add exponential backoff related constants, D4372: [services] Tunnelbroker - Add DYNAMODB_MAX_BATCH_ITEMS to Constants.Jul 11 2022, 2:02 AM

Algorythm of the backoff delay is changed. Function parameter is changed.

• max marked an inline comment as done.Jul 11 2022, 2:27 AM

• max added inline comments.

services/lib/src/DatabaseManagerBase.cpp
86–87 ↗	(On Diff #14181)	This is a part of exponential backoff, which indeed should make the delay grow exponentially with number of retries, but it isn't implemented correctly. If you think about the units here, you can realize what's wrong. `backoffStep` is expressed in time units, e.g. in `ms` and we raise it to a power, which would result in `ms` for the first iteration, `ms^2` for the second and so on. This doesn't make sense to express a delay in time squared, does that? A correct formula should include exponentiation of unitless quantity, e.g. a constant factor of 2. Something like: `backoffStep * std::pow(2, delayRetry) + jitterMs` There are some examples and articles to use the current algorithm, but I don't mind changing it, especially AWS documentation in an example suggesting using this instead of the current. It was changed! As a side note, we need to check if the algorithm isn't included in the SDK - see D4436 Replied to that comment there. Thanks for deep digging into the problem and helpful comments ;)

Harbormaster completed remote builds in B10439: Diff 14383.Jul 11 2022, 2:35 AM

• karol added inline comments.Jul 11 2022, 2:48 AM

services/lib/src/DatabaseManagerBase.cpp
86–87 ↗	(On Diff #14181)	Good point - have you done any research on whether it is maybe available in a library that we already use? As for the formula - I'm not sure I understand how the formula you provided is different from the one that's already here.

• karol added inline comments.Jul 11 2022, 2:53 AM

services/lib/src/DatabaseManagerBase.cpp
91–93 ↗	(On Diff #14181)	You missed the point of my comment. My intention was to keep all `BatchWriteItem` s in the loop. But I now see that it would probably not make an improvement in the code as we have to distinguish the initial request from the rest of the requests which are done for unprocessed items. So I think we're good here.

• karol accepted this revision.Jul 11 2022, 4:11 AM

• max marked 2 inline comments as done.Jul 12 2022, 1:42 PM

• max added inline comments.

services/lib/src/DatabaseManagerBase.cpp
86–87 ↗	(On Diff #14181)	Good point - have you done any research on whether it is maybe available in a library that we already use? The AWS C++ SDK has its internal RetryStrategy which implements a backoff delay algorithm, which is the same as @palys-swm suggested to use instead of the original ones, it's used inside the AWS Client itself, but for the `BatchWrite` AWS Documentation and SDK examples suggests us to use/implement this delay and loop by ourself. By the way, this is a good idea to move this backoff delay algorithm to the shared lib methods (eg. GlobalTools) and use it for the delays between reconnects in AMQP/RabbitMQ client in the Tunnelbroker. I've created a follow-up for this as ENG-1381.

tomek accepted this revision.Jul 18 2022, 6:39 AM

tomek added inline comments.

services/lib/src/DatabaseManagerBase.cpp
79 ↗	(On Diff #14383)	I'm pretty sure you haven't tested this condition, because you're assigning here instead of comparing. And even if `==` was used, it would still result in an infinite loop - `>=` should be used instead.

This revision is now accepted and ready to land.Jul 18 2022, 6:39 AM

• max marked an inline comment as done.Jul 18 2022, 6:45 AM

• max added inline comments.

services/lib/src/DatabaseManagerBase.cpp
79 ↗	(On Diff #14383)	I'm pretty sure you haven't tested this condition, because you're assigning here instead of comparing. Oh, thanks a lot! That's a nit. I'll fix it. And even if `==` was used, it would still result in an infinite loop - `>=` should be used instead. `==` comaprision here is used because we are using `std::min` at line 86. That's why `delayMS` can not be greater than `maxBackoffTime` and that's why I've used `==` instead of `>=` because of the `delayMS` can not be greater and it does not make sense to use `>=` here.

Fixing a nit comparision at if (delayMs = maxBackoffTime)...:79 and rebase on master.

• max mentioned this in rCOMMf074a9c13b3b: [services] Tunnelbroker - Add exponential backoff related constants.Jul 18 2022, 7:03 AM

Harbormaster completed remote builds in B10581: Diff 14570.Jul 18 2022, 7:12 AM

Closed by commit rCOMMecd2f2cb68fe: [services] Inner function to write to DynamoDB by batch (authored by • max). · Explain WhyJul 25 2022, 6:34 AM

This revision was automatically updated to reflect the committed changes.

• max added a commit: rCOMMecd2f2cb68fe: [services] Inner function to write to DynamoDB by batch.

• max mentioned this in rCOMM4f4d313523db: [services] Tunnelbroker - Function to put message items to the database by batch.Jul 25 2022, 6:58 AM

Diff 13867

View Options

services/lib/src/DatabaseManagerBase.h

#include "DatabaseManagerBase.h" #include "DatabaseManagerBase.h"

#include "Item.h" #include "Item.h"

#include <aws/core/utils/Outcome.h> #include <aws/core/utils/Outcome.h>

#include <aws/dynamodb/model/BatchWriteItemRequest.h>

#include <aws/dynamodb/model/BatchWriteItemResult.h>

#include <aws/dynamodb/model/DeleteItemRequest.h> #include <aws/dynamodb/model/DeleteItemRequest.h>

#include <iostream> #include <iostream>

namespace comm { namespace comm {

namespace network { namespace network {

namespace database { namespace database {

Show All 23 Lines void DatabaseManagerBase::innerRemoveItem(const Item &item) {

const Aws::DynamoDB::Model::DeleteItemOutcome &outcome = const Aws::DynamoDB::Model::DeleteItemOutcome &outcome =

getDynamoDBClient()->DeleteItem(request); getDynamoDBClient()->DeleteItem(request);

if (!outcome.IsSuccess()) { if (!outcome.IsSuccess()) {

throw std::runtime_error(outcome.GetError().GetMessage()); throw std::runtime_error(outcome.GetError().GetMessage());

} }

void DatabaseManagerBase::innerBatchWriteItem(

const std::string &tableName,

std::vector<Aws::DynamoDB::Model::WriteRequest> &writeRequests) {

// Split write requests to chunks by MAX_DYNAMODB_BATCH_ITEMS size and write

// them by batch

Aws::DynamoDB::Model::BatchWriteItemOutcome outcome;

std::vector<Aws::DynamoDB::Model::WriteRequest> writeRequestsChunk;

std::vector<Aws::DynamoDB::Model::WriteRequest>::iterator chunkPositionStart,

chunkPositionEnd;

for (size_t i = 0; i < writeRequests.size(); i += MAX_DYNAMODB_BATCH_ITEMS) {

chunkPositionStart = writeRequests.begin() + i;

chunkPositionEnd = writeRequests.begin() +

std::min(writeRequests.size(), i + MAX_DYNAMODB_BATCH_ITEMS);

writeRequestsChunk = std::vector<Aws::DynamoDB::Model::WriteRequest>(

chunkPositionStart, chunkPositionEnd);

tomekUnsubmitted

Done

Aws::DynamoDB::Model::BatchWriteItemOutcome outcome;

- std::vector<Aws::DynamoDB::Model::WriteRequest> writeRequestsChunk;

- std::vector<Aws::DynamoDB::Model::WriteRequest>::iterator chunkPositionStart,

- chunkPositionEnd;

for (size_t i = 0; i < writeRequests.size(); i += MAX_DYNAMODB_BATCH_ITEMS) {

- chunkPositionStart = writeRequests.begin() + i;

- chunkPositionEnd = writeRequests.begin() +

+ auto chunkPositionStart = writeRequests.begin() + i;

+ auto chunkPositionEnd = writeRequests.begin() +

std::min(writeRequests.size(), i + MAX_DYNAMODB_BATCH_ITEMS);

- writeRequestsChunk = std::vector<Aws::DynamoDB::Model::WriteRequest>(

+ auto writeRequestsChunk = std::vector<Aws::DynamoDB::Model::WriteRequest>(

chunkPositionStart, chunkPositionEnd);

Aws::DynamoDB::Model::BatchWriteItemRequest writeBatchRequest;

You can simplify the code by avoiding reusing the variables and using auto

tomek: You can simplify the code by avoiding reusing the variables and using `auto`

maxAuthorUnsubmitted

Done

You can simplify the code by avoiding reusing the variables and using auto

I'm not a big fan of using auto where the type can be declared explicitly. Also, reusing the same variables is a part of loop optimization here (we don't create new ones inside the loop and reuse existing allocated memory for it).

max: > You can simplify the code by avoiding reusing the variables and using `auto` I'm not a big…

tomekUnsubmitted

Not Done

Also, reusing the same variables is a part of loop optimization here

Have you measured the performance difference? I don't think this type of premature optimization is beneficial - if the variable is used only locally, it is easier to reason about it: you don't have to think about its initial value and about possible usages after the loop. Also, having a variable defined inside a loop might enable some optimization (I'm guessing now). Currently, we only reuse the name - new variables are constructed in every iteration.

So overall, we have a lot more important performance issues and saving a couple of machine instructions wouldn't be noticeable. On the other hand, from readability point of view, this is a noticeable difference - you are certain that the variable is used only where it is valid.

tomek: > Also, reusing the same variables is a part of loop optimization here Have you measured the…

Aws::DynamoDB::Model::BatchWriteItemRequest writeBatchRequest;

writeBatchRequest.AddRequestItems(tableName, writeRequestsChunk);

outcome = getDynamoDBClient()->BatchWriteItem(writeBatchRequest);

if (!outcome.IsSuccess()) {

throw std::runtime_error(outcome.GetError().GetMessage());

}

tomekUnsubmitted

Done

writeBatchRequest.AddRequestItems(tableName, writeRequestsChunk);

- outcome = getDynamoDBClient()->BatchWriteItem(writeBatchRequest);

+ auto outcome = getDynamoDBClient()->BatchWriteItem(writeBatchRequest);

if (!outcome.IsSuccess()) {

throw std::runtime_error(outcome.GetError().GetMessage());

}

outcome isn't used outside of this loop so we can declare it here

tomek: `outcome` isn't used outside of this loop so we can declare it here

maxAuthorUnsubmitted

Done

outcome isn't used outside of this loop so we can declare it here

Same as above. I'm sure we should not use auto, where the type can explicitly be used, and allocating and reusing this variable here, is an optimized way from my perspective.

max: > `outcome` isn't used outside of this loop so we can declare it here Same as above. I'm sure…

}

} // namespace database } // namespace database

} // namespace network } // namespace network

} // namespace comm } // namespace comm

View Options

services/lib/src/DatabaseManagerBase.cpp

#include "DatabaseManagerBase.h" #include "DatabaseManagerBase.h"

#include "Item.h" #include "Item.h"

#include <aws/core/utils/Outcome.h> #include <aws/core/utils/Outcome.h>

#include <aws/dynamodb/model/BatchWriteItemRequest.h>

#include <aws/dynamodb/model/BatchWriteItemResult.h>

#include <aws/dynamodb/model/DeleteItemRequest.h> #include <aws/dynamodb/model/DeleteItemRequest.h>

#include <iostream> #include <iostream>

namespace comm { namespace comm {

namespace network { namespace network {

namespace database { namespace database {

Show All 23 Lines void DatabaseManagerBase::innerRemoveItem(const Item &item) {

const Aws::DynamoDB::Model::DeleteItemOutcome &outcome = const Aws::DynamoDB::Model::DeleteItemOutcome &outcome =

getDynamoDBClient()->DeleteItem(request); getDynamoDBClient()->DeleteItem(request);

if (!outcome.IsSuccess()) { if (!outcome.IsSuccess()) {

throw std::runtime_error(outcome.GetError().GetMessage()); throw std::runtime_error(outcome.GetError().GetMessage());

} }

void DatabaseManagerBase::innerBatchWriteItem(

const std::string &tableName,

std::vector<Aws::DynamoDB::Model::WriteRequest> &writeRequests) {

// Split write requests to chunks by MAX_DYNAMODB_BATCH_ITEMS size and write

// them by batch

Aws::DynamoDB::Model::BatchWriteItemOutcome outcome;

std::vector<Aws::DynamoDB::Model::WriteRequest> writeRequestsChunk;

std::vector<Aws::DynamoDB::Model::WriteRequest>::iterator chunkPositionStart,

chunkPositionEnd;

for (size_t i = 0; i < writeRequests.size(); i += MAX_DYNAMODB_BATCH_ITEMS) {

chunkPositionStart = writeRequests.begin() + i;

chunkPositionEnd = writeRequests.begin() +

std::min(writeRequests.size(), i + MAX_DYNAMODB_BATCH_ITEMS);

writeRequestsChunk = std::vector<Aws::DynamoDB::Model::WriteRequest>(

chunkPositionStart, chunkPositionEnd);

tomekUnsubmitted

Done

Aws::DynamoDB::Model::BatchWriteItemOutcome outcome;

- std::vector<Aws::DynamoDB::Model::WriteRequest> writeRequestsChunk;

- std::vector<Aws::DynamoDB::Model::WriteRequest>::iterator chunkPositionStart,

- chunkPositionEnd;

for (size_t i = 0; i < writeRequests.size(); i += MAX_DYNAMODB_BATCH_ITEMS) {

- chunkPositionStart = writeRequests.begin() + i;

- chunkPositionEnd = writeRequests.begin() +

+ auto chunkPositionStart = writeRequests.begin() + i;

+ auto chunkPositionEnd = writeRequests.begin() +

std::min(writeRequests.size(), i + MAX_DYNAMODB_BATCH_ITEMS);

- writeRequestsChunk = std::vector<Aws::DynamoDB::Model::WriteRequest>(

+ auto writeRequestsChunk = std::vector<Aws::DynamoDB::Model::WriteRequest>(

chunkPositionStart, chunkPositionEnd);

Aws::DynamoDB::Model::BatchWriteItemRequest writeBatchRequest;

You can simplify the code by avoiding reusing the variables and using auto

tomek: You can simplify the code by avoiding reusing the variables and using `auto`

maxAuthorUnsubmitted

Done

You can simplify the code by avoiding reusing the variables and using auto

max: > You can simplify the code by avoiding reusing the variables and using `auto` I'm not a big…

tomekUnsubmitted

Not Done

Also, reusing the same variables is a part of loop optimization here

tomek: > Also, reusing the same variables is a part of loop optimization here Have you measured the…

Aws::DynamoDB::Model::BatchWriteItemRequest writeBatchRequest;

writeBatchRequest.AddRequestItems(tableName, writeRequestsChunk);

outcome = getDynamoDBClient()->BatchWriteItem(writeBatchRequest);

if (!outcome.IsSuccess()) {

throw std::runtime_error(outcome.GetError().GetMessage());

}

tomekUnsubmitted

Done

writeBatchRequest.AddRequestItems(tableName, writeRequestsChunk);

- outcome = getDynamoDBClient()->BatchWriteItem(writeBatchRequest);

+ auto outcome = getDynamoDBClient()->BatchWriteItem(writeBatchRequest);

if (!outcome.IsSuccess()) {

throw std::runtime_error(outcome.GetError().GetMessage());

}

outcome isn't used outside of this loop so we can declare it here

tomek: `outcome` isn't used outside of this loop so we can declare it here

maxAuthorUnsubmitted

Done

outcome isn't used outside of this loop so we can declare it here

Same as above. I'm sure we should not use auto, where the type can explicitly be used, and allocating and reusing this variable here, is an optimized way from my perspective.

max: > `outcome` isn't used outside of this loop so we can declare it here Same as above. I'm sure…

}

} // namespace database } // namespace database

} // namespace network } // namespace network

} // namespace comm } // namespace comm

[services] Inner function to write to DynamoDB by batch
ClosedPublic
Actions

Details

Diff Detail

Event Timeline

Revision Contents
Changeset List

Diff 13867

services/lib/src/DatabaseManagerBase.h

services/lib/src/DatabaseManagerBase.cpp

[services] Inner function to write to DynamoDB by batchClosedPublicActions

Details

Diff Detail

Event Timeline

Revision ContentsChangeset List

Diff 13867

services/lib/src/DatabaseManagerBase.h

services/lib/src/DatabaseManagerBase.cpp

[services] Inner function to write to DynamoDB by batch
ClosedPublic
Actions

Revision Contents
Changeset List