Page MenuHomePhabricator

[services] Move folly from Blob to Backup base
ClosedPublic

Authored by karol on Feb 15 2022, 5:17 AM.
Tags
None
Referenced Files
Unknown Object (File)
Thu, Apr 25, 12:40 AM
Unknown Object (File)
Tue, Apr 23, 5:30 PM
Unknown Object (File)
Tue, Apr 23, 5:30 PM
Unknown Object (File)
Tue, Apr 23, 5:30 PM
Unknown Object (File)
Tue, Apr 23, 5:30 PM
Unknown Object (File)
Tue, Apr 23, 5:30 PM
Unknown Object (File)
Tue, Apr 23, 5:30 PM
Unknown Object (File)
Tue, Apr 23, 5:30 PM

Details

Summary

As we'll need to use folly for the dev mode in the backup as well, I decided to move folly installation from the blob base image to the backup base image

Depends on D3179

Test Plan

Remove backup-base, backup-server, blob-base and blob-server as well as all caches and rebuild the images:

cd services
yarn build-backup-base
yarn build-blob-base
yarn run-blob-service
yarn run-backup-service

services should work as they did before this change.

Diff Detail

Repository
rCOMM Comm
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
ashoat requested changes to this revision.Feb 15 2022, 7:43 PM
ashoat added a reviewer: jim.
ashoat added a subscriber: jim.

Can you remind me again, does the blob base build on the backup base? @jimpo, what's your suggestion for how to share the build config for this dependency across multiple services?

Passing back to you since I need a bit more context before I can accept

This revision now requires changes to proceed.Feb 15 2022, 7:43 PM

OK, a related issue here is that I'd prefer folly to be built and installed as a system library instead of pulled into the CMake build of the service. This both helps for Docker layer caching and simplifying the CMakeLists.txt. I don't care as much whether it's in the service image or the base image, but one argument for service image is that we may want to build folly with different features/flags/build parameters for different services.

Can you remind me again, does the blob base build on the backup base?

Yes.

OK, a related issue here is that I'd prefer folly to be built and installed as a system library instead of pulled into the CMake build of the service. This both helps for Docker layer caching and simplifying the CMakeLists.txt. I don't care as much whether it's in the service image or the base image, but one argument for service image is that we may want to build folly with different features/flags/build parameters for different services.

Can you specify what do you mean exactly? I think folly is installed as a system lib and symlinked to the local dir.

We can refactor services but I think it would be good to do this in a separate diff and get this landed.

ashoat requested changes to this revision.Feb 21 2022, 11:27 PM
ashoat added inline comments.
services/backup/docker-server/contents/server/cmake-components/folly.cmake
12–47 ↗(On Diff #9662)

I don't see this list of files anywhere in the before state, and the diff name/description suggest that the diff is just moving the same build config from the backup image to the base image.

Are you actually changing the build config here? If so, can you please move split this diff into two – one that moves the build config, and one that updates it?

Alternately, if this is getting moved/copied from somewhere, can you please indicate the source?

This revision now requires changes to proceed.Feb 21 2022, 11:27 PM
karol added inline comments.
ashoat requested changes to this revision.Feb 22 2022, 7:15 PM
This comment was removed by ashoat.
This revision now requires changes to proceed.Feb 22 2022, 7:15 PM
This revision is now accepted and ready to land.Feb 22 2022, 7:15 PM

Please link a task that tracks fixing this very strange/confusing dependency chart (one service depends on another). It feels like we've let you be a bit too laissez-faire with some of the Docker services config... the idea that the blob service would extend the backup service honestly seems quite ridiculous to me. If I was more of an expert in Docker, I probably would've felt more comfortable requesting changes on this in the past. Hopefully @jimpo will help us catch these sorts of issues in the future

services/backup/docker-server/contents/server/cmake-components/folly.cmake
12–47 ↗(On Diff #9662)

Wait so why aren't we deleting that file, if it's being moved here? Do we need it in both places long-term? Why is that?

karol added inline comments.
services/blob/docker-base/Dockerfile
3 ↗(On Diff #9836)

error

In D3198#87211, @ashoat wrote:

Please link a task that tracks fixing this very strange/confusing dependency chart (one service depends on another). It feels like we've let you be a bit too laissez-faire with some of the Docker services config... the idea that the blob service would extend the backup service honestly seems quite ridiculous to me. If I was more of an expert in Docker, I probably would've felt more comfortable requesting changes on this in the past. Hopefully @jimpo will help us catch these sorts of issues in the future

https://linear.app/comm/issue/ENG-804/fix-services-images

We discussed why we have such a dependency chain a couple of times I think. I see however that it brings more and more confusion, therefore I'm going to push here some changes that get rid of it and an additional diff that follows the DRY rule.

This revision is now accepted and ready to land.Feb 23 2022, 6:37 AM
ashoat added inline comments.
services/blob/docker-base/contents/install_aws_sdk.sh
5–9 ↗(On Diff #9837)

Pattern of semicolon use here is a bit strange. Looking at existing conventions in the codebase, here are some nits. (If this is copied from somewhere, feel free to fix it in both places in a follow-up diff)

7–8 ↗(On Diff #9837)

Please stick with two-char tabs everywhere. (If this is copied from somewhere, feel free to fix it in both places in a follow-up diff)

This revision now requires review to proceed.Mar 2 2022, 4:48 AM
ashoat requested changes to this revision.Mar 2 2022, 10:29 PM
ashoat added inline comments.
services/blob/docker-base/Dockerfile
13 ↗(On Diff #10011)

In this change, you pull out rm -rf /var/lib/apt/lists/* into its own line. My (limited) understanding is that you are breaking build caching by doing this.

If I'm not wrong in my above analysis, it's frankly rather concerning to see you make this change.

When making a change like this, you should (in order):

  1. Ideally you should be following the discussion about this in previous diffs, as it directly pertains to your work. I'm pretty sure you've been on those diffs, so I suspect you probably just skipped reviewing it in the hopes that somebody else would handle it for you.
    • You should be reviewing every diff you are placed on the review list for
    • You should spend time understanding why every change is being made
  2. If you don't seem to recall the context for a pattern in the codebase, then you should at least research and understand what is going on before changing it
    • If you had done sufficient independent research, you would understand that there is a strong reason these should be on the same line
    • That research could be done with git blame and reading the comments in related diffs (recommended)
    • Or, if that fails, you could do independent research to understand why this sort of thing matters
services/blob/docker-base/contents/install_aws_sdk.sh
7–8 ↗(On Diff #10011)

Two-space tabs

This revision now requires changes to proceed.Mar 2 2022, 10:29 PM
services/blob/docker-base/contents/install_aws_sdk.sh
5–9 ↗(On Diff #9837)

Please don't forget to respond to this comment

services/blob/docker-base/Dockerfile
1 ↗(On Diff #10011)

Does this mean the "confusing dependency" no longer exists? What is ENG-804 supposed to be tracking, then? Is the task necessary? I thought it was tracking removing the "confusing dependency"

services/blob/docker-base/Dockerfile
1 ↗(On Diff #10011)

Yes, it means so.

This task tracks two things, details inside.

13 ↗(On Diff #10011)

I'm going to fix this.

services/blob/docker-base/contents/install_aws_sdk.sh
5–9 ↗(On Diff #9837)

I'm going to cover this in upcoming changes

In D3198#86520, @karol-bisztyga wrote:

Can you specify what do you mean exactly? I think folly is installed as a system lib and symlinked to the local dir.

Created a task explaining it https://linear.app/comm/issue/ENG-825/install-folly-as-system-library-in-base-service-image

services/backup/docker-server/contents/server/cmake-components/grpc.cmake
11 ↗(On Diff #10038)

These flags don't do anything here, they need to be set when gRPC is built, which is in install_grpc.sh. Related to https://linear.app/comm/issue/ENG-824/dont-build-redundant-dependencies-for-grpc.

ashoat requested changes to this revision.Mar 3 2022, 10:20 PM

Requesting changes for @jimpo's last comment above. Remember – if you don't request changes / accept when you leave a comment, then the diff stays on your queue instead of going back to the diff author's!

services/blob/docker-base/contents/install_aws_sdk.sh
5–9 ↗(On Diff #10038)

Please fix the tabs and semicolons here, as discussed in D3268

This revision now requires changes to proceed.Mar 3 2022, 10:20 PM

fix all remaining if then in bash

In D3198#89776, @ashoat wrote:

Requesting changes for @jimpo's last comment above. Remember – if you don't request changes / accept when you leave a comment, then the diff stays on your queue instead of going back to the diff author's!

Eh, not necessary to address here was just pointing it out. I made a note in ENG-824 so we can follow up and fix.

This revision is now accepted and ready to land.Mar 7 2022, 10:53 PM
services/backup/docker-server/contents/server/cmake-components/grpc.cmake
11 ↗(On Diff #10038)

Thanks for spotting this. I think we can cover this in the task you linked. The current task is just refactoring/moving stuff. I wouldn't mix it with fixing the grpc build.

services/backup/docker-base/contents/install_aws_sdk.sh
6–7 ↗(On Diff #10321)

Spaces still wrong here

services/base-image/contents/install_grpc.sh
8–9 ↗(On Diff #10321)

Spaces still wrong here

services/blob/docker-base/contents/install_aws_sdk.sh
6–7 ↗(On Diff #10321)

Spaces still wrong here

services/backup/docker-server/contents/server/cmake-components/folly.cmake
12–47 ↗(On Diff #9662)
services/backup/docker-server/contents/server/cmake-components/folly.cmake
12–47 ↗(On Diff #9662)

I don't think that answers the question. I'm asking here very speciifcally about the Folly stuff – the discussion you've linked is general.

It reminds me of this lyric from a band I like:

It's been agreed the whole world stinks so no one's taking showers anymore

  1. Just because we generally have a problem with DRY, does not mean that you can add new copy-pasted code and say it's fine because ENG-804 (effectively just "generic DRY task") will solve it. The best way to avoid making the problem worse is to stop copy-pasting!
  2. Linking this particular task is extra confusing because in its description, ENG-804 seems to indicate it's basically already solved (there are three diffs out), but I can't tell if any of them fixed the copy-pasted Folly dependency issue.

It's possible that there is a diff out already addressing this, but you have simply not given me the context to know that (and as a result, I've spent 5 minutes typing this out). In the future, can you please give more precise answers, and track things precisely? If there's a diff: link it. If all you have is a general task that doesn't talk about this issue specifically, then either create a new task or modify the existing task to clarify that it covers this. At the very least, you could add a comment to the task explaining how it relates.

services/backup/docker-server/contents/server/cmake-components/folly.cmake
12–47 ↗(On Diff #9662)

What exactly is the problem here?

Wait so why aren't we deleting that file, if it's being moved here? Do we need it in both places long-term? Why is that?

We have folly.cmake for backup and for blob services. This repetition is tracked by https://linear.app/comm/issue/ENG-324/avoid-repetitions-in-services-code.

Additionally, there's the same code in the tunnelbroker's cmake but I don't know at what stage the tunnelbroker is and I don't want to go there and mess it up too much just like that.

If there are any more problems or I misunderstood something, please let me know.

BTW I can see more and more complaints about these repetitions. We can prioritize the task of getting rid of them. At first, the idea was to first finish the services and then look at the file, see what's mutual and exclude it to the common space. But since it's become a big problem, I can take care of it and get back to the things I'm working on now after that.

I have two points of feedback I want to emphasize.

Your tasks are confusing

We have folly.cmake for backup and for blob services. This repetition is tracked by https://linear.app/comm/issue/ENG-324/avoid-repetitions-in-services-code

That task does not mention anything about this. This is what I was getting at above with ENG-804 too... you tell me that these tasks track these issues, but there is nothing in the task to indicate what they are tracking. They are both effectively "generic DRY" tasks.

You should not have generic tasks like this. They should be more precise, with more details linked and more in the description. I frequently find myself having to edit your tasks, add comments for clarification... this shouldn't be necessary. The tasks should be clearly labeled, and you should never have to tell somebody what they are tracking... it should be clear in the task. Right now I can't even tell what the difference is supposed to be between ENG-324 and ENG-804, and neither of them mention anything about Folly.

Besides the tasks, your responses to my questions also feel sort of similar... my original question was about why you're copy-pasting Folly, and you still haven't answered it. I don't understand why this is necessary, and why we can't just have a single Folly config. Until I understand that, you are going to keep getting comments on every diff that touches Folly, because I still have no idea how Folly is configured and I am unable to effectively review your diffs without this knowledge.

When somebody asks a question, you should answer it. If you point them to a task instead, that task should answer the question. It's not okay to deflect a question with a link to a task that doesn't discuss that question at all.

I have no idea what's going on

BTW I can see more and more complaints about these repetitions. We can prioritize the task of getting rid of them. At first, the idea was to first finish the services and then look at the file, see what's mutual and exclude it to the common space. But since it's become a big problem, I can take care of it and get back to the things I'm working on now after that.

I'm actually fine with this prioritization. The reason you keep getting pushback on this stuff is because I have no idea what's going on with the current state. I constantly encounter weird configs in diffs, and when I push back you just tell me it's tracked in some generic task that has like one sentence in its description.

The result is that I don't have any idea of the state of the world – what's currently misconfigured and what needs to be done to fix it. If I had this context, I wouldn't keep needing to ask you about it in every diff. In an ideal world, I understand what is being copy-pasted, why it's being copy-pasted, and how we're planning to make it better. Currently I have none of that, and the result is that both of us have to waste time on every diff.

This task is about moving mutual files from all services into some mutual space. So what am I supposed to do? List every file that I'm concerned may be re-used between services? If there's the same file in multiple services then it is tracked by this task. If the task is too generic I can mention there every file that is being repeated.

Besides the tasks, your responses to my questions also feel sort of similar... my original question was about why you're copy-pasting Folly, and you still haven't answered it. I don't understand why this is necessary, and why we can't just have a single Folly config. Until I understand that, you are going to keep getting comments on every diff that touches Folly, because I still have no idea how Folly is configured and I am unable to effectively review your diffs without this knowledge.

I don't understand what is the problem. I stated this before and you didn't answer inline.

here https://phabricator.ashoat.com/D3198#92902 I asked specifically:

What exactly is the problem here?

Without me understanding what is the problem, you will not receive a decent answer from me. Is one able to answer a question when they do not know the question itself? I'm highly doubting that.

I'm copy-pasting folly because I need it in two places. And the reason we're not reusing one code yet is because of the prioritization that we've talked about and you've agreed on that firstly we finish the services(up to some point) and then pull the mutual stuff to the common space.

When somebody asks a question, you should answer it. If you point them to a task instead, that task should answer the question. It's not okay to deflect a question with a link to a task that doesn't discuss that question at all.

I asked what exactly is a problem here and you didn't answer either. There's clearly a problem in communication and understanding each other. The sad thing about this is that it escalated from something simple I believe.

The result is that I don't have any idea of the state of the world – what's currently misconfigured and what needs to be done to fix it. If I had this context, I wouldn't keep needing to ask you about it in every diff. In an ideal world, I understand what is being copy-pasted, why it's being copy-pasted, and how we're planning to make it better. Currently I have none of that, and the result is that both of us have to waste time on every diff.

Ok, let's try to fix this up.

How about that: let's end this discussion and leave it behind. Next, please construct clear questions on what you exactly do not understand, ideally in points. Then, I'm going to try to answer all of them. Thanks.

I don't understand what is the problem. I stated this before and you didn't answer inline.

That's unfortunate to hear, given I have spent at this point probably around 30 minutes trying to articulate precisely what the problem is.

It's clear that we're not making any traction over Phabricator. As I suggested to you last week, let's drop the conversation here and follow up in our 1:1 this week.