Page MenuHomePhabricator

[services] Build arm containers for m1 mac
ClosedPublic

Authored by karol on Jun 28 2022, 1:00 AM.
Tags
None
Referenced Files
F3503548: D4375.diff
Fri, Dec 20, 5:32 AM
F3502783: D4375.id14677.diff
Fri, Dec 20, 5:07 AM
F3502782: D4375.id14676.diff
Fri, Dec 20, 5:07 AM
F3502781: D4375.id14408.diff
Fri, Dec 20, 5:07 AM
F3502780: D4375.id14397.diff
Fri, Dec 20, 5:07 AM
F3502779: D4375.id14009.diff
Fri, Dec 20, 5:07 AM
F3502778: D4375.id14008.diff
Fri, Dec 20, 5:07 AM
F3502777: D4375.id14559.diff
Fri, Dec 20, 5:07 AM

Details

Summary

The point is to not launch intel-based containers on m1 machines. I rebuilt the base image on m1 and pushed it to the hub with a different tag. Doing a simple check in bash scripts we can identify whether we're on m1 or not and use proper version of base image.

Test Plan

You have to be on the m1 mac.
Discard the changes from this diff and do this:

cd services
yarn run-backup-service

Go to the docker desktop and see that the container has a amd label. When hovered, it'll say that this image may be inefficient due to virtualization.

Remove this container, apply the changes from this diff and repeat all operations. There's not going to be that label for the new container. I also observed a significant improvement in container's performance.

Diff Detail

Repository
rCOMM Comm
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
services/scripts/run_server_image.sh
32–35 ↗(On Diff #13879)

Can we use one Dockerfile and pass in the build arch via an ARG? (Or if necessary, an env variable)

This revision now requires changes to proceed.Jun 28 2022, 1:18 PM

use arg instead of multiple dockerfiles

karol retitled this revision from [DRAFT] [services] Build arm containers for m1 mac to [services] Build arm containers for m1 mac.Jun 30 2022, 1:52 AM
karol added inline comments.
services/scripts/run_server_image.sh
32–35 ↗(On Diff #13879)

sure, I'm going to push the changes for this. Also, I'm assuming the idea is good so I'm going to put down the draft label.

max requested changes to this revision.Jun 30 2022, 3:30 AM

I switched to M1 a few months ago and didn't notice any issues with that. Have a few questions to @karol-bisztyga about the purpose of this diff:

  • When we change something and run to rebuild the docker image locally in M1 it rebuilds for the efficient image, right?
  • What is the purpose of the M1 service image in the docker hub? I think it's only to use for the development and this image can be also created if we change something and the docker rebuild the image to the M1 efficient, right? In this case, this image will be useful only in one scenario when the developer runs the service image and doesn't change anything in it.
  • Does the performance difference worth making and maintaining these changes to the main Dockerfile? Personally, I didn't perform any performance issues when switched to M1.

My instinct says that we should put to the hub only the production-state images, but production images will run on AWS which is not M1, and not maintain the rarely used scenarios in it.

This revision now requires changes to proceed.Jun 30 2022, 3:30 AM
ashoat requested changes to this revision.Jun 30 2022, 6:40 AM

Curious about @geekbrother's feedback as well.

Separately – do we really need to use a separate name for each different arch? I've noticed Docker can support multiple archs for the same name, I think. This is why we need to have this line, for instance.

I switched to M1 a few months ago and didn't notice any issues with that. Have a few questions to @karol-bisztyga about the purpose of this diff:

Ok, can you share a screenshot of your docker desktop when you're running some services? (I'm particularly interested in the "containers" tab). It depends on what you consider to be an issue. For me it built successfully but I saw performance drop.

When we change something and run to rebuild the docker image locally in M1 it rebuilds for the efficient image, right?

If I understand this question correctly, then yes.

What is the purpose of the M1 service image in the docker hub? I think it's only to use for the development and this image can be also created if we change something and the docker rebuild the image to the M1 efficient, right? In this case, this image will be useful only in one scenario when the developer runs the service image and doesn't change anything in it.

The images based on the base image were only built for the m1 for me when the base image was built on the m1 as well. Did it work for you without it?

Does the performance difference worth making and maintaining these changes to the main Dockerfile? Personally, I didn't perform any performance issues when switched to M1.

Yes, they were significant for me.

My instinct says that we should put to the hub only the production-state images, but production images will run on AWS which is not M1, and not maintain the rarely used scenarios in it.

Curious for others' perspectives on this one. Why don't we put a base image for devs in the docker hub? What's so wrong with that?

Separately – do we really need to use a separate name for each different arch? I've noticed Docker can support multiple archs for the same name, I think. This is why we need to have this line, for instance.

Not sure now, if there are no more doubts, I will investigate this. Thx.

Separately – do we really need to use a separate name for each different arch? I've noticed Docker can support multiple archs for the same name, I think. This is why we need to have this line, for instance.

Not sure now, if there are no more doubts, I will investigate this. Thx.

I think it would be good to research this, as it impacts the best practice for this diff.

In D4375#125426, @karol-bisztyga wrote:

I switched to M1 a few months ago and didn't notice any issues with that. Have a few questions to @karol-bisztyga about the purpose of this diff:

Ok, can you share a screenshot of your docker desktop when you're running some services? (I'm particularly interested in the "containers" tab). It depends on what you consider to be an issue. For me it built successfully but I saw performance drop.

Yes, sure (PM to me if you need any other info)!

Снимок экрана 2022-07-04 в 16.46.37.png (1×2 px, 263 KB)

The images based on the base image were only built for the m1 for me when the base image was built on the m1 as well. Did it work for you without it?

Yes, I didn't rebuild anything. I can try to make it one more time from scratch to make sure about it.

Curious for others' perspectives on this one. Why don't we put a base image for devs in the docker hub? What's so wrong with that?

I'm personally not against putting the base image to the Docker hub but concerned that we are modifying a lot for this purpose only. In most scenarios (I think) the developers would rebuild it anyway and these modifications would not be used. I'm not 100% against this and I'm curious about the other team member's thoughts.

Not sure now, if there are no more doubts, I will investigate this. Thx.

Would be great if you share it here, thanks!

ashoat requested changes to this revision.Jul 4 2022, 6:27 PM

To @karol-bisztyga's queue

This revision now requires changes to proceed.Jul 4 2022, 6:27 PM

See, I'm getting something like this (when not building with the .m1 suffix)

Screenshot 2022-07-05 at 13.32.22.png (81×788 px, 13 KB)

Screenshot 2022-07-05 at 13.32.27.png (121×828 px, 18 KB)

and it takes a really long time to build, etc. So, assuming some of us get this, and some of us don't, it is not expensive to do this change. Or maybe I have something messed up in the docker settings? I just went with defaults when I got docker on my m1.

Also, you worked only on the tunnelbroker, can you try building backup/blob?

cd services
yarn run-backup-service
yarn run-blob-service

We should find what makes a difference. Can you provide:

  • os version
  • docker version

Not sure what else may matter...

As for the one name for multiple images:

You can build and push with separate commands on different hosts in a cluster, each sending to a different tag. And then after all tags for each platform have been pushed, you can use docker manifest to create a multiplatform manifest that points to all images with a single tag. This tool currently requires experimental support to be enabled.

Depends if we want to go with something which is "experimental"

Separately – do we really need to use a separate name for each different arch? I've noticed Docker can support multiple archs for the same name, I think. This is why we need to have this line, for instance.

Not sure now, if there are no more doubts, I will investigate this. Thx.

I think it would be good to research this, as it impacts the best practice for this diff.

(Just a reminder, this diff is blocked on this investigation)

(Just a reminder, this diff is blocked on this investigation)

As for the one name for multiple images:

You can build and push with separate commands on different hosts in a cluster, each sending to a different tag. And then after all tags for each platform have been pushed, you can use docker manifest to create a multiplatform manifest that points to all images with a single tag. This tool currently requires experimental support to be enabled.

Depends if we want to go with something which is "experimental"

This is labeled as an "experimental" docker feature. If we're good with that, I can try to do this, if we'd rather stick to the stable solutions, then I'm going to look for something else or we can leave this.

ashoat requested changes to this revision.Jul 6 2022, 9:56 AM

Is that how the mysql image works? Is that the only way to have a single image name that supports multiple platforms? Is the [platform key](https://github.com/CommE2E/comm/blob/master/keyserver/docker-compose.yml#L31-L34) in Docker Compose also experimental?

My suspicion is that either there is a separate production mechanism that the mysql image is using. If that suspicion is incorrect, and the mysql image is using this "experimental" feature, then we can probably use the "experimental" feature too – the mysql image is a hugely popular production image. However, I'd like to see some investigation on how the mysql image works to confirm that it is indeed using the mechanism you discuss above.

This revision now requires changes to proceed.Jul 6 2022, 9:56 AM

Maybe @jimpo or @jonringer-comm could help?

Separately - we checked on Jacek's m1 machine and he also had that "amd" label on the blob service. I also have this label for the tunnel broker and it was building really slow. Wondering how @geekbrother managed to run the tunnel broker without this.

The way I understand what is happening is that we have the base image, right? Now, we have the base image built for the intel and if we build some service image on top of this base image, then we end up with the service image built for intel as well. When using m1/arm, we have to build the base image for the arm, and only then build the service image on top of it.

The only explanation I see is that @geekbrother rebuilt the base image locally on his m1. Otherwise, I don't know what happened on his machine...

@geekbrother can you show your "images" tab as well?

What do you mean by "label"?

What do you mean by "label"?

Here is the "amd64" label I'm talking about: https://phab.comm.dev/D4375#126085

Got it, so it's a visual indication you're seeing. Do you have an idea of what that label indicates, though? Is there a formal name for that label in the Docker docs? Have you been able to find any documentation about how Docker handles different architectures online?

I just really want us to get away from trying to hack Docker to do what we want with architectures, and instead learn how Docker handles different architectures, and how popular production Docker images handle different architectures (ie. learning best practices). It feels to me that the research to understand this stuff has not been done, and I think it's going to be hard to proceed effectively here until that research is done. Apologies if I'm missing something.

Here are 2 images - one with and one without the label:

Screenshot 2022-07-07 at 15.11.00.png (103×812 px, 20 KB)

Yes, everything you see in the docker desktop is a visual indication but it shows what really goes on underneath, doesn't it?

The warning is pretty straightforward to me:

IMAGE MAY HAVE POOR PERFORMANCE, OR FAIL, IF RUN VIA EMULATION

5 secs of googling brought me there: https://stackoverflow.com/questions/70765522/docker-amd64-warning-on-apple-m1-computer
Then, it leads you to the docs: https://docs.docker.com/desktop/mac/apple-silicon/#known-issues

The second known issue says:

Not all images are available for ARM64 architecture. You can add --platform linux/amd64 to run an Intel image under emulation. In particular, the mysql image is not available for ARM64. You can work around this issue by using a mariadb image.

However, attempts to run Intel-based containers on Apple silicon machines under emulation can crash as qemu sometimes fails to run the container. In addition, filesystem change notification APIs (inotify) do not work under qemu emulation. Even when the containers do run correctly under emulation, they will be slower and use more memory than the native equivalent.

In summary, running Intel-based containers on Arm-based machines should be regarded as “best effort” only. We recommend running arm64 containers on Apple silicon machines whenever possible, and encouraging container authors to produce arm64, or multi-arch, versions of their containers. We expect this issue to become less common over time, as more and more images are rebuilt supporting multiple architectures.

Okay, thanks. That docs page links to https://www.docker.com/blog/multi-arch-build-and-images-the-simple-way/, which appears to be a guide on best practices for multi-arch builds. Have you had a chance to review that?

rebase + apply for unit tests

@geekbrother could you please share a screenshot of the "images" tab in the docker desktop app? I requested this here and also pinged you on the Comm app:

Screenshot 2022-07-11 at 12.57.21.png (158×1 px, 29 KB)

I'd really love to figure out what exactly is happening especially since it was your argument against this change.

Planning changes since I've not done the research yet

In D4375#128102, @karol-bisztyga wrote:

@geekbrother could you please share a screenshot of the "images" tab in the docker desktop app? I requested this here and also pinged you on the Comm app:

Screenshot 2022-07-11 at 12.57.21.png (158×1 px, 29 KB)

I'd really love to figure out what exactly is happening especially since it was your argument against this change.

Planning changes since I've not done the research yet

In D4375#128102, @karol-bisztyga wrote:

@geekbrother could you please share a screenshot of the "images" tab in the docker desktop app? I requested this here and also pinged you on the Comm app:

Screenshot 2022-07-11 at 12.57.21.png (158×1 px, 29 KB)

I'd really love to figure out what exactly is happening especially since it was your argument against this change.

Planning changes since I've not done the research yet

Sure! This is my images tab:

Снимок экрана 2022-07-12 в 10.03.38.png (1×2 px, 395 KB)

The docker buildx from the @ashoat article above seems can be a solution without adding an m1_suffix.

Thx @geekbrother
My suspicion is still that you've built the base image locally.
could you remove the base image and rebuild the tunnelbroker image?
the best way of doing this would be to just clean up the docker completely and run tunnelbroker start
the goal is to see that the base image is being downloaded from the hub

Of course, this should be done using the 1.1 tag, 1.2 is multi-arch that I pushed to the hub so it doesn't show anything.

If you don't want/ don't have the time to do this, I'm just going to assume that you've built your base image locally instead of pulling it from the hub. In that case, we want the change introduced here.

The article I linked earlier indicates:

If you are on Mac or Windows, you have nothing to worry about, buildx is shipped with Docker Desktop. If you are on linux, you might need to install it by following the documentation here https://github.com/docker/buildx

Since our production environment is on Linux, is it the case that this change will require an additional installation step there?

Removing from everybody's queue except @geekbrother, since @karol-bisztyga is asking him a question

In D4375#128265, @karol-bisztyga wrote:

Thx @geekbrother
My suspicion is still that you've built the base image locally.
could you remove the base image and rebuild the tunnelbroker image?
the best way of doing this would be to just clean up the docker completely and run tunnelbroker start

After I've removed it all completely and the base image is downloaded from the hub, now I have that warning badge about architecture.
I don't mind using your approach, but curious can we make the same result in a more elegant way without adding an additional M1 tag to the build script. Thanks for the research!

This revision is now accepted and ready to land.Jul 14 2022, 7:06 AM
This revision now requires review to proceed.Jul 14 2022, 7:06 AM
ashoat requested changes to this revision.Jul 14 2022, 12:42 PM

Okay awesome, now that @geekbrother has accepted I'll ping my question:

The article I linked earlier indicates:

If you are on Mac or Windows, you have nothing to worry about, buildx is shipped with Docker Desktop. If you are on linux, you might need to install it by following the documentation here https://github.com/docker/buildx

Since our production environment is on Linux, is it the case that this change will require an additional installation step there?

This revision now requires changes to proceed.Jul 14 2022, 12:42 PM

add buildx note to the docs

After I've removed it all completely and the base image is downloaded from the hub, now I have that warning badge about architecture.

Ok, just as I thought it would be, thanks.

I don't mind using your approach, but curious can we make the same result in a more elegant way without adding an additional M1 tag to the build script. Thanks for the research!

You probably didn't check the new changes in this diff. Since I started using buildx, there's no separate tag for m1.

Since our production environment is on Linux, is it the case that this change will require an additional installation step there?

Yes, but only if someone wants to rebuild the base image, right?

I added a note about this to the docs.

ashoat requested changes to this revision.Jul 18 2022, 9:52 AM

Correct me if I'm wrong but I think my question still hasn't been responded to?

This revision now requires changes to proceed.Jul 18 2022, 9:52 AM

Correct me if I'm wrong but I think my question still hasn't been responded to?

This question: https://phab.comm.dev/D4375#128329

Since our production environment is on Linux, is it the case that this change will require an additional installation step there?

Has been answered here: https://phab.comm.dev/D4375#129703

And one older question: https://phab.comm.dev/D4375#126962 (about the multi-arch image)

has been answered by adding changes https://phab.comm.dev/D4375#128263

If you mean yet different question, please, let me know.

Thanks @karol-bisztyga, my apologies... I simply wasn't paying enough attention.

Some nits inline – please address before landing!

docs/dev_services.md
105 ↗(On Diff #14559)

Some nits:

  1. docker desktop should be Docker desktop
  2. Mac OS should be macOS
This revision is now accepted and ready to land.Jul 19 2022, 8:28 AM

@atul, any idea why the ShellCheck Buildkite workflow isn't running here? I wanted to confirm that ShellCheck ran on build_base_image.sh, but I can't see it.

Based on this, it seems like maybe it still needs to be added via Phabricator?

@atul, any idea why the ShellCheck Buildkite workflow isn't running here? I wanted to confirm that ShellCheck ran on build_base_image.sh, but I can't see it.

Based on this, it seems like maybe it still needs to be added via Phabricator?

Yeah the plan was to add it to Phabricator once all the existing issues had been resolved (so it wouldn’t fail for everybody). I’ll take a look to see how quickly that can be wrapped up (if not done already)

Thanks @karol-bisztyga, my apologies... I simply wasn't paying enough attention.

Some nits inline – please address before landing!

No problem, thanks for the review.

docs/dev_services.md
105 ↗(On Diff #14559)

Ok

This revision was landed with ongoing or failed builds.Jul 20 2022, 4:09 AM
This revision was automatically updated to reflect the committed changes.