The point is to not launch intel-based containers on m1 machines. I rebuilt the base image on m1 and pushed it to the hub with a different tag. Doing a simple check in bash scripts we can identify whether we're on m1 or not and use proper version of base image.
Details
- Reviewers
• max ashoat - Commits
- rCOMM32ed95a79ff9: [services] Build arm containers for m1 mac
You have to be on the m1 mac.
Discard the changes from this diff and do this:
cd services yarn run-backup-service
Go to the docker desktop and see that the container has a amd label. When hovered, it'll say that this image may be inefficient due to virtualization.
Remove this container, apply the changes from this diff and repeat all operations. There's not going to be that label for the new container. I also observed a significant improvement in container's performance.
Diff Detail
- Repository
- rCOMM Comm
- Branch
- arcpatch-D4375
- Lint
No Lint Coverage - Unit
No Test Coverage
Event Timeline
services/scripts/run_server_image.sh | ||
---|---|---|
32–35 ↗ | (On Diff #13879) | Can we use one Dockerfile and pass in the build arch via an ARG? (Or if necessary, an env variable) |
services/scripts/run_server_image.sh | ||
---|---|---|
32–35 ↗ | (On Diff #13879) | sure, I'm going to push the changes for this. Also, I'm assuming the idea is good so I'm going to put down the draft label. |
I switched to M1 a few months ago and didn't notice any issues with that. Have a few questions to @karol-bisztyga about the purpose of this diff:
- When we change something and run to rebuild the docker image locally in M1 it rebuilds for the efficient image, right?
- What is the purpose of the M1 service image in the docker hub? I think it's only to use for the development and this image can be also created if we change something and the docker rebuild the image to the M1 efficient, right? In this case, this image will be useful only in one scenario when the developer runs the service image and doesn't change anything in it.
- Does the performance difference worth making and maintaining these changes to the main Dockerfile? Personally, I didn't perform any performance issues when switched to M1.
My instinct says that we should put to the hub only the production-state images, but production images will run on AWS which is not M1, and not maintain the rarely used scenarios in it.
Curious about @geekbrother's feedback as well.
Separately – do we really need to use a separate name for each different arch? I've noticed Docker can support multiple archs for the same name, I think. This is why we need to have this line, for instance.
I switched to M1 a few months ago and didn't notice any issues with that. Have a few questions to @karol-bisztyga about the purpose of this diff:
Ok, can you share a screenshot of your docker desktop when you're running some services? (I'm particularly interested in the "containers" tab). It depends on what you consider to be an issue. For me it built successfully but I saw performance drop.
When we change something and run to rebuild the docker image locally in M1 it rebuilds for the efficient image, right?
If I understand this question correctly, then yes.
What is the purpose of the M1 service image in the docker hub? I think it's only to use for the development and this image can be also created if we change something and the docker rebuild the image to the M1 efficient, right? In this case, this image will be useful only in one scenario when the developer runs the service image and doesn't change anything in it.
The images based on the base image were only built for the m1 for me when the base image was built on the m1 as well. Did it work for you without it?
Does the performance difference worth making and maintaining these changes to the main Dockerfile? Personally, I didn't perform any performance issues when switched to M1.
Yes, they were significant for me.
My instinct says that we should put to the hub only the production-state images, but production images will run on AWS which is not M1, and not maintain the rarely used scenarios in it.
Curious for others' perspectives on this one. Why don't we put a base image for devs in the docker hub? What's so wrong with that?
Separately – do we really need to use a separate name for each different arch? I've noticed Docker can support multiple archs for the same name, I think. This is why we need to have this line, for instance.
Not sure now, if there are no more doubts, I will investigate this. Thx.
Separately – do we really need to use a separate name for each different arch? I've noticed Docker can support multiple archs for the same name, I think. This is why we need to have this line, for instance.
Not sure now, if there are no more doubts, I will investigate this. Thx.
I think it would be good to research this, as it impacts the best practice for this diff.
Yes, sure (PM to me if you need any other info)!
The images based on the base image were only built for the m1 for me when the base image was built on the m1 as well. Did it work for you without it?
Yes, I didn't rebuild anything. I can try to make it one more time from scratch to make sure about it.
Curious for others' perspectives on this one. Why don't we put a base image for devs in the docker hub? What's so wrong with that?
I'm personally not against putting the base image to the Docker hub but concerned that we are modifying a lot for this purpose only. In most scenarios (I think) the developers would rebuild it anyway and these modifications would not be used. I'm not 100% against this and I'm curious about the other team member's thoughts.
Not sure now, if there are no more doubts, I will investigate this. Thx.
Would be great if you share it here, thanks!
See, I'm getting something like this (when not building with the .m1 suffix)
and it takes a really long time to build, etc. So, assuming some of us get this, and some of us don't, it is not expensive to do this change. Or maybe I have something messed up in the docker settings? I just went with defaults when I got docker on my m1.
Also, you worked only on the tunnelbroker, can you try building backup/blob?
cd services yarn run-backup-service yarn run-blob-service
We should find what makes a difference. Can you provide:
- os version
- docker version
Not sure what else may matter...
As for the one name for multiple images:
You can build and push with separate commands on different hosts in a cluster, each sending to a different tag. And then after all tags for each platform have been pushed, you can use docker manifest to create a multiplatform manifest that points to all images with a single tag. This tool currently requires experimental support to be enabled.
Depends if we want to go with something which is "experimental"
As for the one name for multiple images:
You can build and push with separate commands on different hosts in a cluster, each sending to a different tag. And then after all tags for each platform have been pushed, you can use docker manifest to create a multiplatform manifest that points to all images with a single tag. This tool currently requires experimental support to be enabled.
Depends if we want to go with something which is "experimental"
This is labeled as an "experimental" docker feature. If we're good with that, I can try to do this, if we'd rather stick to the stable solutions, then I'm going to look for something else or we can leave this.
Is that how the mysql image works? Is that the only way to have a single image name that supports multiple platforms? Is the [platform key](https://github.com/CommE2E/comm/blob/master/keyserver/docker-compose.yml#L31-L34) in Docker Compose also experimental?
My suspicion is that either there is a separate production mechanism that the mysql image is using. If that suspicion is incorrect, and the mysql image is using this "experimental" feature, then we can probably use the "experimental" feature too – the mysql image is a hugely popular production image. However, I'd like to see some investigation on how the mysql image works to confirm that it is indeed using the mechanism you discuss above.
Separately - we checked on Jacek's m1 machine and he also had that "amd" label on the blob service. I also have this label for the tunnel broker and it was building really slow. Wondering how @geekbrother managed to run the tunnel broker without this.
The way I understand what is happening is that we have the base image, right? Now, we have the base image built for the intel and if we build some service image on top of this base image, then we end up with the service image built for intel as well. When using m1/arm, we have to build the base image for the arm, and only then build the service image on top of it.
The only explanation I see is that @geekbrother rebuilt the base image locally on his m1. Otherwise, I don't know what happened on his machine...
Got it, so it's a visual indication you're seeing. Do you have an idea of what that label indicates, though? Is there a formal name for that label in the Docker docs? Have you been able to find any documentation about how Docker handles different architectures online?
I just really want us to get away from trying to hack Docker to do what we want with architectures, and instead learn how Docker handles different architectures, and how popular production Docker images handle different architectures (ie. learning best practices). It feels to me that the research to understand this stuff has not been done, and I think it's going to be hard to proceed effectively here until that research is done. Apologies if I'm missing something.
Yes, everything you see in the docker desktop is a visual indication but it shows what really goes on underneath, doesn't it?
The warning is pretty straightforward to me:
IMAGE MAY HAVE POOR PERFORMANCE, OR FAIL, IF RUN VIA EMULATION
5 secs of googling brought me there: https://stackoverflow.com/questions/70765522/docker-amd64-warning-on-apple-m1-computer
Then, it leads you to the docs: https://docs.docker.com/desktop/mac/apple-silicon/#known-issues
The second known issue says:
Not all images are available for ARM64 architecture. You can add --platform linux/amd64 to run an Intel image under emulation. In particular, the mysql image is not available for ARM64. You can work around this issue by using a mariadb image.
However, attempts to run Intel-based containers on Apple silicon machines under emulation can crash as qemu sometimes fails to run the container. In addition, filesystem change notification APIs (inotify) do not work under qemu emulation. Even when the containers do run correctly under emulation, they will be slower and use more memory than the native equivalent.
In summary, running Intel-based containers on Arm-based machines should be regarded as “best effort” only. We recommend running arm64 containers on Apple silicon machines whenever possible, and encouraging container authors to produce arm64, or multi-arch, versions of their containers. We expect this issue to become less common over time, as more and more images are rebuilt supporting multiple architectures.
Okay, thanks. That docs page links to https://www.docker.com/blog/multi-arch-build-and-images-the-simple-way/, which appears to be a guide on best practices for multi-arch builds. Have you had a chance to review that?
@geekbrother could you please share a screenshot of the "images" tab in the docker desktop app? I requested this here and also pinged you on the Comm app:
I'd really love to figure out what exactly is happening especially since it was your argument against this change.
Planning changes since I've not done the research yet
The docker buildx from the @ashoat article above seems can be a solution without adding an m1_suffix.
Thx @geekbrother
My suspicion is still that you've built the base image locally.
could you remove the base image and rebuild the tunnelbroker image?
the best way of doing this would be to just clean up the docker completely and run tunnelbroker start
the goal is to see that the base image is being downloaded from the hub
Of course, this should be done using the 1.1 tag, 1.2 is multi-arch that I pushed to the hub so it doesn't show anything.
If you don't want/ don't have the time to do this, I'm just going to assume that you've built your base image locally instead of pulling it from the hub. In that case, we want the change introduced here.
The article I linked earlier indicates:
If you are on Mac or Windows, you have nothing to worry about, buildx is shipped with Docker Desktop. If you are on linux, you might need to install it by following the documentation here https://github.com/docker/buildx
Since our production environment is on Linux, is it the case that this change will require an additional installation step there?
Removing from everybody's queue except @geekbrother, since @karol-bisztyga is asking him a question
After I've removed it all completely and the base image is downloaded from the hub, now I have that warning badge about architecture.
I don't mind using your approach, but curious can we make the same result in a more elegant way without adding an additional M1 tag to the build script. Thanks for the research!
After I've removed it all completely and the base image is downloaded from the hub, now I have that warning badge about architecture.
Ok, just as I thought it would be, thanks.
I don't mind using your approach, but curious can we make the same result in a more elegant way without adding an additional M1 tag to the build script. Thanks for the research!
You probably didn't check the new changes in this diff. Since I started using buildx, there's no separate tag for m1.
Since our production environment is on Linux, is it the case that this change will require an additional installation step there?
Yes, but only if someone wants to rebuild the base image, right?
I added a note about this to the docs.
This question: https://phab.comm.dev/D4375#128329
Since our production environment is on Linux, is it the case that this change will require an additional installation step there?
Has been answered here: https://phab.comm.dev/D4375#129703
And one older question: https://phab.comm.dev/D4375#126962 (about the multi-arch image)
has been answered by adding changes https://phab.comm.dev/D4375#128263
If you mean yet different question, please, let me know.
Thanks @karol-bisztyga, my apologies... I simply wasn't paying enough attention.
Some nits inline – please address before landing!
docs/dev_services.md | ||
---|---|---|
105 ↗ | (On Diff #14559) | Some nits:
|
Yeah the plan was to add it to Phabricator once all the existing issues had been resolved (so it wouldn’t fail for everybody). I’ll take a look to see how quickly that can be wrapped up (if not done already)