Page MenuHomePhabricator

[Nix] Add nix installation script
ClosedPublic

Authored by jon on Aug 15 2022, 6:59 PM.
Tags
None
Referenced Files
Unknown Object (File)
Sun, Nov 17, 10:14 AM
Unknown Object (File)
Sun, Nov 3, 7:45 PM
Unknown Object (File)
Sun, Nov 3, 7:45 PM
Unknown Object (File)
Sun, Nov 3, 7:45 PM
Unknown Object (File)
Sun, Nov 3, 7:45 PM
Unknown Object (File)
Sun, Nov 3, 7:45 PM
Unknown Object (File)
Sun, Nov 3, 7:45 PM
Unknown Object (File)
Sun, Nov 3, 7:45 PM

Details

Summary

Add nix installation script

Currently this will:

  • Install nix, if missing
  • Prompt user for powerline installation
  • Configure nix with:
    • flakes
    • trusted users (for cache)
    • Decent job and core defaults

This is implemented with a "reconcilation" configuration model, meaning
the individual actions should be idempotent. Thus safe to run many times.

Depends on D4843

Test Plan
./scripts/install_nix.sh

# If missing nix, ensure that nix then exits (E.g. which nix)
# Try powerline prompt
# Assert nix.conf has max-jobs, cores, trusted-users, and flakes enabled

Diff Detail

Repository
rCOMM Comm
Branch
jonringer/install-nix
Lint
No Lint Coverage
Unit
No Test Coverage

Event Timeline

Harbormaster returned this revision to the author for changes because remote builds failed.Aug 15 2022, 6:59 PM
Harbormaster failed remote builds in B11369: Diff 15634!

Don't source nix files in docker container

Can you explain where \h:\W \u\$ is coming from? Or do we not need to check that anymore? If we don't need to check it anymore, why is that?

EDIT – ah looks like this is explained in D4843

(Ignore Identity build failure, restarted it)

trusted users (for cache)

When we ran nix together, this was never added to my nix.conf file. For reference, this is what my nix.conf looks like:

image.png (966×1 px, 210 KB)

Can you explain the purpose of trusted users? I read through the Nix documentation for it but I'm still a bit confused. Why do we need to give root access?

Aside Not sure if you prefer using nano as your default git editor but I notice on a lot of your diffs the lines in the summary or test plan are cut off ~70-80 characters (and wrap to the next line). This can be solved by making vim your default git editor. I don't remember the exact steps, but I ran:

git config --global core.editor "vim"

I think if you run with --local instead of --global it will only set this for the comm git repository.

Additionally, in my .zshrc I have:

export EDITOR=vim
export VISUAL=vim

I don't think you need to do all these things, so here's an additional reference: https://stackoverflow.com/questions/2596805/how-do-i-make-git-use-the-editor-of-my-choice-for-editing-commit-messages

You can also do this for any other editor you want, not just vim. But I know vim doesn't truncate lines when pushing to Phabricator.

scripts/install_nix.sh
8 ↗(On Diff #15635)
17 ↗(On Diff #15635)

Neat! Didn't know about this.

image.png (966×1 px, 185 KB)

25 ↗(On Diff #15635)

Why do we need || true here? It's to force the cd to exit successfully, but won't the pwd run anyways since bash will ignore a cd failure (unless you used &&)?

Example: https://riptutorial.com/bash/example/17289/failed-commands-do-not-stop-script-execution

This revision now requires changes to proceed.Aug 16 2022, 10:23 AM
jon marked 3 inline comments as done.

Use double braces

scripts/install_nix.sh
8 ↗(On Diff #15635)

Yea, copied it from /etc/zshrc

25 ↗(On Diff #15635)

it was to make shellcheck happy.

I originally left it off, as it's always able to succeed.

Can you explain the purpose of trusted users? I read through the Nix documentation for it but I'm still a bit confused. Why do we need to give root access?

I think I did a paragraph, but I guess I deleted it after refactoring. Usage of binary caches allows for man-in-the-middle attacks as nix kind of "trusts" another party to have done the build correctly. It's possible that build artifacts have been tampered with. This is also why that binary caches have an associated public key for verification as in https://phab.comm.dev/D4816

Aside Not sure if you prefer using nano as your default git editor but I notice on a lot of your diffs the lines in the summary or test plan are cut off ~70-80 characters (and wrap to the next line). This can be solved by making vim your default git editor. I don't remember the exact steps, but I ran:

I use vim, and my git log shows everything correctly (which uses bat as a PAGER). I generally try to limit it to 80 chars as exported patches are supposed to fit in ancient email clients. I think what is confusing you is sometimes I forget to do it; so when I do it, it looks weird.

Additionally, in my .zshrc I have:

Same, https://github.com/jonringer/nixpkgs-config/blob/master/bash.nix#L34. TIL VISUAL is a similar variable https://bash.cyberciti.biz/guide/%24VISUAL_variable . Although git uses EDITOR

Cool! Just some more inline questions I missed on my first pass.

scripts/install_nix.sh
32 ↗(On Diff #15669)

What are the consequences of this? I'm not too familiar with threaded or parallel computing, but to me this sentence sounds sort of worrying. That is, you explain why this is "the best compromise" but I'm not sure if I understand your explanation.

From what I got, if we over-subscribe cores, if most builds are single threaded it won't matter since only one core will do the work anyways? But what happens when something is massively parallel? I'm just failing to see how this is a good compromise, partly because of my lack of knowledge with this matter.

32 ↗(On Diff #15669)

Where is this coming from?

25 ↗(On Diff #15635)

Ah yeah, that makes sense. pwd is a benign command, so I assumed we didn't need the || true but if it was a rm or something not including a || exit could be catastrophic.

This revision now requires changes to proceed.Aug 16 2022, 11:06 AM
jon added inline comments.
scripts/install_nix.sh
32 ↗(On Diff #15669)

All of the job and core nonsense will be significantly mitigated by D4816 where it will be unlikely for anyone to build anything locally.

But to respond:

What are the consequences of this? I'm not too familiar with threaded or parallel computing, but to me this sentence sounds sort of worrying

The consequences is that there's a potential that every vCPU on the computer could be doing 2 things, which means 100% CPU load for extended periods of time. When a computer is maxed out like this, it means that the computer's usability could be diminished (think of cleaning and running an xcode project).

When a computer tries to do a 1000 different things, then you have to worry about things like contexting switching overhead between threads causing a lot of lost productivity, but limiting the cores to 4 puts a low upper bound to this.

From what I got, if we over-subscribe cores, if most builds are single threaded it won't matter since only one core will do the work anyways?

Correct, if you have (n/2) single-threaded builds going concurrently, then you will only be at 50% cpu load. If you have 8 cores, and one of those jobs starts using 4 cores for the build, then you will jump up to 7/8 cores being used, which is ~87%. Which means the computer should still be pretty usable while the build is going.

But what happens when something is massively parallel?

Then they will go up to a max of 4 cores to attempt that job/build.

I'm just failing to see how this is a good compromise

On one end, you can do many single core jobs, or a single many core job (which is likely bound by the available cpu count, as there's no benefit to providing more than what the cpu is capable of).

The compromise here is that most builds are either single threaded, or have long single-threaded critical sections (e.g. cmake .. or linking).

For example, in our current dev shell, aws-sdk-cpp, protobuf, and nodejs use more than a single core for their build and take longer than a few seconds to build. The aws-sdk-cpp is only building 3 apis, so it's finished in a minute or so. Protobuf may take a few minutes. However, nodejs has a very large build, and will like sit on 4 cores for 10-20 mins even after all other builds have completed. In this example, a 6 or 8 core machine will have a potential few minute window where they are maxed out, but for most of it, they are just under 100%, which allows for people to do other things and still use the machine.

I've also learned from experience that some C++ builds can't have incredibly high ram usage if they have a lot going on (e.g. 1-4 GBs per thread). So having a set core limit of 4 is also to prevent paging in ram (using swap) which makes the build significantly slower.

32 ↗(On Diff #15669)

It's a header to the settings set provide below. It is a bit odd to read as it's providing context to something which hasn't happened yet. I could set the values, then describe them in a comment below it. But we've been doing comments before code everywhere else and felt that comment after code would be even more jarring.

jon marked 2 inline comments as done.
abosh added inline comments.
scripts/install_nix.sh
32 ↗(On Diff #15669)

Thank you for this explanation!

32 ↗(On Diff #15669)

Ah I see. Maybe you could rewrite it to

jobs * cores = (n/2)*4 = 2n

to help decrease some confusion, but not necessary.

This revision is now accepted and ready to land.Aug 16 2022, 12:11 PM
jon marked 2 inline comments as done.

Re-order job comments to be a bit more readable

scripts/install_nix.sh
32 ↗(On Diff #15669)

Originally I wanted to put this on the same 80char line, which is why it's so short. I'll just move it to the next line, and move the big block below the code

Asking for a review again to make sure it's still fine with @abosh

This revision is now accepted and ready to land.Aug 16 2022, 1:14 PM

Check if nix.conf is writable for root

@abosh , please test the nix.conf workflow again.

scripts/install_nix.sh
48 ↗(On Diff #15678)

@abosh, ah, this is why it wasn't writing for you, I added this check because on NixOS the nix.conf is read only, and I wanted to check if it was write for sudo, but it's checking if it's able to be written for USER, which is false.

In D4844#140414, @jon wrote:

@abosh , please test the nix.conf workflow again.

Yay, looks like it added after I ran ./scripts/install_nix.sh.

image.png (966×1 px, 223 KB)

scripts/install_nix.sh
48 ↗(On Diff #15678)

That makes sense. So now that I ran install_nix.sh the trusted_users flake ensures this command works correctly when run by me. And the tee is run with sudo so that we can write to nix.conf since it's read-only.

This revision is now accepted and ready to land.Aug 16 2022, 2:04 PM
This revision was automatically updated to reflect the committed changes.

What's the upgrade path for devs for this new scripts/install_nix.sh script? Should we all just run it? If so, should we make sure we follow up with each dev about that, using a process similar to what I did with the MariaDB migration?

Looks like "just run the script" seems to work:

ashoat@ashoatmbp2021 [~/src/comm]$ cat /etc/nix/nix.conf

build-users-group = nixbld
experimental-features = nix-command flakes
ashoat@ashoatmbp2021 [~/src/comm]$ scripts/install_nix.sh
Password:
ashoat@ashoatmbp2021 [~/src/comm]$ cat /etc/nix/nix.conf

build-users-group = nixbld
experimental-features = nix-command flakes
trusted-users = @admin ashoat
cores = 4
max-jobs = 5
ashoat@ashoatmbp2021 [~/src/comm]$ cat $HOME/Library/Caches/app.comm/enable-powerline
1

Sounds like we need to make sure each dev runs this script... so does it make sense to follow up with each dev about that, using a process similar to what I did with the MariaDB migration?

What's the upgrade path for devs for this new scripts/install_nix.sh script?

@ashoat the script was written in a way where it will only attempt the changes once. Anyone should be able to run it multiple times, or run it if they previously installed nix.

so does it make sense to follow up with each dev about that, using a process similar to what I did with the MariaDB migration?

I created a similar ticket https://linear.app/comm/issue/ENG-1635 to track developer migration.

Unrelated:
I just realized that with this script, the user needs experimental-features = flakes nix-command in their nix.conf for the powerline logic to work

Created D4862 to fix ordering issue of nix then powerline prompt