Page MenuHomePhabricator

[CI] Potentially improve reliability of Android CI
ClosedPublic

Authored by atul on May 18 2022, 10:58 AM.
Tags
None
Referenced Files
F3538104: D4072.id12981.diff
Wed, Dec 25, 10:22 PM
F3534688: D4072.id12910.diff
Wed, Dec 25, 12:55 PM
F3534687: D4072.id12912.diff
Wed, Dec 25, 12:55 PM
F3534640: D4072.id12894.diff
Wed, Dec 25, 12:51 PM
F3534639: D4072.id12916.diff
Wed, Dec 25, 12:50 PM
F3534608: D4072.id12895.diff
Wed, Dec 25, 12:39 PM
F3534602: D4072.id12889.diff
Wed, Dec 25, 12:36 PM
F3534593: D4072.id12891.diff
Wed, Dec 25, 12:33 PM

Details

Summary

These changes are restricted to the Buildkite CI that gets triggered on arc diff. These changes aren't necessary for the GitHub Actions CI since every build happens in a brand new "clean" virtual environment... and because it has been working without issue (barring unavoidable JCenter/Maven repo 500s).

  1. Include ./gradlew clean step before ./gradlew bundleRelease. This step is included in /native/package.json: yarn clean-all, but doesn't get included in normal yarn cleaninstall.
  2. Specify --no-daemon flag. From the Gradle docs:

Since Gradle 3.0, we enable Daemon by default and recommend using it for both developers' machines and Continuous Integration servers. However, if you suspect that Daemon makes your CI builds unstable, you can disable it to use a fresh runtime for each build since the runtime is completely isolated from any previous builds. (https://docs.gradle.org/current/userguide/gradle_daemon.html)

There's varying opinions online about whether this is necessary, but anecdotally I've noticed quite a few error messages involving the Gradle daemon when scanning through the logs of failed builds. It doesn't seem to add much overhead to the build time, and it seems like there's a possibility that it improves reliability. I haven't tested this in any meaningful/robust way...

  1. Pass in jvmargs from the command-line. Was previously doing this via environment variable (GRADLE_OPTS) and $HOME/gradle.properties... but looks like passing from command-line has highest precedence... and is the only option that has higher precedence than project-level gradle.properties. I was previously mistaken in thinking GRADLE_OPTS and $HOME/gradle.properties were being taken into account when I previously tried to resolve the Out of Memory issue and claimed I'd tried adjusting jvmargs.

Here is the discussion on precedence from the docs (https://docs.gradle.org/current/userguide/build_environment.html#sec:gradle_configuration_properties):

1209.png (896×2 px, 266 KB)

Here are the arguments I'm passing:

"-Dorg.gradle.jvmargs=-Xmx32g -XX:MaxPermSize=32g -XX:+HeapDumpOnOutOfMemoryError"

All of the Android CI machines currently have >= 64GB of RAM, may have to adjust this up/down depending on what machine-types we go with in the future. In the future we could be smart about this and get information about available RAM from /proc/meminfo?

Test Plan

Ran the builds a few times with/without these changes and observed that they took marginally longer with the changes.

I think we'll just need to see over time whether this improves reliability?

Diff Detail

Repository
rCOMM Comm
Branch
may18 (branched from master)
Lint
No Lint Coverage
Unit
No Test Coverage

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
atul requested review of this revision.May 18 2022, 11:36 AM
atul retitled this revision from CI, ignore to [CI] Potentially improve reliability of Android CI.May 19 2022, 6:07 AM
atul edited the summary of this revision. (Show Details)
atul edited the test plan for this revision. (Show Details)
atul added reviewers: tomek, ashoat.
ashoat added a subscriber: max.

I wonder if ./gradlew clean may cause more failures than it reduces (I remember we had an issue with it failing before, cc @geekbrother) but the rest of it makes clear sense to me

This revision is now accepted and ready to land.May 19 2022, 9:43 PM

I wonder if ./gradlew clean may cause more failures than it reduces (I remember we had an issue with it failing before, cc @geekbrother) but the rest of it makes clear sense to me

Ah yeah I vaguely remember some issue along those lines... something to do with the native C++ code (eg glog, folly, grpc) being built and linked?


I'll remove ./gradlew clean for now since I think the --no-daemon flag and jvmargs are more likely to make a difference. Can always revisit after seeing how things go.

remove ./gradlew clean (for now)

Ah yeah I vaguely remember some issue along those lines... something to do with the native C++ code (eg glog, folly, grpc) being built and linked?

Yeah. Agree with the decision to remove for now... we can bring it back if we continue to have problems and think it may help.