Page MenuHomePhabricator

[lib] Generate converters from validators
ClosedPublic

Authored by michal on Jun 28 2023, 4:23 AM.
Tags
None
Referenced Files
Unknown Object (File)
Wed, May 1, 3:42 PM
Unknown Object (File)
Tue, Apr 30, 8:28 PM
Unknown Object (File)
Tue, Apr 30, 6:35 PM
Unknown Object (File)
Tue, Apr 30, 8:08 AM
Unknown Object (File)
Sun, Apr 28, 4:40 PM
Unknown Object (File)
Sun, Apr 28, 4:06 PM
Unknown Object (File)
Sun, Apr 28, 2:26 AM
Unknown Object (File)
Fri, Apr 26, 8:03 AM
Subscribers

Details

Summary

https://linear.app/comm/issue/ENG-3995/create-functions-for-migrating-the-client-stores

We need to migrate client db stores, and redux persist. We can't use validators themselves because they may be changed in the future. This diffs adds a script that takes a validator and generated a function (as a string containing js code) that converts the type described by this validator. This way the migration code will remain static even if the user updates the app. The generated function are kept in lib/_generated/migration-utils.js. The only manual change (other than pasting the generating functions) in this file is adding the required imports.

Example of the generated code for the linkStore:

export function convertInviteLinksStoreToNewIDSchema(
  input: InviteLinksStore,
): InviteLinksStore {
  return {
    ...input,
    links: Object.fromEntries(
      entries(input.links).map(([key, value]) => [
        '256|' + key,
        {
          ...value,
          primaryLink:
            value.primaryLink !== null && value.primaryLink !== undefined
              ? {
                  ...value.primaryLink,
                  role: '256|' + value.primaryLink.role,
                  communityID: '256|' + value.primaryLink.communityID,
                }
              : value.primaryLink,
        },
      ]),
    ),
  };
}
Test Plan

Tested with the rest of the diffs, by running schema id migrations on clients using the generated functions.

Diff Detail

Repository
rCOMM Comm
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

lib/utils/validation-utils.js
29–39 ↗(On Diff #28209)

This was needed to get the union discriminator values for conditional branching. The only other place where this is used is when debug printing the validators so this makes them potentially clearer to debug because we have concrete values.

  1. Can you annotate every any-cast with an explanation as to why it is needed? Please include the Flow error you're seeing, and the other things you've tried. Going forward, please always include this when using an any-cast (or something comparable, like Object or Function or $FlowFixMe)
  2. I worry that the codegenned file might get out of sync with its inputs. Before landing, can you create a Linear task to add a CI job that regenerates the file and confirms there are no changes? We should make sure we complete that task before we ship this project.
  1. I worry that the codegenned file might get out of sync with its inputs. Before landing, can you create a Linear task to add a CI job that regenerates the file and confirms there are no changes? We should make sure we complete that task before we ship this project.

Might be helpful to take a look at the JSI codegen CI approach:

cd native && yarn codegen-jsi && git diff --exit-code

where git diff --exit-code will return nonzero exit code if there are any changes in working copy and fail the CI job.

  1. I worry that the codegenned file might get out of sync with its inputs. Before landing, can you create a Linear task to add a CI job that regenerates the file and confirms there are no changes? We should make sure we complete that task before we ship this project.

Do you mean that it will desync before getting released or later? Because they not being synchronized, is the reason why we are generating static functions and not using validators directly. If at some point in the future, we update the validators we don't want to also update these functions because the migration code shouldn't change. So the CI job would have to be removed after shipping anyway.

Improve any-casts. Now they are cast immediately to the correct type, similar to how it is done in lib/utils/validation-utils.js. The main problem is that flow doesn't constrain th TType to one of it's variants (e.g. TInterface or TUnion) after checking the meta.kind field so we need to do that ourselves with the any casts.

Added converters for ConnectionInfo and CalendarFilter

This needs a very close review on the types. It looks pretty daunting, and I don't have time for this tonight unfortunately. Hopefully I will have some time tomorrow, but we'll see.

It would be a lot easier if you had done as requested and annotated each any type individually with an explanation for why it's necessary, as well as the error you're seeing. Ideally in a code comment. The easier you make to review for your reviewers, the sooner you'll be able to land it, and the less frustrating you'll make our experience

Do you mean that it will desync before getting released or later? Because they not being synchronized, is the reason why we are generating static functions and not using validators directly. If at some point in the future, we update the validators we don't want to also update these functions because the migration code shouldn't change. So the CI job would have to be removed after shipping anyway.

Makes sense

ashoat requested changes to this revision.Jul 2 2023, 7:19 PM

Copy-pasting my comment from earlier:

Can you annotate every any-cast with an explanation as to why it is needed? Please include the Flow error you're seeing, and the other things you've tried. Going forward, please always include this when using an any-cast (or something comparable, like Object or Function or $FlowFixMe)

Can you actually do this please?

This revision now requires changes to proceed.Jul 2 2023, 7:19 PM

Removed some uses of any, and annotated the rest. Additionaly:

  • here's a minimal repro of the error: flow playground
  • this is handled in the same way as we handled the error in D7488

Thank you, the types look great now!! I unfortunately haven't had time for a detailed review (just to complain about types 😅), so going to resign here so one of the original reviewers can take a closer look

The diffs looks really interesting!

Tested with the rest of the diffs, by running schema id migrations on clients using the generated functions.

Can we check if after the migration all the ids are prefixed?

keyserver/src/scripts/generate-converter-from-validator.js
28–29 ↗(On Diff #28383)

It is a really good idea to not reference other comments / functions in a comment. The referenced place could change independently and we won't notice. Instead, you can simply repeat a comment.

This revision is now accepted and ready to land.Jul 5 2023, 1:36 AM

Fixed comment

Can we check if after the migration all the ids are prefixed?

I've looked through redux and sqlite on both web and native, and checked if the ids were converted.

I've also checked the converted db with the validators:

const { messages, threads, messageStoreThreads } =
        await commCoreModule.getClientDBStore();

      const rawMessageInfos = messages.map(
        translateClientDBMessageInfoToRawMessageInfo,
      );

      convertClientIDsToServerIDs(
        '256',
        t.list(rawMessageInfoValidator),
        rawMessageInfos,
      );

      const rawThreadInfos = threads.map(
        convertClientDBThreadInfoToRawThreadInfo,
      );

      convertClientIDsToServerIDs(
        '256',
        t.list(rawThreadInfoValidator),
        rawThreadInfos,
      );

      const rawMessageStoreThreadInfos =
        translateClientDBThreadMessageInfos(messageStoreThreads);

      if (
        Object.keys(rawMessageStoreThreadInfos).some(
          threadID => !threadID.startsWith('256|'),
        )
      ) {
        console.log('INVALID');
      }

      console.log('done');
  • before migration: I got invalid_client_id_prefix error (because the ids weren't migrated)
  • after migration: no errors

There's actually another problem: the generated functions can't be typed with our types because they will get out of sync. What I mean by that is that these functions operate on data that is stored in sqlite/redux at this point in time. If in the future someone modifies e.g. RawMessageInfo then this new type could conflict with the generated functions' body and make a flow error.

Because of this I've changed the typing of the generated functions to any (and changed the generating function to match). The only other option would be to make a snapshot of our current types and keep them in some other file, but that's a much worse solution. I don't having these functions typed as any => any matters much, because:

  1. They will never be changed
  2. They will never be used anywhere else (except for the current id schema migration)
tomek requested changes to this revision.Jul 12 2023, 3:07 AM

There's actually another problem: the generated functions can't be typed with our types because they will get out of sync. What I mean by that is that these functions operate on data that is stored in sqlite/redux at this point in time. If in the future someone modifies e.g. RawMessageInfo then this new type could conflict with the generated functions' body and make a flow error.

That makes sense, and we're assuming state type in our migrations, so it seems consistent.

One issue though is that after landing this it might get out of sync before applying the migration. It is even worse when we use any as we won't have a chance to notice that. There are two solutions:

  1. Apply a migration in this diff
  2. Keep the types and replace them with any in a diff that applies the migration
This revision now requires changes to proceed.Jul 12 2023, 3:07 AM
michal added a subscriber: bartek.

I've readded the types. MediaValidator needed to be special cased because of flow issues (that's the same issue as @bartek had). The types are replaced with any in D8386.

I guess this makes sense. We should make sure that the boundaries of the any are well-enforced... any types can sometimes "infect" a codebase and travel to unexpected places.

I can't tell where the any types are used from a cursory glance at the diff stack, but would appreciate if @michal could check the types of things that reference the any types, to make sure that the any type isn't "infecting" anything.

This revision is now accepted and ready to land.Jul 13 2023, 1:25 AM

Yeah, use of this any is contastrained to this one specific migration, and it's used as a function passed to other functions (that are typed correctly) so it shouldn't 'infect' other code