Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add output object type inference based upon Schema definition #697

Merged
merged 27 commits into from
Jun 18, 2024

Conversation

shawnmcknight
Copy link
Member

This PR enhances the capabilities of MVOM for type inference based on the schema definition. These changes allow, given a schema, to infer the output types of either a Document or a Model instance.

The following is a summary of the changes:

Schema class

The Schema class has been changed so that there is now a generic TSchemaDefinition on the class which will be inferred based on the provided definition parameter to the Schema constructor. This generic is the basis for all other changes included in this PR.

Several new utility types have been added to the schema.ts module to allow for output object type inference:

The exported InferDocumentObject type accepts a Schema as a generic and will output a type which aligns with the structure of a Document that was instantiated based upon that schema's definition.

The exported InferModelObject type accepts a Schema as a generic and will output a type which aligns with the structure of a Model that was instantiated based upon that schema's definition. The primary difference between this type and the InferDocumentObject type is that a Model instance will include the _id and __v properties, so this is effectively an extension of the InferDocumentObject type.

An internal InferSchemaType type provides most of the work regarding the inference. It will accept a schema's type definition (that which is assigned to a schema's property value) and recursively process it to determine the output type of a property in the schema's definition. It will handle each scalar type as well as embedded definitions, embedded schemas, scalar arrays, nested scalar arrays, and document arrays.

An internal InferStringType type provides additional utility for string schema type definitions. Since those type definitions may have an enumeration constraint, this type will check if there is a defined enumeration in the definition. If there is, and it is a readonly array, then the output of the string type will be a union of the enumerated strings.

An internal InferRequiredType type will detect whether a scalar type is required or not. If it is not required then the resulting output will be unioned with null.

Usage

Consumers of MVOM can use the InferModelObject and InferDocumentObject types to generate a type which will provide the shape of the output for a Model or Document as follows:

import { Schema } from 'mvom';
import type { InferDocumentObject, InferModelObject } from 'mvom';

const schema = new Schema({ stringProp: { type: 'string', path: '1' } });

type DocumentOutput = InferDocumentObject<typeof schema>; // { stringProp: string | null }
type ModelOutput = InferModelObject<typeof schema>; // { _id: string; __v: string; stringProp: string | null }

Testing of the utility types

The Schema.test.ts suite was updated to have several new tests to confirm that the utility types emit the expected output type. A new utility type named Assert was created which will compare two types and return true if they match and an error output if they do not. This Assert type was used to compare various schemas to their expected output.

This test suite can also provide a lot of insight into the expected output types for various schema definitions.

Note: A failed type assertion would not fail any unit tests, but it would trigger typechecking errors. They have the same end result which is that the CI suite would fail.

compileModel.ts & dynamic Model class

The compileModel function has been augmented to take advantage of the InferModelObject inference. The schema provided to the function will have its output inferred. This output is used to form a new ModelCompositeValue type which is the intersection of the Model instance and the inferred output. The various static and instance methods on the Model class which return instances of the Model will now be of this composite object. Effectively, those methods will now return a type which has all of the properties strongly typed as defined by the schema.

Additionally, when instantiating a new Model, the data property supplied to the constructor must now comply with the inferred object shape. That is, there is type safety applied to the data that would construct a Model instance.

Document class

Similar to the changes made to the Model class, the Document class now outputs a DocumentCompositeValue from the static methods that can instantiate a new Document instance. This composite type will be the intersection of the Document instance and the inferred output object.

Schema type validator changes

Prior to this PR, the validators for a schema type would accept not only the value being validated, but also the Document instance they were being validated for. The passing of the Document instance was never used by any validators. Because the Document is now generic, this complicated the validation as all schema types would have needed to be provided the type of the Schema which they were a member of. Since this validation was never used anywhere, it was far simpler to simply remove the document parameter from the validators.

Query class changes

The Query class accept a Model constructor previous to this PR. However, TypeScript was complaining about this constructor due to the new composite output type of the Model methods. Instead, the Query class was modified to individually accept as parameters the things it was using from the Model constructor -- the connection, schema, and file. The Model method were adjusted to provide this information directly instead of providing its own constructor. This should have no impact to anything with the Query class as the end result is identical.

Copy link
Contributor

@philfuster philfuster left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really good. Lots to learn here about the things you can do with TypeScript types. Thank you for requesting my review.
I just had a few questions. Thank you.

TString['enum'] extends readonly (infer E)[] ? E : string;

/** Infer the output type of a schema type definition */
type InferSchemaType<TSchemaTypeDefinition> =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice.

@@ -376,3 +378,767 @@ describe('transformPathsToDbPositions', () => {
expect(schema.transformPathsToDbPositions(['not.here'])).toEqual([]);
});
});

describe('utility types', () => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like you are testing a handful of cases per type.
I can understand why you wouldn't want each one to have its own test block, that's quite a few.
The one benefit of having their own test block would be to provide a description, which I think would help clarify exactly what part of the inference you are testing. For example, testing that the required property is infered correctly vs. when required is not present/false.

I am wondering if in place of separate test blocks, a comment would provide that same clarity.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the space between each schema definition and test pair does a pretty good job, but a comment might further clarify things for someone new to the mvom library.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, tests 3 and 4 in the "should infer string type" block are definitely readable. It took me a second, but I was able to surmise what functionality you're testing there. I would imagine someone looking at mvom for the first time might find it difficult to truly understand your intent.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An argument against this could be maintenance burden. If something changes, then the developer needs to make sure they update these comments. I guess the same could be said about test descriptions, but comments don't stand out as much, so maybe js-doc comments would be more eye-catching.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to make a bunch of tests because these aren't real tests. They're just a container to keep the type assertions in.

I did add comments to each individual test section.

const schema1 = new Schema({ isoTimeProp: { type: 'ISOTime', path: '1' } });
const test1: Assert<
InferDocumentObject<typeof schema1>,
{ isoTimeProp: `${number}:${number}:${number}.${number}` | null }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are hard coding the expected string type interpolation for the ISOCalendarDate, ISOTime, ISOCalendarDateTime quite a few times in this file. Granted they probably won't ever change and if you get it wrong then there'd by type errors, but is it possible/is there any merit in making those interpolations shareable some how?
maybe something like:

export type ISOTimeFormat = `${number}:${number}:${number}.${number}`;

I tried it locally and it looks like it might work.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added an exported type for each of these types and used them where possible.

@@ -96,7 +105,7 @@ class DocumentArrayType extends BaseSchemaType {
}

/** Generate subdocument instances */
private *makeSubDocument(record: MvRecord): Generator<Document> {
private *makeSubDocument(record: MvRecord): Generator<Document<TSchema, TSchemaDefinition>> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You learn something new every time. This is my first time noticing the use of the * operator here and had to read up on Generators. I could see this being very helpful for large DEPOSITS records.

Copy link
Member

@kthompson23 kthompson23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great but I think we need special handling for _raw type records. In those cases the Schema is null so requires _raw to be defined somewhere.

export type InferModelObject<TSchema extends Schema<SchemaDefinition>> = {
_id: string;
__v: string;
} & InferDocumentObject<TSchema> extends infer O
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to have a special Model type for _raw records?

Since there's no Schema how dos this inference work?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made some changes to improve the experience of schema-less models and documents in a794f2b. It wasn't really broken previously -- it just ended up using a generic SchemaDefinition which couldn't be used to infer anything at all. However, it could be improved, particularly around the behavior of the _raw property to ensure it is typed appropriately based on whether a schema is supplied or not.

I further added tests in 0a3cb8d to confirm the output types for Document and Model methods and included both schema-less and schema variants in them.

Copy link
Contributor

@philfuster philfuster left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your changes look good to me! Thank you.

Copy link
Member

@kthompson23 kthompson23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thank you.

@shawnmcknight shawnmcknight merged commit d5cf1ca into main Jun 18, 2024
4 checks passed
@shawnmcknight shawnmcknight deleted the infer-type branch June 18, 2024 16:30
shawnmcknight added a commit that referenced this pull request Aug 6, 2024
#697 modified the `_raw` property of a document so that it was only
conditionally undefined if the document did not have a `Schema`
associated with it. However, this change had the side-effect of making
the `_raw` property an object key in 100% of document instances whereas
it was previously only conditionally added as a property. There is a
subtle nuance to a property not being present and being present but
undefined, and this exposed that nuance.

This PR changes things slightly such that the `_raw` property will only
be assigned a value (and thus only become an object property) if there
is no schema. If a schema is present then the type of `_raw` will be
`never` and it will not be assigned.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants