Skip to content

Provide syntax for functional record update operations #3072

@ggreif

Description

@ggreif

From the README.md of a certain project:

Technically the record types of Motoko form a meet sub-semilattice
of the global type lattice. Unfortunately the value side of things is less
convenient, and Motoko doesn't (yet) provide the record extension
(concatenation) and record field update operations (along with
correspondingly intuitive syntax). In contrast, moving up in the meet
semilattice is accomplished by record subsumption (i.e. forgetting
fields is always automatic) already.

Sadly, the lack of these shows in the Motoko code, we have to copy fields a lot. With the record features the thing would be much more compact.

@krpeacock also asked for this in the past.

This all needs some design work, e.g. should record concatenation be biased to the left when duplicated fields occur, etc.
(My suggestion would be to error on non-empty joins, and require users to upcast in order to forget fields.)

Strawman

Let's explore the linguistic construct like "I want sports seats for my existing car". That means I have existing_car and get a new version of it such that revised_car.seats == #sport holds. So it would be neat to do such revisions in Motoko with the syntax

let revised_car = { seats = #sport for existing_car };

So I suggest:

(Repeated) field extension/update could be written with the for keyword:

{ new_field = 42; another = "Easy" for existing_record }

When existing_record already has a mentioned field then it would be updated, otherwise extended. Semicolon would be optional before the for. One could have and instead of for, which would give us nice record concatenation syntax too:

let settings = get_settings(key, settings_db);
let funds = get_funds(key, finances_db);
return { settings and funds }

It would also parallel how record type meets are written. This syntax is currently forbidden by Motoko:

> { settings and funds };
stdin: syntax error [M0001], unexpected token 'and', expected one of token or <phrase> sequence:
  }
  <annot_opt>
  ; seplist(<exp_field>,<semicolon>)
  = <exp(ob)>
  <annot_opt> = <exp(ob)>

Iterated concatenation (also combined with field extension/update) is possible:

{ timestamp = now() and settings and funds }

Open questions

Punning

What about field name punning?

let timestamp = now();
{ timestamp for settings and funds }

This would work, but below not:

let timestamp = now();
{ timestamp and settings and funds }

Here timestamp wants to be a punned field, not a record.

Presence of var fields

Since this issue is about functional record update, I am inclined to leave those out for now. But if somebody needs them and can give a clean and sound spec, we can put those in later.

Bias

Should we allow common fields in { recL and recR }? And if so, which side's field value should show up in the result?

Other keyword or a new one?

The white paper suggests with (inspired from the OCaml language?) as a freestanding binary operator like the boolean connective and. This would be a fresh introduction of a new keyword, which could render previously written programs uncompilable.

But this would become a strangely typed operator, with different rules than the other freestanding ones. Thus my suggestion to always have braces around as a visual clue that record extension is happening. Even OCaml does this with

{ record_val with field = value }

Similarly, braces in { recL and recR } make clear that this is not the good old boolean connective. This is similar in spirit to the type intersection syntax type User = Profile and Funds where the type is an indicative of a different context.

One could ponder using the in keyword too.

Being a pattern?

Can we have switch rec { case { f1; f2 in other_fields } ...} patterns? Here other_fields pattern variable would absorb all the fields not mentioned as f1, f2. Note that other_fields would have monomorphic type (upcasted rec with the explicitly mentioned fields stripped). We currently don't have type-level machinery to deal with universally quantified field lists.

Formalisation

This section goes into the finer details of the proposal above, with a more formal approach to syntax, typing and evaluation.

Syntax for record field extension/update

<exp_obj> ::= 
    '{' <list(<exp_field>, ';')> '}'                   # object literal
    '{' <exp_field> (';' <exp_field>)* 'in' <exp> '}'  # field extension/update

Basically, a record/object extension/update looks like a literal record portion followed by in and a base object. The syntax doesn't distinguish between field update and extension, we leave that to the semantics, as laid out in the next subsections.

Notably there is no semicolon allowed before the in and the closing brace.

Typing rule for record field extension

Consider a single field extension/update $\left\{ lab = exp \: \mathtt{in} \: base \right\}$. Furthermore assume that base has an object type and all fields of base are immutable.

We say a record gets extended by a $\left\{ lab : t \right\}$ when base.lab is a type error, e.g. base is label-disjoint to {lab}.
In this case we have a rule

$$ exp : t \hspace{55pt} base : \left\{ l_1 : t_1; …; l_n : t_n \right\} \over \left\{ lab = exp \: \mathtt{in} \: base \right\} : \left\{ lab : t; l_1 : t_1; …; l_n : t_n \right\} $$

Typing rule for record field update

When base is not label-disjoint to {lab}, then we have $base : \left\{ lab : t'; l_1 : t_1; …; l_n : t_n \right\}$ and we define field update as a composition of field stripping (i.e. upcast, eliminating lab) followed by field extension:

$$ exp : t \hspace{30pt} base : \left\{ lab : t'; l_1 : t_1; …; l_n : t_n \right\} \over \left\{ lab = exp \: \mathtt{in} \: base \right\} : \left\{ lab : t; l_1 : t_1; …; l_n : t_n \right\} $$

Typing for many fields

For repeated fields $\left\{ lab_1 = exp_1; lab_2 = exp_2; lab_3 = exp_3; …; lab_m = exp_m \: \mathtt{in} \: base \right\}$ the invariant holds that the record literals formed by $\left\{\overline{lab_n = exp_n}\right\}$ are label-disjoint, so the order in which the extensions/updates are applied doesn't change the resulting type. Hence we define the multi-field case as iterated application of either of the above rules to a successively built-up

  • $b_0 = base$,
  • $b_n = \left\{ lab_n = exp_n \: \mathtt{in} \: b_{n-1} \right\}$.

Thus $\left\{ lab_1 = exp_1; …; lab_m = exp_m \: \mathtt{in} \: base \right\} := b_m$.

Evaluation rule for record field extension/update

The building of extended/updated records is operationally performed just like the creation of the record literals, and is driven by the resulting type:

For { lab_1 = exp_1; … lab_m = exp_m in base } : { l_1 : t_1; …; l_n : t_n } we perform as follows:
bind local variable v_n : t_n to

  • exp_x, if lab_x = l_n
  • base.l_n otherwise.

Form the resulting runtime object (on the heap) as { l_1 : v_1; …; l_n : v_n }.

Metadata

Metadata

Assignees

Labels

P3low priority, resolve when there is timefeatureNew feature or requestlanguage designRequires design work

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions