Ticket #112 (new defect)

Opened 1 year ago

Last modified 3 days ago

The semantics of 'const'

Reported by: lth Assigned to: anonymous
Type: defect Priority: major
Milestone: Component: Spec
Version: Harmony Keywords: const variables
Cc: brendan, graydon, jeffdyer, waldemar, david-sarah@jacaranda.org

Description (last modified by lth) (diff)

What are the semantics of const? Where can const be used?

We've previously had this general notion that const definitions are "write-once variables", but without a lot of detail beyond that (eg, what happens if you try to write it the second time, what do you get if you read it before you write it the first time).

However, "write-once variables" is not what was implemented in the RI. Jeff summarized those semantics as follows:

  • const instance variables are non-const in the constructor but read-only thereafter
  • const static variables must have an initializer in the variable definition
  • const variables elsewhere are write-once

Those semantics seem somewhat arbitrary and do not fit the notion of "write-once variable" very well.

There seems to be two schools of thought:

  • SpiderMonkey (Firefox) implements a simple and fairly permissive notion of const, which is detailed in comment number 5 below
  • Other participants have expressed a desire for a more rigid notion of constant, where the program throws an exception if a const is read it before it's been initialized or if it's written after it's been initialized. Efficient implementation is a little harder than for SpiderMonkey semantics, fairly pervasive read and write barriers are required.

The proposal currently is to follow SpiderMonkey.

Attachments

Change History

  Changed 1 year ago by lth

Also see #24.

  Changed 1 year ago by lth

  • owner deleted
  • component changed from Proposals to Spec

  Changed 1 year ago by lth

Write-once is nice in principle but suffers the same problems as null -- too great a risk of getting exceptions later (when an uninitialized variable is read).

Requiring initializers everywhere is a bit inflexible.

We could allow the use of settings in classes, like for non-nullable values, this eases the pain a little there. But since static consts are just instance consts on the metaclass, static consts must have initializers. This may be too inflexible, but maybe not.

Current thinking: require initializers (or settings on ivars); see what breaks. No decision at Oct 16 phone meeting, though.

  Changed 1 year ago by lth

Some cases that aren't fixed by the "current thinking" without some further thought.

Case 1: class initializer causes class constant that will be initialized later to be accessed from the outside; that access does not know about the structure of the class. Ergo a read barrier is required.

    function f(obj) { return obj.x }
    class C {
        public static var a = f(C)
        public static const x = 10
    }

Case 2: by normal rules for initialization, properties are created on the variable object before code is run, in the following example the global constant x will be accessed before it is initialized. Thus the access requires a read barrier.

    function f() { return x }
    f();
    const x = 10

Conclusion: if we want to avoid read barriers on property accesses (or really, the exceptions that might be thrown when the barrier faults), "const" needs to have more restrictions than "var".

What are the options for const (broadly speaking)?

  • write-once var, if you read before defining you get undefined (write barrier)
  • write-once var, if you read before defining you get exception (read and write barriers)
  • "Jeff semantics", see the Description (read(?) and write barriers plus syntactic loopholes)
  • initialized-at-creation constant, syntactic restrictions (unknown at the moment) to prevent reading before writing (as for nullability), no chance of undefined or exception
  • defer to ES5

I think const as an idea and as a promise is too valuable to defer. Remember also that most code will want to initialize early, all the code in the builtins initialize in the definition (or in the settings, in the case of instance variables). The guarantee that is provided in practice by "I know I've initialized it and its value will not change" is worth a bit.

I favor either of the "write-once var" solutions (probably the second one if I had to choose now). If those are adopted, it does not matter if "const" hoists or not (though one must worry about multiple initialization and either allow repeated writing with the same value, or forbid reinitialization; in the latter case a syntactic restriction to the top level makes the most sense). The solutions are compatible with SpiderMonkey AFAIK. Strict mode can be given some teeth to, say, require all const instance vars to be initialized by the end of the settings, as for nullability (this is good style anyhow) and probably require statics to have initializers (ditto).

(Note how similar these are to "Jeff semantics" too, in strict mode.)

Cost: In general, every property access in the system has a read barrier to handle getters. Optimizations are possible if the type of the object is known and the field is known not to be a getter; the same applies to a constant: if a constant property is known to be initialized by the time the object has been created, and the referent knows the type of the object, then the read barrier can be elided. This will be known (statically) to the runtime.

Code in a package accessing global variables in the package will often known whether a read barrier is needed (though not always). Etc.

So what about those exceptions? I say they are a red herring, that const is not like nullability, that almost every const will be initialized early and then can't get a wrong value by mistake later, and that that's the use case.

  Changed 1 year ago by lth

After further consideration, I'm prepared to be happy with the SpiderMonkey semantics, which are as follows:

  • only const statements can initialize constant properties anywhere
  • const can reinitialize: in a loop the const value can be updated
  • only one const statement can exist in the scope of a variable object for a particular name
  • there are no restrictions on reading a const name before it's set
  • const hoists
  • const creates ReadOnly properties, which is to say that writing to a const-defined name (by anything but a const statement) silently fails

Amendments:

  • settings in class initializers can set const instance properties
  • the default value of a const property depends on its type; if the type is int and the property is read before being initialized, then the reader sees 0

I would like to recommend that strict mode does the following:

  • prohibit const statements except at the top level (so hoisting does not matter)
  • if the type of an object is known and code tries to set a property that is known to be const, then verification fails.

  Changed 1 year ago by lth

  • cc changed from brendan,graydon,jeffdyer to brendan, graydon, jeffdyer, waldemar
  • description changed from Jeff summarizes: * const instance variables are non-const in the constructor but read-only thereafter * const static variables must have an initializer in the variable definition * const variables elsewhere are write-once Lars flips out. Discuss. to What are the semantics of `const`? Where can `const` be used? We've previously had this general notion that `const` definitions are "write-once variables", but without a lot of detail beyond that (eg, what happens if you try to write it the second time, what do you get if you read it before you write it the first time). However, "write-once variables" is not what was implemented in the RI. Jeff summarized those semantics as follows: * const instance variables are non-const in the constructor but read-only thereafter * const static variables must have an initializer in the variable definition * const variables elsewhere are write-once Those semantics seem somewhat arbitrary and do not fit the notion of "write-once variable" very well. There seems to be two schools of thought: * !SpiderMonkey (Firefox) implements a simple and fairly permissive notion of `const`, which is detailed in comment number 5 below * Other participants have expressed a desire for a more rigid notion of constant, where the program throws an exception if a `const` is read it before it's been initialized or if it's written after it's been initialized. Efficient implementation is a little harder than for !SpiderMonkey semantics, fairly pervasive read and write barriers are required. The proposal currently is to follow !SpiderMonkey.

  Changed 1 year ago by lth

Amending comment 5, const in SpiderMonkey? hoists to the nearest variable object (by analogy with var). In ES4, the intent is for const properties to be visible as properties on classes, unlike let or let const.

follow-up: ↓ 9   Changed 1 year ago by lth

  • summary changed from Diverging 'const' semantics to The semantics of 'const'

in reply to: ↑ 8   Changed 1 year ago by waldemar

This is too ad-hoc to me. The important principle of const is that it doesn't change -- if you read it successfully twice, you'll get the same value. Thus, I reading a const before it has been set should not be allowed; the same goes for reinitialization. For a local const, any attempt to reference it before it has been set should be a compile-time error.

  Changed 1 year ago by brendan

Waldemar, compile time analysis is not something to mandate, since ES1-3 have avoided it and very small (~100K code size) target device implementations for ES4 are under way. So the general agreement has been to stick to ES3 Chapter 16's allowance for compile-time exceptions in some cases, while not mandating them. We acknowledge that a prime mover can force a de-facto standard here, but it has not been an issue so far.

/be

  Changed 1 year ago by waldemar

That has been an issue so far. For example try returning an evalerror if someone renames eval.

The small device argument doesn't make sense. I've worked with full compilers that ran in 64Kbytes, and static scope checking was not a problem there.

  Changed 1 year ago by lth

EvalError is an anomaly. In the case of general names the compiler must keep an inventory of all names seen in a scope so that it can signal an error if a later definition of the name is found; this data structure is of no utility during evaluation, and it can become arbitrarily large. Often it's not very big, to be sure, but sometimes it will be. In the case of eval, on the other hand, only one specific name matters, so even though eval can be declared locally and the compiler must handle that (eg signalling errors for eval = 37 if it occurs without a shadowing binding for eval), only a constant amount of data per scope is necessary. In practice, if you want to throw EvalError you do it at run-time anyhow, because later error reporting is in the spirit of the language.

I'm not aware of aspects of ES3 that require the compiler to maintain arbitrary amounts of compile-time only data in order to report errors at compile time.

  Changed 1 year ago by waldemar

You misunderstood the significance of EvalError?. The problem it causes is that since ES3 gave implementations an option to throw it or not, in fact no implementation can throw it because some don't and in practice that would break programs. The consequence is that implementations must do extra analysis and keep extra state just in case someone calls eval using a different name. This is strong evidence against Brendan's point above.

  Changed 1 year ago by brendan

The analysis required to deal with indirect eval is a burden, but it's different in both degree and kind from the analysis required to give compile-time error on use of a const before the const has been initialized. Lars has experience from Opera dealing with eval (IIRC, Opera originally tried throwing EvalError?).

So I do not think EvalError? provides strong evidence against a different proposition. The problem with mandating a different analysis for const may be specific to the details and costs of that analysis. We should talk about this in due course, get it on a meeting agenda. It's fine to keep going here, but more specific arguments that compare the analyses are needed to avoid repetition. And email may be better, since Trac's little mind can't really hold a properly threaded, deep discussion.

/be

  Changed 1 year ago by lth

If you interpret ES3 strictly, the only legal way of using it is as "eval(s)" -- no receiver object, no renaming allowed. Ergo, eval is an operator, and a simple implementation just emits code for static uses of eval that check whether the current binding of eval, at the time the code is reached, is identical to the original binding of eval, and if it is, invokes suitable system code. (This requires a constant small amount of state at run-time and constant work at compile time, assuming identifiers are interned.) The global function eval simply throws EvalError?.

On the web that's not quite sufficient because you need to allow "window.eval" for arbitrary window objects, but that's something that can be handled internally in the eval function. Still no analysis required.

  Changed 1 year ago by lth

Some progress:

  • a const can be written just once
  • a const can't be read before it's written
  • strict mode can do more than standard mode to flag errors

What we have not made progress on:

  • whether errors are compile-time or run-time (see #253 for similar issues)
  • whether const hoists like var or is block-scoped like let

  Changed 1 year ago by graydon

This progress is encouraging, but I am still a little confused about when const properties are checked. In particular, if I place a const field in a class, shall I check that it is initialized during the settings of the class (as I would for a non-null field) or shall I leave the const potentially uninitialized until arbitrarily far in the future, when it is "first written"? The former reuses machinery we already have; the latter requires a new bit representing "const but not yet written-to by user code", orthogonal to initialization state, as nullable consts would presumably auto-initialize to null despite this not representing a "true" first write.

I am going to pursue the former -- model const the same way as non-nullable, using the initialization status -- for the time being, and fault if a const field of a class is not written-to during the settings of the constructor. But I'd like confirmation: it is essential to nail down not only which cases of writing constitute errors, but also which cases of not writing constitute errors (if any).

  Changed 1 year ago by lth

Graydon, there's as yet no indication at all that we will go for the nullability restriction here. Global, static, and local const properties will almost certainly require a barrier from what I have managed to see of consensus so far. That doesn't have to be true for instances, but I'd almost argue it ought to be, for uniformity.

So adding to my above list:

What we have not made progress on:

  • Any rules for whether there are deadlines for const initialization

follow-up: ↓ 20   Changed 1 year ago by lth

Building on earlier agreements (see above) and various discussions, here is a further proposal.

The general idea is that const bindings/properties should behave like non-nullable variables in order to reuse that machinery, since a not-initialized exception for const is very much like a not-initialized exception for a non-nullable variable.

Details:

  • const instance properties must have values by the time the constructor body is entered, the consts can be initialized by the settings section of the constructor or by initializers in the const directives in the class body
  • const bindings that are global, static, or local (whether "const" or "let const") must have an initializer
  • const hoists to the nearest variable object
  • accesses to a const binding before it has been initialized are run-time errors in standard mode
  • we will define circumstances for which strict mode will be required to flag those errors at compile time, notably, uses of local const names not dominated by the initialization of those same names.

Rationale for hoisting: The spirit of ES is that every binding hoists, and in ES4 only "let" names (including "let const") hoist to the block object, not the variable object. Ergo const hoists to the variable object.

Rationale for run-time errors: The spirit of ES is that errors are deferred until they are unavoidable, for greater programming convenience. Strict mode can be used by those who desire earlier error detection.

(In other words, there are precedents for choosing hoisting and run-time errors over non-hoisting and compile-time errors.)

in reply to: ↑ 19   Changed 1 year ago by graydon

Replying to lth:

The general idea is that const bindings/properties should behave like non-nullable variables in order to reuse that machinery, since a not-initialized exception for const is very much like a not-initialized exception for a non-nullable variable.

I find the details presented in this post coherent, sufficient and agreeable, with one exception: by talking about initializer-dominance, you are suggesting a weakened definite-assignment analysis. I'm uncertain about this (and if we accept it, whether we should push the same rule into instance initialization, rather than demanding settings).

Can others chime in? We're getting closer on this bug.

  Changed 1 year ago by lth

I am absolutely open to other/more precise language when it comes to what strict mode should do.

  Changed 1 year ago by graydon

Two more weeks have passed. Can the others CC'ed on this bug chime in? Notably:

  • Should const instance members require initialization before the constructor is entered? Or by the end of the constructor? Ever?
  • Should const globals and statics have a must-be-initialized point, or should we delay checking them until first access?
  • Should a write to an already initialized const fail silently or throw?
  • If we delay checking globals and statics, should we allocate them to "uninitialized" state -- such that a read throws -- or should we allocate them to "default value" state for cases where such a value exists, as we do for instance members? For example, should const x : int be 0 before it is "first explicitly written"?

The RI continues to behave in a way that surprises most people. This is bad.

  Changed 1 year ago by waldemar

I commented on most of these before. To summarize:

  • Const instance members should be initialized at the very least before the constructor leaves. I'm agnostic on whether they should be initialized before the constructor enters.
  • If you make a const global, the question of a must-be-initialized point seems moot. The only way to initialize one is to do it in the definition itself, so the global gets initialized whenever that definition is executed. I assume const statics work the same way.
  • A write to an already initialized const should throw. There may be sticky issues with backwards compatibility here, but I believe that throwing is the right thing to do if at all possible. If it's not possible in some cases due to backwards compatibility, this might be an area where strict mode would diverge from regular mode.
  • You must never be able to produce a situation where reading the same const variable twice returns different values. As such, there must not be a default state.
  • Where it makes sense, I'm in favor of not even making it possible to write code that would attempt to read a const variable (say, local) before it's initialized.

  Changed 11 months ago by graydon

Can anyone who disagrees with waldemar's position please chime in? Both the positions he's expressed here -- which I'm happy to support and implement -- and also the lingering two open points he does not define, but which I will express my preference on:

  • Required point of initialization: before constructor is entered, like a non-nullable.
  • Ability to write code that reads-from before initialization: permitted only in settings list, but throws if executed in an order that attempts to read an uninitialized value. The only way to prohibit it entirely is to require that a const var has its value given at the point of declaration, which is fine for statics and locals, but I think too restrictive for instance consts. They need to be able to take values passed in new expressions.

  Changed 6 months ago by airforce1

  Changed 3 days ago by David-Sarah Hopwood

  • cc changed from brendan, graydon, jeffdyer, waldemar to brendan, graydon, jeffdyer, waldemar, david-sarah@jacaranda.org
  • keywords set to const variables
  • version changed from 4 to Harmony
  • milestone deleted
Note: See TracTickets for help on using tickets.