• Ingen resultater fundet

2.2 Language Constructs for the Basic Concepts

2.2.2 Values and Immutability

The purpose of this section is to explain why it is sometimes necessary to men-tion that values are immutable. It is included because the concept of `value' and the concept of `object' sometimes seem to be hard to keep clear of each other. The reader who thinks that this is a trivial problem may want to skip the rest of this section. The concept of values and its relation to objects has been analyzed in [69], but we are not aware of a treatment that deals with the specic source of confusion which is the topic of this section.

In a typical computer hardware design, as seen from the machine code level and ignoring details about caches etc., there are a few CPUs and a store which is realized as an array of cells of 8-bit binary values, normally augmented with processor instructions for accessing 2, 4, or 8 of those bytes as a unit.

This design provides those values which may be represented in a few bytes with a special status, since they may be stored and loaded directly with built-in processor built-instructions, atomically. The view of the value as the state of a small group of atomically accessed memory cells, and of the group of memory cells themselves as a container for such a value is easy to grasp. Two dierent groups of cells may hold the same value and still be distinct groups, and two dierent values may be loaded from the same group of cells at dierent times.

Sofar, there is no need to talk about values being immutable, just like there is no need to emphasize that a value like `123' is immutable; `123' is `123', and any attempt to change it would be considered a silly and counter-productive exercise; for example, we don't want to worry about things like Do you mean the `123' of today or the `123' of yesterday?

However, values may be arbitrarily complex, so built-in hardware instruc-tions for the retrieval, storage, and management of values in general cannot (realistically) be implemented. As a simple case, a character string of arbitrary length may be kept in a contiguous area of memory cells and transferred to a similar area of cells by a loop which copies the contents of a few cells per iteration. The fact that retrieving and storing this value is not atomic in terms of machine code actions does make a dierence.

In context of concurrency, the abovementioned implementation of string value transfers might be considered wrong, because another thread might change single cells in the source area during the transfer, such that the value produced at the destination is not equal to the value present at the source at any point in time. If the source holds the value 'Hello, world!' and is later assigned the value'Veni, vidi, vixi!', then the destination might receive the value

'Hello, wor, vixi!'. As this shows, not even when considering the whole series of intermediate states at the source can it be explained what the resulting state is at the destination, because the state of the source was never 'Hello,

wor, vixi!'; in other words, this behavior cannot be explained using string val-ues, it can only be explained in terms of charvalues kept in separate, mutable,

char-sized chunks of memory.

Even in the strictly single-threaded case, aliasing invalidates any attempt to explain the behavior using string values only. An access path to any subset of the string storage area, e.g., access to a single memory cell via a (simple,char typed) variable, will enable changes to the composite value, the string, indirectly by changing the value of the simple variable. Again, since no computation, in-termediate or not, in the execution of the program produced the resulting value of the string, there is no way to explain this resulting value without describ-ing the strdescrib-ing as a composite, mutable entity containdescrib-ing a number of separate memory cells.

If we cannot use the concept of a composite value when explaining the behav-ior of program executions, then composite values are simply not implemented correctly.

Hence, in order to be able to explain the semantics using a notion of com-posite values, we must restrict the possible behaviors by imposing a certain discipline on the use of the individual memory cells. The discipline includes the

2.2. LANGUAGE CONSTRUCTS FOR THE BASIC CONCEPTS 21 following:

The entire set of memory cells used to hold a composite value must be updated only as a whole (no subparts of it can be updated via other access paths, e.g., using variables with simpler types)

Any change to the value by means of a sequence of changes to parts of the representation must happen atomically (in a critical region), such that no retrieval of the composite value will ever obtain an intermediate, half-updated result

Any retrieval of the composite value must happen atomically, perhaps overlapped with other retrievals, but not overlapped with updates A simple solution which satises this is to allocate fresh memory to hold a given composite value and then never change it afterwards; any change to a variable holding that value would require allocation of more fresh memory and construction of the new value in there; on the other hand, it is safe to let many variables refer to that storage, to represent the fact that they hold that particular composite value. This is a typical implementation in functional programming languages, where composite values play a very important role. The unlimited aliasing may avoid a lot of copying, but on the other hand the computation of many similar composite values is expensive (e.g., when a long string is being edited interactively it will be copied with every change).

This discipline on the usage of memory cells establishes a closer connection between the groups of memory cells themselves and the contents, the composite value. This is because that area of memory is used for nothing but holding the value, during the entire period from allocation to garbage. That connection probably causes the confusion which it is the purpose of this section to remove.

A given area of memory may be used to represent an objecta mutable entity with object identityor it may be allocated, lled in with a value, and then kept unchanged, in order to hold the given value. In the rst case we might very well nd two areas of storage with the same bit pattern, thus representing two dierent objects which happen to be in the same state; but in the second case, two areas of storage with the same bit pattern would generally represent the same value,4and even though they would have dierent memory addresses, it should not be possible within the (high-level) language to distinguish between them, e.g., to discover whether they were allocated in the same or in two dierent areas of memory. As a consequence, such a composite value representation is not the same as an object, not even the same as an object whose state is declared immutable (const), if the language supports such a concept.

In summary, because composite values cannot be handled atomically at the machine code level, it is impossible to obtain the correct semantics in the man-agement of a composite value without a special discipline on the usage of memory

4Value equality may of course be more complex; the values might, e.g., be graphs which include pointers representing internal edges, and they would of course not have the same bit pattern, but rather describe the same graph

cells containing such a composite value. A safe and simple discipline is to ensure that memory containing a composite value is never changed (until it is garbage collected, at least). This discipline makes the representation of composite values resemble the representation of composite, mutable entities (objects), and hence it is tempting to use the term immutable for the former, to distinguish it from the latter. This is purely an implementation concern; semantically, a mutable value is an absurdity, so values are not immutable, they are just values.