Skip to main content

CS4215 Project Report

Title: Explicit-control evaluator (ECE) for Go

Team Members: Rama Aryasuta Pangestu (A0236444E), Yeung Man Tsung (A0255829N)

Repository URL: https://github.com/benson1029/go-slang

Live Demo: https://benson1029.github.io/go-slang/

Presentation Slides: CS4215_Presentation_Slides.pdf

This report can be viewed online on https://benson1029.github.io/go-slang/docs/project-report/report.

Table of Contents

Project Objectives

We implement an explicit-control evaluator (ECE) for the following subset of the Go programming language:

  • Sequential constructs: Variable declarations and assignments, expressions, control flow (if, for), functions, and complex types (strings, structs, arrays, slices).
  • Concurrent constructs: Goroutines, mutex, wait groups, channels, and select statements.

Additionally, we implement a low-level memory management system with both reference counting and mark-and-sweep garbage collection. All data structures, including control stack, heap, and environments, are stored in a single array buffer. We also implement a visualizer for the ECE machine.

All objectives are met in the final implementation. However, the following language features are not in scope:

  • Sequential constructs: switch statements, defer statements, and panic and recover functions. Anonymous structs, indexing and slicing of strings, pointers, maps and tuples are also not implemented.
  • Concurrent constructs: Closing of channels is not implemented.

The exact subset of the Go programming language that we support, along with some examples, can be found in the language specification page.

Language Processing Steps

Externally, this ECE machine is an interpreter for Go written in JavaScript, which is then run in a JavaScript interpreter (the browser). This is shown using the following T-diagram:

T-diagram

Internally, the Go program is first parsed into an abstract syntax tree (AST) using a parser, generated with peg.js using our own syntax definition. The AST is then preprocessed by a preprocessor, which annotates the global scope declarations with their necessary captures and then sorts the global scope declarations such that their dependencies are satisified. The preprocessed AST is then passed to a loader, which loads the entire program into the heap and initializing the Control stack in the ECE machine. The ECE machine then executes the instructions in the Control stack.

ECE Specification

Instruction Set

The following tables show the instruction set of the ECE machine.

Helper Instructions

InstructionDescriptionParametersStash
PUSH_IPushes a value into the stashvalue (value)None
POP_IPops a value from the stashNonevalue (value)

Expressions

InstructionDescriptionParametersStash
BINARYPerforms a binary operation (includes logical operations)operator (string), left_operand (rvalue expression), right_operand (rvalue expression)None
BINARY_IPerforms a binary operationoperator (string)None
LOGICAL_IMM_IPerforms a logical operation with only the left operand in the stashoperator (string), right_operand (rvalue expression)left_operand (value)
LOGICAL_IPerforms a logical operation with only the right operand in the stashoperator (string)right_operand (value)
UNARYPerforms a unary operationoperator (string), operand (rvalue expression)None
UNARY_IPerforms a unary operationoperator (string)None
LITERALPushes a literal value into the stashtype (string), value (value)None

Variables

InstructionDescriptionParametersStash
ASSIGNAssigns a value to a variablename (lvalue expression), value (rvalue expression)None
ASSIGN_IAssigns a value to a variableNonename (address), value (value)
VARDeclares a variable with an initial valuename (string), value (rvalue expression)None
VAR_IDeclares a variable with an initial valuename (string)value (value)
NAMEAccesses the value of a variablename (string)None
NAME_ADDRESSAccesses the address of a variablename (string)None

Control Flow

InstructionDescriptionParametersStash
SEQUENCEExecutes a sequence of instructionsbody (linked list of instructions)None
BLOCKStarts a scope, i.e. pushes a new environment framebody (sequence)None
EXIT_SCOPE_IExits the current scope, i.e. pops the environment frameNoneNone
IFConditional statementcondition (rvalue expression), then_body (block), else_body (block)None
IF_IConditional statementthen_body (block), else_body (block)condition (boolean)
FORIterative statementinit (statement), condition (rvalue expression), update (statement), body (block)None
FOR_IIterative statementcondition (rvalue expression), update (statement), body (block), loopVar (name or null)condition (boolean)
MARKER_IMarker to signify the end of the loop bodyNoneNone
BREAKBreaks out of the loopNoneNone
CONTINUEContinues to the next iteration of the loopNoneNone

Functions

InstructionDescriptionParametersStash
FUNCTIONDeclares a functionname (string), param_names (array of strings), capture_names (array of strings), body (block)None
CALLCalls a functionfunction (rvalue expression), args (array of rvalue expressions)None
CALL_ICalls a functionnum_args (int)function (value), args (array of values)
CALL_STMTCalls a function and discards the return valuebody (a CALL instruction)None
RETURNReturns a value from a functionvalue (rvalue expression)None
RETURN_IReturns a value from a functionNonevalue (value)
RESTORE_ENV_IRestores the environment frame before the function callframe (environment frame)None

Constructors

InstructionDescriptionParametersStash
DEFAULT_MAKEPushes the default value of a type into the stashtype (string), args (array of any)None
MAKEConstructs a value of a type according to the make function calltype (string), args (array of rvalue expressions)None
MAKE_IConstructs a value of a type according to the make function calltype (string), num_args (int)args (array of values)
CONSTRUCTORConstructs a value of a type according to the initializer list {...}type (string), args (array of rvalue expressions)None
CONSTRUCTOR_IConstructs a value of a type according to the initializer list {...}type (string), num_args (int)args (array of values)

Arrays

InstructionDescriptionParametersStash
INDEXAccesses the value of an array elementarray (rvalue expression), index (rvalue expression)None
INDEX_IAccesses the value of an array elementNonearray (value), index (value)
INDEX_ADDRESSAccesses the address of an array elementarray (rvalue expression), index (rvalue expression)None
INDEX_ADDRESS_IAccesses the address of an array elementNonearray (address), index (value)

Structs

InstructionDescriptionParametersStash
STRUCTDeclares a structname (string), fields (array of names and types)None
METHODDeclares a method for a structstruct_name (string, name of the struct), name (string), self_name (string), param_names (array of strings), capture_names (array of strings), body (block)None
MEMBERAccesses the value of a struct memberobject (rvalue expression), member (string)None
MEMBER_IAccesses the value of a struct memberNoneobject (value), member (string)
MEMBER_ADDRESSAccesses the address of a struct memberobject (rvalue expression), member (string)None
MEMBER_ADDRESS_IAccesses the address of a struct memberNoneobject (address), member (string)
METHOD_MEMBERAccesses the value of a method member, and pushes the current object into the stash (for the method call to use)object (rvalue expression), member (string) , struct (string, name of the struct, annotated during type checking)None

Slices

InstructionDescriptionParametersStash
SLICESlices an arrayarray (rvalue expression), start (rvalue expression), end (rvalue expression)None
SLICE_ISlices an arrayNonearray (value), start (value), end (value)
SLICE_ADDRESSSlices an arrayarray (rvalue expression), start (rvalue expression), end (rvalue expression)None
SLICE_ADDRESS_ISlices an arrayNonearray (address), start (value), end (value)

Goroutines and Channels

InstructionDescriptionParametersStash
GO_CALL_STMTCalls a function in a new goroutinebody (a CALL instruction)None
GO_CALL_ICalls a function in a new goroutinenum_args (int)function (value), args (array of values)
CHAN_RECEIVEReceives a value from a channelchannel (rvalue expression)None
CHAN_RECEIVE_IReceives a value from a channelNonechannel (address)
CHAN_RECEIVE_STMTReceives a value from a channel and discards itbody (a CHAN_RECEIVE instruction)None
CHAN_SENDSends a value to a channelchannel (rvalue expression), value (rvalue expression)None
CHAN_SEND_ISends a value to a channelNonechannel (address), value (value)
SELECTSelects a case from a list of casescases (array of cases)None
SELECT_ISelects a case from a list of cases, with the expressions already evaluatedselect (a SELECT instruction)The evaluated expressions for all cases
CASE_SENDA case for sending to a channelchannel (rvalue expression), value (rvalue expression), body (sequence)None
CASE_RECEIVEA case for receiving from a channelchannel (rvalue expression), assign (null or lvalue expression), body (sequence)None
CASE_DEFAULTA default casebody (sequence)None

State Representation

info

\cdot \mathbin\Vert \cdot denotes the concatenation of two elements. For example, xyx \mathbin\Vert y is the concatenation of xx and yy. In particular, x=xx \mathbin\Vert \varnothing = x, y=y\varnothing \mathbin\Vert y = y, and =\varnothing \mathbin\Vert \varnothing = \varnothing.

The ECE machine has the scheduler T=Ti1Ti2\mathcal{T} = T_{i_1} \mathbin\Vert T_{i_2} \mathbin\Vert \ldots, where Ti1,Ti2,T_{i_1}, T_{i_2}, \ldots are the (unblocked) threads in the scheduler. Each thread TiT_i is a tuple (C,S,E)(C, S, E), where CC is the control stack (containing instructions), SS is the stash (containing runtime values for instructions), and EE is the environment.

The control stack and the stash are represented as the concatenation of its elements x1x2x3xkx_1 \mathbin\Vert x_2 \mathbin\Vert x_3 \mathbin\Vert \ldots \mathbin\Vert x_k. The environment is a tuple E=(ΔN,ΔS)E = (\Delta_N, \Delta_S), where ΔN\Delta_N and ΔS\Delta_S are the name and struct frames, respectively.

  • ΔN\Delta_N is a concatenation of environment frames Δ1Δ2Δn\Delta_1 \mathbin\Vert \Delta_2 \mathbin\Vert \ldots \mathbin\Vert \Delta_n, where each environment frame Δi\Delta_i is a hash table that maps variable names to its address in the heap.
  • ΔS\Delta_S is a hash table that maps struct names and methods to their addresses in the heap. Currently, we only support struct declarations on the global scope, so there is a single struct frame ΔS\Delta_S for the entire program.

A variable xx is a pointer to a value vv in the heap (the value for the variable). Therefore, a variable will have a fixed address aa in the heap. When a variable is inserted into an environment frame, the environment frame hash table keeps a mapping from the variable name (represented by a string object in the heap) to its fixed address aa in the heap.

We define the following operations on the environment:

  • Δ\varnothing \mathbin\Vert \Delta is the environment frame Δ\Delta with a new empty child frame.
  • Δ[xv]\Delta[x \gets v] is the environment frame Δ\Delta with the variable xx bound to the value vv.
  • Δ(x)\Delta(x) is the value bound to the variable xx in the environment frame Δ\Delta.
  • Δ(x)a\Delta(x)_a is the address of the variable xx in the environment frame Δ\Delta.

If we update the value of an object in address aa to a new value vv, this change is reflected in all threads that have a reference to the address aa. As this is a global change, we denote this operation as H[av]\mathcal{H}[a \gets v], where H\mathcal{H} is the heap. Similarly, we can denote H(a)\mathcal{H}(a) as the value at address aa in the heap.

However, it's not just the values of environment variables which are stored in the heap. The scheduler T\mathcal{T}, each of its threads Ti=(Ci,Si,Ei)T_i = (C_i, S_i, E_i), the control stack CiC_i, the stash SiS_i, and the environment Ei=(ΔNi,ΔSi)E_i = (\Delta_{N_i}, \Delta_{S_i}) are all stored in the heap. We denote the address Ta\mathcal{T}_a of the scheduler in the heap such that H(Ta)=T\mathcal{H}(\mathcal{T}_a) = \mathcal{T}.

Therefore, since all objects determining the state of the ECE machine are stored in the heap, we have the following state representation:

info

The state of the ECE machine is the heap H\mathcal{H}.

For clarity, however, we will slightly abuse notation and denote the state of the ECE machine as (T,H)(\mathcal{T}, \mathcal{H}), where T=H(Ta)\mathcal{T} = \mathcal{H}(\mathcal{T}_a). It is important to remember that this is a shorthand notation to easily refer to the scheduler T\mathcal{T}, and the actual state of the ECE machine only depends on the heap H\mathcal{H}.

This shorthand notation allows us to treat T\mathcal{T} as if it's a separate entity from the heap H\mathcal{H}, but note that any changes to T\mathcal{T} are also reflected in H\mathcal{H}. In other words, we can view T\mathcal{T} in (T,H)(\mathcal{T}, \mathcal{H}) as a (mutable) reference to the scheduler object H(Ta)\mathcal{H}(\mathcal{T}_a) in the heap H\mathcal{H}. Implicitly, we treat the children of the scheduler T\mathcal{T} (i.e. the threads TiT_i and their control stacks CiC_i, stashes SiS_i, and environments Ei=(ΔNi,ΔSi)E_i = (\Delta_{N_i}, \Delta_{S_i})) as mutable references to their respective objects in the heap H\mathcal{H} as well.

We define the transition function T,H\rightrightarrows_{\mathcal{T}, \mathcal{H}} that maps the current state (T,H)(\mathcal{T}, \mathcal{H}) to the next state (T,H)(\mathcal{T}', \mathcal{H}') after evaluation, i.e. (T,H)T,H(T,H)(\mathcal{T}, \mathcal{H}) \rightrightarrows_{\mathcal{T}, \mathcal{H}} (\mathcal{T}', \mathcal{H}').

To avoid clutter, most of the time we will only show the scheduler T\mathcal{T} in the inference rules, and only mention the heap H\mathcal{H} when we modify an existing object in the heap directly other than the scheduler and its children. As a shorthand, we define the transition function T\rightrightarrows_{\mathcal{T}} that maps the current state (T,H)(\mathcal{T}, \mathcal{H}) to the next state (T,H[TaT])(\mathcal{T}', \mathcal{H}[\mathcal{T}_a \gets \mathcal{T}']) after evaluation, i.e. TTT\mathcal{T} \rightrightarrows_{\mathcal{T}} \mathcal{T}'.

Inference Rules for Selected Parts

We define inference rules for selected instructions of our ECE machine.

Helper Instructions

PUSH_I and POP_I have the following inference rules:

(PUSH_I vC,S,E)TTT(C,vS,E)\begin{matrix} \hline (\texttt{PUSH\_I} \ v \mathbin\Vert C, S, E) \mathbin\Vert \mathcal{T} \rightrightarrows_{\mathcal{T}} \mathcal{T} \mathbin\Vert (C, v \mathbin\Vert S, E) \end{matrix}

(POP_IC,vS,E)TTT(C,S,E)\begin{matrix} \hline (\texttt{POP\_I} \mathbin\Vert C, v \mathbin\Vert S, E) \mathbin\Vert \mathcal{T} \rightrightarrows_{\mathcal{T}} \mathcal{T} \mathbin\Vert (C, S, E) \end{matrix}

Expressions

Literals, unary operators, and binary operators are evaluated using the LITERAL, UNARY, and BINARY instructions. However, for binary logical operators, we have short-circuit evaluation. The LOGICAL_IMM_I instruction evaluates the left operand first, and if the result is sufficient to determine the result of the logical operation, it will not evaluate the right operand.

This is shown in the following inference rules:

I=(BINARY operator EL ER)(operator=&&operator=||)(I C,S,E)TTT(ELLOGICAL_IMM_I operator ERC,S,E)\begin{matrix} I = (\texttt{BINARY} \ \texttt{operator} \ E_L \ E_R) \qquad (\texttt{operator} = \texttt{\&\&} \lor \texttt{operator} = \texttt{||}) \\ \hline (I \ \mathbin\Vert C, S, E) \mathbin\Vert \mathcal{T} \rightrightarrows_{\mathcal{T}} \mathcal{T} \mathbin\Vert (E_L \mathbin\Vert \texttt{LOGICAL\_IMM\_I} \ \texttt{operator} \ E_R \mathbin\Vert C, S, E) \end{matrix}

I=(LOGICAL_IMM_I && ER)(I C,falseS,E)TTT(C,falseS,E)\begin{matrix} I = (\texttt{LOGICAL\_IMM\_I} \ \texttt{\&\&} \ E_R) \\ \hline (I \ \mathbin\Vert C, \texttt{false} \mathbin\Vert S, E) \mathbin\Vert \mathcal{T} \rightrightarrows_{\mathcal{T}} \mathcal{T} \mathbin\Vert (C, \texttt{false} \mathbin\Vert S, E) \end{matrix}

I=(LOGICAL_IMM_I && ER)(I C,trueS,E)TTT(ERLOGICAL_I &&C,S,E)\begin{matrix} I = (\texttt{LOGICAL\_IMM\_I} \ \texttt{\&\&} \ E_R) \\ \hline (I \ \mathbin\Vert C, \texttt{true} \mathbin\Vert S, E) \mathbin\Vert \mathcal{T} \rightrightarrows_{\mathcal{T}} \mathcal{T} \mathbin\Vert (E_R \mathbin\Vert \texttt{LOGICAL\_I} \ \texttt{\&\&} \mathbin\Vert C, S, E) \end{matrix}

I=(LOGICAL_IMM_I || ER)(I C,trueS,E)TTT(C,trueS,E)\begin{matrix} I = (\texttt{LOGICAL\_IMM\_I} \ \texttt{||} \ E_R) \\ \hline (I \ \mathbin\Vert C, \texttt{true} \mathbin\Vert S, E) \mathbin\Vert \mathcal{T} \rightrightarrows_{\mathcal{T}} \mathcal{T} \mathbin\Vert (C, \texttt{true} \mathbin\Vert S, E) \end{matrix}

I=(LOGICAL_IMM_I || ER)(I C,falseS,E)TTT(ERLOGICAL_I ||C,S,E)\begin{matrix} I = (\texttt{LOGICAL\_IMM\_I} \ \texttt{||} \ E_R) \\ \hline (I \ \mathbin\Vert C, \texttt{false} \mathbin\Vert S, E) \mathbin\Vert \mathcal{T} \rightrightarrows_{\mathcal{T}} \mathcal{T} \mathbin\Vert (E_R \mathbin\Vert \texttt{LOGICAL\_I} \ \texttt{||} \mathbin\Vert C, S, E) \end{matrix}

I=(LOGICAL_I operator)(I C,falseS,E)TTT(C,falseS,E)\begin{matrix} I = (\texttt{LOGICAL\_I} \ \texttt{operator}) \\ \hline (I \ \mathbin\Vert C, \texttt{false} \mathbin\Vert S, E) \mathbin\Vert \mathcal{T} \rightrightarrows_{\mathcal{T}} \mathcal{T} \mathbin\Vert (C, \texttt{false} \mathbin\Vert S, E) \end{matrix}

I=(LOGICAL_I operator)(I C,trueS,E)TTT(C,trueS,E)\begin{matrix} I = (\texttt{LOGICAL\_I} \ \texttt{operator}) \\ \hline (I \ \mathbin\Vert C, \texttt{true} \mathbin\Vert S, E) \mathbin\Vert \mathcal{T} \rightrightarrows_{\mathcal{T}} \mathcal{T} \mathbin\Vert (C, \texttt{true} \mathbin\Vert S, E) \end{matrix}

Variables

We can declare a variable with an initial value using the VAR instruction. The VAR instruction will push the initial value into the stash, followed by the VAR_I instruction to create a new variable in the environment frame.

This is shown in the following inference rules:

(VAR x EC,S,(ΔN,ΔS))TTT(EVAR_I xC,S,(ΔN,ΔS))\begin{matrix} \hline (\texttt{VAR} \ x \ E \mathbin\Vert C, S, (\Delta_N, \Delta_S)) \mathbin\Vert \mathcal{T} \rightrightarrows_{\mathcal{T}} \mathcal{T} \mathbin\Vert (E \mathbin\Vert \texttt{VAR\_I} \ x \mathbin\Vert C, S, (\Delta_N, \Delta_S)) \end{matrix}

(VAR_I xC,vS,(ΔN,ΔS))TTT(C,S,(ΔN[xv],ΔS))\begin{matrix} \hline (\texttt{VAR\_I} \ x \mathbin\Vert C, v \mathbin\Vert S, (\Delta_N, \Delta_S)) \mathbin\Vert \mathcal{T} \rightrightarrows_{\mathcal{T}} \mathcal{T} \mathbin\Vert (C, S, (\Delta_N[x \gets v], \Delta_S)) \end{matrix}

info

The variable name xx is an address to a string object in the heap, and the expression EE is an address to a instruction object in the heap. The result of evaluating the expression EE will be pushed into the stash, from which the VAR_I instruction will pop the value and bind it to the variable xx in the environment frame.

After declaring a variable, we can get its value using the NAME instruction and its address using the NAME_ADDRESS instruction. Both instructions will push the value or address of the variable into the stash.

This is shown in the following inference rules:

(NAME xC,S,(ΔN,ΔS))TTT(C,ΔN(x)S,(ΔN,ΔS))\begin{matrix} \hline (\texttt{NAME} \ x \mathbin\Vert C, S, (\Delta_N, \Delta_S)) \mathbin\Vert \mathcal{T} \rightrightarrows_{\mathcal{T}} \mathcal{T} \mathbin\Vert (C, \Delta_N(x) \mathbin\Vert S, (\Delta_N, \Delta_S)) \end{matrix}

(NAME_ADDRESS xC,S,(ΔN,ΔS))TTT(C,ΔN(x)aS,(ΔN,ΔS))\begin{matrix} \hline (\texttt{NAME\_ADDRESS} \ x \mathbin\Vert C, S, (\Delta_N, \Delta_S)) \mathbin\Vert \mathcal{T} \rightrightarrows_{\mathcal{T}} \mathcal{T} \mathbin\Vert (C, \Delta_N(x)_a \mathbin\Vert S, (\Delta_N, \Delta_S)) \end{matrix}

With the NAME_ADDRESS instruction, we can assign a new value (from an expression EE) to an existing variable using the ASSIGN instruction. The ASSIGN instruction will push the value of the expression EE into the stash, followed by the address of the variable into the stash, and then the ASSIGN_I instruction will assign the value to the variable.

This is shown in the following inference rules:

(ASSIGN x EC,S,(ΔN,ΔS))TTT(ENAME_ADDRESS xASSIGN_IC,S,(ΔN,ΔS))\begin{matrix} \hline (\texttt{ASSIGN} \ x \ E \mathbin\Vert C, S, (\Delta_N, \Delta_S)) \mathbin\Vert \mathcal{T} \rightrightarrows_{\mathcal{T}} \mathcal{T} \mathbin\Vert (E \mathbin\Vert \texttt{NAME\_ADDRESS} \ x \mathbin\Vert \texttt{ASSIGN\_I} \mathbin\Vert C, S, (\Delta_N, \Delta_S)) \end{matrix}

((ASSIGN_IC,avS,(ΔN,ΔS))T,H)T,H(T(C,S,(ΔN,ΔS)),H[av])\begin{matrix} \hline ((\texttt{ASSIGN\_I} \mathbin\Vert C, a \mathbin\Vert v \mathbin\Vert S, (\Delta_N, \Delta_S)) \mathbin\Vert \mathcal{T}, \mathcal{H}) \rightrightarrows_{\mathcal{T}, \mathcal{H}} (\mathcal{T} \mathbin\Vert (C, S, (\Delta_N, \Delta_S)), \mathcal{H}[a \gets v]) \end{matrix}

Sequence and Block

A sequence of instructions is executed using the SEQUENCE instruction. The SEQUENCE instruction will execute the first instruction II in the sequence, followed by the SEQUENCE instruction with the rest of the instructions II'. In the heap, the SEQUENCE instruction is represented as a linked list of instructions.

This is shown in the following inference rules:

Inil(SEQUENCE I IC,S,E)TTT(ISEQUENCE IC,S,E)\begin{matrix} I \neq \texttt{nil} \\ \hline (\texttt{SEQUENCE} \ I \ I' \mathbin\Vert C, S, E) \mathbin\Vert \mathcal{T} \rightrightarrows_{\mathcal{T}} \mathcal{T} \mathbin\Vert (I \mathbin\Vert \texttt{SEQUENCE} \ I' \mathbin\Vert C, S, E) \end{matrix}

I=nil(SEQUENCE IC,S,E)TTT(C,S,E)\begin{matrix} I = \texttt{nil} \\ \hline (\texttt{SEQUENCE} \ I \mathbin\Vert C, S, E) \mathbin\Vert \mathcal{T} \rightrightarrows_{\mathcal{T}} \mathcal{T} \mathbin\Vert (C, S, E) \end{matrix}

A block of instructions is executed using the BLOCK instruction. The BLOCK instruction will push a new environment frame into the environment. After the block is executed, the EXIT_SCOPE_I instruction will pop the environment frame.

This is shown in the following inference rule:

(BLOCK bodyC,S,(ΔN,ΔS))TTT(bodyEXIT_SCOPE_IC,S,(ΔN,ΔS))\begin{matrix} \hline (\texttt{BLOCK} \ \texttt{body} \mathbin\Vert C, S, (\Delta_N, \Delta_S)) \mathbin\Vert \mathcal{T} \rightrightarrows_{\mathcal{T}} \mathcal{T} \mathbin\Vert (\texttt{body} \mathbin\Vert \texttt{EXIT\_SCOPE\_I} \mathbin\Vert C, S, (\varnothing \mathbin\Vert \Delta_N, \Delta_S)) \end{matrix}

An EXIT_SCOPE_I instruction will pop the environment frame. This is shown in the following inference rule:

(EXIT_SCOPE_IC,S,(FΔN,ΔS))TTT(C,S,(ΔN,ΔS))\begin{matrix} \hline (\texttt{EXIT\_SCOPE\_I} \mathbin\Vert C, S, (F \mathbin\Vert \Delta_N, \Delta_S)) \mathbin\Vert \mathcal{T} \rightrightarrows_{\mathcal{T}} \mathcal{T} \mathbin\Vert (C, S, (\Delta_N, \Delta_S)) \end{matrix}

For Loop

A for loop statement is defined in BNF as follows:

ForStmt = "for" WhiteSpace InitStmt ";" Expression ";" UpdStmt Block
InitStmt = Assignment | VarDecl
UpdStmt = Assignment | PostfixStmt

When the FOR instruction is executed, the init part is executed first, followed by the condition part. Next, the FOR_I instruction will checks for the result of evaluting the condition and determine the next steps. As FOR_I needs to know the loop variable, it is also determined and passed to the FOR_I instruction. The FOR instruction also needs to create a new environment frame for the loop variable (if any).

This is shown in the following inference rules:

init=VAR x E(FOR init condition update body C,S,(ΔN,ΔS))TTT(initconditionFOR_I condition update body xEXIT_SCOPE_IC,S,(ΔN,ΔS))\begin{matrix} \texttt{init} = \texttt{VAR} \ x \ E \\ \hline (\texttt{FOR} \ \texttt{init} \ \texttt{condition} \ \texttt{update} \ \texttt{body} \ \mathbin\Vert C, S, (\Delta_N, \Delta_S)) \mathbin\Vert \mathcal{T} \\ \rightrightarrows_{\mathcal{T}} \mathcal{T} \mathbin\Vert (\texttt{init} \mathbin\Vert \texttt{condition} \mathbin\Vert \texttt{FOR\_I} \ \texttt{condition} \ \texttt{update} \ \texttt{body} \ x \mathbin\Vert \texttt{EXIT\_SCOPE\_I} \mathbin\Vert C, S, (\varnothing \mathbin\Vert \Delta_N, \Delta_S)) \end{matrix}

initVAR x E(FOR init condition update body C,S,(ΔN,ΔS))TTT(initconditionFOR_I condition update body nilEXIT_SCOPE_IC,S,(ΔN,ΔS))\begin{matrix} \texttt{init} \neq \texttt{VAR} \ x \ E \\ \hline (\texttt{FOR} \ \texttt{init} \ \texttt{condition} \ \texttt{update} \ \texttt{body} \ \mathbin\Vert C, S, (\Delta_N, \Delta_S)) \mathbin\Vert \mathcal{T} \\ \rightrightarrows_{\mathcal{T}} \mathcal{T} \mathbin\Vert (\texttt{init} \mathbin\Vert \texttt{condition} \mathbin\Vert \texttt{FOR\_I} \ \texttt{condition} \ \texttt{update} \ \texttt{body} \ \texttt{nil} \mathbin\Vert \texttt{EXIT\_SCOPE\_I} \mathbin\Vert C, S, (\varnothing \mathbin\Vert \Delta_N, \Delta_S)) \end{matrix}

When the FOR_I instruction is executed, it takes the topmost value in the stash as the result of the condition. If the condition is true, the body is executed, followed by the update part, the condition part, and the FOR_I instruction again. If the condition is false, the loop is exited.

If there is a loop variable, a new environment frame is created for an iteration-scope copy of the loop variable. This copy should be copied back to the loop-scope variable at the end of the iteration.

Additionally, there is a MARKER_I instruction after the body to handle CONTINUE instructions. We cannot use FOR_I as the marker since the copying of the loop variable must be done after a CONTINUE.

This is shown in the following inference rules:

I=FOR_I condition update body loopVarloopVar=nil(I C,trueS,(ΔN,ΔS))TTT(bodyMARKER_IupdateconditionIC,S,(ΔN,ΔS))\begin{matrix} I = \texttt{FOR\_I} \ \texttt{condition} \ \texttt{update} \ \texttt{body} \ \texttt{loopVar} \qquad \texttt{loopVar} = \texttt{nil} \\ \hline (I \ \mathbin\Vert C, \texttt{true} \mathbin\Vert S, (\Delta_N, \Delta_S)) \mathbin\Vert \mathcal{T} \\ \rightrightarrows_{\mathcal{T}} \mathcal{T} \mathbin\Vert (\texttt{body} \mathbin\Vert \texttt{MARKER\_I} \mathbin\Vert \texttt{update} \mathbin\Vert \texttt{condition} \mathbin\Vert I \mathbin\Vert C, S, (\Delta_N, \Delta_S)) \end{matrix}

I=FOR_I condition update body loopVarloopVar=xnil(I C,trueS,(ΔN,ΔS))TTT(bodyMARKER_INAME xEXIT_SCOPE_INAME_ADDRESS xASSIGN_IupdateconditionIC,S,([xΔN(x)]ΔN,ΔS))\begin{matrix} I = \texttt{FOR\_I} \ \texttt{condition} \ \texttt{update} \ \texttt{body} \ \texttt{loopVar} \qquad \texttt{loopVar} = x \neq \texttt{nil} \\ \hline (I \ \mathbin\Vert C, \texttt{true} \mathbin\Vert S, (\Delta_N, \Delta_S)) \mathbin\Vert \mathcal{T} \\ \rightrightarrows_{\mathcal{T}} \mathcal{T} \mathbin\Vert ( \texttt{body} \mathbin\Vert \texttt{MARKER\_I} \mathbin\Vert \texttt{NAME} \ x \mathbin\Vert \texttt{EXIT\_SCOPE\_I} \mathbin\Vert \texttt{NAME\_ADDRESS} \ x \mathbin\Vert \texttt{ASSIGN\_I} \mathbin\Vert \\ \texttt{update} \mathbin\Vert \texttt{condition} \mathbin\Vert I \mathbin\Vert C, S, (\varnothing[x \gets \Delta_N(x)] \mathbin\Vert \Delta_N, \Delta_S)) \end{matrix}

I=FOR_I condition update body loopVar(I C,falseS,(ΔN,ΔS))TTT(C,S,(ΔN,ΔS))\begin{matrix} I = \texttt{FOR\_I} \ \texttt{condition} \ \texttt{update} \ \texttt{body} \ \texttt{loopVar} \\ \hline (I \ \mathbin\Vert C, \texttt{false} \mathbin\Vert S, (\Delta_N, \Delta_S)) \mathbin\Vert \mathcal{T} \rightrightarrows_{\mathcal{T}} \mathcal{T} \mathbin\Vert (C, S, (\Delta_N, \Delta_S)) \end{matrix}

info

[xΔN(x)]ΔN\varnothing[x \gets \Delta_N(x)] \mathbin\Vert \Delta_N is the environment frame ΔN\Delta_N with a new child frame containing a copy of the variable xx. After the loop body is executed, the NAME instruction retrieves the value of the loop variable from the iteration-scope copy, the NAME_ADDRESS instruction (after EXIT_SCOPE_I) retrieves the address of the loop variable in the loop-scope frame, and the ASSIGN_I instruction assigns the value of the loop variable to the loop-scope frame.

The evaluation of a MARKER_I instruction does not produce side effects:

(MARKER_I C,S,E)TTT(C,S,E)\begin{matrix} \hline (\texttt{MARKER\_I} \ \mathbin\Vert C, S, E) \mathbin\Vert \mathcal{T} \rightrightarrows_{\mathcal{T}} \mathcal{T} \mathbin\Vert (C, S, E) \end{matrix}

A BREAK instruction will create a BREAK_I instruction in the control stack, which pops from the control stack until it pops out the FOR_I instruction. On the other hand, a CONTINUE instruction will create a CONTINUE_I instruction in the control stack, which pops from the control stack until it pops out the MARKER_I instruction. All instructions except EXIT_SCOPE_I will be ignored during the popping process.

This is shown in the following inference rules:

(BREAKC,S,E)TTT(BREAK_IC,S,E)\begin{matrix} \hline (\texttt{BREAK} \mathbin\Vert C, S, E) \mathbin\Vert \mathcal{T} \rightrightarrows_{\mathcal{T}} \mathcal{T} \mathbin\Vert (\texttt{BREAK\_I} \mathbin\Vert C, S, E) \end{matrix}

I=FOR_I(BREAK_IIC,S,E)TTT(C,S,E)\begin{matrix} I = \texttt{FOR\_I} \\ \hline (\texttt{BREAK\_I} \mathbin\Vert I \mathbin\Vert C, S, E) \mathbin\Vert \mathcal{T} \rightrightarrows_{\mathcal{T}} \mathcal{T} \mathbin\Vert (C, S, E) \end{matrix}

I=EXIT_SCOPE_I(BREAK_IIC,S,(FΔN,ΔS))TTT(BREAK_IC,S,(ΔN,ΔS))\begin{matrix} I = \texttt{EXIT\_SCOPE\_I} \\ \hline (\texttt{BREAK\_I} \mathbin\Vert I \mathbin\Vert C, S, (F \mathbin\Vert \Delta_N, \Delta_S)) \mathbin\Vert \mathcal{T} \rightrightarrows_{\mathcal{T}} \mathcal{T} \mathbin\Vert (\texttt{BREAK\_I} \mathbin\Vert C, S, (\Delta_N, \Delta_S)) \end{matrix}

IFOR_IIEXIT_SCOPE_I(BREAK_IIC,S,E)TTT(BREAK_IC,S,E)\begin{matrix} I \neq \texttt{FOR\_I} \qquad I \neq \texttt{EXIT\_SCOPE\_I} \\ \hline (\texttt{BREAK\_I} \mathbin\Vert I \mathbin\Vert C, S, E) \mathbin\Vert \mathcal{T} \rightrightarrows_{\mathcal{T}} \mathcal{T} \mathbin\Vert (\texttt{BREAK\_I} \mathbin\Vert C, S, E) \end{matrix}

(CONTINUEC,S,E)TTT(CONTINUE_IC,S,E)\begin{matrix} \hline (\texttt{CONTINUE} \mathbin\Vert C, S, E) \mathbin\Vert \mathcal{T} \rightrightarrows_{\mathcal{T}} \mathcal{T} \mathbin\Vert (\texttt{CONTINUE\_I} \mathbin\Vert C, S, E) \end{matrix}

I=MARKER_I(CONTINUE_IIC,S,E)TTT(C,S,E)\begin{matrix} I = \texttt{MARKER\_I} \\ \hline (\texttt{CONTINUE\_I} \mathbin\Vert I \mathbin\Vert C, S, E) \mathbin\Vert \mathcal{T} \rightrightarrows_{\mathcal{T}} \mathcal{T} \mathbin\Vert (C, S, E) \end{matrix}

I=EXIT_SCOPE_I(CONTINUE_IIC,S,(FΔN,ΔS))TTT(CONTINUE_IC,S,(ΔN,ΔS))\begin{matrix} I = \texttt{EXIT\_SCOPE\_I} \\ \hline (\texttt{CONTINUE\_I} \mathbin\Vert I \mathbin\Vert C, S, (F \mathbin\Vert \Delta_N, \Delta_S)) \mathbin\Vert \mathcal{T} \rightrightarrows_{\mathcal{T}} \mathcal{T} \mathbin\Vert (\texttt{CONTINUE\_I} \mathbin\Vert C, S, (\Delta_N, \Delta_S)) \end{matrix}

IMARKER_IIEXIT_SCOPE_I(CONTINUE_IIC,S,E)TTT(CONTINUE_IC,S,E)\begin{matrix} I \neq \texttt{MARKER\_I} \qquad I \neq \texttt{EXIT\_SCOPE\_I} \\ \hline (\texttt{CONTINUE\_I} \mathbin\Vert I \mathbin\Vert C, S, E) \mathbin\Vert \mathcal{T} \rightrightarrows_{\mathcal{T}} \mathcal{T} \mathbin\Vert (\texttt{CONTINUE\_I} \mathbin\Vert C, S, E) \end{matrix}

Go Function Call

A go function call is defined in BNF as follows:

GoStmt = "go" FunctionCall

When the GO_CALL_STMT instruction is executed, it will evaluate the function and function arguments, followed by GO_CALL_I. The GO_CALL_I instruction will create a new thread with a single instruction CALL_I in the control stack, while the stash and environment are copied from the current goroutine.

This is shown in the following inference rules:

(GO_CALL_STMT CALL function args1  argsnC,S,E)TTT(functionargsnargs1GO_CALL_I nC,S,E)\begin{matrix} \hline (\texttt{GO\_CALL\_STMT} \ \texttt{CALL} \ \texttt{function} \ \texttt{args}_1 \ \dots \ \texttt{args}_n \mathbin\Vert C, S, E) \mathbin\Vert \mathcal{T} \\ \rightrightarrows_{\mathcal{T}} \mathcal{T} \mathbin\Vert ( \texttt{function} \mathbin\Vert \texttt{args}_n \mathbin\Vert \dots \mathbin\Vert \texttt{args}_1 \mathbin\Vert \texttt{GO\_CALL\_I} \ n \mathbin\Vert C, S, E) \end{matrix}

(GO_CALL_I nC,args1argsnfunctionS,E)TTT(C,S,E)(CALL_I n,args1argsnfunctionS,E)\begin{matrix} \hline (\texttt{GO\_CALL\_I} \ n \mathbin\Vert C, \texttt{args}_1 \mathbin\Vert \dots \mathbin\Vert \texttt{args}_n \mathbin\Vert \texttt{function} \mathbin\Vert S, E) \mathbin\Vert \mathcal{T} \\ \rightrightarrows_{\mathcal{T}} \mathcal{T} \mathbin\Vert (C, S, E) \mathbin\Vert (\texttt{CALL\_I} \ n, \texttt{args}_1 \mathbin\Vert \dots \mathbin\Vert \texttt{args}_n \mathbin\Vert \texttt{function} \mathbin\Vert S, E) \end{matrix}

info

A goroutine may contain unused values in the stash and environment. However, this does not affect the correctness, and when the goroutine finishes execution, the unused values are freed due to reference counting. Note that the newly spawned thread only has the CALL_I instruction in the control stack, so it terminates after the function call is completed.

Mutex Lock and Unlock

Mutex is implemented as a struct, which contains a special heap object as its member. We define a mutex as a tuple (s,Q)(s, Q) where ss is the state of the mutex (true if locked and false if unlocked) and QQ is its wait queue (a list of threads, similar to the scheduler). We encapsulate blocked threads in QQ by a special heap object called Waker(C,S,E)\texttt{Waker}_{(C, S, E)}.

info

We design Waker(C,S,E)\texttt{Waker}_{(C, S, E)} to have a special behaviour:

  • The first time that it wakes up a thread (C,S,E)(C, S, E), it replaces itself with a Waker\texttt{Waker}_{\varnothing}.
  • The second time onwards, if it is requested to wake up a thread again (since it holds \varnothing), it does nothing.

We design Waker\texttt{Waker} this way to be able to implement select, as we need to be able to wait until one of the channels is ready to send or receive; the channels involved in a select will have a reference to the same Waker\texttt{Waker} object, so it only wakes up the thread the first time a channel is ready (and does nothing the second time onwards). In the inference rules below for mutex, we do not enforce Waker\texttt{Waker}'s behaviour as we only require this behaviour in the channel implementation.

When Lock() is called on a mutex, the mutex is locked and the thread is pushed into the scheduler if it is unlocked. Otherwise, the current thread is added to the wait queue of the mutex.

M=H(Ma)=(false,Q)((CALL_I 0C,sync.MutexLock (builtin)MS,E),H)TT,H(T(C,S,E),H[Ma(true,Q)])\begin{matrix} M = \mathcal{H}(M_a) = (\texttt{false}, Q) \\ \hline ((\texttt{CALL\_I} \ 0 \mathbin\Vert C, \texttt{sync.MutexLock (builtin)} \mathbin\Vert M \mathbin\Vert S, E), \mathcal{H}) \mathbin\Vert \mathcal{T} \rightrightarrows_{\mathcal{T}, \mathcal{H}} \left(\mathcal{T} \mathbin\Vert (C, S, E), \mathcal{H}\left[M_a \gets (\texttt{true}, Q)\right] \right) \end{matrix}

M=H(Ma)=(true,Q)((CALL_I 0C,sync.MutexLock (builtin)MS,E)T,H)T,H(T,H[Ma(true,QWaker(C,S,E))])\begin{matrix} M = \mathcal{H}(M_a) = (\texttt{true}, Q) \\ \hline ((\texttt{CALL\_I} \ 0 \mathbin\Vert C, \texttt{sync.MutexLock (builtin)} \mathbin\Vert M \mathbin\Vert S, E) \mathbin\Vert \mathcal{T}, \mathcal{H}) \rightrightarrows_{\mathcal{T}, \mathcal{H}} \left(\mathcal{T}, \mathcal{H}\left[M_a \gets (\texttt{true}, Q \mathbin\Vert \texttt{Waker}_{(C, S, E)})\right]\right) \end{matrix}

When Unlock() is called on a mutex, the mutex is unlocked and the first thread in the wait queue is popped and pushed into the scheduler (if any). It throws an error if the mutex is already unlocked.

M=H(Ma)=(true,WakerTQ)((CALL_I 0C,sync.MutexUnlock (builtin)MS,E)T,H)T,H(T(C,S,E)T,H[Ma(true,Q)])\begin{matrix} M = \mathcal{H}(M_a) = (\texttt{true}, \texttt{Waker}_T \mathbin\Vert Q) \\ \hline ((\texttt{CALL\_I} \ 0 \mathbin\Vert C, \texttt{sync.MutexUnlock (builtin)} \mathbin\Vert M \mathbin\Vert S, E) \mathbin\Vert \mathcal{T}, \mathcal{H}) \rightrightarrows_{\mathcal{T}, \mathcal{H}} \left(\mathcal{T} \mathbin\Vert (C, S, E) \mathbin\Vert T, \mathcal{H}\left[M_a \gets (\texttt{true}, Q)\right]\right) \end{matrix}

M=H(Ma)=(true,)((CALL_I 0C,sync.MutexUnlock (builtin)MS,E)T,H)T,H(T(C,S,E),H[Ma(false,)])\begin{matrix} M = \mathcal{H}(M_a) = (\texttt{true}, \varnothing) \\ \hline ((\texttt{CALL\_I} \ 0 \mathbin\Vert C, \texttt{sync.MutexUnlock (builtin)} \mathbin\Vert M \mathbin\Vert S, E) \mathbin\Vert \mathcal{T}, \mathcal{H}) \rightrightarrows_{\mathcal{T}, \mathcal{H}} \left(\mathcal{T} \mathbin\Vert (C, S, E), \mathcal{H}\left[M_a \gets (\texttt{false}, \varnothing)\right]\right) \end{matrix}

M=H(Ma)=(false,Q)((CALL_I 0C,sync.MutexUnlock (builtin)MS,E)T,H)T,Hε\begin{matrix} M = \mathcal{H}(M_a) = (\texttt{false}, Q) \\ \hline ((\texttt{CALL\_I} \ 0 \mathbin\Vert C, \texttt{sync.MutexUnlock (builtin)} \mathbin\Vert M \mathbin\Vert S, E) \mathbin\Vert \mathcal{T}, \mathcal{H}) \rightrightarrows_{\mathcal{T}, \mathcal{H}} \varepsilon \end{matrix}

where ε\varepsilon is the error state.

Channel Send and Receive

The implementation of channels uses the same idea as above, except that there is one waiting queue for sending and one waiting queue for receiving. Also, they use specialized instructions CHAN_SEND_I and CHAN_RECEIVE_I rather than using built-in functions.

We define a channel as a tuple (B,QS,QR)(B, Q_S, Q_R) where BB is the buffer of the channel, QSQ_S is the send waiting queue, and QRQ_R is the receive waiting queue. We encapsulate blocked threads in QSQ_S and QRQ_R by special heap objects called WaitingInstanceW,V,b\texttt{WaitingInstance}_{W, V, b} where:

  • WW is a Waker(C,S,E)\texttt{Waker}_{(C, S, E)}
  • VV is the value to be sent (on receive, V=nilV = \texttt{nil})
  • bb is an instruction to be pushed into the control stack after the thread is woken up. This is mainly used for select implementation, although we need to define it here.

We use a shorthand WaitingInstance(C,S,E),V\texttt{WaitingInstance}_{(C, S, E), V} to denote WaitingInstanceWaker(C,S,E),V,\texttt{WaitingInstance}_{\texttt{Waker}_{(C, S, E)}, V, \varnothing}.

For a channel X=(B,QS,QR)X = (B, Q_S, Q_R), we maintain the following invariants at all times:

  • If QSQ_S is non-empty, then BB is full (len(B)=cap(B)\texttt{len}(B) = \texttt{cap}(B)).
  • If QRQ_R is non-empty, then BB is empty (len(B)=0\texttt{len}(B) = 0).

Furthermore, we enforce the special behaviour of Waker\texttt{Waker} as described in the mutex section.

When Send() is called on a channel, the value is sent to a thread in QRQ_R if any, or pushed into the buffer if there is space. Otherwise, the current thread is added to the send waiting queue of the channel. This is shown in the following inference rules:

(CHAN_SEND channel valueC,S,E)TTT(valuechannelCHAN_SEND_IC,S,E)\begin{matrix} \hline (\texttt{CHAN\_SEND} \ \texttt{channel} \ \texttt{value} \mathbin\Vert C, S, E) \mathbin\Vert \mathcal{T} \rightrightarrows_{\mathcal{T}} \mathcal{T} \mathbin\Vert (\texttt{value} \mathbin\Vert \texttt{channel} \mathbin\Vert \texttt{CHAN\_SEND\_I} \mathbin\Vert C, S, E) \end{matrix}

X=H(Xa)=(B,QS,WaitingInstanceW,,bQR)len(B)=0W=H(Wa)=Waker((CHAN_SEND_IC,XVS,E)T,H)T,H((CHAN_SEND_IC,XVS,E)T,H[Xa(B,QS,QR)])\begin{matrix} X = \mathcal{H}(X_a) = (B, Q_S, \texttt{WaitingInstance}_{W, \varnothing, b} \mathbin\Vert Q_R) \qquad \texttt{len}(B) = 0 \qquad W = \mathcal{H}(W_a) = \texttt{Waker}_{\varnothing} \\ \hline ((\texttt{CHAN\_SEND\_I} \mathbin\Vert C, X \mathbin\Vert V \mathbin\Vert S, E) \mathbin\Vert \mathcal{T}, \mathcal{H}) \rightrightarrows_{\mathcal{T}, \mathcal{H}} \\ ((\texttt{CHAN\_SEND\_I} \mathbin\Vert C, X \mathbin\Vert V \mathbin\Vert S, E) \mathbin\Vert \mathcal{T}, \mathcal{H}[X_a \gets (B, Q_S, Q_R)]) \end{matrix}

X=H(Xa)=(B,QS,WaitingInstanceW,,bQR)len(B)=0W=H(Wa)=Waker(CW,SW,EW)((CHAN_SEND_IC,XVS,E)T,H)T,H(T(C,S,E)(bCW,VSW,EW),H[Xa(B,QS,QR)][WaWaker])\begin{matrix} X = \mathcal{H}(X_a) = (B, Q_S, \texttt{WaitingInstance}_{W, \varnothing, b} \mathbin\Vert Q_R) \qquad \texttt{len}(B) = 0 \qquad W = \mathcal{H}(W_a) = \texttt{Waker}_{(C_W, S_W, E_W)} \\ \hline ((\texttt{CHAN\_SEND\_I} \mathbin\Vert C, X \mathbin\Vert V \mathbin\Vert S, E) \mathbin\Vert \mathcal{T}, \mathcal{H}) \rightrightarrows_{\mathcal{T}, \mathcal{H}} \\ \left(\mathcal{T} \mathbin\Vert (C, S, E) \mathbin\Vert (b \mathbin\Vert C_W, V \mathbin\Vert S_W, E_W), \mathcal{H}\left[X_a \gets (B, Q_S, Q_R)\right][W_a \gets \texttt{Waker}_{\varnothing}]\right) \end{matrix}

X=H(Xa)=(B,QS,QR)len(QR)=0len(B)<cap(B)((CHAN_SEND_IC,XVS,E)T,H)T,H(T(C,S,E),H[Xa(BV,QS,QR)])\begin{matrix} X = \mathcal{H}(X_a) = (B, Q_S, Q_R) \qquad \texttt{len}(Q_R) = 0 \qquad \texttt{len}(B) < \texttt{cap}(B) \\ \hline ((\texttt{CHAN\_SEND\_I} \mathbin\Vert C, X \mathbin\Vert V \mathbin\Vert S, E) \mathbin\Vert \mathcal{T}, \mathcal{H}) \rightrightarrows_{\mathcal{T}, \mathcal{H}} \\ \left(\mathcal{T} \mathbin\Vert (C, S, E), \mathcal{H}\left[X_a \gets (B \mathbin\Vert V, Q_S, Q_R)\right]\right) \end{matrix}

X=H(Xa)=(B,QS,QR)len(QR)=0len(B)=cap(B)((CHAN_SEND_IC,XVS,E)T,H)T,H(T,H[Xa(B,QSWaitingInstance(C,S,E),V,QR)])\begin{matrix} X = \mathcal{H}(X_a) = (B, Q_S, Q_R) \qquad \texttt{len}(Q_R) = 0 \qquad \texttt{len}(B) = \texttt{cap}(B) \\ \hline ((\texttt{CHAN\_SEND\_I} \mathbin\Vert C, X \mathbin\Vert V \mathbin\Vert S, E) \mathbin\Vert \mathcal{T}, \mathcal{H}) \rightrightarrows_{\mathcal{T}, \mathcal{H}} \\ \left(\mathcal{T}, \mathcal{H}\left[X_a \gets (B, Q_S \mathbin\Vert \texttt{WaitingInstance}_{(C, S, E), V}, Q_R)\right]\right) \end{matrix}

Note that if QRQ_R is non-empty, we will keep popping the frontmost WaitingInstance\texttt{WaitingInstance} as long as it holds Waker\texttt{Waker}_{\varnothing}.

When Receive() is called on a channel, the value is popped from the buffer if there is any, and if there is a thread in QSQ_S, its value is pushed into the buffer. Otherwise, the current thread is added to the receive waiting queue of the channel. This is shown in the following inference rules:

(CHAN_RECEIVE channelC,S,E)TTT(channelCHAN_RECEIVE_IC,S,E)\begin{matrix} \hline (\texttt{CHAN\_RECEIVE} \ \texttt{channel} \mathbin\Vert C, S, E) \mathbin\Vert \mathcal{T} \rightrightarrows_{\mathcal{T}} \mathcal{T} \mathbin\Vert (\texttt{channel} \mathbin\Vert \texttt{CHAN\_RECEIVE\_I} \mathbin\Vert C, S, E) \end{matrix}

X=H(Xa)=(B,WaitingInstanceW,V,bQS,QR)len(B)=cap(B)W=H(Wa)=Waker((CHAN_RECEIVE_IC,XS,E)T,H)T,H((CHAN_RECEIVE_IC,XS,E)T,H[Xa(B,QS,QR)])\begin{matrix} X = \mathcal{H}(X_a) = (B, \texttt{WaitingInstance}_{W, V', b} \mathbin\Vert Q_S, Q_R) \qquad \texttt{len}(B) = \texttt{cap}(B) \qquad W = \mathcal{H}(W_a) = \texttt{Waker}_{\varnothing} \\ \hline ((\texttt{CHAN\_RECEIVE\_I} \mathbin\Vert C, X \mathbin\Vert S, E) \mathbin\Vert \mathcal{T}, \mathcal{H}) \rightrightarrows_{\mathcal{T}, \mathcal{H}} \\ ((\texttt{CHAN\_RECEIVE\_I} \mathbin\Vert C, X \mathbin\Vert S, E) \mathbin\Vert \mathcal{T}, \mathcal{H}\left[X_a \gets (B, Q_S, Q_R)\right]) \end{matrix}

X=H(Xa)=(VB,WaitingInstanceW,V,bQS,QR)len(B)=cap(B)>0W=H(Wa)=Waker(CW,SW,EW)((CHAN_RECEIVE_IC,XS,E)T,H)T,H(T(C,VS,E)(bCW,SW,EW),H[Xa(BV,QS,QR)][WaWaker])\begin{matrix} X = \mathcal{H}(X_a) = (V \mathbin\Vert B, \texttt{WaitingInstance}_{W, V', b} \mathbin\Vert Q_S, Q_R) \qquad \texttt{len}(B) = \texttt{cap}(B) > 0 \qquad W = \mathcal{H}(W_a) = \texttt{Waker}_{(C_W, S_W, E_W)} \\ \hline ((\texttt{CHAN\_RECEIVE\_I} \mathbin\Vert C, X \mathbin\Vert S, E) \mathbin\Vert \mathcal{T}, \mathcal{H}) \rightrightarrows_{\mathcal{T}, \mathcal{H}} \\ \left(\mathcal{T} \mathbin\Vert (C, V \mathbin\Vert S, E) \mathbin\Vert (b \mathbin\Vert C_W, S_W, E_W), \mathcal{H}\left[X_a \gets (B \mathbin\Vert V', Q_S, Q_R)\right]\left[W_a \gets \texttt{Waker}_{\varnothing}\right]\right) \end{matrix}

X=H(Xa)=(B,WaitingInstanceW,V,bQS,QR)len(B)=cap(B)=0W=H(Wa)=Waker(CW,SW,EW)((CHAN_RECEIVE_IC,XS,E)T,H)T,H(T(C,VS,E)(bCW,SW,EW),H[Xa(B,QS,QR)][WaWaker])\begin{matrix} X = \mathcal{H}(X_a) = (B, \texttt{WaitingInstance}_{W, V', b} \mathbin\Vert Q_S, Q_R) \qquad \texttt{len}(B) = \texttt{cap}(B) = 0 \qquad W = \mathcal{H}(W_a) = \texttt{Waker}_{(C_W, S_W, E_W)} \\ \hline ((\texttt{CHAN\_RECEIVE\_I} \mathbin\Vert C, X \mathbin\Vert S, E) \mathbin\Vert \mathcal{T}, \mathcal{H}) \rightrightarrows_{\mathcal{T}, \mathcal{H}} \\ \left(\mathcal{T} \mathbin\Vert (C, V' \mathbin\Vert S, E) \mathbin\Vert (b \mathbin\Vert C_W, S_W, E_W), \mathcal{H}\left[X_a \gets (B, Q_S, Q_R)\right]\left[W_a \gets \texttt{Waker}_{\varnothing}\right]\right) \end{matrix}

X=H(Xa)=(VB,QS,QR)len(QS)=0((CHAN_RECEIVE_IC,XS,E)T,H)T,H(T(C,VS,E),H[Xa(B,QS,QR)])\begin{matrix} X = \mathcal{H}(X_a) = (V \mathbin\Vert B, Q_S, Q_R) \qquad \texttt{len}(Q_S) = 0 \\ \hline ((\texttt{CHAN\_RECEIVE\_I} \mathbin\Vert C, X \mathbin\Vert S, E) \mathbin\Vert \mathcal{T}, \mathcal{H}) \rightrightarrows_{\mathcal{T}, \mathcal{H}} \\ \left(\mathcal{T} \mathbin\Vert (C, V \mathbin\Vert S, E), \mathcal{H}\left[X_a \gets (B, Q_S, Q_R)\right]\right) \end{matrix}

X=H(Xa)=(B,QS,QR)len(QS)=0len(B)=0((CHAN_RECEIVE_IC,XS,E)T,H)T,H(T,H[Xa(B,QS,QRWaitingInstance(C,S,E),)])\begin{matrix} X = \mathcal{H}(X_a) = (B, Q_S, Q_R) \qquad \texttt{len}(Q_S) = 0 \qquad \texttt{len}(B) = 0 \\ \hline ((\texttt{CHAN\_RECEIVE\_I} \mathbin\Vert C, X \mathbin\Vert S, E) \mathbin\Vert \mathcal{T}, \mathcal{H}) \rightrightarrows_{\mathcal{T}, \mathcal{H}} \\ \left(\mathcal{T}, \mathcal{H}\left[X_a \gets (B, Q_S, Q_R \mathbin\Vert \texttt{WaitingInstance}_{(C, S, E), \varnothing})\right]\right) \end{matrix}

Note that if QSQ_S is non-empty, we will keep popping the frontmost WaitingInstance\texttt{WaitingInstance} as long as it holds Waker\texttt{Waker}_{\varnothing}.

Select Statement

Go's select statement is a way to wait on multiple channel operations with several optimizations. Generally, it can be defined as follows:

  • If there is at least one channel operation that can proceed without blocking, it will randomly choose one of them to execute, and continue with the case body.
  • If all channel operations would block, and a default case is present, it will execute the default case.
  • Otherwise, it will block until one of the channel operations can proceed, and then execute the corresponding case body.

Note that in a case channel send or receive, it may have consist of complex operations. For example, the channel involved in a case may be a result of a function call or even a blocking expression (e.g., the receive of a channel of channels). Similarly, on case with a channel send, the value to be sent may be blocking. Therefore, our implementation of select will evaluate all involved channels and send values (which may block) in cases before actually doing the select.

To evaluate all involved channels and send values, we use the SELECT instruction, with the following inference rule:

(SELECT cases1  casesnC,S,E)TTT(f(casesn)f(cases1)SELECT_I casesC,S,E)\begin{matrix} \hline (\texttt{SELECT} \ \texttt{cases}_1 \ \dots \ \texttt{cases}_n \mathbin\Vert C, S, E) \mathbin\Vert \mathcal{T} \rightrightarrows_{\mathcal{T}} \mathcal{T} \mathbin\Vert (f(\texttt{cases}_n) \mathbin\Vert \dots \mathbin\Vert f(\texttt{cases}_1) \mathbin\Vert \texttt{SELECT\_I} \ \texttt{cases} \mathbin\Vert C, S, E) \end{matrix}

where we define f(case)f(\texttt{case}) as:

case=case_send channel value bodyf(case)=valuechannel\begin{matrix} \texttt{case} = \texttt{case\_send} \ \texttt{channel} \ \texttt{value} \ \texttt{body} \\ \hline f(\texttt{case}) = \texttt{value} \mathbin\Vert \texttt{channel} \end{matrix}

case=case_receive channel bodyf(case)=channel\begin{matrix} \texttt{case} = \texttt{case\_receive} \ \texttt{channel} \ \texttt{body} \\ \hline f(\texttt{case}) = \texttt{channel} \end{matrix}

case=case_default bodyf(case)=\begin{matrix} \texttt{case} = \texttt{case\_default} \ \texttt{body} \\ \hline f(\texttt{case}) = \varnothing \end{matrix}

In our actual implementation, we also allow case_receive\texttt{case\_receive} to have an ASSIGN instruction to assign the received value to a variable. However, for simplicity, we omit this in the inference rules.

When SELECT_I is executed, it will randomly choose one of the cases that can proceed without blocking, and execute the body\texttt{body} of the case. If all cases would block, it will block until one of the cases can proceed.

Since we need to be able to add body\texttt{body} to the top of the control stack, we need to encode it in WaitingInstance\texttt{WaitingInstance}. We encode the body as the parameter bb in WaitingInstanceW,V,b\texttt{WaitingInstance}_{W, V, b} previously defined, where bb is the body of the case (possibly b=b = \varnothing).

To write all possible inference rules mathematically for nn cases would take an exponential amount of space; however, algorithmically, it is straightforward to implement with multiple if statements. Therefore, for simplicity, we will only show the inference rule for the following SELECT_I instruction:

  • There are two cases with a channel send and a channel receive, respectively.
  • The first case is a channel send is on a channel X=(BX,QXS,QXR)X = (B_X, Q_{XS}, Q_{XR}) with a value VV to be sent, and it has a body bXb_X.
  • The second case is a channel receive is on a channel Y=(BY,QYS,QYR)Y = (B_Y, Q_{YS}, Q_{YR}), and it has a body bYb_Y. The return value of the receive is pushed into the stash. In the actual implementation, the first instruction of bYb_Y is either an ASSIGN instruction (to assign the received value on top of the stash to a variable) or a POP instruction (to pop the received value from the stash). We assume bYb_Y follows this convention.
  • XX and YY are different channels, and both channels are not unbuffered (cap(X)>0\texttt{cap}(X) > 0 and cap(Y)>0\texttt{cap}(Y) > 0).
  • If QXRQ_{XR} is non-empty, then the frontmost WaitingInstance\texttt{WaitingInstance} in QXRQ_{XR} does not hold Waker\texttt{Waker}_{\varnothing}.
  • If QYSQ_{YS} is non-empty, then the frontmost WaitingInstance\texttt{WaitingInstance} in QYSQ_{YS} does not hold Waker\texttt{Waker}_{\varnothing}.
  • If both cases can proceed without blocking, the case with the channel send (channel XX) will be executed.

If channel XX can proceed without blocking, we have the two following inference rules. Note that it is very similar to the inference rules for CHAN_SEND_I.

X=H(Xa)=(BX,QXS,WaitingInstanceW,,bQXR)len(BX)=0W=H(Wa)=Waker(CW,SW,EW)((SELECT_I casesC,XVYS,E)T,H)T,H(T(bXC,S,E)(bCW,VSW,EW),H[Xa(BX,QXS,QXR)][WaWaker])\begin{matrix} X = \mathcal{H}(X_a) = (B_X, Q_{XS}, \texttt{WaitingInstance}_{W, \varnothing, b} \mathbin\Vert Q_{XR}) \qquad \texttt{len}(B_X) = 0 \qquad W = \mathcal{H}(W_a) = \texttt{Waker}_{(C_W, S_W, E_W)} \\ \hline ((\texttt{SELECT\_I} \ \texttt{cases} \mathbin\Vert C, X \mathbin\Vert V \mathbin\Vert Y \mathbin\Vert S, E) \mathbin\Vert \mathcal{T}, \mathcal{H}) \rightrightarrows_{\mathcal{T}, \mathcal{H}} \\ \left(\mathcal{T} \mathbin\Vert (b_X \mathbin\Vert C, S, E) \mathbin\Vert (b \mathbin\Vert C_W, V \mathbin\Vert S_W, E_W), \mathcal{H}\left[X_a \gets (B_X, Q_{XS}, Q_{XR})\right]\left[W_a \gets \texttt{Waker}_{\varnothing}\right]\right) \end{matrix}

X=H(Xa)=(BX,QXS,QXR)len(QXR)=0len(BX)<cap(BX)((SELECT_I casesC,XVYS,E)T,H)T,H(T(bXC,S,E),H[Xa(BXV,QXS,QXR)])\begin{matrix} X = \mathcal{H}(X_a) = (B_X, Q_{XS}, Q_{XR}) \qquad \texttt{len}(Q_{XR}) = 0 \qquad \texttt{len}(B_X) < \texttt{cap}(B_X) \\ \hline ((\texttt{SELECT\_I} \ \texttt{cases} \mathbin\Vert C, X \mathbin\Vert V \mathbin\Vert Y \mathbin\Vert S, E) \mathbin\Vert \mathcal{T}, \mathcal{H}) \rightrightarrows_{\mathcal{T}, \mathcal{H}} \\ \left(\mathcal{T} \mathbin\Vert (b_X \mathbin\Vert C, S, E), \mathcal{H}\left[X_a \gets (B_X \mathbin\Vert V, Q_{XS}, Q_{XR})\right]\right) \end{matrix}

If the channel XX would block, and the channel YY can proceed without blocking, we have the following inference rule. Note that it is very similar to the inference rules for CHAN_RECEIVE_I.

X=H(Xa)=(BX,QXS,QXR)len(QXR)=0len(BX)=cap(BX)Y=H(Ya)=(VYBY,WaitingInstanceW,VY,bQYS,QYR)len(BY)=cap(BY)>0W=H(Wa)=Waker(CW,SW,EW)((SELECT_I casesC,XVYS,E)T,H)T,H(T(bYC,VYS,E)(bCW,SW,EW),H[Ya(BYVY,QYS,QYR)][WaWaker])\begin{matrix} X = \mathcal{H}(X_a) = (B_X, Q_{XS}, Q_{XR}) \qquad \texttt{len}(Q_{XR}) = 0 \qquad \texttt{len}(B_X) = \texttt{cap}(B_X) \\ Y = \mathcal{H}(Y_a) = (V_Y \mathbin\Vert B_Y, \texttt{WaitingInstance}_{W, V_Y', b} \mathbin\Vert Q_{YS}, Q_{YR}) \qquad \texttt{len}(B_Y) = \texttt{cap}(B_Y) > 0 \qquad W = \mathcal{H}(W_a) = \texttt{Waker}_{(C_W, S_W, E_W)} \\ \hline ((\texttt{SELECT\_I} \ \texttt{cases} \mathbin\Vert C, X \mathbin\Vert V \mathbin\Vert Y \mathbin\Vert S, E) \mathbin\Vert \mathcal{T}, \mathcal{H}) \rightrightarrows_{\mathcal{T}, \mathcal{H}} \\ \left(\mathcal{T} \mathbin\Vert (b_Y \mathbin\Vert C, V_Y \mathbin\Vert S, E) \mathbin\Vert (b \mathbin\Vert C_W, S_W, E_W), \mathcal{H}\left[Y_a \gets (B_Y \mathbin\Vert V_Y', Q_{YS}, Q_{YR})\right]\left[W_a \gets \texttt{Waker}_{\varnothing}\right]\right) \end{matrix}

X=H(Xa)=(BX,QXS,QXR)len(QXR)=0len(BX)=cap(BX)Y=H(Ya)=(VYBY,QYS,QYR)len(QYS)=0((SELECT_I casesC,XVYS,E)T,H)T,H(T(bYC,VYS,E),H[Ya(BY,QYS,QYR)])\begin{matrix} X = \mathcal{H}(X_a) = (B_X, Q_{XS}, Q_{XR}) \qquad \texttt{len}(Q_{XR}) = 0 \qquad \texttt{len}(B_X) = \texttt{cap}(B_X) \\ Y = \mathcal{H}(Y_a) = (V_Y \mathbin\Vert B_Y, Q_{YS}, Q_{YR}) \qquad \texttt{len}(Q_{YS}) = 0 \\ \hline ((\texttt{SELECT\_I} \ \texttt{cases} \mathbin\Vert C, X \mathbin\Vert V \mathbin\Vert Y \mathbin\Vert S, E) \mathbin\Vert \mathcal{T}, \mathcal{H}) \rightrightarrows_{\mathcal{T}, \mathcal{H}} \\ \left(\mathcal{T} \mathbin\Vert (b_Y \mathbin\Vert C, V_Y \mathbin\Vert S, E), \mathcal{H}\left[Y_a \gets (B_Y, Q_{YS}, Q_{YR})\right]\right) \end{matrix}

If both channels would block, we have the following inference rule:

X=H(Xa)=(BX,QXS,QXR)len(QXR)=0len(BX)=cap(BX)Y=H(Ya)=(BY,QYS,QYR)len(QYS)=0len(BY)=0W=H(Wa)=Waker(C,S,E)IX=WaitingInstanceW,V,bXIY=WaitingInstanceW,,bY((SELECT_I casesC,XVYS,E)T,H)T,H(T,H[Xa(BX,QXSIX,QXR)][Ya(BY,QYS,QYRIY)])\begin{matrix} X = \mathcal{H}(X_a) = (B_X, Q_{XS}, Q_{XR}) \qquad \texttt{len}(Q_{XR}) = 0 \qquad \texttt{len}(B_X) = \texttt{cap}(B_X) \\ Y = \mathcal{H}(Y_a) = (B_Y, Q_{YS}, Q_{YR}) \qquad \texttt{len}(Q_{YS}) = 0 \qquad \texttt{len}(B_Y) = 0 \\ W = \mathcal{H}(W_a) = \texttt{Waker}_{(C, S, E)} \qquad I_X = \texttt{WaitingInstance}_{W, V, b_X} \qquad I_Y = \texttt{WaitingInstance}_{W, \varnothing, b_Y} \\ \hline ((\texttt{SELECT\_I} \ \texttt{cases} \mathbin\Vert C, X \mathbin\Vert V \mathbin\Vert Y \mathbin\Vert S, E) \mathbin\Vert \mathcal{T}, \mathcal{H}) \rightrightarrows_{\mathcal{T}, \mathcal{H}} \\ \left(\mathcal{T}, \mathcal{H}\left[X_a \gets (B_X, Q_{XS} \mathbin\Vert I_X, Q_{XR})\right]\left[Y_a \gets (B_Y, Q_{YS}, Q_{YR} \mathbin\Vert I_Y)\right]\right) \end{matrix}

Project Source

Our project source code is available on GitHub at benson1029/go-slang. It is a standalone React application that can be run locally, or deployed to GitHub Pages on https://benson1029.github.io/go-slang/.

To run the project locally, you need to have Node.js and npm installed. After that, you can run the application locally by running the following commands:

git clone https://github.com/benson1029/go-slang.git
cd go-slang
npm install
npm start

Then, open http://localhost:3000 to view it in your browser.

Test Cases

Unit tests for the parser and preprocessor (specially, the module for sorting global declarations) are available in src/go/parser/go.test.js and src/go/ece/loader/globalSort.test.ts. Integration tests for the ECE machine are available under the directory src/go/ece/tests, which cover all language constructs that are supported by the ECE machine.

For most test cases that does not involve recursive calls, we additionally check whether all objects in the heap are freed after the program execution. This confirms that our reference counting garbage collector is working correctly. Recursive calls will create cyclic data structures in the heap, thus we do not check for unfreed objects in these cases.

Additionally, for all test cases, we will run it a second time, triggering mark and sweep garbage collection for every heap allocation. This is to ensure that the mark and sweep garbage collector, especially the handling of temporary roots, is working correctly. Therefore, we have high confidence that our garbage collector is working correctly.

The purpose of each test is documented in the source code.

To run the tests, you can run the following command:

npm test