## Disclaimer
Yes, I know [DIV Go](http://www.divgo.net/), I know
[the source code is on GitHub](https://github.com/DIVGAMES/DIV-Games-Studio)
and I also know this is probably a useless project but, you know?
**I don't care.**
## Index
1. How does DIV work?
2. The execution model
3. The memory model
4. The execution pipeline
![DIV2 running on DOS-BOX](./imgs/div2-dosbox.png)
DIV programs have a point of entry...
![DIV2 main program](./imgs/div2-main-program.png)
...and a set of processes.
![DIV2 processes](./imgs/div2-processes.png)
DIV2 processes are similar to Linux processes in a single-thread.
They run for a while, until executing a `FRAME` sentence.
![Processes list](./imgs/processes-list.png)
![Process Execution](./imgs/div2-process-execution.png)
When all the processes have executed `FRAME`, the graphic engine draws a new frame.
And the execution continues **after the `FRAME`**...
![Resume process list](./imgs/resume-execution.png)
...until finding another one.
![Next frame](./imgs/div2-process-execution-next-frame.png)
Do you know what in JavaScript is very similar to this?
A [generator](https://developer.mozilla.org/en/docs/Web/JavaScript/Reference/Statements/function*)!
```js
function* process() {
let counter = 0;
while (true) {
yield counter; // This is like FRAME.
counter++;
}
}
```
![Process hierarchy](./imgs/div2-process-hierarchy.png)
Processes have a hierarchy, a type, an ID...
DIV memory is **manually managed**.
![DIV2 Memory](./imgs/memory.png)
It is a continous chunk of 4-byte aligned space.
And you know what in JavaScript is very similar to this?
A [TypedArray](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Typed_arrays)!
```js
const memory = new Int32Array(5); // 5 32-bit intenger cells.
const ANGLE = 2;
memory[ANGLE] = 90000; // Variable assignation.
memory[ANGLE + 1] = 90000; // Pointer aritmethic!
```
Let's consider generators.
With generators, a DIV process...
```div
PROCESS asteroid()
BEGIN
graph = 1;
WHILE (NOT out_of_region(ID, 0))
graph = ((graph + 1) % 10 + 1);
y += 10;
FRAME;
END
END
```
...would be translated into a JavaScript generator:
```js
function* asteroid(mem) {
const GRAPH = 100;
const Y = 101;
mem[GRAPH] = 1;
while (!out_of_region(mem[ID], 0)) {
mem[GRAPH] = ((mem[GRAPH] + 1) % 10 + 1);
mem[Y] += 10;
yield { type: 'frame', completion: 100 };
}
}
```
Creating a new process would look like:
```js
function newProcess(name, args) {
const processType = processes[name];
const mem = new Int32Array(PROCESS_SIZE);
initializeMemory(mem, args);
const process = processType(mem);
process.mem = mem;
processList.push(process);
}
```
And the scheduler would look like this:
```js
function runProcessList() {
let retValue;
while (processList.length) {
for (let i = 0, current; current = processList[i]; i++) {
const result = current.next(retValue); // run until yield.
retValue = handle(result, current.mem);
};
drawFrame();
}
}
```
`handle()` would manipulate the process list according to the result of the execution of
the process.
`retValue` is sent to the process since some statements can return values (i.e. _calling a process returns its ID_).
* `FRAME` continues with the next process.
* `<EOP>` removes the process from the list.
* `anotherProcess()` adds a new process to the list.
* `DEBUG` starts the debugger, then continues with the current process.
* `CLONE` adds a new process, copying the state of the current process.
```div
PROGRAM test;
BEGIN
write_int(0, 0, 0, 0, ID);
write(0, 0, 10, 0, "Only the parent.");
CLONE
x = 100;
write_int(0, x, 20, 0, ID);
write(0, x, 30, 0, "Only the child.");
END
write_int(0, x, 40, 0, ID);
write(0, x, 50, 0, "Both.");
LOOP FRAME; END // alt+x to quit
END
```
![Clone execution](./imgs/clone-execution.png)
Here is the problem: copying the memory state is possible but **copying the execution state is not**.
So I tried [the standard way](https://esdiscuss.org/topic/proposal-generator-clone-and-generator-goto).
![Discussion about adding generator#clone and generator#goto](./imgs/generator-clone.png)
I knew the solution was a state machine...
So [I wrote my own](https://github.com/delapuente/div2js/blob/master/src/context.js#L139).
This is how translating `CLONE` looks like:
```js
translators.CloneSentence = function (divClone, context) {
var insideCloneLabel = context.newLabel();
var afterCloneLabel = context.newLabel();
context.clone(insideCloneLabel, afterCloneLabel);
context.label(insideCloneLabel);
translateBody(divClone, context);
context.label(afterCloneLabel);
};
```
And this is how the `CLONE` example looks once transpiled:
```js
function program_test(mem, exec, args) {
while (true) {
switch (exec.pc) {
case 1:
return __yieldCallFunction(2, 'write_int', [0, 0, 0, 0, mem[ID]]);
case 2:
exec.retv.dequeue();
return __yieldCallFunction(4, 'write', [0, 0, 10, 0, 'Only the parent.']);
case 4:
exec.retv.dequeue();
return __yieldClone(6, 11);
case 6:
mem[x] = 100;
return __yieldCallFunction(8, 'write_int', [0, mem[x], 20, 0, mem[ID] ]);
case 8:
exec.retv.dequeue();
return __yieldCallFunction(10, 'write', [0, mem[x], 30, 0, 'Only the child.']);
case 10:
exec.retv.dequeue();
case 11:
return __yieldCallFunction(12, 'write_int', [0, mem[x], 40, 0, mem[ID]]);
case 12:
exec.retv.dequeue();
return __yieldCallFunction(14, 'write', [0, mem[x], 50, 0, 'Both.']);
case 14:
exec.retv.dequeue();
case 15:
exec.pc = 15;
break;
case 16:
return __yieldEnd;
}
}
}
```
You migh be wondering **why so many yields**?
Is it necessary to yield on **every** call?
Because you need to return to the scheduler to have the opportunity of launching the debugger.
Each process has a _reserved_ memory region.
![Local data](./imgs/div2-local-data.png)
Implementation goal:
Preserve pointer semantics across regions.
Memory is a `Int32Array`
```
const mem = new Int32Array(PROCESS_SIZE);
```
So, first question, what is `PROCESS_SIZE`?
Hypothesis: _with no privates, size of the process is size of locals_.
Gut feeling: _`ID` is the address of the process_.
According to the debugger, there are **44 locals**.
```div
PROGRAM calc_process_size;
BEGIN
p();
p();
DEBUG;
END
PROCESS p();
BEGIN
LOOP FRAME; END
END
```
![Substracting one process ID fron another, you have the process size](./imgs/div2-process-size.png)
1751 - 1707 = 44; 1795 - 1751 = 44
Fair enough.
Hypothesis: _with privates, size of the process is size of locals + privates_.
```div
PROGRAM calc_process_size;
BEGIN
p();
q();
DEBUG;
END
PROCESS p();
PRIVATE
a;
BEGIN
LOOP FRAME; END
END
PROCESS q();
PRIVATE
a;
b;
BEGIN
LOOP FRAME; END
END
```
![Size is constant](./imgs/div2-constant-size.png)
1769 - 1723 = 46; 1815 - 1769 = 46
WTF?
So, DIV has a fixed process size, which is the sum of local segment and the biggest private region.
![Process memory region](./imgs/memory-segment.png)
Regions include enough space for _locals_ and _private_ variables.
In addition to a fixed number of process, the design leads to:
* Fixed program footprint.
* Very fast process allocation.
Variable names will be array indices
```
mem[X] = 10; // x = 10
mem[Y] = 10; // y = 10
```
So, second question is, where is the data inside these regions?
What is the first variable?
```div
PROGRAM memory_map;
PRIVATE
pid;
BEGIN
pid = p();
(*pid) = 1234567890;
DEBUG;
END
PROCESS p();
BEGIN
LOOP FRAME; END
END
```
![First local is process id](./imgs/div2-first-local.png)
Where are the other?
```div
PROGRAM memory_offsets;
PRIVATE
target;
result;
BEGIN
target = &reserved.id_scan;
result = target - ID;
write_int(0, 0, 0, 0, &result);
LOOP FRAME; END
END
```
![id_scan offset is 1](./imgs/div2-memory-locations.png)
Double check privates come after locals.
```div
PROGRAM segment_order;
PRIVATE
a;
BEGIN
*(ID + 44) = 1234567890;
DEBUG;
END
```
![Privates goes after locals](./imgs/div2-segment-order.png)
Pointer airthmetic goes in between the brackets
```js
mem[X] = mem[MOUSE_STRUCT + MOUSE_X]; // x = mouse.x
```
The only type in DIV (first version) was the 4 byte int. So, no need for types at all since this was also the width of the pointer.
But DIV2 adds `BYTE`, `WORD` and `INT` to explicitly refer to 1 byte, 2 byte and 4 byte values respectively.
And it also adds `POINTER` to those types and the `STRING` view.
The alignment is still 4 bytes but DIV2 ensures arrays are continuous. So...
|Value | Space required | Space used |
| ------------- |
|WORD[1] a; | 4 bytes | 4 bytes|
|WORD b, c; | 4 bytes | 8 bytes|
So, how to access `a[1]`?
```div
PROGRAM pointer_arithmetic;
PRIVATE
WORD POINTER pa, pb, pc;
WORD[1] a;
WORD b, c;
BEGIN
pa = &a;
pb = &b;
pc = &c;
DEBUG;
END
```
![Contiguous memory](./imgs/div2-contiguous-memory.png)
```div
PROGRAM pointer_arithmetic;
PRIVATE
WORD POINTER pa, pb, pc;
WORD[1] a;
WORD b, c;
BEGIN
pa = &a[1];
pb = &b;
pc = &c;
DEBUG;
END
```
![No offset](./imgs/div2-no-offset.png)
```div
PROGRAM pointer_arithmetic;
PRIVATE
WORD POINTER pa, pb, pc;
WORD[1] a;
WORD b, c;
BEGIN
pa = &a[1];
pb = &b;
pc = &c;
*pa = 123;
*(pa + 1) = 456;
DEBUG;
END
```
![Double pointer arithmetic](./imgs/div2-double-arithmetic.png)
```div
PROGRAM pointer_arithmetic;
PRIVATE
WORD POINTER pa, pb, pc;
WORD[1] a;
WORD b, c;
BEGIN
pa = &a[1];
pb = &b;
pc = &c;
*pa = 123;
*pa = 456 << 16 or *pa;
DEBUG;
END
```
![Pointer arithmetic for array values](./imgs/div2-double-arithmetic-fixed.png)
So, there is a **double pointer arithmetic**.
A DIV legacy one for **4 byte aligned values**.
```js
mem[A] = 123;
mem[A + 1] = 456;
```
Another for **array values**.
```js
mem[A] = 456 << 16 | mem[A];
```
### Globals and text segments
Is it possible to achieve total pointer compatibility?
Yes, if processes receive **all the memory**.
Fortunately, source is now on GitHub so [figuring
out global memory](https://github.com/DIVGAMES/DIV-Games-Studio/blob/0c006cca548f9d6dc66d174d4f05d167148c7e78/dll/div.h) map was much easier.
- So, where the globals are?
- And the text of the program?
- Are there gaps in between?
## The Execution Pipeline
The process to execute a DIV program
traverses several stages.
1. Transpilation
2. Load
3. Execution
Is the process for which a DIV program is translated into a JavaScript.
```div
PROGRAM test;
BEGIN
text_z = 1;
DEBUG;
a();
DEBUG;
text_z = 3;
DEBUG;
END
PROCESS a();
BEGIN
text_z = 2;
END
```
During **parsing**, source code is translated into a **DIV Abstract Syntax Tree**.
```js
{
"type": "Unit",
"program": {
"type": "Program",
"name": {
"type": "Identifier",
"name": "test"
},
"consts": null,
"globals": null,
"locals": null,
"privates": null,
"body": {
"type": "ProcessBody",
"sentences": [
{
"type": "ExpressionSentence",
"expression": {
"type": "AssignmentExpression",
"operator": "=",
"left": {
"type": "Identifier",
"name": "text_z"
},
"right": {
"type": "Literal",
"value": 1,
"raw": "1"
}
}
},
{
"type": "DebugSentence"
},
{
"type": "ExpressionSentence",
"expression": {
"type": "CallExpression",
"callee": {
"type": "Identifier",
"name": "a"
},
"arguments": []
}
},
{
"type": "DebugSentence"
},
{
"type": "ExpressionSentence",
"expression": {
"type": "AssignmentExpression",
"operator": "=",
"left": {
"type": "Identifier",
"name": "text_z"
},
"right": {
"type": "Literal",
"value": 3,
"raw": "3"
}
}
},
{
"type": "DebugSentence"
}
]
}
},
"processes": [
{
"type": "Process",
"name": {
"type": "Identifier",
"name": "a"
},
"privates": null,
"body": {
"type": "ProcessBody",
"sentences": [
{
"type": "ExpressionSentence",
"expression": {
"type": "AssignmentExpression",
"operator": "=",
"left": {
"type": "Identifier",
"name": "text_z"
},
"right": {
"type": "Literal",
"value": 2,
"raw": "2"
}
}
}
]
}
}
]
}
```
[JISON](http://zaa.ch/jison/) is used to write the grammar and parse the source code.
> JISON: Your friendly JavaScript parser generator!
[_friendly_](https://github.com/delapuente/div2js/blob/master/src/grammar.y)
The **semantic checker** runs now (which includes the type checker) extracting the enough context to drive the translation.
```js
{
"_processes": {
"a": true
},
"_auxNames": {},
"_currentProcessPrivates": {}
}
```
With the DIV AST + context, the translator has
enough information to create a **JavaScript AST**.
```js
{
"type": "Program",
"body": [
{
"type": "VariableDeclaration",
"declarations": [
{
"type": "VariableDeclarator",
"id": {
"type": "Identifier",
"name": "G_BASE"
},
"init": {
"type": "Literal",
"value": 0,
"raw": "0"
}
},
{
"type": "VariableDeclarator",
"id": {
"type": "Identifier",
"name": "G_TEXT_Z"
},
"init": {
"type": "Literal",
"value": 0,
"raw": "0"
}
}
],
"kind": "var"
},
{
"type": "FunctionDeclaration",
"id": {
"type": "Identifier",
"name": "program_test"
},
"params": [
{
"type": "Identifier",
"name": "mem"
},
{
"type": "Identifier",
"name": "exec"
},
{
"type": "Identifier",
"name": "args"
}
],
"defaults": [],
"body": {
"type": "BlockStatement",
"body": [
{
"type": "WhileStatement",
"test": {
"type": "Literal",
"value": true,
"raw": "true"
},
"body": {
"type": "BlockStatement",
"body": [
{
"type": "SwitchStatement",
"discriminant": {
"type": "MemberExpression",
"computed": false,
"object": {
"type": "Identifier",
"name": "exec"
},
"property": {
"type": "Identifier",
"name": "pc"
}
},
"cases": [
{
"type": "SwitchCase",
"test": {
"type": "Literal",
"value": 1,
"raw": "1"
},
"consequent": [
{
"type": "ExpressionStatement",
"expression": {
"type": "AssignmentExpression",
"operator": "=",
"left": {
"type": "MemberExpression",
"computed": true,
"object": {
"type": "Identifier",
"name": "mem"
},
"property": {
"type": "BinaryExpression",
"operator": "+",
"left": {
"type": "Identifier",
"name": "G_BASE"
},
"right": {
"type": "Identifier",
"name": "G_TEXT_Z"
}
}
},
"right": {
"type": "Literal",
"value": 1,
"raw": "1"
}
}
},
{
"type": "ReturnStatement",
"argument": {
"type": "CallExpression",
"callee": {
"type": "Identifier",
"name": "__yieldDebug"
},
"arguments": [
{
"type": "Literal",
"value": 3,
"raw": "3"
}
]
}
}
]
},
{
"type": "SwitchCase",
"test": {
"type": "Literal",
"value": 3,
"raw": "3"
},
"consequent": [
{
"type": "ReturnStatement",
"argument": {
"type": "CallExpression",
"callee": {
"type": "Identifier",
"name": "__yieldNewProcess"
},
"arguments": [
{
"type": "Literal",
"value": 4,
"raw": "4"
},
{
"type": "Literal",
"value": "a",
"raw": "\"a\""
},
{
"type": "ArrayExpression",
"elements": []
}
]
}
}
]
},
{
"type": "SwitchCase",
"test": {
"type": "Literal",
"value": 4,
"raw": "4"
},
"consequent": [
{
"type": "ExpressionStatement",
"expression": {
"type": "CallExpression",
"callee": {
"type": "MemberExpression",
"computed": false,
"object": {
"type": "MemberExpression",
"computed": false,
"object": {
"type": "Identifier",
"name": "exec"
},
"property": {
"type": "Identifier",
"name": "retv"
}
},
"property": {
"type": "Identifier",
"name": "dequeue"
}
},
"arguments": []
}
},
{
"type": "ReturnStatement",
"argument": {
"type": "CallExpression",
"callee": {
"type": "Identifier",
"name": "__yieldDebug"
},
"arguments": [
{
"type": "Literal",
"value": 6,
"raw": "6"
}
]
}
}
]
},
{
"type": "SwitchCase",
"test": {
"type": "Literal",
"value": 6,
"raw": "6"
},
"consequent": [
{
"type": "ExpressionStatement",
"expression": {
"type": "AssignmentExpression",
"operator": "=",
"left": {
"type": "MemberExpression",
"computed": true,
"object": {
"type": "Identifier",
"name": "mem"
},
"property": {
"type": "BinaryExpression",
"operator": "+",
"left": {
"type": "Identifier",
"name": "G_BASE"
},
"right": {
"type": "Identifier",
"name": "G_TEXT_Z"
}
}
},
"right": {
"type": "Literal",
"value": 3,
"raw": "3"
}
}
},
{
"type": "ReturnStatement",
"argument": {
"type": "CallExpression",
"callee": {
"type": "Identifier",
"name": "__yieldDebug"
},
"arguments": [
{
"type": "Literal",
"value": 8,
"raw": "8"
}
]
}
}
]
},
{
"type": "SwitchCase",
"test": {
"type": "Literal",
"value": 8,
"raw": "8"
},
"consequent": [
{
"type": "ReturnStatement",
"argument": {
"type": "Identifier",
"name": "__yieldEnd"
}
}
]
}
]
}
]
}
}
]
},
"generator": false,
"expression": false
},
{
"type": "FunctionDeclaration",
"id": {
"type": "Identifier",
"name": "process_a"
},
"params": [
{
"type": "Identifier",
"name": "mem"
},
{
"type": "Identifier",
"name": "exec"
},
{
"type": "Identifier",
"name": "args"
}
],
"defaults": [],
"body": {
"type": "BlockStatement",
"body": [
{
"type": "WhileStatement",
"test": {
"type": "Literal",
"value": true,
"raw": "true"
},
"body": {
"type": "BlockStatement",
"body": [
{
"type": "SwitchStatement",
"discriminant": {
"type": "MemberExpression",
"computed": false,
"object": {
"type": "Identifier",
"name": "exec"
},
"property": {
"type": "Identifier",
"name": "pc"
}
},
"cases": [
{
"type": "SwitchCase",
"test": {
"type": "Literal",
"value": 1,
"raw": "1"
},
"consequent": [
{
"type": "ExpressionStatement",
"expression": {
"type": "AssignmentExpression",
"operator": "=",
"left": {
"type": "MemberExpression",
"computed": true,
"object": {
"type": "Identifier",
"name": "mem"
},
"property": {
"type": "BinaryExpression",
"operator": "+",
"left": {
"type": "Identifier",
"name": "G_BASE"
},
"right": {
"type": "Identifier",
"name": "G_TEXT_Z"
}
}
},
"right": {
"type": "Literal",
"value": 2,
"raw": "2"
}
}
},
{
"type": "ReturnStatement",
"argument": {
"type": "Identifier",
"name": "__yieldEnd"
}
}
]
}
]
}
]
}
}
]
},
"generator": false,
"expression": false
}
]
}
```
The output AST follows the [ESTree](https://github.com/estree/estree) specification which is based in the [Mozilla Parser API](https://developer.mozilla.org/en-US/docs/Mozilla/Projects/SpiderMonkey/Parser_API).
Which is the AST for this:
```js
var G_BASE = 0, G_TEXT_Z = 0;
function program_test(mem, exec, args) {
while (true) {
switch (exec.pc) {
case 1:
mem[G_BASE + G_TEXT_Z] = 1;
return __yieldDebug(3);
case 3:
return __yieldNewProcess(4, 'a', []);
case 4:
exec.retv.dequeue();
return __yieldDebug(6);
case 6:
mem[G_BASE + G_TEXT_Z] = 3;
return __yieldDebug(8);
case 8:
return __yieldEnd;
}
}
}
function process_a(mem, exec, args) {
while (true) {
switch (exec.pc) {
case 1:
mem[G_BASE + G_TEXT_Z] = 2;
return __yieldEnd;
}
}
}
```
This AST is then combined with a fixed one,
which contains a set of **bridge functions** to
communicate with the runtime.
```js
(function (rt) {
'use strict';
/* Here comes the memory map */
function __yieldDebug(npc) {
return new rt.Baton('debug', { npc: npc });
}
function __yieldNewProcess(npc, processName, args) {
return new rt.Baton('newprocess', {
npc: npc,
processName: processName,
args: args
});
}
var __yieldEnd = new rt.Baton('end');
return { /* Here come the process and memory maps */ };
});
```
[Esprima](http://esprima.org/) is used to parse the code above.
Now the —3rd party— **generator** ([escodegen](https://github.com/estools/escodegen)) can create the proper JavaScript code.
```js
(function (rt) {
'use strict';
var G_BASE = 0, G_TEXT_Z = 0;
function __yieldDebug(npc) {
return new rt.Baton('debug', { npc: npc });
}
function __yieldNewProcess(npc, processName, args) {
return new rt.Baton('newprocess', {
npc: npc,
processName: processName,
args: args
});
}
var __yieldEnd = new rt.Baton('end');
return {
pmap: {
program: function program_test(mem, exec, args) {
while (true) {
switch (exec.pc) {
case 1:
mem[G_BASE + G_TEXT_Z] = 1;
return __yieldDebug(3);
case 3:
return __yieldNewProcess(4, 'a', []);
case 4:
exec.retv.dequeue();
return __yieldDebug(6);
case 6:
mem[G_BASE + G_TEXT_Z] = 3;
return __yieldDebug(8);
case 8:
return __yieldEnd;
}
}
},
process_a: function process_a(mem, exec, args) {
while (true) {
switch (exec.pc) {
case 1:
mem[G_BASE + G_TEXT_Z] = 2;
return __yieldEnd;
}
}
}
},
mmap: {
G: {
G_BASE: 0,
G_TEXT_Z: 0
},
L: {},
P: {}
}
};
});
```
Loading the program means linking the object artefact with the runtime to create a **executable program**.
First the unit is evaluated with `eval()` passing the runtime functions needed to communicate with the scheduler.
```js
const unit = eval(objText)(runtime);
const processMap = unit.pmap;
const memoryMap = unit.mmap;
```
Now the process and memory maps are used to create a new program.
```js
const program = new runtime.Runtime(processMap, memoryMap)
```
Before executing the program, we can listen for some events.
```js
// Both events don't finish until resolving a promise.
program.ondebug = () => openDebugger();
program.onfinished = () => showFinishMessage();
```
But our real interest is to run the program.
```js
program.run();
```