I’ve spent quite a lot of time, during the past few weeks, working on the representation of my dataflow language. In this post I will attempt to outline my current thoughts and hopefully generate some feedback on what I got right and wrong, what I could improve, how the language could be made more intuitive, easier (and faster) to develop in and less bug-prone.
For this post, I will define some basic temrinology. This will help me in explaining my ideas in a consistent way, but may not match up with existing terminology. If there is such a mismatch, please feel free to let me know and I’ll update the post. Later, I will probably make a list of terminology specific to my language, using existing temrinology where possible, as a reference to help people who are viewing my language for the first time. For now, however, I’ll simply use these:
- operation – a fundamental instruction or external block of code, which acts as a single node in the dataflow network
- component – either an operation or a collection of operations, or a collection of collections of.. That is, a componecan refer to a single operation, or collection of components (which themselves refer to either a single operation or a collection of components).
- expression – a deterministic calculation whos output is a function of its inputs, calculated using the substitution model of computation. Expressions consist of a limited set of predefined (usually mathematical or logical) operators.
- pre-condition – a logical expression which evaluates to either True or False and must be evaluated as True for a component to be considered executable. May optionally triggeran an arror state if this expression evaluates to False.
- post-condition – a logical expression which evaluates to either True or False and must be evaluated as True for a component to propogate its output to connected nodes. May optionally trigger an arror state if this expression evaluates to False.
For now, I have defined a set of data types for use in the language, though what the final set of supported data types will be is, so far, undeicded as I need to do more real-world tests to see what is useful and what is not. The data types I have currently defined are:
- integer – a signed whole number whos range is defined by the host systems native sized integers (ie 32bit on a 32bit machine).
- real – a signed floating point number, whos range is defined by the host systems native floating point support.
- byte – an unsigned 8bit value used to store binary data.
- character – a single unicode character.
- pair – a container of two values, where each value may be of a different data type.
- array – a container of a fixed number of values, where each value is of the same data type.
- record – a container of a predetermined set of named fields, where each field may be of a different data type.
Additional, theres a special data type: the sequence. A sequence is a stream of values of the same data type and is the connection between components – that is, I am naming the flow of data between nodes a sequence.
I am also contemplating whether a variant data type should also be provided as this could be useful, together with a guard or pattern matching mechanism, for receiving data from external sources.
To demonstrate the language syntax, I will construct a network to perform the following sample pseudocode:
input a, b and c
value d = (a * b) - 1
if d > 10 then
value e = c - 1
This simple piece of code would be represented in my dataflow language as:
This simple dataflow network shows two “merge” components connected together. Before explaining how this network works, I will briefly introduce the merge components.
A basic merge component takes one of the following two forms:
A merge component has two or three inputs and a single output. It also has an accosiated expression. The output is the result of applying the expression to the inputs.
There also exist additional variants of most operations – those with optional pre- and/or post-conditions attached. Pre-conditions are marked as a green block on the left and post-conditions are marked as a red block on the right. In the sample code, presented above, the second merge node has a pre-condition assigned to it.
In the code editor, it would be possible to collapse, and therefore view, the pre-conditions, expressions and post-conditions. The diagram used the hidden view to demonstrate that a more compact view of code, which does not display all available information, would be available in an editor. The collapsed view would look as follows:
Note the use of coloured markers where the inputs are used. This is done because forcing inputs to be named, when it may not make sense to name them, programmers, being as lazy as we are, tend to give them meaningless names. This only adds to the noise and makes the program more dificult to read and understand. Using coloured markers makes it immediately obvious which inputs are used where without polluting the code with meaningless identifiers. It should, however, be possible to optionally label the markers, if it makes sense to do so. Perhaps the labels are displayed as a tooltip when hovering the mouse over the marker. It may also be useful to highlight each occurance of an input within the component when hovering the mouse over it.
This code shows that, in the first merge component, the expression (a * b) – 1 is computed. The result of this expression is passed to the next node, which contains the pre-condition that its first input (d in the pseudocode) must be greater than 10. If it is not, then that input (and the input to d) are discarded and no furether action is taken. If the pre-condition passes, however, then the expression is evaluated, subtracting 2 from input c.
The example below is a bit more complex and introduces variants of the “merge” component (multiple expressions, single input, etc) and a new node: the “route” component. The route component evaluates a condition and then routes the inputs to one of two possible sets of outputs. Route components may have a pre-condition, but cannot have any post-conditions, as the inputs are never modified by this node – the same effect may be achieved through pre-conditions.
This example deserves a more in-depth explanation than I hae time for, so I will edit this post over the next week or so to add in additional detail and explain my ideas better.



