Functional programming places a high value on immutability and functional purity. If all data is immutable and functions never have any side effects, then a function call and the value returned are interchangeable. This aspect, called referential transparency, simplifies reading code. It also makes concurrent code easier to write since race conditions are non-existent. The downside of pure code is that since it has no side effects it cannot interact with the world — including us.
Continuing with the MNIST dataset example from the previous post, let’s say we want to write a function that loads the MNIST data set. A possible type signature would be:
get_mnist_data :: () -> MNIST
The ()
is the unit type. The signature says that the function
get_mnist_data
takes no input and returns the MNIST
dataset. It is telling
us the same information we would get from a C/C++
prototype (where ()
is
the equivalent of void
). This simple signature may be sufficient since we can
guess what it is most likely to be doing. However, it might be doing something
other than loading the MNIST dataset, such as randomly generating a simulation
of the MNIST data.
The get_mnist_data
function is of the more general form
f :: () -> a
The reverse of this general form is a → ()
. Functions of this reversed form
include plot, print, or cache functions that write something to the
environment. Other fucntions of this form may alter the environment in more
exotic ways, like commanding a robot to take lift its hand.
What these functions have in common is that they are interacting with the environment. They all take input from or execute output in the environment. We can label all these effects as Input/Ouput, IO, effects.
There are other types of effects that are also not captured in a pure mathematical function.
One such effect is mutation. Immutability is one of the core principles of functional programming, but it is sometimes desirable to mutate an object in place. Mutating data structures in place is required for many efficient algorithms. Mutation is also useful when we want to avoid keeping multiple copies of an object in memory. For example, if we are working with a dataset of millions of images, and have limited memory, we might want to discard the intermediate images produced as we go through several steps of scaling, recoloring, cropping, and noise filtering.
Suppose we have an increment
function that adds one to an input integer in
place. Without special type-level support, its signature would be:
increment :: Int -> Int
Now suppose we use this function in the following code:
x = 21
y = add (increment x) x
Is the correct result 42 or 41? If increment x
is called first, x
will be
changed in memory and add
will return 42. This is the most reasonable result.
However, there is no guarantee that a particular add
function will operate in
this order. Further, the morloc
compiler may choose to evaluate the two
inputs in parallel and feed the results back to add
. So use of this mutating
increment
function could be lead to a race condition.
However, if morloc
knew increment
mutated its input, then it could
guarantee that no race conditions or inconsistencies arise by ensuring that
variables are not reused after mutation.
Here is a deeper example. Suppose we have the functions "transpose" and "scale" that perform matrix operations in place. It would be valid to write:
munge filename = scale (transpose (read_matrix filename))
This function reads a matrix from a file, transposes and scales the matrix, and
returns the result. All this would be done efficiently in place. In fact, if
scale
and transpose
did not mutate the matrix, the function would still
work. Further, the munge
function is not a mutating function since it does
not mutate its input filename
. So a function may be non-mutating overall even
if internally it uses mutatable data. In contrast, the function
mungeMut x = scale (transpose x)
does mutate its intput (x
) if scale
or transpose
is mutating. Thus to
know how mungeMut
behaves we must know how scale
and transpose
behave. So
we need to know what effects are involved in every function.
A third effect is exception raising. A function that can go wrong may raise an exception that winds up the call stack. If any function in a nested function composition can raise an exception, then the entire composition can raise an exception (assuming the exception is not handled). Thus the exception effect is qualitatively different from the mutation effect (where mutation doesn’t always propagate up).
A fourth effect is non-determinism. The most common type of non-determinism is
random number generation. For example, the following function generations n
numbers from a Gaussian (normal) distribution with a given mean and standard
deviation:
gaussian :: mean:Num -> sd:Num -> n:Int -> [Num]_{n}
Every time this function is called, it will return a different list of numbers. Most languages deal with random number generation by having a global pseudorandom number generator with a seed that is set using noise from the environment (or made from the pid and date). Every time a random nuber is generated, the global state of the generator is updated. In concurrent code, this can lead to two different kinds of non-determinism: 1) the random numbers are non-deterministic and 2) since the threads are running in parallel using a shared mutable generator, the numbers would be non-deterministic even if the original seeds were the same.
One alternative way to handle random numbers is to pass the random number generator and seed to the random funtion:
seededGaussian :: rng:(Num -> (() -> Num @mut)) -> seed:Num -> mean:Num -> sd:Num -> [Num]_{n}
If the function were actually pulling truly random data from outside (say random fluctuations in meteorological data), then the random function would have both an IO and non-determinism effect.
There are several approaches to handling effects in functional languages. Haskell programmers typically use a monadic approach. Another solution is the "algebraic effect" system demonstrated elegantly in Microsoft’s language Koka [1].
At the moment, I am leaning towards an algebraic effect system for morloc
.
Implementing this system, and ensuring it works elegantly across languages,
will be a major research area going forwards. Type checking effects in morloc
will be complicated by the fact that effects may vary between languages.
General algorithms that are implemented purely in Haskell may use mutation in
C++
. Or some implementations may add print statements (and thus io or console
effects) but this should not raise errors at compile time.
Here are a few examples of what type signatures might look like with added algrbraic effects:
lookup :: [(k,v)] -> k -> v @exp
sort :: [a] -> [a]
sort Haskell :: [a] -> [a]
sort Cpp :: [a] -> [a] @mut
accessDb :: conn -> key -> a @io
accessDb py :: conn -> key -> a @console
accessDb R :: conn -> key -> a
transpose :: [a]_{x,y} -> [a]_{y,x} @mut
gaussian :: mean:Num -> sd:Num -> n:Int -> [Num]_{n} @idet
lookup
searches a key/value list for a particular key. If the key does not
exist, it raises an exception and kills the program. An alternative
implementation could use an optional datatype, such as the Haskell Maybe monad.
So algebraic effects and monadic systems are not mutually exclusive.
sort
sorts a list. In the Haskell implementation, a new list is returned. In
the C++
implementation, the list is sorted in place and the input list is
lost. morloc
needs to determine whether the mutable sort can be used
(possibly rearranging order of evaluation to avoid conflict). I am not yet
certain if this problem is tactable or if allowing implementations to differ on
mutability is desirable; much research remains to be done.
accessDb
connects to some database and returns some value from it. This is
fundamentally an IO operation. So @io
is added to the general type. The
Python implementation adds a @console
effect since it writes comments to the
STDERR. The R implementation does not have the @console
effect since it,
presumably, captures and deletes any console info. Allowing @console
effects
to be added to implementations allows error messages to be added without
without changing the general type (which may have been imported from an
external library).
For transpose
mutation is not strictly necessary. It would be reasonable for
some implementations to be mutable and others immutable, so putting the @mut
tag at the general level is not a good idea. Instead, any effect at the general
level should be considered a necessary effect that is always passed to
implementations (i.e., implementations inherit effects). Any effect specified
at the implementation type level is an added effect.
gaussian
generates n
numbers from a Gaussian distribution using an
unspecified random number generator. Since each call to this function returns
different values, it has the @ndet
effect. seededGaussian
, however, is a
pure function, even though it contains mutability internally, since given the
seed and a random number generator, the function returns the same result every
time it is called and has no side effects.
I do not want to require everyone adds effects to all their signatures.
Instead, I want "gradual effects" where no effect annotations implies the
effect is not known. Absence of any effect is specified by adding @pure
. I
would also like to allow effect negation. This would mean that a function could
be annotated as, for example, not having an io
effect without saying anything
else specific about the other effect types.
Another challenge morloc
faces is how to determine that the stated effects
are accurately reflect the effects in the source code. Functions in morloc
may be imported from arbitrary sources and are thus black boxes. Judicious fuzz
testing may be able to automatically detect most effects. Static analysis may
also be applicable in some cases (perhaps all, in the far future).
I’ve only touched on the type annotations associated with algebraic effects,
not on how they are implemented in the compiler and specified in actual
morloc
code. Specifically, using algebraic effects requires specifying effect
handling functions. I will circle back to this topic in a future, for now this
post serves to introduce the problem and point towards a possible solution.
References
[1] D. Leijen, Koka: Programming with Row Polymorphic Effect Types. Electronic Proceedings in Theoretical Computer Science, 153 (2014) 100–126. https://doi.org/10.4204/eptcs.153.8.