Language Features
Polyglot: import functions from any supported language and export as a polyglot module
source Cpp from "core.hpp" ("morloc_map" as map) source py from "core.py" ("morloc_map" as map) source R from "core.R" ("morloc_map" as map) export map
Typed: assign type signatures to describe the general data form and the language-specific data structures in memory
map :: (a -> b) -> [a] -> [b] map Cpp :: (a -> b) -> "vector<$1>" a -> "vector<$1>" b map py :: (a -> b) -> "list" a -> "list" b map R :: (a -> b) -> "list" a -> "list" b
Functional: build complexity by composing functions without worrying about their particular implementation
import core (map) import math (mean, sqrt, mul) square x = mul x x rms xs = sqrt (mean (map square xs)) export rms
Generative: let the compiler handle language-interoperability and the generation of user interfaces
$ morloc make -o rms rms.loc $ ./rms -h The following commands are exported: rms param 1: [Num] return: Num $ ./rms rms "[1,2,3]" 2.1602468994692869
Use Case
Alice is a programmer developing a new protein alignment algorithm. She implements her algorithm in her favorite language C++.
#include <vector> #include <string> using namespace std ... // any required helper functions vector<string> align(vector<string> seq){ ... }
This function takes a list of strings and returns a new list of strings padded with gaps such that they are the same length but with maximized columnwise similarity.
CATTACA GATACA GAAACA
CATTACA GAT-ACA GAA-ACA
She shares her function with her team and they are enthusiastic about the algorithm but can't use the raw C++ function. Some of them want a Python module. Others want a command line tool they can integrate into their pipelines. To make matters worse, everyone wants the tool to read input and write output in multiple formats and handle error validation across all of them:
>MG123456|Human|2020-04-01 CATTACA >MG34343|Pig|2021-06-11 GATACA >MG34346|Pig|2021-04-24
>MG123456|Human|2020-04-01 CATTACA >MG34343|Pig|2021-06-11 GAT-ACA >MG34346|Pig|2021-04-24 GAA-ACA
CLUSTAL W (1.82) multiple sequence alignment MG123456 CATTACA 7 MG34343 GAT-ACA 7 MG34346 GAA-ACA 7 **. ***
To make her users happy, Alice would have to do the following:
So Alice is unhappy. What started as an interesting algorithm implementation turned into a huge software project that she (or some unfortunate future programmer) will have to maintain for years.
Morloc offers an alternative solution that preserves the original algorithm's simplicity while granting Alice's users all the extra functionallity they desired (and more, as we shall see). All Alice needs to do is tell the morloc compiler how to use her function through a pair of type signatures:
module align source Cpp from "align.h" ("align") export align --the general type, list of strings to list of strings align :: [String] -> [String] --the language-specific type, C++ vector of strings align Cpp :: "vector<$1>" "string" -> "vector<$1>" "string"
After uploading this morloc script and the original C++ source code to github, Alice's role is finished.
Now Alice's team can customize and build the tools they want within the morloc ecosystem through function composition.
They can find sequence readers/writers in the morloc bio module
The general readFasta function may be implemented in many languages such as R, Python, and C++ it is the role of the compiler to choose the most appropriate one and generate any code needed to make it interoperable with our align function.
readFasta to read in sequence data as a list of header/sequence pairs
--read fasta file as list of header/sequence pairs readFasta :: File -> [(String, String)] --write annotated sequence file and return filename write Fasta :: File -> [(String, String)] -> File
Generic functions for working with lists and pairs can be found in the morloc base module
--unzip one list of pairs into two lists unzipPair :: [(a,b)] -> ([a],[b]) --call a function on the second element in a pair withsnd :: (b->c) -> (a,b) -> (a,c) --merge two lists into one list of pairs zipPair :: ([a],[b]) -> [(a,b)]
Alice's team can then specify the tool they want in a morloc script
import align (align) import bio (readFasta, writeFasta) import base (zipPair, withsnd, unzipPair) export alignFasta -- align annotated sequences without altering annotations alignG seq = zipPair (withsnd align (unzipPair seq)) -- map an arbitrary function over (header, sequence) pairs withFasta f i o = writeFasta o (f (readFasta i)) -- align a fasta file alignFasta i o = withFasta alignG i o
The program can be built in the morloc terminal
The above code builds a command line tool. Compiling instead to a basic web-app or language-specific module will soon be supported. Adding these features would only require changing the backend code generators.
$ morloc install yourname/align # githubrepo username/repo $ morloc install base # from morloclib on github $ morloc install bio # from morloclibon github $ morloc make –o aligner aligner.loc # make the CLI tool $ ./aligner –h # view usage info $ ./aligner alignFasta '"myseq.fasta"' # run the program