Basil
elucent (41)

The Basil Programming Language

Hello! Basil is our (@basilTeam: @elucent, @minchingtonak) entry to the Repl.it language jam.

Unique Compilation

The goal of Basil is to try to have it all - high-level language features like macros, dynamic code evaluation, minimal type annotations - without any of the runtime costs normally associated with high-level languages - large runtime systems, slow performance, high memory use.

To accomplish this, Basil makes use of a novel system to compile its code. The main innovation is that the Basil compiler is actually an interpreter, able to interpret any Basil expression. When it gets to an expression it wants to save for runtime, it wraps it up and saves it to the end of the program. These runtime values propagate and result in an expression tree of all code reachable at runtime - notably without unnecessary operations between constants or dead code. Basil also statically type-checks this tree, so none of the code it ultimately compiles incurs a dynamic typing or boxing overhead.

Expressive Freedom

The primary goal of Basil as a language, however, isn't just speed. Basil exists to help programmers express complex problems in natural ways. While keeping Basil totally compiled and statically-typed, we've managed to implement the following features to enable uniquely customizable syntax:

  1. Simple, homoiconic code! Basil is a Lisp dialect at its core, and has extremely simple base syntax with no reserved words and few special symbols.

  2. Extremely simple macros! Only one primitive for code interpolation/evaluation, as opposed to the three or so usually present in Lisp dialects.

  3. User-defined infix functions and macros! Can have any arity (unary, binary, or more!) and custom precedence.

  4. Mixfix function syntax! Functions can be defined with keywords in their argument lists, to help break up high-arity function calls into natural-language phrases.

Implementation Details

The Basil compiler was implemented entirely within the language jam, almost entirely from scratch in C++. Only two libraries were used (both created before the jam):

  1. Jasmine, a runtime x86_64 assembler and object format.

  2. Collections, an alternative C++ standard library.

Everything else is totally original, created since August 10th.

The Basil compiler has no runtime dependencies other than libc, and the entire compiler and runtime fit comfortably in under 400 kB.

The code is first tree-walk interpreted by the compiler, using dynamically-typed Basil values to represent code. This is then lowered for runtime values to typed nodes, which are then lowered to a simple IR and then to x86_64 machine code. This code is then linked, loaded and executed just-in-time by the Basil environment.

Try It Out!

The best way to give Basil a try is to work through the interactive tour! It walks you through every core Basil feature, from the basic syntax to infix macros.

You can also try running the Basil REPL. You can reach it by exiting the tour on the project page, and then running ./basil in command line. The REPL mode features a number of commands to view compiler intermediates or generated assembly, so if you're curious how that all fits together give it a shot!

If you're curious how some part of the project works, or are looking for a list of all the language's features, take a look at our README for a somewhat informal language specification.

Support the Project

If you liked the project, consider giving us a star on GitHub, or following the devs (@elucentdev, @minchyAK) on Twitter!

If you'd like to get in touch (with a question, suggestion, etc), the best way to reach us is on Twitter or Discord (our handles are elucent#1839 and minchy#2474).

Thanks so much for taking a look at our project! It was a lot of fun making it, and we're going to be continuing work on it past the jam - so stay tuned! :)

You are viewing a single comment. View All
xxpertHacker (545)

@fuzzyastrocat

Obviously, this would incur some memory overhead, as well as a slight performance overhead since every time a reference to the value was made it would have to be added to the value's reference list and it would have to be removed when that reference goes out of scope.

Sounds like reference counting, can you confirm? Or are you suggesting something that is moderately different?
If so, deferred reference counting, at a language-level, could help out quite a lot here, falling back to normal reference counting when it can't be used.

The difficulty I see is overlap — what happens when f's result is passed to g? You'd have to have two different "kinds" of g: one that accepts something from the stack and one that accepts something from the heap.

Yes! This is precisely what I had referred to.

(Can functions be called with these different memory management methods and still work?)

If function f and g both accept an object, or value that needs to fit into memory, as opposed to a single integer/float, the function could be given an object that was "moved" and is stack-allocated, or it could be called with an object that is stored in "normal" memory. In reality, the instructions that would operate on the data are inherently different, meaning a branch must be somewhere.
Maybe it's like C++'s overloading or generic code, where a separate function would be statically linked and called in place of what the programmer thought was only one function.
Or we could have a runtime branch, but that would introduce runtime overhead.

How would one deal with functions then?

T f (reference) {
    ...
}

T g (reference) {
    ...
}

T h (reference) {
    let functionArray = [ f, g ];
    functionArray[ index ]( reference );
}

Stuff like that (not uncommon in my own code), would not be fun to statically analyze. One could insert extra code to intercept which function was called, a sort of "boxed" function, almost allowing dynamic types to be passed in, and then calling and returning the proper function (ideally while reusing the stack frame for extra optimization).

There are infinite ways one might solve this problem, I wonder how Basil will handle it.

The static analysis of code paths is another thing I've been thinking about. It gets difficult since memory can be used in different ways based on runtime situations:

Oh, of course, static analysis can always be complicated, but the key idea is that when it can help, put the information gained to use, which would improve what programs that it can, while leaving the rest unchanged. If it's too complicated, the compiler can bail out on the optimization and leave it to whatever it currently would do, if it had not applied the optimization at all.