The Onyx programming language
vladfaust (2)

State of System Programming

This section is an excerpt of the original article named System Programming in 2k20 written by me. You're welcome to read it after completing this read on Repl.It. 🙂

Nowadays, application programming is seemingly ubiquitous. It is easy to spin up a web server on Ruby on Rails, to write a low-poly game in Unity or create a messenger in Electron.

But programming is more than that. Ruby itself is written in C. Unity is presumably written in C++ and native C#. Electron is C++.

Dynamic language runtimes, game engines, GUI libraries, media applications such as graphic and music editors, neural networks, medicine software, automotive software, operating system kernels... It all needs bare-metal performance. It is all written in system programming languages.

Despite the everlasting growth of no-code tools and higher-level FFIs like PyTorch, we still need to program on the lowest level. What languages do you think those tools and libraries you're relying on are written in? System developers are still needed to create and maintain this low-level software.

Which language to choose for system programming?

For smaller projects, there are a plethora of languages to choose from, including C, Zig and many esoteric ones.

But when it comes to a big project with long-term maintainability requirements, the choice is narrow: it's either C++ or Rust.

There are also other challengers like Crystal, Nim and Julia, but most of them are not that really about "system" programming. Read more on that in the original article.

Long story short, neither C++ nor Rust is ideal, so I propose another system programming language, named Onyx.

The Language

Onyx is a general-purpose statically typed programming language suitable both for application and system programming.

On the surface, Onyx has syntax inspired by Crystal, hence Ruby, and Rust. It is a C-family language.

Onyx is designed with the following principles in mind:

  • Safe but slower by default, still providing tools for unsafe optimizations. It is still possible to shoot your legs, but it would require explicit safety responsibilities transferring from a compiler to you.

  • Have grammar defined by a finite set of rules with a small(er) number of exceptions. This leads to what we call "an intuitive language".

  • Unlike C (int x), have identifier-centric typing system (let x or let x : SBin32).

  • Infer as much as possible unless ambiguous.

  • Absolute platform agnosticism. The language itself does not make any assumptions about the operating system it runs on. It is only aware of ISA, and optional ABI information. Standard library in the traditional sense is not a part of the language, as it does not know anything about memory, threading, interruptions etc.

Onyx is practically somewhere in the middle between Rust and C++.

C++ is extremely unsafe, Nuff said.

But Rust has safety in absolute, which brings in a ton of complex-to-understand abstractions. When programming in Rust, instead of keeping in mind all the undefined behaviour an implementation is capable of as in C++, you have to keep in mind all these Mut, Box, Arc, unwrap(), as_ref(), borrow() etc. concepts.

Read more on why not C++ and Rust in the according sections of the original article.

Pointers

One of the main implications of Rust is that safely passing pointers around, which is essential in system programming, needs additional abstractions.

This problem is solved in Onyx by introducing pointer scopes. A pointer in Onyx is a simple raw pointer, nothing is added in runtime. However, in compile time, an Onyx pointer is aware of the scope of the pointee.

For example, a pointer to a local variable can not be safely returned from a function in Onyx, but can be safely passed to a callee, because a caller is guaranteed to outlive callee:

# `foo` accepts an argument named `val`
# which has type `SBin32*cw`, a *w*riteable
# pointer to `SBin32` with *c*aller scope.
def foo(val : SBin32*cw)
  # Change the `val` by-reference
  *val = 42
end

def main
  let x = 0

  # Taking address of `x` returns
  # a local-scoped pointer, but
  # it can be safely passed as
  # a caller-scoped argument
  foo(&x)

  @assert(x == 42)
end

Taking an address of a static variable (i.e. that defined in the global scope) would return a statically-scoped pointer.

Obtaining a pointer from C world would return a pointer with undefined scope because we have no information about it. You can safely pass a pointer with undefined scope around, but once the time comes to read or write into such a pointer, it would be unsafe.

You can read more about scopes in a WIP language reference found at https://fancysoft.xyz/onyx-ref/#_scopes.

Interoperability

Another important concern of system programming is the ability to reuse existing C code or compile the program into a reusable library.

Well, in Onyx all functions exist in Onyx context only. There is deliberately no ABI planned on Onyx functions.

Instead, to be able to interact with the "outer world", an Onyx function must be explicitly exported. The def main example show above rarely has a practical use: there was no actual entry function.

A canonical "Hello, world!" example in Onyx looks like this:

import "stdio.h"

export void main() {
  unsafe! $puts(&"Hello, world!\0")
}

It is then a system linker's responsibility to declare main as the entry function of the program, and maybe expect a return value other than void.

Note that what's following export is practically a C function prototype, until { is met. What's inside the exported function is Onyx code. For example, it could be export int main(int argc, char** argv) {.

We had to wrap the call into the unsafe! statement, as any C call is unsafe.

Interestingly, to avoid Rust's mistake on "yelling" macro names, the bangs are only used in explicit safety blocks, such as threadsafe!, fragile! and unsafe!. It also makes it easier to grep the code, as it distinguishes from safety modifiers in declarations (e.g. unsafe def foo) and explicit safety statements (e.g. unsafe! foo()).

Yes, there are three levels of safety, because you also need guarantees about data races.

Lastly, you may have noticed the import semantics. Onyx is able to import C entities from headers, and they can be easily referenced from Onyx code by prepending $ to an identifier.

Even C preprocessor macros can be used in Onyx code as long as they evaluate to valid C constants or literal initializers! For example, let x = ${C_MACRO} could've evaluate to let x = 42.

C functions, structs, unions, enums, typedefs and macros can be exported, imported and declared as external in Onyx.

You can read more about interoperability in https://fancysoft.xyz/onyx-ref/#_interoperability and https://nxsf.org/onyx/#_interoperability.

Macros

The third pillar of Onyx is macros written in Lua. Yes, a plain Lua with some functionality allowing to access and modify Onyx AST, and emit new tokens into the source code.

As there is a full Lua environment, you may make use of existing Lua libraries using your favourite package manager, such as LuaRocks. And you can even debug your compilation with Lua debugging facilities. Yes, I'm talking about putting a breakpoint into the compilation process!

A couple of examples on macros:

import "stdio.h"

export void main() {
  {% for i = 0, 2 do %}
    unsafe! $puts(&"i = {{ i }}\0")
  {% end %}
}

The code would expand exactly to:

import "stdio.h"

export void main() {
  unsafe! $puts(&"i = 0\0")
  unsafe! $puts(&"i = 1\0")
  unsafe! $puts(&"i = 2\0")
}

Delegating computations to compilation time:

{%
  -- Local context is preserved
  -- during this file compilation
  local function fib(n)
    local function inner(m)
      if m < 2 then
        return m
      end

      return inner(m - 1) + inner(m - 2)
    end

    return inner(n)
  end
%}

# This is a macro "function", which
# may be used directly from Onyx code.
macro @fib(n)
  {{ fib(n) }}
end

import "stdio.h"

export void main() {
  unsafe! $printf(&"%d\n", @fib(10))
}

This could would evaluate simply to $printf(&"%d\n", 55).

Macro possibilities are virtually endless!

You can find more macro examples in https://vladfaust.com/posts/2020-08-20-the-onyx-programming-language/#macros.

Other Features

There is a ton of all features, big and small: traits as composable units of behaviour, object lifetime with much simpler semantics than in Rust and C++, incredibly versatile generics system, complex types implying a pair of real and imaginary type, aliases, distinct aliases, modern types such as SIMD vectors and matrices, and many more.

I do not want to repeat myself, so please read the full Onyx introduction article: https://vladfaust.com/posts/2020-08-20-the-onyx-programming-language.

The Foundation

If I want to see Onyx prospering, I have to think about its foundation in advance.

It is common for a seemingly perspective language to vanish because of the famous chicken-egg problem: nobody wants to use a language without a decent amount of libraries, but nobody wants to write libraries for a language which is not used by anybody.

This problem is covered in the according System Programming article section, and the solution to this may be the "source-on-demand" model akin to Spotify. It seems like a good idea to reward authors of popular libraries.

But to reward those authors, some kind of fund is needed. That is where the Onyx Software Foundation kicks in.

NXSF is to-be an official non-profit organization accepting tax-exemptive donations to sponsor development of the Onyx programming language itself and its ecosystem, including the funding of popular packages. Seems win-win to me.

Oh yes, the Foundation would only care about official standards development with RFCs and votes. It would also selectively fund implementations, but the main work is taken on the standards. Because standardization matters more than implementation, change my mind.

For example, if I were a custom SoC developer, I'd love to have some kind of open standard to quickly implement an Onyx compiler for my device. And it is generally good to have multiple competing, but compatible, implementations.

Again, the problem of standardization is more deeply covered in the according System Programming article section.

Currently, there are six planned standards governed by NXSF.

Read more about the Onyx Software Foundation at https://vladfaust.com/posts/2020-08-20-the-onyx-programming-language/#the-onyx-software-foundation.

The Jam

Honestly, upon entering the jam I thought it would be possible to create a simple Onyx compiler being able to compile the simplest "Hello, World!" example. I even called for my friend, who is a researcher working on GPUs in a physics institute, to help me with the code.

Well, at that time I did not have any standard or reference drafts published and did not have any writings on what and why Onyx. And it was inevitable that I would need these writing sooner or later because people need to have some introduction to the language, as well as the answer to the "why another language?" question.

So I was spending my time on polishing the drafts before publication and writing those articles because without them, it'd be hard for judges to assess the full potential of the language.

As the time flew by, I understood that I do not want to write an ad-hoc implementation thereof, but an implementation done right requires more time, because Onyx grammar is quite complex in terms of implementation. For instance, the grammar implies a lot of basic, abstract AST blocks recursively chained. It is not as easy as simply iterating through tokens with some dirty ifs. Also, as the implementation relies on LLVM, and existing non-C++ wrappers are limited, I had to use C++, which is not the best language to write a high-level compiler in 20 days in.

That said, I had to sacrifice the idea of implementing an ad-lib compiler quickly. However, I indeed made some progress in it during the jam, which you may observe at https://github.com/fancysofthq/fnxc/tree/development.

Along with that, I've managed to publish:

Therefore, during the jam, I effectively pushed the overall development of the language forward.

A small nitpick: the jam rules emphasize "[...] designs and prototypes a new language", which does not necessarily mean an implementation; and I have not publicly revealed the language before the jam, so technically I did meet the requirements. 🤔

Conclusion

These were rough days of hard working, but I'm glad I've finally told the world I'm working on something big.

As I can see in the adjacent jam entries, it's a tradition to promise that I'll continue working on the language. Well, there is already a 160-paged language specification, and I would be an idiot to abandon the chance to really change the world. I plan to continue working on the implementation, and you're welcome to join!

You can also find the NXSF roadmap on its website.

Even if I don't win anything, the real results are already there.So, thank you, Repl.it, for the opportunity!

P.S: Dear judges, please read the full introduction article found at https://vladfaust.com/posts/2020-08-20-the-onyx-programming-language for a better picture of the language.

You are viewing a single comment. View All
sugarfi (581)

@vladfaust np! great language btw!