Ask coding questions

← Back to all posts
Making a compiled programming language
MocaCDeveloper (321)

Hi. So I am currently starting a new programming language called Bite and this language is aimed to further advanced features to work with memory as well as advance features for low level development that not only make it easier to do but more readable/flexible.

Bite is interpreted as of right now, but that is actually a really BIG issue because Bite is going to be dependable on being fast.
And this is where the issue emerges. A compiled language tends to run faster than interpreted languages and this is why I NEED to make Bite compiled.

C'mon now, lets be serious: Would it make sense to make a low-level language aimed to make low-level development easier if it is interpreted? NO!

So, here's the question

Does anyone know any documentation(or articles) that will be helpful as to going step by step(or just explaining) how to create a compiled programming language.

I have researched it multiple times before but maybe I am not digging deep enough and I often run into time issues/time stumps.

So, will anyone be willing to help me?? Please and thank you, it will be much appreciated.

P.S

I know that a compiled language gathers all of the source code, tokenizes it, and builds a Syntax Tree off of the tokens, I just always find myself needing a runtime and this is where I get a bit confused as to where the compiled side of things comes in at.

Any help would be great!

THANKS!!

Answered by Viper2211 (82) [earned 5 cycles]
View Answer
Commentshotnewtop
Viper2211 (82)

Go to https://craftinginterpreters.com/ . The second article on compiling was really helpful for me, so I would definitely recommend checking it out! It's written in C, so I think that'll help you a lot!

MocaCDeveloper (321)

@Viper2211 Thank you allot! I will make sure to take a look at it!!

xxpertHacker (390)

The compilation is just the generation of executable machine instructions. Whether it be JiT, or AoT, executable code is generated nonetheless.

Some would consider outputting Asm to be compiled too, as there is a one-to-one correlation between a binary format and it's Asm textual representation. Although, Asm isn't executable.

You could learn binary / Asm and directly output bytecode instructions into a file, but that would be foolish.

I recommend looking into compilation libraries like LLVM, Qbe, CraneLift, etc.

(for the time being, I personally am going to output direct, unoptimized binary for my language from the recent PL jam, but I'll probably switch to a library, or make my own optimization system and program synthesizer :) )

fuzzyastrocat (767)

@xxpertHacker

that would be foolish.

I would have to disagree... I learned so much from making a compiler to AT&T assembly. I think it's very a viable option for a little personal lang.

However you might run into trouble making it cross-platform. If you get the lang to the point of being something where others will use it (which probably won't happen for quite some time), then yes it might make more sense to switch to something like LLVM.

xxpertHacker (390)

@fuzzyastrocat For a long term, it is definitely foolish. There is no way you can complete the libraries out there. Now there is definitely a lot to learn from Asm languages, sure. But eventually, your job will be much harder trying to generate Asm / binary for everything, catching edge cases, optimization, etc.

fuzzyastrocat (767)

@xxpertHacker Right. I'm just saying that outputting x86 can be beneficial for a first-time compiled language since it takes much less setup than a big library like llvm and really gives you a low-level view into what tools like llvm actually end up doing.

xxpertHacker (390)

@fuzzyastrocat I'm following my own advice here; since I already know a low-level language, for the time being, I don't need something as powerful as LLVM, yet.

But if you don't know an Asm language already, it might just be better to start with LLVM or a higher-level IR format instead of Asm / binary bytecode operations. But I'm not forcing either way upon others.

fuzzyastrocat (767)

@xxpertHacker Yeah, I think both ways can have lots of benefits. Either way will be very useful, I just encourage taking the x86 route but LLVM is fine too.

fuzzyastrocat (767)

I know that the accepted answer is https://craftinginterpreters.com, but I actually find Nora Sandler's guide https://norasandler.com/2017/11/29/Write-a-Compiler.html very helpful. It sadly isn't complete, but it does a great job of keeping things simple and balancing between showing you stuff and having you do stuff. And, even though Sandler's guide is incomplete, by the end of it I had the knowledge I needed to continue building my compiler. I'm not trying to say that craftinginterpreters is bad, I just think that it might be a lot to take in for a first time building a compiled language.

However, the opposite could be true — Sandler's guide compiles directly to machine code, a "true" compiler, while crafting interpreters compiles to a virtual machine. The virtual machine approach will naturally be slower, but might be easier to translate to — finding documentation for x86 can be tricky sometimes, but if you make the VM you'll know how it works.

In the end, it depends on what you want. Since your language seems to be so low-level, I would highly suggest compiling to machine code. But, either option will work.

MocaCDeveloper (321)

@fuzzyastrocat So, you suggest that the link/documentation/article you found would be easier and will still give me what I want: A compiled programming language?

fuzzyastrocat (767)

@MocaCDeveloper Yes. In fact, I would not consider craftinginterpreters.com a compiled language. It converts the user code to an interpreted virtual machine. I'm not saying it's a bad site, it's a really great tutorial, but it's just not a compiled language per-se (after all, it's called "crafting interpreters"). If you truly want a compiled language, you'll want to generate machine code, which craftinginterpreters does not teach you how to do.

MocaCDeveloper (321)

@fuzzyastrocat The link you have given me teaches how to create a compiler for C...I am guessing I am just going to have to carry that knowledge into making my own compiled language?

fuzzyastrocat (767)

@MocaCDeveloper Right. It's just like how craftinginterpreters.com shows you how to create a virtual machine interpreter for Lox.

Coder100 (8721)

you should try making an interpreted language first, or transpile the language into like C++ and let C++ compile it for you

MocaCDeveloper (321)

@Coder100 Do you think that the language would be able to be a success be interpreted when it is aimed to be low-level and fast?

Coder100 (8721)

but it will be easier to make @MocaCDeveloper

MocaCDeveloper (321)

@Coder100 well that is an issue, I want it to be fast.

Would it make sense though? To have a interpreted low-level language?

Coder100 (8721)

@MocaCDeveloper well yeah, so you can make it transpiled or use something like llvm (or your own vm!)

MocaCDeveloper (321)

@Coder100 So, interpret it in C. Set up the languages "usefulness" in C, then transpile to another language, most likely C++, and then use llvm?

MocaCDeveloper (321)

@Coder100 Also I am avaiable to code now! Sorry I was in a class. I have classes 8am through 330pm

Coder100 (8721)

actually those are all 3 different things
1. Lexer
2. Parser
3. Now this is up to you how you want to execute it

MocaCDeveloper (321)

@Coder100 Hang on. I am writing it with C so how would I compile it with llvm? Wouldn't I have to transpile it first?

Oh, and Hi Coder100! How are you :)

MocaCDeveloper (321)

@Coder100 I am wanting to make the language a compiled language, but I am writing it in C. And isn't llvm a C++ library(or something of the sort)?

I don't know I am really confused :(

Oh, and that's good. I am glad to hear you're doing great!

Coder100 (8721)

@MocaCDeveloper I've never used llvm before, I would say you are best with transpiling it

MocaCDeveloper (321)

@Coder100 And how would that happen? I have never attempted to transpile before :/

I am sorry for so many questions!

Coder100 (8721)

oh it's ok

so how does your tree roughly look?
@MocaCDeveloper

MocaCDeveloper (321)

@Coder100 PFFFTTT I haven't even gotten that far!

I am that one person that stresses about something that comes along the road 5000 miles ahead rather than the obstacle right in my face :)

Coder100 (8721)

@MocaCDeveloper lol

well, go make your lexer and parser first, they will determine how we do the evaluation step.

MocaCDeveloper (321)

@Coder100 I will get to it when I am ready for the frustrations of facing the hardship of figuring out what the heck to do for each step!

But, if I put my head into it, I could probably write up a very well functioning lexer AND parser in about 2-3 hours!

Also, we need to start working on that Election Poll app soon, and another question, are you at all familiar with programming language development and the C language?

Coder100 (8721)

@MocaCDeveloper I have made a programming language, but I don't like C

fuzzyastrocat (767)

@MocaCDeveloper

And isn't llvm a C++ library(or something of the sort)?
I don't know I am really confused :(

I will try to help un-confuse you :D
llvm is a C/C++ library. The simplest way to explain it is that you tell it how to compile your code based on your AST. Basically, for each node of your tree (your AST), you'll tell LLVM what "machine code" (llvm virtual machine code) to create. Then, LLVM handles the generation of the real machine code for you.

While this sounds easy, LLVM is a huge library and it can be hard to get started. I'd suggest only using LLVM once you've had experience with hand-compiling.

xxpertHacker (390)

@MocaCDeveloper LLVM is simply a virtual machine and library that takes input as it's IR format and outputs native machine instructions for the device that it was run on. It is very useful for portability and optimization.

Now, as @fuzzyastrocat mentioned, LLVM is large, thus not easy to learn quickly. I would recommend at least dedicating a few weeks towards just learning LLVM, that is, if you're going to use it at all.

Once you've become familiar with it, it should be easy.