Learn to Code via Tutorials on Repl.it!

← Back to all posts
How To Make A True Coding Language: Part 1
h
CSharpIsGud (425)

Im making this tutorial series because almost every language i've seen posted to repl talk doesn't use parsing algorithms and I think it would be nice to see some that do. There are a couple flavors of these languages, typically they come in 2 forms
1) They use string splitting and regular expressions

Technically you can call this "parsing" or a language of some sort.
But you will very quickly discover you run into syntax limitations like having to have a separator for a lot of things.
Example:
set:var,Hello World;print:var

2) They do nothing at all but define some classes or variables

I don't know how people get away with this and then tell you to calm down when someone calmly separates what it is from what it isn't. Even when put in the best possible words as to not directly attack the repl itself.

Which is why I have decided to create a tutorial on making a programming language in hopes people start making ones that don't have the above flaws.

This tutorial is going to go bottom to top using no dependencies at all and will show creation of a lexer all the way up to the hand made recursive-descent parser!

The Lexer (or scanner, tokenizer, whatever you wish to call it)

Located in lexer.py

The other components will get their own files as they are created.

Next: https://repl.it/talk/learn/How-To-Make-A-Language-Parsing/39832

Commentshotnewtop
CodeSalvageON (534)

im pretty tired of seeing fake langs on talk. they're worse than fake os's

CodeLongAndPros (982)

@CodeSalvageON I don't think of them as OSes, more of shells.

DynamicSquid (2678)

@CodeSalvageON I think fake OS are worse

std::cout << "Booting up system. Please wait a while\n";

for (int a = 0; a < infinity; ++a)
    std::cout << a << "% loading...\n";

@CodeLongAndPros empty shells

AmazingMech2418 (693)

You do know that there is more than one way to make a programming language, right? Your type one not programming languages are actually programming languages, just without as many powerful functions. For example, you could create a LOLCODE interpreter using split functions and regular expressions and LOLCODE is a programming language. It's not the most useful, but still a language. Same with Forth which is even easier to create an interpreter for. Then, the type two "not programming languages" really aren't programming languages and just created dialects for known languages. However, Clojure is a dialect of Lisp and is considered a separate language, so why couldn't some of what you call "not programming languages" actually be programming languages? For example, the in-development THAIL programming language is really a dialect of Adapt (my programming language which is also in development). Also, please stop arguing with everyone about the things you call "fake". There is still hard work put into it, just maybe not as much as a real OS or full interpreted/compiled programming language.

CSharpIsGud (425)

@AmazingMech2418 Ok yeah, I guess you could consider something like deflang an actual language. But at least deflang does some kind of parsing, tell me how you can make a dialect of a language without any form of parsing at all

AmazingMech2418 (693)

@CSharpIsGud Do you consider Dart a language although it normally just transpiles into JavaScript? Also, a dialect could technically just be a "language" with the same syntax and different function names.

CSharpIsGud (425)

@AmazingMech2418 dart actually compiles, they didn't just rename javascript

CSharpIsGud (425)

@AmazingMech2418 Also he technically didn't make a compiler in 3 lines, he just used the existing one which compiles into python bytecode, so roylat keeps his life savings

AmazingMech2418 (693)

@CSharpIsGud Yes, Dart actually compiles and he didn't make a compiler, but he still renamed the functions which qualifies as a dialect in my opinion.

[deleted]

@CSharpIsGud jokes on you I dont have any so either way I dont lose anything

AmazingMech2418 (693)

@roylatgnail Honestly, all of my languages use full interpreters. Well, besides Link and XPL which really just use JSON and XML syntax respectively and evaluate the functions to allow you to do something other than store data in JSON and XML.

xxpertHacker (341)

@AmazingMech2418 T.H.A.I.L. never even happened, this your entire point is invalid.

xxpertHacker (341)

@AmazingMech2418 Dart is it's own language, a whole VM was made just for Dart, it has it's own parser, tokenizer, everything. Dart is normally just transpiled. There are browsers that support Dart natively with improved performance over JavaScript.

Now can we Consider TypeScript a language? No, it's clearly a dialect of JavaScript.

CSharpIsGud (425)

@StudentFires I mean, typescript has its own parser. its a dialect of javascript in that it inherits most of its syntax and compiles to it, but typescript adds major things javascript just doesn't have, like static typing.

xxpertHacker (341)

@CSharpIsGud That's it, it ends there. Interfaces are for type checking, types are for type checking, what it calls "function overloading" is the stupidest implementation of function overloading I've ever seen, is just for type checking, templates are just for type checking.

All it is, is JS with types.

Dart has types too, and a whole different syntax, but there's more that syntax and types that make a language unique. Until TS really branches off from JS, it's still just a dialect. It doesn't add anything new.

Whereas, like I said before, Dart has a whole browser dedicated to running it, parsing it, etc. Look at TypeScript, Deno, one of the only TypeScript run-times that I've heard of, barely came out a few months ago.

xxpertHacker (341)

@CSharpIsGud It's comparable to using the Python type checking extension, is it still Python? Of course!

xxpertHacker (341)

@CSharpIsGud TypeScript has room for improvement, let's look at it's function overloading, it just sets up a pattern of types that a function can accept.
Yet, if it's compiled, can't the TS compiler rename the functions before compiling and separate them? I'm sure it could! I can, and I'm sure you can too, so why can't Mircosoft? It wouldn't matter to the developer, since it should all become minified anyways.

BobNeo (39)

Can you like not hate on people’s projects just because they don’t fit your idea of a coding language?

CodeSalvageON (534)

@BobNeo they're modules at most, not "coding languages"

BobNeo (39)

That doesn’t excuse it. @CodeSalvageON

CodeSalvageON (534)

@BobNeo still, criticism is good and people should know what they're actually programming before they call it something else.

CSharpIsGud (425)

@BobNeo "my idea of a coding language" there is no my idea. its not a language if its just python with some defined classes, basically the same with .split and regexp

BobNeo (39)

That doesn’t excuse it either @CodeSalvageON

CSharpIsGud (425)

@BobNeo it does, because its true. but there isn't an excuse for blindly defending posts that don't even know what they are

LoganSpong (46)

@CSharpIsGud Also, I know who you're targeting.

CSharpIsGud (425)

@LoganSpong this applies to everyone who makes either of those types of not programming languages, your just the most recent

LoganSpong (46)

@CSharpIsGud Yours literally is Python. print(), Its just classes as well! And I can prove it! I can write any old Python code, and it will work!

LoganSpong (46)

@CSharpIsGud IsAlpha() is literally a redefinition of the str class and the function is .isalpha

Its also only 96 lines.
In Syntax.md IT LITERALLY SAYS: Some syntax borrowed from other languages.

Huh?

CSharpIsGud (425)

@LoganSpong good idea with isalpha, however by Standard syntax shared by most languages under expressions I obviously meant stuff like 1 + 2 * 3 which most languages share.

also its 97 lines because this is just the lexer and its in python.
if you look at my other langs like my python compiler you will see it quickly rises into the 3 digit range

LoganSpong (46)

@CSharpIsGud Python Compiler? Really? C'mon, triple-digits?
I can make one in 3-ish lines.

LoganSpong (46)

@CSharpIsGud Because you only have string analysing functions?

CSharpIsGud (425)

@LoganSpong I mean a real to native executable compiler, not a call to eval nested in a loop(which isn't a compiler anyway)

the one you make in 3 lines would just be using the existing python interpreter instead of making one

https://repl.it/@CSharpIsGud/CPython

[deleted]

@LoganSpong i will bet my life savings that you cannot make a python compiler in 3 lines

CSharpIsGud (425)

@LoganSpong Mine uses classes, but I never said the C++ classes were classes in my own language. If you look, you will see that the compiler doesn't actually support python classes because I haven't gotten to parsing those yet

And obviously I have to make a program for the compiler to compile

CodeSalvageON (534)

@LoganSpong he's not targeting anyone. If you call criticism "targeting" then "pls give upvotes"

LoganSpong (46)

@CSharpIsGud In python there is a compile function.

Python compiler:

code = input('Enter code here:')
exec(compile(code, code, 'exec')
NoelB33 (308)

@royaltgnail So when will you give your life savings to @LoganSpong

NoelB33 (308)

He’s obviously targeting somebody, this post came out right after he commented on the post he is targeting. @CodeSalvageON

CSharpIsGud (425)

@LoganSpong also thats just using an existing compiler and its compiling to bytecode(python is JIT compiled(TO BYTECODE NOT NATIVE) im sure so the compile function is basically the same amount of cheating as exec and eval)

BobNeo (39)

@CSharpIsGud I just got like 30 notifications from just this lol

JordanDixon1 (310)

@BobNeo @CSharpIsGud @CodeSalvageON Listen, heated discussions are not the reason repl.it was made. It was meant for making and sharing projects. It was meant for people that don't want to install the programming language on their computer that may be around 200mb! Also, I do believe CONSTRUCTIVE criticism is good, however, the keyword is constructive. You don't need to create a post about how someone else's post is invalid and wrong. You can simply comment on their post suggesting the name be changed to something different. @LoganSpong s module is actually really good, and although he may have the description wrong, it can still be really helpful for developers. I am working with him on making his module on pypi and I hope to see it on there soon. Anyway, I don't mean to point fingers, harass, or anything like that. I am simply trying to put an end to this heated discussion.

AdCharity (1270)

if you think about it most of the "fake languages" are actually languages,they're just implemented very poorly. It's literally like me ragging on you for not separating the lexer and tokenizer because they're completely different.

CSharpIsGud (425)

@AdCharity I mean, I may have exploded on .split and regexp in the past, but you can technically call those languages. when people define some classes and import them in another file and genuinely believe it isn't just python. I impulsively point that out

AdCharity (1270)

@CSharpIsGud I get your point (in fact one of my "languages was the same thing, I'm going to pretend I'm different now"). Nice project though :P

Spiered (2)

Whether it is a "true" or a "fake" language (as you call them), both are useless in the sense that, barring exceptions, nobody will use them (except maybe for fun), and I don't think we should blame people making this "fake" languages or "true" languages because both are very interesting to code, it is a question of skills: if you are skillful and experienced then make a "real" language, but if you are a beginner or if you don't have a lot of time (whatever) code a "fake" language, nothing bad with that.

Otherwise this tutorial sounds interesting :)

Codemonkey51 (803)

Yeh I agree there semi-fake they are technically languages but this is better I think I'll add this to RePy a language by me and @SushiPython it would really import every it. Also I've always wanted to make a proper language so here I go

TheForArkLD (660)

is deflang true language?

CSharpIsGud (425)

@TheForArkLD Technically you could call .split and regexp a parser by the definition of the word, but you run into limitations really fast, note how you had to require multiple separators to split statements, parameters etc. and require an expression parsing algorithm like the shunting yard algorithm if you want stuff like 5 + 2 * 3(including order of operations of course)

xxpertHacker (341)

@CSharpIsGud Thinking about it, can we call Amasad's BASIC a language? It's really just a transpiler.

rediar (324)

"Enter a math equation to evaluate (like 2 + 2): 5^5
5"
hmmmm
Also, quick question, I couldn't figure out what self.cur (lexer.py) was supposed to do

TheDrone7 (1089)

Shall I recommend editing the posts to have a link to your next tutorials and previous ones (in the future posts) so that people can easily navigate through these?

LoganSpong (46)

Hm.. This is cool! I would like to make a parser, thanks!

DannyIsCoding (523)

Yay! I can't wait to make my own language :D

DynamicSquid (2678)

I'm really curious about making my own language, but I too busy now :(

The way I understand it, it that you have an input

a += 7

And you have to split that up into characters

a, +, =, 7

And that part is called the lexer?

CSharpIsGud (425)

@DynamicSquid its more than characters, tokens, like operators, numbers, strings

DynamicSquid (2678)

@CSharpIsGud oh, okay. and then then next next step is the parser?

CSharpIsGud (425)

@DynamicSquid Yes, I use recursive-descent parsers in my interpreters/compilers

DynamicSquid (2678)

@CSharpIsGud Oh, okay. And is that most of it?

CSharpIsGud (425)

@DynamicSquid after the parser you make the actual interpreter or compiler

DynamicSquid (2678)

@CSharpIsGud and that's where the user enters their input?

CSharpIsGud (425)

@DynamicSquid interpreter gets input -> passes to parser which passes it to the lexer -> parser generates a syntax tree -> interpreter traverses that tree

DynamicSquid (2678)

@CSharpIsGud ohh.. okay, it makes sense now. thanks!

xxpertHacker (341)

@DynamicSquid The += get's parsed as one operator, so that's only 3 tokens.

PYer (3293)

Hey! Great tutorial/idea! Haven't seen one of these yet. (But don't insult other people's projects either...)

DynamicSquid (2678)

@PYer unrelated, but your on your way to 3000

PYer (3293)

lol thanks! I have my first cycle special planned for it. it's been planned for 2 months, and I'm not sure if i'll have it finished in time @DynamicSquid

DynamicSquid (2678)

@PYer ooh... can't wait to see that! And I'm sure once you release your cycles special, it won't be long before you take over mat!

PYer (3293)

ummm... i think you're going to be disappointed. And I feel like it's going to take some time to beat mat. I've been ahead of him once before, but lost the lead. I'm satisfied being second. @CodingCactus and @Vandesm14 are slowly creeping up (quickly in the case of CodingCactus). The cycle special is REDACTED. I'll delete that in one minute though :) @DynamicSquid

DynamicSquid (2678)

@PYer oh, well, still, 3000 cycles! That's pretty good, well done :)

LoganSpong (46)

There. I changed the name of my post to: A collection of powerful functions. Like it now? I can also change it to: Some functions I made called Inspyre.

NoelB33 (308)

Your post was a really good post, and you don’t have to change it’s name just because some person doesn’t like it. @LoganSpong

LoganSpong (46)

@NoelBryan Meh, I don't care about cycles anyways. It's not like I'm goona get banned from Repl.it. I guess he is right.
But, @CSharpIsGud, can we stop fighting?
I get it. It was not a coding language. Yours may be better! But Repl.it is a website for innovation, not hating. Let's stop this.

JordanDixon1 (310)

@LoganSpong If you are still interested, we can upload your project to pypi. I haven't been on for awhile due to power outage, but I'm back sooooo...

LoganSpong (46)

@JordanDixon1 Yeah! I just wrote up README.md

JordanDixon1 (310)

@LoganSpong okay, I may or may not be able to help you so instead, I will give you the info on how to do it using my tutorial also you can fork the project and make sure that you use the version control option on the left when done to make a github git. On top of this replace the information in setup.py with the info you want.