Learn to Code via Tutorials on Repl.it!

← Back to all posts
How To Make A Language: Parsing

The last tutorial didn't really give an in depth explanation as to what everything was doing aside from some comments, however for parsing I feel I should explain how it works.

Markdown was really weird on this post.

We are going to be using a recursive descent parser, written all by hand and with no dependencies.

Lets start with an expression `8 + 5 * 4 / 2`

The first thing we are going to do is tokenize it, so this expression will tokenize to this

``````1: ("Number", 8)
2: ("Operator", "+")
3: ("Number", 5)
4: ("Operator", "*")
5: ("Number", 4)
6: ("Operator", "/")
7: ("Number", 2)``````

Im going to do a little experiment here, and if it works you should be able to read the above stream of tokens just like an rdp parser. When you see `Add(), Sub(), Mul(), Div() and Literal()` in the text below jump to the corresponding label!

Left = Sub()

``````Is the next token a + operator? If so then:
Right = Sub()
Return (Left, "+", Right)

Otherwise return the left side``````

Sub:
Left = Mul()

``````Is the next token a - operator? If so then:
Right = Mul()
Return (Left, "-", Right)

Otherwise return the left side``````

Mul:
Left = Div()

``````Is the next token a * operator? If so then:
Right = Div()
Return (Left, "*", Right)

Otherwise return the left side``````

Div:
Left = Literal()

``````Is the next token a / operator? If so then:
Right = Literal()
Return (Left, "/", Right)

Otherwise return the left side``````

Literal:
Return the current token and advance to the next one

Now, if you went through everything above and kept track of where you were in the sequence you will notice that you end with a tree that looks roughly like this:

[deleted]

# I like this

Soz about the 10yr old making Python classes rather than a programming language

LoganSpong (52)

@BXRStudios I get it already! I took down the post!

[deleted]

@LoganSpong I don’t think you are @CSharpIsGud but if you are hello otherwise I was talking to @CSharpIsGud

LoganSpong (52)

@BXRStudios No, I'm the 10 year old that made the uhh... thing.

CodeLongAndPros (1537)

Can people just

# Get over it!

It is not a big deal! Point it out once, then be done!

You can ask them to change the name, but really?

This kind of behavior is downright mean. Some people just like to make stuff and show off what they made!

## </rant>

CSharpIsGud (797)

@CodeLongAndPros Mean is kind of relative, since people will call calmly pointing out that importing a module doesn't make a new language mean while also ones that say the same thing in an insulting tone mean so there are 2 kinds of mean on replit

CodeLongAndPros (1537)

@CSharpIsGud There are some people that don’t like it because it is “not canonical”. I mean, sure, you can point it out, but don’t fixate on it. I’ve pointed some things to that effect, but I don’t focus on them. You could say that some people are stating a fact, and others are fixated on that same fact.

[deleted]

Well let's just stop here then and all just take a break. :)

bramley (248)

Please check the Repl.it TOS, @LoganSpong

LoganSpong (52)

@bramley I did. I could'nt find the section at which you're talking about. Which one is it?

LoganSpong (52)

@bramley Is it the copyright one? Like this: You may be held accountable for damages (including costs and attorneys' fees) for misrepresentation or bad-faith claims on the infringement of any User Content found on and/or through the Service on your copyright.

Or is it this one?
If we learn of any user that is under the age of 13, we will immediately terminate the user’s Account and delete any and all information we may have about that user.

Viper2211 (84)

Ummmm, is that supposed to happen?

CSharpIsGud (797)

@Viper2211 oh oops, I didn't implement negative numbers

Viper2211 (84)

@CSharpIsGud That makes more sense. I thought the parser was broken or something...

CSharpIsGud (797)

@Viper2211 negative numbers work now

studentAlfredAl (440)

I know I'm really late but... @CSharpIsGud I found another bug. Negative numbers don't work properly. I typed in `-1-3` and it returned -1... I'm pretty sure that's not the answer lol
But other than that, this was great! :D

Highwayman (1439)

Hm ok so I’ve mulled over this method for quite a while and I still don’t get it. The diagram at the end seems a big messed, but I’m not sure. How are we supposed to traverse the tree?

PythonPrograms (14)

Wow I like this, this is pretty cool and it’s very well explained

LoganSpong (52)

Cool! I wanted to learn how to make a parser, now I know! But what is the node.type thing?

CSharpIsGud (797)

@LoganSpong It could be a Binary Operator like 1 + 2, it could be an assignment like var = 15 or it could just be a Literal like "Hello World"

LoganSpong (52)

@CSharpIsGud cool. I'm making an CLI. It's pretty helpful

[deleted]

just make a bunch of functions and classes /s

DynamicSquid (4532)

xxpertHacker (649)

All this really does is parse an expression and then call Python's built-in operator, which probably calls C's built-in operator, which already sounds inefficient. Now if this were compiled into some binary that performed this, and maybe held some variables...

CSharpIsGud (797)

@StudentFires Thats basically how any interpreted language does it, this tutorial isn't going to cover a compiler but I might make a separate one after this series is finished.

xxpertHacker (649)

@CSharpIsGud Gonna show of the process of how you're making Qyl? If you do it might hype up Qyl so when it arrives it might get noticed.

VulcanWM (2231)

How do I add a parser to this: https://repl.it/@medcho/VainSmoggyApplicationsuite
I still don’t get what you mean?

VulcanWM (2231)

Thanks man, I seriously needed this

TheForArkLD (743)

I like these tutorial, can you make language with me by nodejs?