There is a thing on your computer called the stack.
It is more of a concept than a thing.
It resides in your RAM and it is what is known as a Last in, First out or LIFO data structure.
Let us discuss what that is for a second.
LIFO data structures
Imagine you have a stack of books.
This stack of books has a top and some stuff under it.
Let us say that there is 5 books currently on this stack.
Let us order them like this:
Book #5
Book #4
Book #3
Book #2
Book #1
And you have another Book #6 off to the side.
If you wanted to take off Book #4 you would first have to take off Book #5.
That is known as popping from the stack.
If you were to put Book #6 on top that would be pushing to the stack.
Let us look at a more relavant example with data.
Data LIFO
Let us say you have 1 register and how about it is eax.
And you have a stack.
The stack has a pointer register called esp.
This pointer register points to the top of the stack.
Let us have a visual of all of this data with a few pushes and pops.
Before anything:
esp: 16
eax: 0
stack:
0 - esp *points* to this meaning that it's value is this memory location
0
0
0
Notice that esp starts at 16 this is because on x86 processors the stack starts at the top.
Also we only have 4 places in the stack (a real stack would be much bigger than this) but esp is at 16.
This is because each place in the stack is 4 bytes and each memory location is 1 byte in itself so we need to take the amount of things in the stack and multiply it by 4 to get esp.
Ok, so now lets push 9 to the stack.
esp: 12
eax: 0
stack:
9
0 - esp *points* to this meaning that it's value is this memory location
0
0
Now let's pop 9 back out into eax.
esp: 16
eax: 9
stack:
9 - esp *points* to this meaning that it's value is this memory location
0
0
0
For you C programmers out there I will also provide a structure-form of a stack.
The stack is actually placed upside-down in RAM aswell.
The stack (continued)
OK, so how do we do pushes and pops?
Well let us review each seperately.
Push
Syntax:
push <value>
value can be a register, pointer, straight-out data (such as just the number 5), and a few other things that we will discuss later.
Pop
Syntax:
pop <location>
location can be a register, pointer, and a few other things that we will discuss later.
Ok! Now you have a decent understanding of the stack! You can use it to store things, preserve things, and to pass arguments to labels.
Basic Math!
Add
add takes 2 arguments the first one being a location and the second being a location or straight-out data.
So like this:
add eax, 1
It adds them and saves the value into the first argument which is the location.
Sub
sub also takes 2 arguments the first one being a location and the second being a location or straight-out data.
So like this:
sub eax, 1
It subtracts the second from the first and saves the value into the first argument which is the location.
You could think of both add and sub like so:
void add(int* location, int data) {
*location = location + data;
}
void sub(int* location, int data) {
*location = location - data;
}
int main() {
int x = 9;
add(&x, 10); // x should now be 19
sub(&x, 10); // x should again be 9
return 0;
}
Dec and Inc
Both dec and inc take 1 argument which must be a location.
dec decrements 1 from the location and inc increments the location by 1.
You could think of them like...
void inc(int* location) {
*location = location + 1;
}
void dec(int* location) {
*location = location - 1;
}
int main() {
int x = 9;
inc(&x); // x should now be 10
dec(&x); // x should now be 9 again
return 0;
}
jmp and call instruction.
If we define a label alongside our _start label like so...
section .text
global _start
_start:
; code...
; exit our program
mov eax, 1
mov ebx, 0
int 0x80 ; syscall
add20:
; this adds 20 to eax
add eax, 20
How do we call it?
Well, we use the jmp instruction!
So, let's do that.
jmp syntax:
jmp <label>
So let's update our code.
section .text
global _start
_start:
; code...
jmp add20
; exit our program
mov eax, 1
mov ebx, 0
int 0x80 ; syscall
add20:
; this adds 20 to eax
add eax, 20
The problem is the code below doesn't get executed.
There is however a way around this, the finiky but useful call instruction!
The call instruction has this syntax:
call <label>
Let us update our code with the call instruction.
section .text
global _start
_start:
; code...
call add20
; exit our program
mov eax, 1
mov ebx, 0
int 0x80 ; syscall
add20:
; this adds 20 to eax
add eax, 20
Still doesn't work...
That is because we never returned from our label.
We do this using the ret instruction!
So, this instruction has no syntax other than the keyword ret.
So let us add that to the end of add20
section .text
global _start
_start:
; code...
call add20
; exit our program
mov eax, 1
mov ebx, 0
int 0x80 ; syscall
add20:
; this adds 20 to eax
add eax, 20
ret
Let's see if our code is working by adding a bit of extra code to put 20 in eax before calling add20 and exiting with the result (which should be 40).
section .text
global _start
_start:
; code...
mov eax, 20
call add20
; exit our program
mov ebx, eax
mov eax, 1
int 0x80 ; syscall
add20:
; this adds 20 to eax
add eax, 20
ret
There ya go! We did it!
Conditonal jumps.
Ok. So what if we wanted to jump with a condition.
Let's make a goal for this section. Let's write a program that exits with 0 if eax is greater than 10, otherwise, it exits with 1.
Ok, firstly we need to know how to compare the two numbers.
For this we use the cmp instruction!
The syntax for `cmp is as follows:
cmp <x>, <y>
Please note that both x and y can be a location or data.
So we do that, now what?
Well cmp sets a few flags inside of the CPU.
If we want to jump based on these flags we use a few special jump statements.
These are exactly like the regular jmp except for the fact that they only jump based on these flags, so the syntax is exactly the same it's just the keyword is different.
The most basic are as follows
jg: jump if x was greater than y
jl: jump if x was less than y
jge: jump if x is greater than or equal to y
jle: jump if x is less than or equal to y
je: jump if x is equal to y
jz: jump if x is 0
jnz: jump if x doesn't equal 0
jne: jump if x does not equal y.
Ok, what we want for our goal is jg.
So let's write our code.
First we define the stuff that we will need.
section .text
global _start
_start:
mov eax, 11 ; so it will be greater than 10 it should exit with 0
; now what?
exit0:
mov eax, 1
mov ebx, 0
int 0x80
exit1:
mov eax, 1
mov ebx, 1
int 0x80
Ok, well we want to use jg so we will first compare with:
cmp eax, 10
Then we will jump if greater with jg:
jg exit0
So let's tack:
cmp eax, 10
jg exit0
On the end of _start.
section .text
global _start
_start:
mov eax, 11 ; so it will be greater than 10 it should exit with 0
cmp eax, 10
jg exit0
exit0:
mov eax, 1
mov ebx, 0
int 0x80
exit1:
mov eax, 1
mov ebx, 1
int 0x80
Ok, if it doesn't jump it will just get lost so we need to jump to exit1 if it's not.
Remember, because it's jmp it doesn't return so we will just tack:
jmp exit1
On the end of _start to catch it if doesn't jump to exit1.
section .text
global _start
_start:
mov eax, 11 ; so it will be greater than 10 it should exit with 0
cmp eax, 10
jg exit0
; if it doesnt jump up there it needs to go to exit1
jmp exit1
exit0:
mov eax, 1
mov ebx, 0
int 0x80
exit1:
mov eax, 1
mov ebx, 1
int 0x80
And there we go!
Function Prolouge and Epilouge
So as a safeguard programmers invented the Function Prolouge and Epilouge to go at the start and end of labels using call to make them safer.
FYI, C uses these.
For example:
int main() {
return 0;
}
Would compile to:
main:
push ebp
mov ebp, esp
mov eax, 0
leave
ret
Or some equivalent.
So what are these lines.
Well, for now you really don't need to know what they mean. Just know that ebp is the base pointer.
Prolouge
Put this at the start of your callable labels.
push ebp
mov ebp, esp
Epilouge
Put this at the end of your callable labels.
leave
ret
What leave actually is doing is undoing the code above like so:
mov esp, ebp
pop ebp
This code is to preserve the stack pointer so it doesn't get all messed up.
Conclusion.
Let's first write a program with all we have learned so far.
The goal of this program is to increment eax until it is greater than or equal to 100.
It will also put the amount of times it has reacurred into ebx.
Let's write this.
Firstly we need to define _start and our adduntil label.
section .text
global _start
_start:
mov eax, 50 ; it should go over 50 times
call loop ; calling the loop
; exit with the amount of times it has reacurred
mov eax, 1
int 0x80
adduntil:
push ebp
mov ebp, esp
; now what?
leave
ret
Now we can define another label called loop which will loop over.
section .text
global _start
_start:
mov eax, 50 ; it should go over 50 times
call loop ; calling the loop
; exit with the amount of times it has reacurred
mov eax, 1
int 0x80
adduntil:
push ebp
mov ebp, esp
; now what?
leave
ret
loop:
inc eax
inc ebx
; ???
Now on the end of loop we put the code to check stuff.
section .text
global _start
_start:
mov eax, 50 ; it should go over 50 times
call loop ; calling the loop
; exit with the amount of times it has reacurred
mov eax, 1
int 0x80
adduntil:
push ebp
mov ebp, esp
; now what?
leave
ret
loop:
inc eax
inc ebx
cmp eax, 100
jl loop
; ???
Ok, now we need to cut of the end of _start so we can jump to that.
section .text
global _start
_start:
mov eax, 50 ; it should go over 50 times
call loop ; calling the loop
adduntil:
push ebp
mov ebp, esp
; now what?
leave
ret
loop:
inc eax
inc ebx
cmp eax, 100
jl loop
jmp exit
end:
; exit with the amount of times it has reacurred
mov eax, 1
int 0x80
And tidy up...
section .text
global _start
_start:
mov eax, 50 ; it should go over 50 times
call adduntil ; calling the func
adduntil:
push ebp
mov ebp, esp
cmp eax, 100
jl loop
leave
ret
loop:
inc eax
inc ebx
cmp eax, 100
jl loop
jmp exit
end:
; exit with the amount of times it has reacurred
mov eax, 1
int 0x80
And we are done!
In this case, the function prolouge and epilouge for adduntil aren't really required but why not (maybe it causes some problem, if you have an error remove it, I just wanted to show it off.)?
Please upvote if you liked it, it helps more people see the tutorial :-).
@programmeruser I think this tutorial is pretty long already, imo, cycle squeezing would be rather cutting a tutorial in very small pieces, but @Waku made a pretty lengthy tutorial lol! :D
x86 Assembly Tutorial Part 2 (again, its big i promise..)
Assembly Tutorial Part 2
Ello.
Let's just hurry into part 2 :-).
The stack
Let me introduce a concept to you.
There is a thing on your computer called
the stack
.It is more of a concept than a thing.
It resides in your RAM and it is what is known as a Last in, First out or LIFO data structure.
Let us discuss what that is for a second.
LIFO data structures
Imagine you have a stack of books.
This stack of books has a top and some stuff under it.
Let us say that there is 5 books currently on this stack.
Let us order them like this:
And you have another
Book #6
off to the side.If you wanted to take off
Book #4
you would first have to take offBook #5
.That is known as
popping
from the stack.If you were to put
Book #6
on top that would be pushing to the stack.Let us look at a more relavant example with data.
Data LIFO
Let us say you have 1 register and how about it is
eax
.And you have a stack.
The stack has a
pointer register
calledesp
.This
pointer register
points to the top of the stack.Let us have a visual of all of this data with a few pushes and pops.
Before anything:
Notice that
esp
starts at16
this is because on x86 processors the stack starts at the top.Also we only have 4 places in the stack (a real stack would be much bigger than this) but
esp
is at 16.This is because each place in the stack is 4 bytes and each memory location is 1 byte in itself so we need to take the amount of things in the stack and multiply it by 4 to get
esp
.Ok, so now lets push
9
to the stack.Now let's pop 9 back out into
eax
.For you C programmers out there I will also provide a structure-form of a stack.
The stack is actually placed upside-down in RAM aswell.
The stack (continued)
OK, so how do we do pushes and pops?
Well let us review each seperately.
Push
Syntax:
value
can be a register, pointer, straight-out data (such as just the number5
), and a few other things that we will discuss later.Pop
Syntax:
location
can be a register, pointer, and a few other things that we will discuss later.Ok! Now you have a decent understanding of the stack! You can use it to store things, preserve things, and to pass arguments to labels.
Basic Math!
Add
add
takes 2 arguments the first one being a location and the second being a location or straight-out data.So like this:
It adds them and saves the value into the first argument which is the location.
Sub
sub
also takes 2 arguments the first one being a location and the second being a location or straight-out data.So like this:
It subtracts the second from the first and saves the value into the first argument which is the location.
You could think of both
add
andsub
like so:Dec and Inc
Both
dec
andinc
take 1 argument which must be a location.dec
decrements 1 from the location andinc
increments the location by 1.You could think of them like...
jmp and call instruction.
If we define a label alongside our
_start
label like so...How do we call it?
Well, we use the
jmp
instruction!So, let's do that.
jmp
syntax:So let's update our code.
The problem is the code below doesn't get executed.
There is however a way around this, the finiky but useful
call
instruction!The call instruction has this syntax:
Let us update our code with the call instruction.
Still doesn't work...
That is because we never returned from our label.
We do this using the
ret
instruction!So, this instruction has no syntax other than the keyword
ret
.So let us add that to the end of
add20
Let's see if our code is working by adding a bit of extra code to put 20 in
eax
before callingadd20
and exiting with the result (which should be40
).There ya go! We did it!
Conditonal jumps.
Ok. So what if we wanted to jump with a condition.
Let's make a goal for this section. Let's write a program that exits with
0
ifeax
is greater than10
, otherwise, it exits with1
.Ok, firstly we need to know how to compare the two numbers.
For this we use the
cmp
instruction!The syntax for `cmp is as follows:
Please note that both
x
andy
can be a location or data.So we do that, now what?
Well
cmp
sets a few flags inside of the CPU.If we want to jump based on these flags we use a few special jump statements.
These are exactly like the regular
jmp
except for the fact that they only jump based on these flags, so the syntax is exactly the same it's just the keyword is different.The most basic are as follows
jg
: jump ifx
was greater thany
jl
: jump ifx
was less thany
jge
: jump ifx
is greater than or equal toy
jle
: jump ifx
is less than or equal toy
je
: jump ifx
is equal toy
jz
: jump ifx
is0
jnz
: jump ifx
doesn't equal0
jne
: jump ifx
does not equaly
.Ok, what we want for our goal is
jg
.So let's write our code.
First we define the stuff that we will need.
Ok, well we want to use
jg
so we will first compare with:Then we will jump if greater with
jg
:So let's tack:
On the end of
_start
.Ok, if it doesn't jump it will just get lost so we need to jump to
exit1
if it's not.Remember, because it's
jmp
it doesn't return so we will just tack:On the end of
_start
to catch it if doesn't jump toexit1
.And there we go!
Function Prolouge and Epilouge
So as a safeguard programmers invented the Function Prolouge and Epilouge to go at the start and end of labels using
call
to make them safer.FYI, C uses these.
For example:
Would compile to:
Or some equivalent.
So what are these lines.
Well, for now you really don't need to know what they mean. Just know that
ebp
is thebase pointer
.Prolouge
Put this at the start of your
callable labels
.Epilouge
Put this at the end of your
callable labels
.What leave actually is doing is undoing the code above like so:
This code is to preserve the stack pointer so it doesn't get all messed up.
Conclusion.
Let's first write a program with all we have learned so far.
The goal of this program is to increment
eax
until it is greater than or equal to100
.It will also put the amount of times it has reacurred into
ebx
.Let's write this.
Firstly we need to define
_start
and ouradduntil
label.Now we can define another label called
loop
which will loop over.Now on the end of loop we put the code to check stuff.
Ok, now we need to cut of the end of
_start
so we can jump to that.And tidy up...
And we are done!
In this case, the function prolouge and epilouge for
adduntil
aren't really required but why not (maybe it causes some problem, if you have an error remove it, I just wanted to show it off.)?Please upvote if you liked it, it helps more people see the tutorial :-).
More parts coming soon!
@programmeruser I think this tutorial is pretty long already, imo, cycle squeezing would be rather cutting a tutorial in very small pieces, but @Waku made a pretty lengthy tutorial lol! :D