This page was last updated on March 13th, 2020(UTC) and it is currently May 30th, 2023(UTC).
That means this page is 3 years, 78 days, 4 hours, 7 minutes and 18 seconds old. Please keep that in mind.
28 - Hello DOS Assembly
Let's start off with a nice proper "hello world" for DOS.
.intel_syntax noprefix
.section .text
.global _main
_main: pushd offset MSG
call _puts
add esp, 4
xor eax, eax
ret
.section .data
MSG: .asciz "meow"
Ok, so what's going on here is quite simple. We have a new section (".data") which is where we want to put our data. It's not absolutely necessary to make a new section, but it's good practice. ".global" basically suggests a label should be tracked in the object file in a way that other programs can call it. libstdc.o is a hidden object file that gets "linked" by LD by default for most of these projects, and it expects a "main" symbol to be tracked. For DJGPP, it would seem that symbols expected by the c library getting linked against it are getting a prefix of "_", which isn't normal, but it's pretty simple to deal with, but it's something to keep in mind. "pushd" is an instruction that acts as a "mov" to the memory location specified by the esp (stack pointer) register, after subtracting the size of the oprand from esp. Other "push" instructions also exist of different sizes, including an ambiguous "push" instruction whose size is automatically determined by the size of the oprand. I'll explain in more detail when we get to C, but for now, understand that C functions receive parameters via offsets of esp, pushed in reverse order (first param ends up being ESP+4, second being ESP+8, etc, on 32bit mode of x86 CPUs). Now, push has and equivalent "pop" instruction, which basically undoes the push (and stores the popped data in the destination oprand), but we're not using it. The "offset" is required by the current setup (but not all setups) to say "offset from section." Without it, a memory protection fault (error) will occur, because you'd be trying to access memory the OS thinks you shouldn't have access to (even if you do) because you're accessing it wtih the wrong segment register (which is used for memory protection to keep programs from messing with the OS by accident). "call" basically acts as a "push" with the address of the first instruction after call, and a jmp instruction to the oprand. The code at the specified location is presumed to have a "ret" instruction, which basically pops with "eip" (instruction pointer), which, in this case, becomes "add esp, 4".
Effectively, what this means is, "_main" gets called by the clib which has the "entrypoint" or "start of the program." Our program gets "wrapped" by the clib, so it can go into 32bit mode for us and do a few other really tedious things so we don't have to. Odds are, the actual entrypoint is called "_start" so if you're curious what all it does, feel free. The MZ executable format (which I doubt you want to learn unless you plan on using DOS forever) is managed by LD which is invoked by GCC, hiding quite a bit of boilerplate code from us. The "puts" function (we call "functions") prints out a "null-terminated string." That is to say, it prints out letters 1 at a time until it finds a byte with the value of 0 (hence why .asciz exists to put a 0 at the end, since this has become the standard convention).
Get your own web kitty here!
©Copyright 2010-2023. All rights reserved.