|
MSIL is the lowest level .NET language. All languages targeting .NET framework generate MSIL. As a C# developer, you will probably never write MSIL code directlly, but you will often look at the MSIL disassembly of your application for answers to dependency, versioning and optimization questions. To start disassembling C# (or any other .NET application) you need to run ildasm.exe from either VS.NET command prompt or .NET framework command prompt. On my computer, ildasm.exe is at C:\Program Files\Microsoft Visual Studio 8\SDK\v2.0\Bin. To get you excied about MSIL, let me ask you a question: Where does the term boxing come from? Is there such a keyword in C#? What about VB.Net? Why do we call value to object conversion boxing and the oposite conversion unboxing? Because, MSIL is using box and unbox keywords to perform these conversations. So, let's study MSIL. The simplest MSIL program is the one which does not do anything and has no data: |
.assembly hello{}
.class hello
{
.method static public void main() il managed
{
.entrypoint
ret
}
}
MSIL is an object oriented assembly language. As such, it retains object-oriented constructs of the source languages, e.g private and public methods. Every MSIL application needs to have an entry point. Any method (not just Main) can serve as the entry point as long as it is decorated with .entrypoint instruction. MSIL programs are compiled with the Intermediate language compiler, ilasm.exe that is located in the same directory as the dissassambler. Here is a more complicated program that, once again, does not do anything but has some data.
.assembly hello{}
.class hello{
.method static public void main() il managed{
.entrypoint
.locals( string V_0)
ldstr "hi there"
stloc.0
ret
}
}
This program has a statement .local(string V_0),which declares a single local variable of type string. This declaration allows the compiler to allocate "hi there" on the local stack. Because of that, stloc.0 can find "hi there" and pop it from the stack. Since you are working in a managed environment, you cannot leave data in memory before quitting the program. Memory leaks are not allowed; so every single variable which you have allocated in memory has to be popped from the registers. Every program also needs to start with a declaration of the assembly it belongs to. In our case, we choose the assembly name to be the same as the class name.
Intermediate language compiler is very forgiven, and you may easily crash an MSIL application by inserting some invalid instructions into the code. For example, try adding ldstr "hi there"; after ret instruction above.
Let's take a look at a bit more complicated example which still doesn't do anything useful.
//allocating and deallocating multiple variables on the stack
.assembly hello{}
.class hello
{
.method static public void main() il managed
{
.maxstack 2
.entrypoint
.locals( string V_0, string V_1) //we have two local variables now
ldstr "hi there" //push this string on stack
ldstr "bye here" //push second string on stack
stloc.0 //pop first string from the stack and store it in the
local variable 0.
//you do not need to worry about deallocating local variables - it is done by
the runtime.
stloc.0 //pop the second string from the stack and store it in the same local
variable ("hi there" is overwritten)
ret
}
}
There is a new element in this program: .maxstack declaration. We use .maxstack to declare the maximum number of variables we plan to have on the stack at any given time. The default value is 1, so we can always omit this declaration when we use a single register.
Here is a hello world program written in MSIL
//compile with ilasm
.assembly hello {}
.method static public void main() il managed
{
.entrypoint
ldstr "Hello MS IL!"
call void [mscorlib]System.Console::WriteLine(class
System.String)
ret
}
All MSIL directives start with a period. Any MSIL component (except module) is an assembly. Ilasm allows classles assemblies (see code above). However, classless assemblies are not compatible with assemblies generated from higher level .NET languages (e.g. C# and VB.NET).,
.entrypoint and ret are equivalent to main(){ ... }
.lsdtr loads string into a register and calls to WriteLine picks it up from there. WriteLine does all the clean up before it displays "hello msil", we do not need to pop anything from the stack. We will get a runtime error if we do.
Here is a program which illustrates how to store data into local variables and how to overwrite them
.assembly hello{} It is always a lot of fun to manipulate integers with Assembly language.
//print number 2
.assembly extern mscorlib {}
.class hello
{
.method static public void main() il managed{
.maxstack 2
.entrypoint
.locals(string V_0, string V_1)
//we have two local variables now
ldstr "hi there" //push this string on stack
ldstr "bye here" //push second string on stack
stloc.0 //pop first string from the stack and store it in the local variable 0.
//you do not need to worry about dealocating local variables - it is done by the runtime.
stloc.0 //pop the second string from the stack and store it in the same local variable ("bye there" is overwritten)
ldloc.0 //push the remaining local variable containing "bye there" into the register
call void [mscorlib]System.Console::WriteLine(string)
ret
}
}
.assembly hello {}
.method public static void Main() il managed
{
.entrypoint
.locals(int32
ldc.i4.2
stloc.0
ldloc.0
call void [mscorlib]System.Console::WriteLine(int32)
ret
}
The next program adds two integers
//add two numbers 1 and 3
.assembly hello {}
.assembly extern mscorlib {}
.class public hello
{
.method static public void main()
{
.entrypoint
.maxstack 2
.locals(int32 V_0, int32 V_1) //declare two local variables
ldc.i4.1 //put number 1 on the stack
ldc.i4.3 //put number 3 on the stack
stloc.0 //pop 1 from the stack and store it in the local variable
ldloc.0 //push local variable with value 1 on the stack
add //add takes care of the second value on the local stack
//you should not try to deallocoate memory there. it is done by add
//add works with the first variable on the stack
and the value
call void [mscorlib]System.Console::WriteLine(int32)
ret
}
}
It is sometimes very useful to have an explicit conversion between a value and an object. This is done with box directive. The example bellow outputs an object value. So, we need to explicitly convert the data inside the register to a boxed data.
.assembly hello{}
.method public static void Main() il managed
{
.entrypoint
ldc.i4.s 100 //put 100 on stack
box [mscorlib]System.Int32 //convert it to on object in place
call void [mscorlib]System.Console::WriteLine(object) //print the value
of the object
ret
}
The example above was a bit contrived to keep things simple . Here is a more realistic example
.assembly hello{}
.method public static void Main() il managed
{
.entrypoint
.maxstack 2
.locals (int32 V_0)
ldstr "Please enter your age:"
call void [mscorlib]System.Console::WriteLine(string)
call string [mscorlib]System.Console::ReadLine()
call int32 [mscorlib]System.Int32::Parse(string)
stloc.0
ldstr "You are {0} years old "
ldloc.0
box [mscorlib]System.Int32 //convert int32 to an object on the stack
call void [mscorlib]System.Console::WriteLine(string, object)
ret
}
Note that MSIL does not have System.Consol::WriteLine(sting,int32 ) method, therefore int32 needs to be converted to another type to allow output to the console.
Exercises: