How to combine assembly with the C++ skills
Introduction
How access to the structures
Accessing to the fields of structures declared in our source
A trick to get easily the path of all subfields of a structure
Accessing to structures whereby a pointer
How to use the offsets of a structure
What is a macro?
Accessing to the StrGlobaliTomb4 structure whereby the MOVE_OFFSET() macro
The MOVE_VALUE() macro
Understanding the Pointers
What macro is better using?
How to get the field paths for structures different than the global structure?
MOVE_OFFSET_FIELD(): the generic macro to access to any structure
How to manage a field path when there is another pointer in the middle
The intermediate computations to access to middle pointers or indexing of vectors
Macros to access to the fields of most important structures: Moveable Items, Statics and Room structures
The GET_ROOM_OFFSET() macro
The GET_ITEM_OFFSET() macro
The GET_STATIC_OFFSET() macro
How to get the pair indices to access to static items
The CONVERT_STATIC_INDEX() macro
How to call a C function from assembly
How to call the C functions of Tomb Raider program
How to examine "Functions.h" and "DefTomb4Funct.h" sources for accessing to Tomb Raider functions
How to declare in our source a Tomb Raider function we discovered
How to call a tomb raider function from assembly
How to convert the C++ types in Assembly types?
Warning about little arguments
Troubles about "bool" or "BOOL" types
What is the "void" argument?
How to preserve our register when we call a C++ function
Using PUSHAD / POPAD to preserve all registers
Using SAVE_MY_REGISTERS() and RESTORE_MY_REGISTERS() macros to preserve registers
That problem when we save/restore the EAX register
How to preserve the registers for the caller
Introduction
There a lot of advanced features of C++ language that we could wish to help our job.
With C++ it is easy: reading/writing a file from/to disk, performing operations in floating point, manage complicated array of structures, showing message box, windows and performing advanced graphic functions on the screen.
The most of above operations are very complicated using only assembly language.
The good news is that we can use all above C++ features from assemly having only a very litle knowledge about C++ programming.
The target of this document is to give this limited knowledges to create a C_to_assembly bridge to take advantage of most advanced features of C++ compiler.
How access to the structures
We have already seen the declaration of a structure in the First Steps in Assembly document, now we'll trye to explain how accessing to these structures from the assembly language.
First of all, it's important to understand because the accessing to the structures is so important...
The most of data managed by tomb4 engine is based own on structures.
The data about a moveable, like Lara or an enemy, is a structure.
The data about position and status of statics objects are store in structures.
The data about textures, collisions, lights and sounds are stored in structures, too.
Also the data of trng library are stored in structures...
For above reasons it's not own possible skipping this argument.
Accessing to the fields of structures declared in our source
We could begin with a simple structure:
typedef struct StrPosition {
int OrgX;
int OrgY;
int OrgZ;
}stuffPosition;
Our target is to discover how accessing to a single field, like OrgZ, knowing the begin of the StrPosition structure.
When we declare this structure in our source is enough easy this target.
StrPosition MyData;
mov eax, dword ptr [MyData.OrgZ]
The only one (little) problem in above case is to go to read the structure prototype (what you see after "typdef struct" statement), copy the precise field name (in our example it was "OrgZ") and copy it, after the "." (point).
This operation is easy, in previous example, because the StrPosition structure is very simple, but sometimes the situation is more boring.
It's necessary kwow that a structure can very often contain inside other structures.
For example keep in mind the previous StrPosition structure, we can see another structure like this:
typedef struct StrRobot {
DWORD Flags;
StrPosition StartPosition;
StrPosition EndPosition;
int SpeedX;
int SpeedZ;
}RobotFields;
In above StrRobot structure, we can see there are declaration of other StrPosition structure. This situation is very common (unfortunately)
If we have to access to OrgZ field of StrRobot about StartPosition we'll have to type this instruction:
StrRobot MyRobot;
mov eax, dword ptr [MyRobot.StartPosition.OrgZ]
Not even this instruction seems so complicated but in this case we have had to read both prototye structure, that of StrRobot (to find the "StartPosition" name) and then the StrPosition structure (to find the precise name of OrgZ).
There are structures hugely more complicated than above structure, with a lot of structure nested in others.
In these situations become really tiring building the whole list of fields and subfields.
Fortunately there is a trick to fast this operation....
A trick to get easily the path of all subfields of a structure
Now we are wondering about how typing the full path of fields in assembly but how does it happen using the C++ language?
To try just typing the name of our MyRobot structure inside a C function.
We can also create an empyt C function for this experiment:
void MyTestingCProc(void)
{
StrRobot MyRobot;
_
}
Now move the cursor where you see the _ sign, and type (slowly, so it's more fun :) the name: MyRobot
Ok, now type the point "." on the keyboard: it will happen this:
It will be showed the list of first fields of MyRobot structure.
Now we can use the up/down arrows of keyboard to select the wished field, in our example the "StartPosition" field, and now hit newly the point "." key.
Now we'll see the new list of subfields
Now we can move up/down to select the wished field, the OrgZ, and now hit SPACE key on the keyboard.
We'll have gotten our final result (MyRobot.StartPosition.OrgZ) without going to read the source where we typed the structure prototype of StrRobot and StrPosition structures.
This auto-enumerate feature is marvellous, and absolutely necessary when we work with huge structures with dozen of fields and many sub-structures contained in it.
It's a pity that this auto-enumerate fields does not work fine in assembly procedures, anyway we can perform this operation inside a C function and then copy the full path (MyRobot.StartPosition.OrgZ) and paste it to our assembly instruction:
mov eax, dword ptr [MyRobot.StartPosition.OrgZ]
Note: The matter about auto-enum field skill in assemly, it's misterious...
Probably there is a bug in the Microsoft compiler, because theorically the auto-enum should work also in assembly code, but very often it doesn't.
I noticed that when you use a structure variable with a macro (enclosing it, in the arguments, between round parenthesis), it works.
This is a good news, since we'll use very often the macros (see What is a macro? ).
Accessing to structures whereby a pointer
When the structure has not been declared in our source but we have only the address where it begins we have to use a pointer to access to it.
Theorically is not a problem reaching it in assembly, just copy in extended register, like ESI this address and then we can read or write values of this structure.
The problem remains that about the fields and subfields names, and in this case it's more complicated because we cann't use the previous trick since this code is not valid:
;we suppose to have the address of our StrRobot structure
;in esi register
; this instruction is not correct
mov edx, StartPosition.OrgZ[esi]
; and neither this
mov edx, MyRobot.StartPosition[esi]
This first instruction will give an error because the StartPosition will be not recognized by Compiler since it's not a declared variable but only a internal field.
Better could go, the syntax:
mov edx, [esi].StartPosition.OrgZ
Because in this case the compiler understands that the ".StartPosition.OrgZ" are offset of a structure, but this method doesn't work always, because when we have a field name that has been used also in other struturre, the compiler will gives the error:
mov ecx, [eax].CordX
error C2410: 'CordX' : ambiguous member name in 'second operand'
While the second instrution:
mov edx, MyRobot.StartPosition[esi]
It will be accepted but its execution will give an error, since it adds to the address of the structure another address of another structurer (MyRobot) declared in our source. A big mess!
In C when we have a pointer we have not to use the "." (point) but the couple of characters "->".
See following image
We can see that, at left, we have to use the "->" pointer operator, while after the StartPosition we use newly the "." because the StartPosition is not a pointer.
Anyway at end we get in our C function this path:
pRobot->StartPosition.OrgZ
Unfortunately, neither this path is useful for our target, because we cann't use the "->" operator in assembly (it doesn't exist in assembler) and neither the second part "StartPosition.OrgZ" works, like we had just said.
How to use the offsets of a structure
When we have the address of a structure and we wish access to its single fields, we can use an information present in own prototyes of structure.
Now we give newly a look to our Robot structure:
typedef struct StrRobot {
DWORD Flags;
StrPosition StartPosition;
StrPosition EndPosition;
int SpeedX;
int SpeedZ;
}RobotFields;
We know how to use the "StrRobot" definition: to allocate memory where store this structure where we can write and read values in its fields.
StrRobot MyRobot;
MyRobot.SpeedX =34;
MyRobot.StartPosition.OrgX = 1440;
But what is the meaning of the "RobotFields" name at end of the structure prototype?
That is used to have a reference about the offsets of single fields, i.e. the distance from each field and the 0 position of beginning of the structure.
For example accessing to the RobotFields fields, using the "." operator, they will have following values:
RobotFields = 0
RobotFields.Flags = 0
RobotFields.StartPosition = 4
RobotFields.EndPosition = 16
RobotFields.SpeedX = 32
RobotFields.SpeedZ = 36
Those growing number, are the distance from 0, for each field.
For example .StartPosition begins to offset 4, because first of it, there was only the .Flags variable, that it is an "DWORD" type (4 bytes).
This feature is ideal to work with pointer to structures (and it's not a coincidence, of course ;-)
So when we had the address of our Robot structure in a pointer (a register in this case), we'll be able to access to its field using RobotFields offset:
;we suppose to have in [esi] register the address of
;a StrRobot structure
mov ecx, RobotFields.Flags[esi]
mov dword ptr [esi+RobotFields.StartPosition.OrgX], 100
In above mode, it will work!
What is a macro?
I start by saying that is not necessary you understand the macros or how creating a macro.
You can use them, like they were common assembly instructions avoiding other complications.
Here I explain shortly the core of macro only for the inquiring people.
You create a macro using the #define directive we had already seen in First steps in assembly document.
Really in that page we spoke about using of #define to define constant value with mnemonic names, but we can use this directive also to create a macro.
When we set arguments in a #define directive we create a macro with arguments, a very powerful tool to customize our assembly instructions.
Pratically now we could rewrite all instructions giving them new names and syntax.
Anyway we'll use the macros for other more consistent reasons: a shortcut to get a big assembly code in our final project, in spite we typed only few character (the name of the macro and its arguments).
An example of macro with arguments is this:
#define MOVE_ARG(Register, ArgNumber) \
__asm mov Register, dword ptr [ESP+4*ArgNumber]
Looking above macro we can discover some infos about its syntax:
When we have no spaces in first row this means it's not a common #define but a macro.
When first row (and further others) ends with "\" sign, this macro will create more rows in spite when we'll use it, it will be always only on a single row.
When we have a couple of round parenthesis this is a macro with arguments and the names typed inside the parenthesis are the arguments split by commas.
Above macro will be used in this way:
MOVE_ARG(ecx, 2)
;then we can use the ecx register where it will be copied
;the wished value: the second argument
cmp ecx, 4
jz GoIf4
This macro will be used to read from stack an argument (first=1, second=2, third=3 ect).
We should use it only in first rows of a procedure that has been called in C style with pushing of the arguments into the stack.
It's interesting see how will be converted that macro, in above code from compiler.
It will become:
__asm mov ecx, dword ptr [ESP+4*2]
Looking this and comparing with the previous declaration of the macro now we begin to understand how it works.
The compiler will simply replace the argument, given in the usage (ecx,2) with the names supplied in the definition (Register, ArgNumber) and it will use the new code so replaced to create the real code.
About the "__asm" directive it's necessary to remember to the compiler that we are using assembly syntax, otherwise it will give an error.
As last description of above macro, we can show as we'll use it to extract three arguments from the stack and to copy them in eax, ebx and ecx registers:
MOVE_ARG(eax,1)
MOVE_ARG(ebx,2)
MOVE_ARG(ecx,3)
It could seem a real assembly instruction but it's not true, of course. I invented it only to get more easy, a specific, very common operation.
If we don't use that macro, we could get same result typing real assembly instructions in this way:
mov eax, dword ptr [esp+4]
mov ebx, dword ptr [esp+8]
mov ecx, dword ptr [esp+12]
That it's a bit more complicated, we have to type more text and remember (and compute) the issue about "each argument requires 4 byte, therefore to have third argument we have to type [esp+12].
Isn't it more easy to type: "MOVE_ARG(ecx,3)" where the 3 is for third argument?
Accessing to the StrGlobaliTomb4 structure whereby the MOVE_OFFSET() macro
We can access to subfields of main structure of trng, the "StrGlobaliTomb4" structure (wich beginning address is in [Trng.pGlobTomb4] pointer variable), whereby the MoveOffset() macro.
This structure is huge, and it contains almost all structures used by trng and also the pointers to all tomb4 structures.
To locate a single value in this huge structure you can act in different way in according with the language you are using.
If you use C++ language it's easy, you can use the auto-list function already described getting for example this result:
int Blink;
// read the time for blinking bar in the damage rooms
Blink = Trng.pGlobTomb4->DamageRoom.BlinkTime;
In above example we have all help of auto-list featrure. Just begin with "trng", type the "." and then choose the field ect.
Only thing to take care is to remember to use "." or "->" in according if the left field is a declared variable or a pointer. The pointer names begin with lower P or with "ptr"
When we see "vet" it means "vector" and you have to use the squared parenthesis to set the index you wish read of that vector.
All speech is about C++ language.
In assembly we'll have to use the MOVE_OFFSET() macro in this way:
int Blink;
// read the time for blinking bar in the damage rooms
MOVE_OFFSET(eax, GlobalFields.DamageRoom.BlinkTime)
mov ecx, word ptr [eax]
mov dword ptr [Blink], ecx
This macro computes the offset of specific field you typed using GlobalFields offset of the StrGlobaliTomb4 structure, and then it adds this value to the real address of the structure, stored in Trng.pGlobTomb4 pointer.
Remark: As already explained in A trick to get easily the path of all subfields of a structure chapter, in the case you don't see the auto-list member working from assembly, it's necessary typing temporary the macro inside a C function to enable the auto-list feature to select fields and subfields of the structure. Then, we'll copy the macro to your assembly procedure.
Anyway, the member auto-list feature works (partially, at first level of "." fields) when you use it from the macro arguments.
The MOVE_VALUE() macro
The MOVE_VALUE() is very alike than MOVE_OFFSET() but in this case we'll get immediatly the wished value in the supplied register, while with MOVE_OFFSET() macro you get the address of the wished field and then you'll use that address to read/write in that field, while with MOVE_VALUE() macro you can only read the given fields, but you perform the reading operation immediatly.
Example of MOVE_VALUE() usage:
int Blink;
MOVE_VALUE(ecx, GlobalFields.DamageRoom.BlinkTime)
mov word ptr [Blink], ecx
About MOVE_VALUE() macro, it's important set the right size type of register (first argument of the macro), because this size type has to be the same of the field you are reading.
For example if we wish read a BYTE field we'll use a BYTE register but the matter it's a bit more complicate when the field is a pointer:
Understanding the Pointers
BYTE MyFlags;
MOVE_VALUE(ecx, GlobalFields.Adr.pFlagsScriptDat)
mov al, byte ptr [ecx]
mov byte ptr [MyFlags], al
Look carefully above rows: the field we read is "pFlagsScriptDat", it contains info for current level stored in script.dat file. Anyway we now have to understand the type of these values.
In the structure prototype of global structure ( we call here "global" this main structure of our plugin, the "StrGlobaliTomb4") that you find in Tomb_NextGeneration.h source, our field has been declared in following way:
BYTE *pFlagsScriptDat;
This is a pointer to a BYTE value.
Anyway we have to understand that a pointer is ALWAYS a DWORD value like size (4 byte), because it contains a memory address. Then this address memory can point to a BYTE, a short or other structure, chars ect, but a pointer is always a DWORD.
For this reason we typed the macro in this way
MOVE_VALUE(ecx, GlobalFields.Adr.pFlagsScriptDat)
mov al, byte ptr [ecx]
mov byte ptr [MyFlags], al
We copied in ecx register (a dword register) the pointer, and then we used the address saved in ecx to read the byte.
To understand better we can compare the MOVE_VALUE() macro with the MOVE_OFFSET() macro with the same target.
Using the MOVE_OFFSET() macro to get the value of flags for current level (FlagsScriptDat), we should type the code in this way
BYTE MyFlags;
MOVE_OFFSET(esi, GlobalFields.Adr.pFlagsScriptDat)
;in esi now there is the address of pFlagsScript
;we have to read the pointer stored in pFlagsScriptDat
;field
mov ecx, dword ptr [esi]
;now in ecx we have the address where is the byte
;of FlagsScriptDat
mov al, byte ptr [ecx]
;now in al there are the value of FlagsScriptDat
mov byte ptr [MyFlags], al
I know that the pointers are a bit vague concept at beginning.
We can try to uderstand better with an image showing the interested addresses and memory cells.
In above image the black boxes are the memory cells where we can store numbers.
The number outside (at left) of the boxes are addresses of those memory cells.
Now we can suppose that the memory cell with address 412350 is the pFlagsScriptDat field.
Inside that memory there is the dword number 420580, this is a number but it will be used as address for another memory cell, where we'll find (finally) the value we wished: the flags of current level.
In above example this value is the 79 number.
Using above image as example, we can show how change the values of registers in our code.
With MOVE_OFFSET() macro...
MOVE_OFFSET(esi, GlobalFields.Adr.pFlagsScriptDat]
;in this moment esi = 412350 (see above image)
mov ecx, dword ptr [esi]
;now ecx= 420580
mov al, byte ptr [ecx]
mov byte ptr [MyFlags], al
;now MyFlags = 79
or with MOVE_VALUE() macro...
MOVE_VALUE(ecx, GlobalFields.Adr.pFlagsScriptDat)
;now ecx=420580
mov al, byte ptr [ecx]
mov byte ptr [MyFlags], al
;now MyFlags = 79
What macro is better using?
In many circustances we can get same result with both macros, anyway we could say that:
- The MOVE_OFFSET() macro is more general purpose, since we get only the address of a field and then we can manage that value as we wish.
Differently the MOVE_VALUE() reads always the value from some field of the global structure, for this reason you cann't use MOVE_VALUE() to write a value in a field of the structure. (note: it takes exception to this speech, when you use MOVE_VALUE() to get the value of a pointer. In this case you'll be able to use that pointer to read but also to write from/to that field.)
- When you have to read single values the MOVE_VALUE() macro is briefer than MOVE_OFFSET() since it returns immediatly the wished value in the register you chose, while with MoveOffset() you'll have to add another instruction to read the value pointed by the pointer in the register.
- The MOVE_VALUE() macro uses internally the EBX register, therefor you have the limitation that you cann't use as Register argument own EBX , BX , BL or BH
Anyway the value of EBX will be saved and restored from MOVE_VALUE() macro, so you are not fear of losing the value stored in it.
How to get the field paths for structures different than the global structure?
We saw that the MOVE_OFFSET() and MOVE_VALUE() macros work both to access to global structure StrGlobaliTomb4, but what happen if we wish access to a different structure?
I start by remembering that this is a problem ONLY when you have only the address (pointer) to access to this structure. This means that for all other structures, you declared in your source, you'll be able to access to the fields easily using the trick already showed above (A trick to get easily the path of all subfields of a structure
The situations where you'll have a pointer to access to some extern (to your source) structure and different than global structure are rare but they exist.
It could happen in your callback procedures, where trng will pass to your code some structures by address, i.e. with pointers.
MOVE_OFFSET_FIELD(): the generic macro to access to any structure
We can see another macro that is able to work with any structure.
The trick is to add another argument to host the value of the address where this structure begins (i.e.. the pointer to this structure).
#define MOVE_OFFSET_FIELD(Register, Address, Fields) \
__asm mov Register, Address \
__asm add Register, Fields
The meaning of arguments:
Register: The target register (always extended type like eax, ecx, ebx, esi, edi, ebp) where the result will be copied
Address: The address where begins our structure, i.e. its' pointer
Fields: The offset reference of the structure type, something like "nameFields...."
Then we can use it with different structures: for example, the StrMeshTr4 and the StrOrient structures (you find both in the "structures.h" source file)
;we have the base of StrMeshTr4 structure in ebx register
;it computes the displacement to access to its SphereRadius field
MOVE_OFFSET_FIELD(ecx, ebx, MeshTr4Fields.SphereRadius)
; now in [ecx] there will be the address that points to "SphereRadius" variable of our structure
;to read the value of SphereRadius we'll use this:
mov ax, word ptr [ecx]
;to write a value in SphereRadius we'll ue this:
mov word ptr [ecx], 0
;another example, using the StrOrient structure
;we suppose having the base of StrOrient structure in [esi] register
MOVE_OFFSET_FIELD(eax, esi, OrientFields.OrientR)
; now in [eax] there will the address of "OrientR" variable of our structure
;to read the value of OrientR:
mov cx, word ptr [eax]
;to change the value in OrientR:
add word ptr [eax], 16
Remark:
Looking the definition of MOVE_OFFSET_FIELD() macro:
#define MOVE_OFFSET_FIELD(Register, Address, Fields) \
__asm mov Register, Address \
__asm add Register, Fields
It seems so little and easy to wonder whether it's necessary using a macro instead of typing simply the original two assembly instructions to reach same target.
Well, obviously, you can omit the use of this macro, and computing by yourself the final address of a subfield of your structure.
Anyway, the usage of a macro, other to spare the instruction rows (you type only one row, instead of 2), when you use a macro the auto-enumerate of field member will work fine, while using it in common assembly instructions it doesn't (for obscure reasons).
How to manage a field path when there is another pointer in the middle
In previous chapters we saw how it's possible getting a field path using the C skill about structure accessing.
Once we get a field path like:
GlobalFields.DamageRoom.BlinkTime
We can use it to get an offset (of the field in the real structure) with the MOVE_OFFSET macro, or the real value, with the MOVE_VALUE macro.
The macros and method showed above works fine, but they have a limit.
When in the field path there is another pointer (in above examples there was a pointer, but it was own the first) in the middle, the above trick cann't work.
The reason is that C language has bigger skills about computing of the address.
For example this C instruction:
MyValue = GlobalFields.Adr.pVetRooms[3].Z_SizeSectors
It performs many computations to reach the final "Z_SizeSectors" field.
It reaches the .pVetRooms field, like it happened with assembly,too, but in the management of the room pointer "pVetRooms" the C code will extract the value stored in that [pVetRooms] pointer, then it will multiply by 3 the size of Room structure, and then it will add these two values (address in the pVetRooms pointer, and the total of 3 * SizeOf(Room sructure) ) to reach the begin of the room structure with index = 3.
Assembly is not so powerful.
The computation for field path works only for single fields, or subfields, but its not able to read internal pointer and computing byself the index in the bracket [ ].
Pratically a field path where there is the pointer operator ("->") or the indexing syntax ("[ index ]") will be not handled by assembly.
When you need to access to a field with above problems, it's not possible performing the access with a single instruction.
In this case we'll have to perform some intermediate computations to manage the middle pointer.
The intermediate computations to access to middle pointers or indexing of vectors
When we have to access to a sub-field of a structure like this showed in C++ syntax:
MyValue = GlobalFields.Adr.pVetRooms[3].Z_SizeSectors
In assembly we have to perform some intermediate step to reach the final field, in this example the "Z_SizeSectors" word.
Since we are able to get a field when there are (until to it) only "." operators, we can read first side of the path, upto the pointer.
Since in above example the pointer is the variable "pVetRooms", we can begin reading the content of "pVetRooms":
MOVE_VALUE(esi, GlobalFields.Adr.pVetRooms)
Now in esi we have the address it was stored in "pVetRooms" pointer.
Since in our example we have to access to the room with index 3 (pVetRooms[3]), this means we have yet some computation to do, because now the esi register is pointing to the first room of our vector, i.e. the room with index [0].
To reach the room with index 3 , we have to add to vector base (that we have in esi register) the size of each room structure, multiplied by 3.
Looking in the "structures.h" source, we discover that the room structure has a size of 0x94 bytes (94h exadecimal value).
// strutture room per tr4 (size 0x94)
typedef struct StrRoomTr4 {
void *pStaticDataRoom; // 00
StrBaseDoors *pDoors; // 04
Therefor we have to multiply 94h by 3 and add the total to our esi register.
MOVE_VALUE(esi, GlobalFields.Adr.pVetRooms)
mov ecx, 94h
imul ecx, 3
add esi, ecx
In this moment, the esi will point to the begin of pVetRooms[3], but we wish read the "Z_SizeSectors" field of that structure.
Now we use the macro to get offset of any structure: MOVE_OFFSET_FIELD(): the generic macro to access to any structure
WORD MyValue;
MOVE_VALUE(esi, GlobalFields.Adr.pVetRooms)
mov ecx, 94h
imul ecx, 3
add esi, ecx
// now esi, points to pVetRooms[3]
MOVE_OFFSET_FIELD(edx, esi , RoomFields.Z_SizeSectors)
// now edx points to pVetRooms[3].Z_SizeSectors
mov ax, word ptr [edx] ; read the value of pVetRooms[3].Z_SizeSectors
mov word ptr [MyValue], ax ;and save it to MyValue variable
Note: since above operations are a bit boring, I created some specialized macros that helps the access for some problematic structures whereby a call to some C++ function.
Macros to access to the fields of most important structures: Moveable Items, Statics and Room structures
Why create macros own for rooms, moveables and statics?
There is a reason, rather, two reasons:
- The above three items (Moveables, statics and rooms) are the core of each tomb raider procedure. For this reason it's very important to be able to access to them in fast and easy way.
- The access to above "core" items is more difficultous than it happens for other tomb raider stuff, because they are stored in vectors, and we'll have to access to them computing their index value.
In spite of above complexity, using following macros, it will be easy to access to these items also from assembly
The GET_ROOM_OFFSET() macro
This macro has the following syntax:
GET_ROOM_OFFSET(RoomIndex, RoomField)
When you need to access to a room structure, you have to know its index.
Then you can get the address of that room structure in this way:
;we suppose to have in edx register the index of our room
;to get the address of its structure, we'll use this code:
GET_ROOM_OFFSET(edx, RoomFields)
;now in [eax] there will be address of our room
;we can also get the address of a given field of the wished room
GET_ROOM_OFFSET(edx, RoomFields.ColorIntensityLight)
;now in [eax] there is the address of ColorIntensityLight field
; of room with index [edx].
;you can, for example, change its value in this way:
mov [eax], 0
As we saw, the result of the address computation, will be always put in eax register.
Indeed, when we don't specify the target register for our macros, it's better to follow the C conventions that want the result stored in eax register.
The GET_ITEM_OFFSET() macro
For "item" we mean a moveable, like Lara or enemies.
This macro has the following syntax:
GET_ITEM_OFFSET(ItemIndex, ItemField)
Everytime we have a (tomb raider) index of a moveable, we can access to its structure using the GET_ITEM_OFFSET() macro:
// we suppose to have the index of our moveable
// in ecx register
GET_ITEM_OFFSET(ecx, ItemFields)
// now in [eax] there is the pointer to moveable structure
// with index ecx
// we can access directly to some fields of a moveable structure
GET_ITEM_OFFSET(ecx, ItemFields.StateIdCurrent)
// now in [eax] there is the address of StateIdCurrent
// field of item with index ecx.
// we can read or write it:
mov dx, word ptr [eax] ;read the state id of moveable
// we can change it
mov word ptr [eax], 12 ;force stateid 12 for moveable
The GET_STATIC_OFFSET() macro
This macro has the following syntax:
GET_STATIC_OFFSET(StaticIndex, RoomIndex, StaticField)
This macro is a bit more complex. It has three arguments instead of 2, like above macros.
The reason is that there is no global vector about static items, but rather, there are many local vectors, one for each room.
For this reason when you wish access to some static item, we have to know its (local) index but also the index of the room where it is.
Examples
;we suppouse to have in ecx the index of room where there is our static
;while we have in edx register the local index of our static
GET_STATIC_OFFSET(edx, ecx, StaticFields)
;now in [eax] there is the address of the (StrMeshInfo) static structure
;we can also access to a single field of a static structure
GET_STATIC_OFFSET(edx, ecx, StaticFields.OCB)
;now in [eax] there is the address of OCB field
;we can test its value
test word ptr [eax], 8 ;is there 8 ocb in the OCB field?
jnz GoYes8FlagEnabled
How to get the pair of indices to access to static items
When we have only an univocal index to some static, we got that index from NGLE program (the old winroomedit.exe) or from script.dat.
In above situation we have an univocal static index, but to access to the corresponding static item structure we have to convert the univocal index to the pair of indices: room index and static index
Pratically it happens that each room has inside a vector with static items present in that room.
Therefor we need of two indices: the index of the room where the static is, and the internal index to that static in his room.
The CONVERT_STATIC_INDEX() macro
To get the pair of indices to access to a static item structure, you can use the CONVERT_STATIC_INDEX()
Syntax:
CONVERT_STATIC_INDEX(OutStaticIndex, OutRoomIndex, NgleStaticIndex)
The arguments have following meaning:
- OutStaticIndex
This has to be an extended register (its name begin with "e" like edx, ecx, edx) where the macro writes the Internal Static Index
- OutRoomIndex
This has be an extended register, too.
In this register the macro will save the room index of your static
- NgleStaticIndex
This argument ix the ngle static index you wish convert.
Note: the first two arguments have "Out" in front to remember you that they are an output of the macro, this means that it's not important their value when you call the macro, because it will be the macro to write values in these two registers.
The only input value is the third argument, where you have to insert the univocal static index to convert.
Example:
We wish read the OCB value of the static with static index read from script with value 234
CONVERT_STATIC_INDEX(edx, ecx, 234)
;now: edx= LocalStaticIndex ; ecx=RoomIndex
;we use above values to get the address of the OCB field
;of this static
GET_STATIC_OFFSET(edx, ecx, StaticFields.OCB)
;now in [eax] there is the address of OCB field
;we read it to check its value
mov bx, word ptr [eax]
cmp bx, 2
jz OCBIs2
How to call a C function from assembly
To use advanced C++ feature we have to type some code in a C function but then we have to call this C function from our assembly code.
For this reason the first step is understanding how to call a C function from assembly.
Now look this C function:
int MyCFunction(int Arg1, int Arg2, int Arg3)
{
int Result;
Result=Arg1+Arg2*Arg3;
return Result;
}
Presently we are not interested about the content of C code but only to the first row, where we see the input arguments and the result returned (output)
int MyCFunction(int Arg1, int Arg2, int Arg3)
This function has three arguments, and it returns an int value (a 32 bit signed number)
In assembly we can use above function in this way:
push ecx ; value for Arg3 argument
push 01h ; value for Arg2 argument
push -5 ; value for Arg1 argument
call MyCFunction
add esp, 12 ;remove the 3 arguments from the stack (3*4 =12)
;now we can use the value returned by the function. It will be always in eax (or ax or al) register.
cmp eax, 0
jz GoSomewhere
Looking above example we discover some infos:
- The arguments have to be passed with the PUSH instruction, to save these values in the stack
- The arguments have to be sorted in the opposite way, like a countdown, beginning from last argument (arg3 in our example) until to the first argument
- Immediatly after the call instruction, we have to use an "add esp, ..." instruction to remove the arguments, and the number to use it will be always the number of arguments mulitplied by 4
- When the function returns a value, this will be always in eax register
How to call the C functions of Tomb Raider program
Tomb raider program has been developed with Microsoft C compiler 2.0.
So, its procedures are C functions.
The way to call them it's the same of other C functions, anyway in this case we have not a direct label (like "MyFunction" declared in our source) to call them, but we'll have to use the address where they start.
The problem is to discover what is the meaning of a procedure that (probably begins to a given address) and how much are the arguments and what is their meaning.
In the last years I discovered many of above infos and you can find these infos in two files of your Plugin_trng sources.
How to examine "Functions.h" and "DefTomb4Funct.h" sources for accessing to Tomb Raider functions
In the "Functions.h" source, you find the name of the function and its address in tomb4 executable.
For example:
TYPE_Draw2DSprite Draw2DSprite = (TYPE_Draw2DSprite) 0x48B1A0;
Above row (you find in "functions.h") gives to us two informations:
- From the name, we can understand what is its target. In this case, it's clear that it used to draw a sprite.
- From the value used to initialize it (the number 0x48B1A0 in this case) we discover what is its address, i.e. where it begins.
While, to discover the arguments and furhter returned value, we have to look for that kind of pointer function (in this case the type is "TYPE_Draw2DSprite") in the "DefTomb4Funct.h" source.
In the "DefTomb4Funct.h" source there are the prototype of tomb raider functions.
If we find the declaration of prototype for "TYPE_Draw2DSprite", we read:
typedef void (__cdecl *TYPE_Draw2DSprite) (int CordX, int CordY, int IndiceSprite, DWORD Colore, int Mistero);
// or translating some words in english it becomes:
typedef void (__cdecl *TYPE_Draw2DSprite) (int CordX, int CordY, int SpriteIndex, DWORD Color, int Mistery);
In this way we discover the arguments used, their sorting, and, very often, their meaning.
In above case we can understand that Draw2dSprite() has 5 arguments: first two are 2d (x,y) coodinates, the third is the index of sprite from sprite slot, the forth is the color, while the fifth is not yet known its meaning (mistery).
Before showing how to call a tomb raider function declared in above way, it's interesting to explain the syntax of these declaration of pointer functions.
How to declare in our source a Tomb Raider function we discovered
In the "DefTomb4Funct.h" there are declarations of function types.
Remember that you cann't place your declaration in this source, you have to type them in "Tomb4Discoveries_mine.h" source.
A declaration like this:
typedef void (__cdecl *TYPE_Draw2DSprite) (int CordX, int CordY, int SpriteIndex, DWORD Color, int Mistery);
We can describe as:
typedef ReturnedValue(TYPE_FUNCT * MyTypeName) (List of arguments, divided by commas);
The "typedef" is the keyword used to define structure or functions, pratically to define any new type of variable we wish create. It will be always present in our declaration of type function.
The ReturnedValue is the type of variable returned from the function. It could be "WORD", "bool", "BYTE", "int" , StrItemTr4 * (a pointer ot StrItemTr4 structure) ecc.
In the case there was no retuned value, it will be used the type "void" that it means "null, nothing".
The "__cdecl *" is the type of generic function pointer, i.e. a pointer that hosts the address of a function.
The "MyTypeName" it the name we choose for this new kind of function pointer. We'll try to assign a name that remembers the real name of the function. I used the convention to add the "TYPE_" text to the name of the function I'm declaring.
In the following couple of round parenthesis, we'll place all arguments of the function with the same syntax we use for common C functions.
So, we created a type of function, that is a pointer to a function, but now we have to insert the value of this address, too, where the function begins in tomb raider.
This assignment it has been placed in "functions.h" source but you have to type these assignment in your "constants_mine.h" source.
If you read this row:
TYPE_Draw2DSprite Draw2DSprite = (TYPE_Draw2DSprite) 0x48B1A0;
We should understand that is not so different than a declaration, and initialization of variable like this:
WORD MyWord = 0x4434;
Where instead of "WORD" there is our new type "TYPE_Draw2DSprite", and where we had to perform a casting to convert a constant value like "0x48B1A0" to a value of same kind of type of our declared variable, therefor the "(TYPE_Draw2DSprite)" is like "(WORD)" to convert to WORD a value at its right . (see The cast-type with the (CASTING) operator )
We could to do an example about a new tomb4 function you discovered.
- You analyse a function that begins at offset 0x4321F0 in tomb raider program
- You understand that it accepts a single argument and it is a StrItemTr4 structure
- You understand that item structure is always that of Lara
- You discover that this function doesn't return any value
- Looking the code you suppose that it computes something about turning underwater
Now you can transform all these you discoveries in a tomb raider function declaration.
In the "Tomb4Discoveries_mine.h" source, you can type the declaration of type and number and type of arguments.
typedef void (__cdecl* TYPE_LaraTurnsUnderWater) (StrItemTr4 *pLara);
We typed many info we discovered, the void (no returned value), a meaningfull name "LaraTurnsUnderWater" to describe what the function computes, and the number and type of arguments (a pointer to a StrItemTr4 structure, but it is always Lara, so we named the argument pLara)
Now we have to declare the pointer variable for this function and initialize its value to the address where it is.
We do this in the "constants_mine.h" source:
TYPE_LaraTurnsUnderWater LaraTurnsUnderWater = (TYPE_LaraTurnsUnderWater) 0x4321F0;
How to call a tomb raider function from assembly
Now we continue with our example: how to call the Draw2DSprite() function.
We remember what is its address from "functions.h":
TYPE_Draw2DSprite Draw2DSprite = (TYPE_Draw2DSprite) 0x48B1A0;
and its arguments from the function type declaring in DefTomb4Funct.h:
typedef void (__cdecl *TYPE_Draw2DSprite) (int CordX, int CordY, int SpriteIndex, DWORD Color, int Mistery);
With all above infos it's easy calling this function:
push 0 ;arg 5 Mistery
push 5500ffh ;Arg4 Color
push 2 ;Arg3 SpriteIndex
push 100 ;Arg2 CordY
push 230 ;Arg1 CordX
call dword ptr [Draw2dSprite] ;call the address where the function begins
add esp, 20 ;five arguments to remove from the stack
How to convert the C++ types in Assembly types?
The C language has an huge number of variable types, while in assembly we have only BYTE, WORD and DWORD, signed or unsigned.
When we see a C function with "weird" types of arguments we can have the problem to understand how manage these variables in assembly
Example:
char* MisteriousFunction(StrItemTr4* pItem)
When we find a "*" (asterisk) sign closed to some argument, or at left of the function name (the returned value), this means that is a pointer.
As we have already exaplained (Understanding the Pointers ) a pointer is always a dword value that represents an address where we'll find the wished value.
This means we can use always dword registers as pointers.
The text at left of the * says to us what is the type of value we'll find to the address of that pointer.
About above function we could use it in this way:
StrItemTr4 MyStrItem;
push offset MyStrItem ;address of MyStrItem structure
call MisteriousFunction
add esp, 4 ;remove argument from stack
;now in [eax] there is the address where we can
;find a string of characters, for example it could be the
;text: "Position is x=3939 y=-256 z=34022"
;followed by a zero value, as it happens for all texts in C
Warning about little arguments
When the size of some argument is less than 32 bits (i.e. littler than int, dword, long or a pointer *) you should understand how manage this situation about the pusing of above values
void MyFunction(BYTE Arg1, short Arg2)
In this case first argument have to be a BYTE, while the second a short (16 bit signed)
Anyway it should be an error call this function in this way
WORD MyValue;
push word ptr [MyValue] ;arg2
push cl ;arg1 the value is in cl byte register
call MyFuncion
add esp, 8
Above passing of argument is wrong.
The reason is that we'll pass always DWORD arguments, in spite their size may be less.
Furtherly we could clear the highest size of our dword to signal that only the lower side it that valid.
Now we see how perform above call in correct way
WORD MyValue;
;copy in eax the value of MyValue extending the word to
;dword with sign
movsx eax, word ptr [MyValue]
push eax
and ecx, 0ffh ;mask only the byte, clearing the higher bits of ecx
push ecx
call MyFunction
add esp, 8
In the reality it's not necessary clearing the higher bits because this operation will be performed by the code of C function in according with the size of its arguments.
Anyway the point to remember is that we'll pass always DWORD arguments with PUSH instruction.
We can see the opposite situation, when is the returned value to be a BYTE (or short, or bool variable) with a size less than a DWORD.
BYTE MyFunction(int Arg1, int Arg2)
We know that the returned value ends in eax register, but in this case, since it is a byte, we'll have to manage only the al register, the byte version of eax register.
push edx ;arg2
push 40 ;arg1
call MyFunction
add esp, 8
cmp al, 0
jz GoDoSomething
We use "cmp" with "al" because the result should be only in AL since it is a byte.
Troubles about "bool" or "BOOL" types
The bool type keep a result of a condition or a test. When it is zero this means: false, no, null ect While when it's different than zero it means: true, ok, enable, engage, it depends by the context.
The problem is about its size, because Microsoft did a mess.
While in the past it assigns to bool the same size than an int (4 bytes), then it changed idea and transformed it in a byte size.
Other matter is about the value of TRUE. Sometimes it is "01" but it could be -1 in other formats. The BOOL is the 4 byte version of bool.
To aovid all these complication it's better consider it always like a byte, reading only the first byte in all situation.
For example:
bool IsTrue(int Arg1, BOOL TestEngage)
;we manage above function in this way
push 01 ;arg2 = true, or we can use 00 for false
push ecx ;arg1
call IsTrue
add esp, 8
cmp al, 0
jz ItWasFalse
jnz ItIsTrue
What is the "void" argument?
In the declaration of some functions we can find the "void" keyword as argument or returned value.
void SomeFunction(int Alfa);
bool IsFullScreenMode(void);
The "void" means simply: "no arguments" or "no returned value.
The above SomeFunction() doesn't return any value, while the IsFullScreenMode() returns a bool variable but it doesn't require any argument.
How to preserve our registers when we call a C++ function
In above examples we showed only the way to pass arguments and read the returned value but in the reality when we call C functions we have also to take care about the values in our registers because the function could change them values with unpredictable results.
There is a rule that is like a gentelman agreement about the registers that a C function can change or less.
The rule is that all function are free to change the values of these registers:
EAX, ECX, EDX
While a C function should preserve (for the caller) the values stored in other registers:
EBX, EBP, ESI, EDI
Keeping in mind above rule we have also to understand about what registers of our assembly procedure we wish preserve the value and for which others is not necessary.
Then we'll save those general purpose register that we want preserve the values.
Example:
mov ecx, 6
MyLoop:
mov eax, MyVect[ecx*4]
cmp eax, 0
jz Skip
.... other code ....
Skip:
push eax ;arg2
push 0 ;arg1
call MyFunction
add esp, 8
dec ecx
jns MyLoop
In above procedure we are using the ecx register to count the number of cycles to read the vector MyVect.
In the middle of this cycle we call the c function MyFunction()
Above code should be wrong, because it will happen that MyFunction() will change the value of ECX register and therefor we'll lose its real value with bad results.
In this case we have to save the ecx register before calling MyFunction and then restore it.
mov ecx, 6
MyLoop:
mov eax, MyVect[ecx*4]
cmp eax, 0
jz Skip
.... other code ....
Skip:
push ecx ; save the value of ecx
push eax ;arg2
push 0 ;arg1
call MyFunction
add esp, 8
pop ecx ;restore the original value of ecx
dec ecx
jns MyLoop
Above we see how we should change our code to save and restore the value of ecx register.
As general rule we should use some PUSH instruction above of first argument we pass to the function, and then we'll have to extract from the stack with POP instruction after the "add esp, ..." instruction used to remove the arguments from the stack
Using PUSHAD / POPAD to preserve all registers
The method showed above is right but it's easy mistake a register, for example forgetting that we have to save/restore also another register like EDX.
When this happens we have very boring bugs.
To avoid risks, we can save always ALL register before any call to a C function.
The assembly instruction to save all registers is PUSHAD (push All Dword registers) and the corresponding to restore their value is POPAD.
Example:
mov ecx, 6
MyLoop:
mov eax, MyVect[ecx*4]
cmp eax, 0
jz Skip
.... other code ....
Skip:
pushad ;save all registers;
push eax ;arg2
push 0 ;arg1
call MyFunction
add esp, 8
popad ; restore all registers
dec ecx
jns MyLoop
Above we see the code changed to save/restore all registers.
In this way we are sure that all registers have been saved, no forgetfulness can damage us
The only one unconvenience of this method is that using PUSHAD/POPAD instructions, we save and restore also not necessary registers as ESI, EDI, EBX and EBP that they will be preserved already by the C function we are calling.
Using SAVE_MY_REGISTERS() and RESTORE_MY_REGISTERS() macros to preserve registers
A good compromise between saving manually only necessary registers (with the risk to mistake something) and the saving/restoring of all registers, also when it's not necessary, is to use two macros work to save and restore only the registers that could be changed by the Function we call.
The SAVE_MY_REGISTERS will save following registers:
PUSH eax
PUSH ecx
PUSH edx
while the RESTORE_MY_REGISTERS restore above register in opposite (correct) sorting:
POP edx
POP ecx
POP eax
For example, working on the previous code, we can solve the problem in this way:
mov ecx, 6
MyLoop:
mov eax, MyVect[ecx*4]
cmp eax, 0
jz Skip
.... other code ....
Skip:
SAVE_MY_REGISTERS ;save registers: eax, ecx, edx
push eax ;arg2
push 0 ;arg1
call MyFunction
add esp, 8
RESTORE_MY_REGISTERS ; restore registers: eax, ecx, edx
dec ecx
jns MyLoop
That problem when we save/restore the EAX register
When we save/restore the registers using PUSHAD or SaveMyRegisters, also the eax register will be saved and (then resotred )
This could create a trouble in this situation
mov ecx, 6
MyLoop:
mov eax, MyVect[ecx*4]
cmp eax, 0
jz Skip
.... other code ....
Skip:
SAVE_MY_REGISTERS ;save registers: eax, ecx, edx
push eax ;arg2
push 0 ;arg1
call MyFunction
add esp, 8
RESTORE_MY_REGISTERS ; restore registers: eax, ecx, edx
cmp eax, 0 ;read the returned value from MyFunction (?)
jnz GoThere
.... other code
GoThere:
dec ecx
jns MyLoop
When the function returns a value it will be stored in eax register.
If we consult eax after the POPAD or RestoreMyRegisters instruction, the value we are checking is not that returned by the function but it is the value that eax had before we used the PUSHAD or SAVE_MY_REGISTERS instruction.
Pratically we lost the value of returned value after the restoring of our registers.
A way to solve the problem is easy: just reading the eax newest value that restore the registers with POPAD or RESTORE_MY_REGISTERS.
mov ecx, 6
MyLoop:
mov eax, MyVect[ecx*4]
cmp eax, 0
jz Skip
.... other code ....
Skip:
SAVE_MY_REGISTERS ;save registers: eax, ecx, edx
push eax ;arg2
push 0 ;arg1
call MyFunction
add esp, 8
cmp eax, 0 ;read the returned value from MyFunction (?)
jnz GoThere
RESTORE_MY_REGISTERS ; restore registers: eax, ecx, edx
.... other code
GoThere:
dec ecx
jns MyLoop
In this case we checked the returned value in eax, recent than that used to restore the registers.
It's true that the value we are checking with (cmp eax, 0) is really that returned by the function, but in above there is yet a problem: when eax =0 we jump to GoThere label and in this way we skipped the RESTORE_MY_REGISTERS macro and this is very wrong, since everytime we save registers we have ALWAYS to restore them to preserve the correct level of values in the stacks.
In these situations we could save the returned value in a variable.
int MyValue;
mov ecx, 6
MyLoop:
mov eax, MyVect[ecx*4]
cmp eax, 0
jz Skip
.... other code ....
Skip:
SAVE_MY_REGISTERS ;save registers: eax, ecx, edx
push eax ;arg2
push 0 ;arg1
call MyFunction
add esp, 8
mov dword ptr [MyValue], eax ;save the returned value
RESTORE_MY_REGISTERS ; restore registers: eax, ecx, edx
mov eax, dword ptr [MyValue] ;read the returned value
cmp eax, 0
jnz GoThere
.... other code
GoThere:
dec ecx
jns MyLoop
How to preserve the registers for the caller
Until now we examined the problem to preserve the values you set in the register, avoiding that the function we'll call changed them.
Now we should analyse the opposite problem.
Also our assembly code has been called from other subroutines or functions and we have to respect that gentleman agreement that prescribes that each function should preserve the original values of ESI, EDI, EBX and EBP registers.
In this case we have to save previous value for above registers, and restore it before exiting from our code.
This means that our assembly procedure should have always these istructions
BEGIN_ASM_PROC(MyProcedure)
;save the original values for registers of the caller
push ebp
push ebx
push esi
push edi
;here we type the code of our procedure
;restore the original values of the caller
pop edi
pop esi
pop ebx
pop ebp
ret
END_ASM_PROC
Theorically we could save only the registers we have really in mind to use (changing them value).
I mean that, if we don't change the ebp register (for example) it's not necessary saving and restoring it.
Anyway there is the usual problem: we could mistake something and forget to have really used some of these registers, omitting to save and restore its value.
If you choose to save and restore always all registers for the caller (EBX, EBP, ESI and EDI), you can use the macros: SAVE_REGISTERS() and RESTORE_REGISTERS()
For example the previous code could become:
BEGIN_ASM_PROC(MyProcedure)
;save the original values for registers of the caller
SAVE_REGISTERS
;here we type the code of our procedure
;restore the original values of the caller
RESTORE_REGISTERS
ret
END_ASM_PROC
How to read the arguments passed to our assembly function in C style
In the previous chapter we saw how to pass parameters to the C functions, using PUSH instructions.
Now we see how to read parameters passed to our assembly procedure.
The parameters will be pushed in the stacks, so they will be stored in the memory of stack, the memory pointed by ESP register but in what position?
At begin of our procedure, the first argument will be in position [ESP+4], the second in position [ESP+8] and so on.
For example if our procedure requires 3 arguments, it can read them in this way:
BEGIN_ASM_PROC(MyProcedure)
mov eax, dword ptr [ESP+4] ;first argument
mov ecx, dword ptr [ESP+8] ;second argument
mov edx, dword ptr [ESP+12] ;third argument
The formula to remember to compute the number in ESP+... syntax, is Position of argument * 4.
Anyway there is a macro to read the arguments where the compute is easier.
Reading arguments passed to our procedure with MOVE_ARG() macro
To read the value of arguments passed to our assembly procedure we can use the MOVE_ARG(Register, ArgNumber) macro.
It has two arguments: Register, where you wish it was copied the value of that argument, and ArgNumber that is the number of the argument to read, where 1 is first argument, 2 the second and so on.
Example:
BEGIN_ASM_PROC(MyAsmProcedure)
MOVE_ARG(eax, 1) ;read first argument and save to eax register
MOVE_ARG(ecx, 2) ;read second argument and save to ecx register
MOVE_ARG(edx, 3) ;read third argument and save to edx register
The correct disposition of Reading arguments and saving registers
At begin of our assembly procedure we read arguments but we should also preserve registers of caller process (see How to preserve the registers for the caller ) but what is the correct disposition of these two phases?
We have to read arguments before saving the caller registers, in this way:
BEGIN_ASM_PROC(GosubName)
MOVE_ARG(eax, 1) ; read first argument
MOVE_ARG(ecx, 2) ; read second argument
;and now we can save registers of the caller (esi, edi, ebp, ebx)
SAVE_REGISTERS
This is the right way, indeed if we did the opposite (first saving register and then reading arguments, we'll read wrong values since the saving registers operation push other values in the stack.
About this operation it is clear that we cann't change the caller register before saving them with SaveRegisters, for this reason we cann't begin our procedure in this way
BEGIN_ASM_PROC(GosubName)
MOVE_ARG(eax, 1) ; first argument
MOVE_ARG(esi, 2) ;second argument
SAVE_REGISTERS
In above code the error is to use the esi register to copy the value of second argument. We cann't use the esi because we have not saved it and we cann't change its value.
For this reason you should remember to use always and only the register eax, ecx and edx to read the arguments but not the others (esi, edi, ebp, ebx)
We can modify the values in caller register only AFTER having saved them with SAVE_REGISTERS.
In this speech there is the question: what does it happen if the number of arguments is greater than the number of available registers (3 since they are eax, ecx, and edx)?
There are easy ways to solve the problem, the easiest is to save, in some variable, some argument.
For example:
DWORD Radius;
short Speed;
BEGIN_ASM_PROC
MOVE_ARG(eax,1) ;first argument
mov dword ptr [Radius], eax ;copy first arg in [Radius]
MOVE_ARG(eax,2) ;second argument
mov word ptr [Speed], ax ;copy second arg in [Speed]
MOVE_ARG(ecx,3) ;thrid argument
MOVE_ARG(edx,4) ;forth argument
MOVE_ARG(eax,5) ;fifth argument
SAVE_REGISTERS
In above code we had five arguments to read but we used only three registers (eax, ecx, edx) to extract them. Some arguments, like first and second we had stored in two variables where we'll be able to read newly them when we need.
Using this method we could extract an unlimited number of arguments using also the only one eax register.