This article discusses the following:
The subroutine linkage convention describes the machine state at subroutine entry and exit. When followed, this scheme allows routines compiled separately in the same or different languages to be linked and executed when called.
The linkage convention allows for parameter passing and return values to be in floating-point registers (FPRs), general-purpose registers (GPRs), or both.
For AIX Version 4.3, the following discussion applies to both 32-bit mode and 64-bit mode with the following notes:
The PowerPC 32-bit architecture has 32 GPRs and 32 FPRs. Each GPR is 32 bits wide, and each FPR is 64 bits wide. There are also special registers for branching, exception handling, and other purposes. The General-Purpose Register Convention table shows how GPRs are used.
| General-Purpose Register Conventions | ||
| Register | Status | Use | 
| GPR0 | volatile | In function prologs. | 
| GPR1 | dedicated | Stack pointer. | 
| GPR2 | dedicated | Table of Contents (TOC) pointer. | 
| GPR3 | volatile | First word of a function's argument list; first word of a scalar function return. | 
| GPR4 | volatile | Second word of a function's argument list; second word of a scalar function return. | 
| GPR5 | volatile | Third word of a function's argument list. | 
| GPR6 | volatile | Fourth word of a function's argument list. | 
| GPR7 | volatile | Fifth word of a function's argument list. | 
| GPR8 | volatile | Sixth word of a function's argument list. | 
| GPR9 | volatile | Seventh word of a function's argument list. | 
| GPR10 | volatile | Eighth word of a function's argument list. | 
| GPR11 | volatile | In calls by pointer and as an environment pointer for languages that require it (for example, PASCAL). | 
| GPR12 | volatile | For special exception handling required by certain languages and in glink code. | 
| GPR13:GPR31 | nonvolatile | These registers must be preserved across a function call. | 
The preferred method of using GPRs is to use the volatile registers first. Next, use the nonvolatile registers in descending order, starting with GPR31 and proceeding down to GPR13. GPR1 and GPR2 must be dedicated as stack and Table of Contents (TOC) area pointers, respectively. GPR1 and GPR2 must appear to be saved across a call, and must have the same values at return as when the call was made.
Volatile registers are scratch registers presumed to be destroyed across a call and are, therefore, not saved by the callee. Volatile registers are also used for specific purposes as shown in the previous table. Nonvolatile and dedicated registers are required to be saved and restored if altered and, thus, are guaranteed to retain their values across a function call.
The Floating-Point Register Conventions table shows how the FPRs are used.
| Floating-Point Register Conventions | ||
| Register | Status | Use | 
| FPR0 | volatile | As a scratch register. | 
| FPR1 | volatile | First floating-point parameter; first 8 bytes of a floating-point scalar return. | 
| FPR2 | volatile | Second floating-point parameter; second 8 bytes of a floating-point scalar return. | 
| FPR3 | volatile | Third floating-point parameter; third 8 bytes of a floating-point scalar return. | 
| FPR4 | volatile | Fourth floating-point parameter; fourth 8 bytes of a floating-point scalar return. | 
| FPR5 | volatile | Fifth floating-point parameter. | 
| FPR6 | volatile | Sixth floating-point parameter. | 
| FPR7 | volatile | Seventh floating-point parameter. | 
| FPR8 | volatile | Eighth floating-point parameter. | 
| FPR9 | volatile | Ninth floating-point parameter. | 
| FPR10 | volatile | Tenth floating-point parameter. | 
| FPR11 | volatile | Eleventh floating-point parameter. | 
| FPR12 | volatile | Twelfth floating-point parameter. | 
| FPR13 | volatile | Thirteenth floating-point parameter. | 
| FPR14:FPR31 | nonvolatile | If modified, must be preserved across a call. | 
The preferred method of using FPRs is to use the volatile registers first. Next, the nonvolatile registers are used in descending order, starting with FPR31 and proceeding down to FPR14.
Only scalars are returned in multiple registers. The number of registers required depends on the size and type of the scalar. For floating-point values, the following results occur:
The Special-Purpose Register Conventions table shows the PowerPC special purpose registers (SPRs). These are the only SPRs for which there is a register convention.
| Special-Purpose Register Conventions | ||
| Register or Register Field | Status | Use | 
| LR | volatile | Used as a branch target address or holds a return address. | 
| CTR | volatile | Used for loop count decrement and branching. | 
| XER | volatile | Fixed-point exception register. | 
| FPSCR | volatile | Floating-point exception register. | 
| CR0, CR1 | volatile | Condition-register bits. | 
| CR2, CR3, CR4 | nonvolatile | Condition-register bits. | 
| CR5, CR6, CR7 | volatile | Condition-register bits. | 
Routines that alter CR2, CR3, and CR4 must save and restore at least these fields of the CR. Use of other CR fields does not require saving or restoring.
The stack format convention is designed to enhance the efficiency of the following:
The Run-Time Stack figure illustrates the run-time stack. It shows the stack after the sender function calls the catcher function, but before the catcher function calls another function. This figure is based on the assumption that the catcher function will call another function. Therefore, the catcher function requires another link area (as described in the stack layout). PWn refers to the nth word of parameters that are passed.
Only one register, referred to as the stack pointer (SP), is used for addressing the stack, and GPR1 is the dedicated stack pointer register. It grows from numerically higher storage addresses to numerically lower addresses.
The Run-Time Stack figure illustrates what happens when the sender function calls the catcher function, and how the catcher function requires a stack frame of its own. When a function makes no calls and requires no local storage of its own, no stack frame is required and the SP is not altered.
Notes:
- To reduce confusion, data being passed from the sender function (the caller) is referred to as arguments, and the same data being received by the catcher function (the callee) is referred to as parameters. The output argument area of sender is the same as the input parameter area of catcher.
 - The address value in the stack pointer must be quadword-aligned. (The address value must be a multiple of 16.)
 
For convenience, the stack layout has been divided into eight areas numbered 1 to 8, starting from the bottom of the diagram (high address) to the top of the diagram (low address). The sender's stack pointer is pointing to the top of area 3 when the call to the catcher function is made, which is also the same SP value that is used by the catcher function on entry to its prolog. The following is a description of the stack areas, starting from the bottom of the diagram (area 1) and moving up to the top (area 8):
Area 1 is the local variable area for the sender function, contains all local variables and temporary space required by this function.
Area 2 is the output argument area for the sender function. This area is at least eight words in size and must be doubleword-aligned. The first eight words are not used by the caller (the sender function) because their corresponding values are placed directly in the argument registers (GPR3:GPR10). The storage is reserved so that if the callee (the catcher function) takes the address of any of its parameters, the values passed in GPR3:GPR10 can be stored in their address locations (PW1:PW8, respectively). If the sender function is passing more than eight arguments to the catcher function, then it must reserve space for the excess parameters. The excess parameters must be stored as register images beyond the eight reserved words starting at offset 56 from the sender function's SP value.
Note: This area may also be used by language processors and is volatile across calls to other functions.
Area 3 is the link area for the sender function. This area consists of six words and is at offset 0 from the sender function's SP at the time the call to the catcher function is made. Certain fields in this area are used by the catcher function as part of its prolog code, those fields are marked in the Run-Time Stack figure and are explained below.
The first word is the back chain, the location where the sender function saved its caller's SP value prior to modifying the SP. The second word (at offset 4) is where the catcher function can save the CR if it modifies any of the nonvolatile CR fields. The third word (offset 8) is where the catcher function can save the LR if the catcher function makes any calls.
The fourth word is reserved for compilers, and the fifth word is used by binder-generated instructions. The last word in the link area (offset 20) is where the TOC area register (see "Understanding and Programming the TOC" for description) is saved by the global linkage (glink) interface routine. This occurs when an out-of-module call is performed, such as when a shared library function is called.
Area 4 is the floating-point register save area for the callee (the catcher function) and is doubleword-aligned. It represents the space needed to save all the nonvolatile FPRs used by the called program (the catcher function). The FPRs are saved immediately above the link area (at a lower address) at a negative displacement from the sender function's SP. The size of this area varies from zero to a maximum of 144 bytes, depending on the number of FPRs being saved (maximum number is 18 FPRs * 8 bytes each).
Area 5 is the general-purpose register save area for the catcher function and is at least word-aligned. It represents the space needed by the called program (the catcher function) to save all the nonvolatile GPRs. The GPRs are saved immediately above the FPR save area (at a lower address) at a negative displacement from the sender function's SP. The size of this area varies from zero to a maximum of 76 bytes, depending on the number of GPRs being saved (maximum number is 19 GPRs * 4 bytes each).
Notes:The system-defined stack floor includes the maximum possible save area. The formula for the size of the save area is:
- A stackless leaf procedure makes no calls and requires no local variable area, but it may use nonvolatile GPRs and FPRs.
 - The save area consists of the FPR save area (4) and the GPR save area (5), which have a combined maximum size of 220 bytes. The stack floor of the currently executing function is located at 220 bytes less than the value in the SP. The area between the value in the SP and the stack floor is the maximum save area that a stackless leaf function may use without acquiring its own stack. Functions may use this area as temporary space which is volatile across calls to other functions. Execution elements such as interrupt handlers and binder-inserted code, which cannot be seen by compiled codes as calls, must not use this area.
 
18*8 (for FPRs) + 19*4 (for GPRs) = 220
Area 6 is the local variable area for the catcher function and contains local variables and temporary space required by this function. The catcher function addresses this area using its own SP, which points to the top of area 8, as a base register.
Area 7 is the output argument area for the catcher function and is at least eight words in size and must be doubleword-aligned. The first eight words are not used by the caller (the catcher function), because their corresponding values are placed directly in the argument registers (GPR3:GPR10). The storage is reserved so that if the catcher function's callee takes the address of any of its parameters, then the values passed in GPR3:GPR10 can be stored in their address locations. If the catcher function is passing more than eight arguments to its callee (PW1:PW8, respectively), it must reserve space for the excess parameters. The excess parameters must be stored as register images beyond the eight reserved words starting at offset 56 from the catcher function's SP value.
Note: This area can also be used by language processors and is volatile across calls to other functions.
Area 8 is the link area for the catcher function and contains the same fields as those in the sender function's link area (area 3).
All language processors and assemblers must maintain the stack-related system standard that the SP must be atomically updated by a single instruction. This ensures that there is no timing window where an interrupt that would result in the stack pointer being only partially updated can occur.
Note: The examples of program prologs and epilogs show the most efficient way to update the stack pointer.
Prologs and epilogs may be used for functions, including setting the registers on function entry and restoring the registers on function exit.
No predetermined code sequences representing function prologs and epilogs are dictated. However, certain operations must be performed under certain conditions. The Stack Frame Layout figure shows the stack frame layout.
A typical function's execution stack is:
The Prolog Actions and Epilog Actions tables show the conditions and actions required for prologs and epilogs.
| Prolog Actions | |
| If: | Then: | 
| Any nonvolatile FPRs (FPR14:FPR31) are used | Save them in the FPR save area (area 4 in the previous figure). | 
| Any nonvolatile GPRs (GPR13:GPR31) are used | Save them in the GPR save area (area 5 in the previous figure). | 
| LR is used for a nonleaf procedure | Save the LR at offset eight from the caller function SP. | 
| Any of the nonvolatile condition register (CR) fields are used. | Save the CR at offset four from the caller function SP. | 
| A new stack frame is required | Get a stack frame and decrement the SP by the size of the frame padded (if necessary) to a multiple of 16 to acquire a new SP and save caller's SP at offset 0 from the new SP. | 
Note: A leaf function that does not require stack space for local variables and temporaries can save its caller registers at a negative offset from the caller SP without actually acquiring a stack frame.
| Epilog Actions | |
| If: | Then: | 
| Any nonvolatile FPRs were saved | Restore the FPRs that were used. | 
| Any nonvolatile GPRs were saved | Restore the GPRs that were saved. | 
| The LR was altered because a nonleaf procedure was invoked | Restore LR. | 
| The CR was altered | Restore CR. | 
| A new stack was acquired | Restore the old SP to the value it had on entry (the caller's SP). Return to caller. | 
While the PowerPC architecture provides both load and store multiple instructions for GPRs, it discourages their use because their implementation on some machines may not be optimal. In fact, use of the load and store multiple instructions on some future implementations may be significantly slower than the equivalent series of single word loads or stores. However, saving many FPRs or GPRs with single load or store instructions in a function prolog or epilog leads to increased code size. For this reason, the system environment must provide routines that can be called from a function prolog and epilog that will do the saving and restoring of the FPRs and GPRs. The interface to these routines, their source code, and some prolog and epilog code sequences are provided .
As shown in the stack frame layout, the GPR save area is not at a fixed position from either the caller SP or the callee SP. The FPR save area starts at a fixed position, directly above the SP (lower address) on entry to that callee, but the position of the GPR save area depends on the number of FPRs saved. Thus, it is difficult to write a general-purpose GPR-saving function that uses fixed displacements from SP.
If the routine needs to save both GPRs and FPRs, use GPR12 as the pointer for saving and restoring GPRs. (GPR12 is a volatile register, but does not contain input parameters.) This results in the definition of multiple-register save and restore routines, each of which saves or restores m FPRs and n GPRs. This is achieved by executing a bla (Branch and Link Absolute) instruction to specially provided routines containing multiple entry points (one for each register number), starting from the lowest nonvolatile register.
Notes:
- There are no entry points for saving and restoring GPR and FPR numbers greater than 29. It is more efficient to save a small number of registers in the prolog than it is to call the save and restore functions.
 - If the LR is not saved or restored in the following code segments, the language processor must perform the saving and restoring as appropriate.
 
Language processors must use a proprietary method to conserve the values of nonvolatile registers across a function call.
Three sets of save and restore routines must be made available by the system environment. These routines are:
For a function that saves and restores n GPRs and no FPRs, the saving can be done using individual store and load instructions or by calling system-provided routines as shown in the following example:
Note: The number of registers being saved is n. Sequences such as <32-n> in the following examples indicate the first register number to be saved and restored. All registers from <32-n> to 31, inclusive, are saved and restored.
#Following are the prolog/epilog of a function that saves n GPRS #(n>2):
mflr    r0                       #move LR into GPR0
bla     _savegpr0_<32-n>         #branch and link to save GPRs
stwu    r1,<-frame_size>(r1)     #update SP and save caller's SP
...                              #frame_size is the size of the
                                 #stack frame to be required
<save CR if necessary>
...
...                              #body of function
...
<reload save CR if necessary>
...
<reload caller's SP into R!>     #see note below
ba      _restgpr0_<32-n>         #restore GPRs and return
Note: The restoring of the calling function SP can be done by either adding the frame_size value to the current SP whenever frame_size is known, or by reloading it from offset 0 from the current SP. The first approach is more efficient, but not possible for functions that use the alloca subroutine to dynamically allocate stack space.
The following example shows a GPR save routine when FPRs are not saved:
_savegpr0_13    stw     r13,-76(r1)              #save r13
_savegpr0_14    stw     r14,-72(r1)              #save r14
_savegpr0_15    stw     r15,-68(r1)              #save r15
_savegpr0_16    stw     r16,-64(r1)              #save r16
_savegpr0_17    stw     r17,-60(r1)              #save r17
_savegpr0_18    stw     r18,-56(r1)              #save r18
_savegpr0_19    stw     r19,-52(r1)              #save r19
_savegpr0_20    stw     r20,-48(r1)              #save r20
_savegpr0_21    stw     r21,-44(r1)              #save r21
_savegpr0_22    stw     r22,-40(r1)              #save r22
_savegpr0_23    stw     r23,-36(r1)              #save r23
_savegpr0_24    stw     r24,-32(r1)              #save r24
_savegpr0_25    stw     r25,-28(r1)              #save r25
_savegpr0_26    stw     r26,-24(r1)              #save r26
_savegpr0_27    stw     r27,-20(r1)              #save r27
_savegpr0_28    stw     r28,-16(r1)              #save r28
_savegpr0_29    stw     r29,-12(r1)              #save r29
                stw     r30,-8(r1)               #save r30
                stw     r31,-4(r1)               #save r31
                stw     r0 , 8(r1)               #save LR in 
                                                 #caller's frame
                blr                              #return
Note: This save routine must not be called when GPR30 or GPR31, or both, are the only registers beings saved. In these cases, the saving and restoring must be done inline.
The following example shows a GPR restore routine when FPRs are not saved:
_restgpr0_13    lwz     r13,-76(r1)              #restore r13
_restgpr0_14    lwz     r14,-72(r1)              #restore r14
_restgpr0_15    lwz     r15,-68(r1)              #restore r15
_restgpr0_16    lwz     r16,-64(r1)              #restore r16
_restgpr0_17    lwz     r17,-60(r1)              #restore r17
_restgpr0_18    lwz     r18,-56(r1)              #restore r18
_restgpr0_19    lwz     r19,-52(r1)              #restore r19
_restgpr0_20    lwz     r20,-48(r1)              #restore r20
_restgpr0_21    lwz     r21,-44(r1)              #restore r21
_restgpr0_22    lwz     r22,-40(r1)              #restore r22
_restgpr0_23    lwz     r23,-36(r1)              #restore r23
_restgpr0_24    lwz     r24,-32(r1)              #restore r24
_restgpr0_25    lwz     r25,-28(r1)              #restore r25
_restgpr0_26    lwz     r26,-24(r1)              #restore r26
_restgpr0_27    lwz     r27,-20(r1)              #restore r27
_restgpr0_28    lwz     r28,-16(r1)              #restore r28
_restgpr0_29    lwz     r0,8(r1)                 #get return
                                                 #address from
                                                 #frame
                lwz     r29,-12(r1)              #restore r29
                mtlr    r0                       #move return
                                                 #address to LR
                lwz     r30,-8(r1)               #restore r30
                lwz     r31,-4(r1)               #restore r31
                blr                              #return
Note: This restore routine must not be called when GPR30 or GPR31, or both, are the only registers beings saved. In these cases, the saving and restoring must be done inline.
For a function that saves and restores n GPRs and m FPRs (n>2 and m>2), the saving can be done using individual store and load instructions or by calling system-provided routines as shown in the following example:
#The following example shows the prolog/epilog of a function #which save n GPRs and m FPRs: mflr r0 #move LR into GPR 0 subi r12,r1,8*m #compute GPR save pointer bla _savegpr1_<32-n> #branch and link to save GPRs bla _savefpr_<32-m> stwu r1,<-frame_size>(r1) #update SP and save caller's SP ... <save CR if necessary> ... ... #body of function ... <reload save CR if necessary> ... <reload caller's SP into r1> #see note below on subi r12,r1,8*m #compute CPR restore pointer bla _restgpr1_<32-n> #restore GPRs ba _restfpr_<32-m> #restore FPRs and return
Note: The calling function SP can be restored by either adding the frame_size value to the current SP whenever the frame_size is known or by reloading it from offset 0 from the current SP. The first approach is more efficient, but not possible for functions that use the alloca subroutine to dynamically allocate stack space.
The following example shows a GPR save routine when FPRs are saved:
_savegpr1_13    stw     r13,-76(r12)             #save r13
_savegpr1_14    stw     r14,-72(r12)             #save r14
_savegpr1_15    stw     r15,-68(r12)             #save r15
_savegpr1_16    stw     r16,-64(r12)             #save r16
_savegpr1_17    stw     r17,-60(r12)             #save r17
_savegpr1_18    stw     r18,-56(r12)             #save r18
_savegpr1_19    stw     r19,-52(r12)             #save r19
_savegpr1_20    stw     r20,-48(r12)             #save r20
_savegpr1_21    stw     r21,-44(r12)             #save r21
_savegpr1_22    stw     r22,-40(r12)             #save r22
_savegpr1_23    stw     r23,-36(r12)             #save r23
_savegpr1_24    stw     r24,-32(r12)             #save r24
_savegpr1_25    stw     r25,-28(r12)             #save r25
_savegpr1_26    stw     r26,-24(r12)             #save r26
_savegpr1_27    stw     r27,-20(r12)             #save r27
_savegpr1_28    stw     r28,-16(r12)             #save r28
_savegpr1_29    stw     r29,-12(r12)             #save r29
                stw     r30,-8(r12)              #save r30
                stw     r31,-4(r12)              #save r31
                blr                              #return
The following example shows an FPR save routine:
_savefpr_14    stfd     f14,-144(r1)             #save f14
_savefpr_15    stfd     f15,-136(r1)             #save f15
_savefpr_16    stfd     f16,-128(r1)             #save f16
_savefpr_17    stfd     f17,-120(r1)             #save f17
_savefpr_18    stfd     f18,-112(r1)             #save f18
_savefpr_19    stfd     f19,-104(r1)             #save f19
_savefpr_20    stfd     f20,-96(r1)              #save f20
_savefpr_21    stfd     f21,-88(r1)              #save f21
_savefpr_22    stfd     f22,-80(r1)              #save f22
_savefpr_23    stfd     f23,-72(r1)              #save f23
_savefpr_24    stfd     f24,-64(r1)              #save f24
_savefpr_25    stfd     f25,-56(r1)              #save f25
_savefpr_26    stfd     f26,-48(r1)              #save f26
_savefpr_27    stfd     f27,-40(r1)              #save f27
_savefpr_28    stfd     f28,-32(r1)              #save f28
_savefpr_29    stfd     f29,-24(r1)              #save f29
               stfd     f30,-16(r1)              #save f30
               stfd     f31,-8(r1)               #save f31
               stw      r0 , 8(r1)               #save LR in 
                                                 #caller's frame
               blr                               #return
The following example shows a GPR restore routine when FPRs are saved:
_restgpr1_13    lwz     r13,-76(r12)             #restore r13
_restgpr1_14    lwz     r14,-72(r12)             #restore r14
_restgpr1_15    lwz     r15,-68(r12)             #restore r15
_restgpr1_16    lwz     r16,-64(r12)             #restore r16
_restgpr1_17    lwz     r17,-60(r12)             #restore r17
_restgpr1_18    lwz     r18,-56(r12)             #restore r18
_restgpr1_19    lwz     r19,-52(r12)             #restore r19
_restgpr1_20    lwz     r20,-48(r12)             #restore r20
_restgpr1_21    lwz     r21,-44(r12)             #restore r21
_restgpr1_22    lwz     r22,-40(r12)             #restore r22
_restgpr1_23    lwz     r23,-36(r12)             #restore r23
_restgpr1_24    lwz     r24,-32(r12)             #restore r24
_restgpr1_25    lwz     r25,-28(r12)             #restore r25
_restgpr1_26    lwz     r26,-24(r12)             #restore r26
_restgpr1_27    lwz     r27,-20(r12)             #restore r27
_restgpr1_28    lwz     r28,-16(r12)             #restore r28
_restgpr1_29    lwz     r29,-12(r12)             #restore r29
                lwz     r30,-8(r12)              #restore r30
                lwz     r31,-4(r12)              #restore r31
                blr                              #return
The following example shows an FPR restore routine:
_restfpr_14    lfd     r14,-144(r1)              #restore r14
_restfpr_15    lfd     r15,-136(r1)              #restore r15
_restfpr_16    lfd     r16,-128(r1)              #restore r16
_restfpr_17    lfd     r17,-120(r1)              #restore r17
_restfpr_18    lfd     r18,-112(r1)              #restore r18
_restfpr_19    lfd     r19,-104(r1)              #restore r19
_restfpr_20    lfd     r20,-96(r1)               #restore r20
_restfpr_21    lfd     r21,-88(r1)               #restore r21
_restfpr_22    lfd     r22,-80(r1)               #restore r22
_restfpr_23    lfd     r23,-72(r1)               #restore r23
_restfpr_24    lfd     r24,-64(r1)               #restore r24
_restfpr_25    lfd     r25,-56(r1)               #restore r25
_restfpr_26    lfd     r26,-48(r1)               #restore r26
_restfpr_27    lfd     r27,-40(r1)               #restore r27
_restfpr_28    lfd     r28,-32(r1)               #restore r28
_restfpr_29    lwz     r0,8(r1)                  #get return
                                                 #address from
                                                 #frame
               lfd     r29,-24(r1)               #restore r29
               mtlr    r0                        #move return
                                                 #address to LR
               lfd     r30,-16(r1)               #restore r30
               lfd     r31,-8(r1)                #restore r31
               blr                               #return
For a function that saves and restores m FPRs (m>2), the saving can be done using individual store and load instructions or by calling system-provided routines as shown in the following example:
#The following example shows the prolog/epilog of a function #which saves m FPRs and no GPRs: mflr r0 #move LR into GPR 0 bla _savefpr_<32-m> stwu r1,<-frame_size>(r1) #update SP and save caller's SP ... <save CR if necessary> ... ... #body of function ... <reload save CR if necessary> ... <reload caller's SP into r1> #see note below ba _restfpr_<32-m> #restore FPRs and return
Notes:
- There are no entry points for saving and restoring GPR and FPR numbers higher than 29. It is more efficient to save a small number of registers in the prolog than to call the save and restore functions.
 - The restoring of the calling function SP can be done by either adding the frame_size value to the current SP whenever frame_size is known, or by reloading it from offset 0 from the current SP. The first approach is more efficient, but not possible for functions that use the alloca subroutine to dynamically allocate stack space.
 
The PowerPC stwu (Store Word with Update) instruction is used for computing the new SP and saving the back chain. This instruction has a signed 16-bit displacement field that can represent a maximum signed value of 32,768. A stack frame size greater than 32K bytes requires two instructions to update the SP, and the update must be done atomically.
The two assembly code examples illustrate how to update the SP in a prolog.
To compute a new SP and save the old SP for stack frames larger than or equal to 32K bytes:
addis   r12, r0, (<-frame_size> > 16) & 0XFFFF
                        # set r12 to left half of frame size
ori     r12, r12 (-frame_size> & 0XFFFF
                        # Add right halfword of frame size
stwux   r1, r1, r12     # save old SP and compute new SP
To compute a new SP and save the old SP for stack frames smaller than 32K bytes:
stwu r1, <-frame_size>(r1) #update SP and save caller's SP
When an assembler language program calls another program, the caller should not use the names of the called program's commands, functions, or procedures as global assembler language symbols. To avoid confusion, follow the naming conventions for the language of the called program when you create symbol names. For example, if you are calling a C language program, be certain you use the naming conventions for that language.
A called routine has two symbols associated with it: a function descriptor (Name) and an entry point (.Name). When a call is made to a routine, the compiler branches to the name point directly.
Except for when loading parameters into the proper registers, calls to functions are expanded by compilers to include an NOP instruction after each branch and link instruction. This extra instruction is modified by the linkage editor to restore the contents of the TOC register (register 2) on return from an out-of-module call.
The instruction sequence produced by compilers is:
bl .foo #Branch to foo cror 31,31,31 #Special NOP 0x4ffffb82
Note: Some compilers produce a cror 15,15,15 (0x4def7b82) instruction. To avoid having to restore condition register 15 after a call, the linkage editor transforms cror 15,15,15 into cror 31,31,31. Condition register bit 31 is not preserved across a call and does not have to be restored.
The linkage editor will do one of two things when it sees the bl instruction (in the previous instruction sequence, on a call to the foo function):
bl .glink_of_foo #Branch to global linkage routine for foo l 2,20(1) #Restore TOC register instruction 0x80410014
The bl .glink_of_foo instruction sequence is changed to:
bl .foo #Branch to foo cror 31,31,31 #Special NOP instruction 0x4ffffb82
Note: For any export, the linkage editor inserts the procedure's descriptor into the module.
Prologs and epilogs are used in the called routines. On entry to a routine, the following steps should be performed:
Note: If a stack overflow occurs, it will be known immediately when the store of the back chain is completed.
On exit from a procedure, perform the following step:
Every assembly (compiled) program needs traceback information for the debugger to examine if the program traps or crashes during execution. This information is in a traceback table at the end of the last machine instruction in the program and before the program's constant data.
The traceback table starts with a full word of zeros, X'00000000', which is not a valid system instruction. The zeros are followed by 2 words (64 bits) of mandatory information and several words of optional information, as defined in the /usr/include/sys/debug.h file. Using this traceback information, the debugger can unwind the CALL chain and search forward from the point where the failure occurred until it reaches the end of the program (the word of zeros).
In general, the traceback information includes the name of the source language and information about registers used by the program, such as which general-purpose and floating-point registers were saved.
The following is an example of assembler code called by a C routine:
#      Call this assembly routine from C routine:
#       callfile.c:
#       main()
#       {
#       examlinkage();
#       }
#       Compile as follows:
#       cc -o callfile callfile.c examlinkage.s
#
################################################################# # On entry to a procedure(callee), all or some of the # following steps should be done: # 1. Save the link register at offset 8 from the # stack pointer for non-leaf procedures. # 2. If any of the CR bits 8-19(CR2,CR3,CR4) is used # then save the CR at displacement 4 of the current # stack pointer. # 3. Save all non-volatile FPRs used by this routine. # If more that three non-volatile FPR are saved,
# a call to ._savefn can be used to # save them (n is the number of the first FPR to be # saved). # 4. Save all non-volatile GPRs used by this routine # in the caller's GPR SAVE area (negative displacement # from the current stack pointer r1). # 5. Store back chain and decrement stack pointer by the # size of the stack frame. #
# On exit from a procedure (callee), all or some of the # following steps should be done: # 1. Restore all GPRs saved. # 2. Restore stack pointer to value it had on entry. # 3. Restore Link Register if this is a non-leaf # procedure. # 4. Restore bits 20-31 of the CR is it was saved. # 5. Restore all FPRs saved. If any FPRs were saved then # a call to ._savefn can be used to restore them # (n is the first FPR to be restored). # 6. Return to caller.
#################################################################
#       The following routine calls printf() to print a string.
#       The routine performs entry steps 1-5 and exit steps 1-6.
#       The prolog/epilog code is for small stack frame size.
#       DSA + 8 < 32k
#################################################################
        .file   "examlinkage.s"
#Static data entry in T(able)O(f)C(ontents)
        .toc
T.examlinkage.c:        .tc     examlinkage.c[tc],examlinkage.c[rw]
        .globl  examlinkage[ds]
#examlinkage[ds] contains definitions needed for
#runtime linkage of function examlinkage
        .csect  examlinkage[ds]
        .long   .examlinkage[PR]     
        .long   TOC[tc0]
        .long   0
#Function entry in T(able)O(f)C(ontents)
        .toc
T.examlinkage:  .tc     .examlinkage[tc],examlinkage[ds]
#Main routine
        .globl  .examlinkage[PR]
        .csect  .examlinkage[PR]
#       Set current routine stack variables
#       These values are specific to the current routine and
#       can vary from routine to routine
        .set    argarea,     32
        .set    linkarea,    24
       .set    locstckarea,   0
        .set    nfprs,       18
        .set    ngprs,       19
        .set    szdsa,       
8*nfprs+4*ngprs+linkarea+argarea+locstckarea
#PROLOG:  Called Routines Responsibilities
        #    Get link reg.
        mflr    0
        #    Get CR if current routine alters it.
        mfcr    12
        #    Save FPRs 14-31.
        bl      ._savef14
        cror 31, 31, 31
        #    Save GPRs 13-31.
        stm     13, -8*nfprs-4*ngprs(1)
        #    Save LR if non-leaf routine.
        st      0, 8(1)
        #    Save CR if current routine alters it.
        st      12, 4(1)
        #    Decrement stack ptr and save back chain.
        stu     1, -szdsa(1)
################################
#load static data address
#################################
        l       14,T.examlinkage.c(2)
        #  Load string address which is an argument to printf.
        cal 3, printing(14)
        #  Call to printf routine
        bl      .printf[PR]
        cror 31, 31, 31
#EPILOG:  Return Sequence
        #    Restore stack ptr
        ai      1, 1, szdsa
        #    Restore GPRs 13-31.
        lm      13, -8*nfprs-4*ngprs(1)
        #    Restore FPRs 14-31.
        bl      ._restf14
        cror 31, 31, 31
        #    Get saved LR.
        l       0, 8(1)
        #    Get saved CR if this routine saved it.
        l       12, 4(1)
        #    Move return address to link register.
        mtlr    0
        #    Restore CR2, CR3, & CR4 of the CR.
        mtcrf   0x38,12
        #    Return to address held in Link Register.
        brl        
.tbtag 0x0,0xc,0x0,0x0,0x0,0x0,0x0,0x0
          #       External variables
        .extern ._savef14
        .extern ._restf14
        .extern .printf[PR]
#################################
#       Data 
#################################
        .csect  examlinkage.c[rw]
        .align  2
printing:       .byte  'E,'x,'a,'m,'p,'l,'e,' ,'f,'o,'r,' 
                .byte   'P,'R,'I,'N,'T,'I,'N,'G
        .byte   0xa,0x0
All of the fixed-point divide instructions, and some of the multiply instructions, are different for POWER and PowerPC. To allow programs to run on systems based on either architecture, a set of special routines is provided by the operating system. These are called milicode routines and contain machine-dependent and performance-critical functions. Milicode routines are located at fixed addresses in the kernel segment. These routines can be reached by a bla instruction. All milicode routines use the link register.
Notes:
- No unnecessary registers are destroyed. Refer to the definition of each milicode routine for register usage information.
 - Milicode routines do not alter any floating-point register, count register, or general-purpose registers (GPRs) 10-12. The link register can be saved in a GPR (for example, GPR 10) if the call appears in a leaf procedure that does not use nonvolatile GPRs.
 - Milicode routines do not make use of a TOC.
 
The following milicode routines are available:
The following example uses the mulh milicode routine in an assembler program:
li R3, -900 li R4, 50000 bla .__mulh ... .extern .__mulh
Assembling and Linking a Program.
Understanding Assembler Passes.
Interpreting an Assembler Listing.
Interpreting a Symbol Cross-Reference.
Understanding and Programming the TOC.
The b (Branch) instruction, cror (Condition Register OR) instruction.