Previous | Table of Contents | Next |
LISTING 9.5 L9-5.ASM
; Divides an arbitrarily long unsigned dividend by a 16-bit unsigned ; divisor. C near-callable as: ; unsigned int Div(unsigned int * Dividend, ; int DividendLength, unsigned int Divisor, ; unsigned int * Quotient); ; ; Returns the remainder of the division. ; ; Tested with TASM 2. parms struc dw 2 dup (?) ;pushed BP & return address Dividend dw ? ;pointer to value to divide, stored in Intel ; order, with lsb at lowest address, msb at ; highest. Must be composed of an integral ; number of words DividendLength dw ? ;# of bytes in Dividend. Must be a multiple ; of 2 Divisor dw ? ;value by which to divide. Must not be zero, ; or a Divide By Zero interrupt will occur Quotient dw ? ;pointer to buffer in which to store the ; result of the division, in Intel order. ; The quotient returned is of the same ; length as the dividend parmsends .model small .code public _Div _Divprocnear push bp ;preserve callers stack frame mov bp,sp ;point to our stack frame push si ;preserve callers register variables push di std ;were working from msb to lsb mov ax,ds mov es,ax ;for STOS mov cx,[bp+DividendLength] sub cx,2 mov si,[bp+Dividend] add si,cx ;point to the last word of the dividend ; (the most significant word) mov di,[bp+Quotient] add di,cx ;point to the last word of the quotient ; buffer (the most significant word) mov bx,[bp+Divisor] shr cx,1 inc cx ;# of words to process sub dx,dx ;convert initial divisor word to a 32-bit ;value for DIV DivLoop: lod sw ;get next most significant word of divisor div bx sto sw ;save this word of the quotient ;DX contains the remainder at this point, ; ready to prepend to the next divisor word loop DivLoop mov ax,dx ;return the remainder cld ;restore default Direction flag setting pop di ;restore callers register variables pop si pop bp ;restore callers stack frame ret _Divendp end
LISTING 9.6 L9-6.C
/* Sample use of Div function to perform division when the result doesnt fit in 16 bits */ #include <stdio.h> extern unsigned int Div(unsigned int * Dividend, int DividendLength, unsigned int Divisor, unsigned int * Quotient); main() { unsigned long m, i = 0x20000001; unsigned int k, j = 0x10; k = Div((unsigned int *)&i, sizeof(i), j, (unsigned int *)&m); printf(%lu / %u = %lu r %u\n, i, j, m, k); }
Way back in Volume 1, Number 1 of PC TECHNIQUES, (April/May 1990) I wrote the very first of that magazines HAX (#1), which extolled the virtues of placing your most commonly-used automatic (stack-based) variables within the stacks sweet spot, the area between +127 to -128 bytes away from BP, the stack frame pointer. The reason was that the 8088 can store addressing displacements that fall within that range in a single byte; larger displacements require a full word of storage, increasing code size by a byte per instruction, and thereby slowing down performance due to increased instruction fetching time.
This takes on new prominence in 386 native mode, where straying from the sweet spot costs not one, but two or three bytes. Where the 8088 had two possible displacement sizes, either byte or word, on the 386 there are three possible sizes: byte, word, or dword. In native mode (32-bit protected mode), however, a prefix byte is needed in order to use a word-sized displacement, so a variable located outside the sweet spot requires either two extra bytes (an extra displacement byte plus a prefix byte) or three extra bytes (a dword displacement rather than a byte displacement). Either way, instructions grow alarmingly.
Performance may or may not suffer from missing the sweet spot, depending on the processor, the memory architecture, and the code mix. On a 486, prefix bytes often cost a cycle; on a 386SX, increased code size often slows performance because instructions must be fetched through the half-pint 16-bit bus; on a 386, the effect depends on the instruction mix and whether theres a cache.
On balance, though, its as important to keep your most-used variables in the stacks sweet spot in 386 native mode as it was on the 8088. |
In assembly, its easy to control the organization of your stack frame. In C, however, youll have to figure out the allocation scheme your compiler uses to allocate automatic variables, and declare automatics appropriately to produce the desired effect. It can be done: I did it in Turbo C some years back, and trimmed the size of a program (admittedly, a large one) by several Knot bad, when you consider that the sweet spot optimization is essentially free, with no code reorganization, change in logic, or heavy thinking involved.
Previous | Table of Contents | Next |