|Previous||Table of Contents||Next|
At the end of the previous chapter, X-Sharp had just acquired basic hidden-surface capability, and performance had been vastly improved through the use of fixed-point arithmetic. In this chapter, were going to add quite a bit more: support for 8088 and 80286 PCs, a general color model, and shading. Thats an awful lot to cover in one chapter (actually, itll spill over into the next chapter), so lets get to it!
To date, X-Sharp has run on only the 386 and 486, because it uses 32-bit multiply and divide instructions that sub-386 processors dont support. I chose 32-bit instructions for two reasons: Theyre much faster for 16.16 fixed-point arithmetic than any approach that works on the 8088 and 286; and theyre much easier to implement than any other approach. In short, I was after maximum performance, and I was perhaps just a little lazy.
I should have known better than to try to sneak this one by you. The most common feedback Ive gotten on X-Sharp is that I should make it support the 8088 and 286. Well, I can take a hint as well as the next guy. Listing 54.1 is an improved version of FIXED.ASM, containing dual 386/8088 versions of CosSin(), XformVec(), and ConcatXforms(), as well as FixedMul() and FixedDiv().
Given the new version of FIXED.ASM, with USE386 set to 0, X-Sharp will now run on any processor. Thats not to say that it will run fast on any processor, or at least not as fast as it used to. The switch to 8088 instructions makes X-Sharps fixed-point calculations about 2.5 times slower overall. Since a PC is perhaps 40 times slower than a 486/33, were talking about a hundred-times speed difference between the low end and mainstream. A 486/33 can animate a 72-sided ball, complete with shading (as discussed later), at 60 frames per second (fps), with plenty of cycles to spare; an 8-MHz AT can animate the same ball at about 6 fps. Clearly, the level of animation an application uses must be tailored to the available CPU horsepower.
The implementation of a 32-bit multiply using 8088 instructions is a simple matter of adding together four partial products. A 32-bit divide is not so simple, however. In fact, in Listing 54.1 Ive chosen not to implement a full 32×32 divide, but rather only a 32×16 divide. The reason is simple: performance. A 32×16 divide can be implemented on an 8088 with two DIV instructions, but a 32×32 divide takes a great deal more work, so far as I can see. (If anyone has a fast 32×32 divide, or has a faster way to handle signed multiplies and divides than the approach taken by Listing 54.1, please drop me a line care of the publisher.) In X-Sharp, division is used only to divide either X or Y by Z in the process of projecting from view space to screen space, so the cost of using a 32×16 divide is merely some inaccuracy in calculating screen coordinates, especially when objects get very close to the Z = 0 plane. This error is not cumulative (that is, it doesnt carry over to later frames), and in my experience doesnt cause noticeable image degradation; therefore, given the already slow performance of the 8088 and 286, Ive opted for performance over precision.
At any rate, please keep in mind that the non-386 version of FixedDiv() is not a general-purpose 32×32 fixed-point division routine. In fact, it will generate a divide-by-zero error if passed a fixed-point divisor between -1 and 1. As Ive explained, the non-386 version of Fixed-Div() is designed to do just what X-Sharp needs, and no more, as quickly as possible.
|Previous||Table of Contents||Next|