|Previous||Table of Contents||Next|
The answer varies considerably depending on what display adapter and what display mode were talking about. The display adapter cycle-eater is worst with the Enhanced Graphics Adapter (EGA) and the original Video Graphics Array (VGA). (Many VGAs, especially newer ones, insert many fewer wait states than IBMs original VGA. On the other hand, Super VGAs have more bytes of display memory to be accessed in high-resolution mode.) While the Color/Graphics Adapter (CGA), Monochrome Display Adapter (MDA), and Hercules Graphics Card (HGC) all suffer from the display adapter cycle-eater as well, they suffer to a lesser degree. Since the VGA represents the base standard for PC graphics now and for the foreseeable future, and since it is the hardest graphics adapter to wring performance from, well restrict our discussion to the VGA (and its close relative, the EGA) for the remainder of this chapter.
Even on the EGA and VGA, the effect of the display adapter cycle-eater depends on the display mode selected. In text mode, the display adapter cycle-eater is rarely a major factor. Its not that the cycle-eater isnt present; however, a mere 4,000 bytes control the entire text mode display, and even with the display adapter cycle-eater it just doesnt take that long to manipulate 4,000 bytes. Even if the display adapter cycle-eater were to cause the 8088 to take as much as 5µs per display memory accessmore than five times normalit would still take only 4,000× 2× 5µs, or 40 µs, to read and write every byte of display memory. Thats a lot of time as measured in 8088 cycles, but its less than the blink of an eye in human time, and video performance only matters in human time. After all, the whole point of drawing graphics is to convey visual information, and if that information can be presented faster than the eye can see, that is by definition fast enough.
Thats not to say that the display adapter cycle-eater cant matter in text mode. In Chapter 3, I recounted the story of a debate among letter-writers to a magazine about exactly how quickly characters could be written to display memory without causing snow. The writers carefully added up Intels instruction cycle times to see how many writes to display memory they could squeeze into a single horizontal retrace interval. (On a CGA, its only during the short horizontal retrace interval and the longer vertical retrace interval that display memory can be accessed in 80-column text mode without causing snow.) Of course, now we know that their cardinal sin was to ignore the prefetch queue; even if there were no wait states, their calculations would have been overly optimistic. There are display memory wait states as well, however, so the calculations were not just optimistic but wildly optimistic.
Text mode situations such as the above notwithstanding, where the display adapter cycle-eater really kicks in is in graphics mode, and most especially in the high-resolution graphics modes of the EGA and VGA. The problem here is not that there are necessarily more wait states per access in highgraphics modes (that varies from adapter to adapter and mode to mode). Rather, the problem is simply that are many more bytes of display memory per screen in these modes than in lower-resolution graphics modes and in text modes, so many more display memory accesseseach incurring its share of display memory wait statesare required in order to draw an image of a given size. When accessing the many thousands of bytes used in the high-resolution graphics modes, the cumulative effects of display memory wait states can seriously impact code performance, even as measured in human time.
For example, if we assume the same 5 µs per display memory access for the EGAs high-resolution graphics mode that we assumed for text mode, it would take 26,000 × 2 × 5 µs, or 260 µs, to scroll the screen once in the EGAs high-resolution graphics mode, mode 10H. Thats more than one-quarter of a secondnoticeable by human standards, an eternity by computer standards.
That sounds pretty serious, but we did make an unfounded assumption about memory access speed. Lets get some hard numbers. Listing 4.11 accesses display memory at the 8088s maximum speed, by way of a REP MOVSW with display memory as both source and destination. The code in Listing 4.11 executes in 3.18 µs per access to display memorynot as long as we had assumed, but a long time nonetheless.
LISTING 4.11 LST4-11.ASM
; Times speed of memory access to Enhanced Graphics ; Adapter graphics mode display memory at A000:0000. ; mov ax,0010h int 10h; select hi-res EGA graphics ; mode 10 hex (AH=0 selects ; BIOS set mode function, ; with AL=mode to select) ; mov ax,0a000h mov ds,ax mov es,ax ;move to & from same segment sub si,si ;move to & from same offset mov di,si mov cx,800h ;move 2K words cld call ZTimerOn rep movsw ;simply read each of the first ; 2K words of the destination segment, ; writing each byte immediately back ; to the same address. No memory ; locations are actually altered; this ; is just to measure memory access ; times call ZTimerOff ; mov ax,0003h int 10h ;return to text mode
For comparison, lets see how long the same code takes when accessing normal system RAM instead of display memory. The code in Listing 4.12, which performs a REP MOVSW from the code segment to the code segment, executes in 1.39 µs per display memory access. That means that on average, 1.79 µs (more than 8 cycles!) are lost to the display adapter cycle-eater on each access. In other words, the display adapter cycle-eater can more than double the execution time of 8088 code!
LISTING 4.12 LST4-12.ASM
; Times speed of memory access to normal system ; memory. ; mov ax,ds mov es,ax ;move to & from same segment sub si,si ;move to & from same offset mov di,si mov cx,800h ;move 2K words cld call ZTimerOn rep movsw ;simply read each of the first ; 2K words of the destination segment, ; writing each byte immediately back ; to the same address. No memory ; locations are actually altered; this ; is just to measure memory access ; times call ZTimerOff
Bear in mind that were talking about a worst case here; the impact of the display adapter cycle-eater is proportional to the percent of time a given code sequence spends accessing display memory.
|Previous||Table of Contents||Next|