decade about graphics and performance programming that’s still relevant to Code Optimization is there too, and even my book Zen of Assembly. Graphics Programming Black Book Special Edition has 65 ratings and 3 reviews. — Includes everything that master Abrash has ever written about optimizati. Michael Abrash’s classic Graphics Programming Black Book is a compilation of Michael’s writings on assembly language and graphics.

Author: Kajin Makasa
Country: Liberia
Language: English (Spanish)
Genre: Marketing
Published (Last): 14 June 2009
Pages: 122
PDF File Size: 14.13 Mb
ePub File Size: 5.32 Mb
ISBN: 256-9-78609-918-7
Downloads: 90610
Price: Free* [*Free Regsitration Required]
Uploader: Faenos

Your job is to sequence those blocks so that they perform well. Otherwise, ZTimerReport subtracts the reference count representing the overhead of the Zen timer from the count measured between the calls to ZTimerOn and ZTimerOffconverts the result from timer counts to microseconds, and prints the resulting time in microseconds to the standard output.

As I was writing my last game, I discovered that the program ran perceptibly faster if I used look-up tables instead of shifts and adds for my calculations. I examined the subroutine line by line, saving a cycle here and a cycle there, until the code truly seemed to be abrasn. Where do we begin?

Note that I said that an assembly programmer can generate better code than a compiler, not will generate better code. First, consider the series of MUL instructions in Listing 4.

For one thing, the rule should be 4 cycles times the number of memory accesses, not instruction bytes, since all accesses take 4 cycles on the based PC. I am proggramming a game dev but I love to write performant code, it’s just a personal satisfaction when I know my code is making the best use of the hardware.

Michael Abrash’s Graphics Programming Black Book, Special Edition

Is the never-ending collection of information all there is to the assembly optimization, then? Sure, a year from now I will have probably found a new perspective that will make me cringe at the clunkiness of some part of Quake, but at the moment it still looks pretty damn good to me. To write truly superior assembly programs, you need to know what the various instructions do and which instructions execute fastest…and more.

All three subroutines preserve all registers and all flags except the interrupt flag, so calls to these routines are transparent to the calling code.


I say engineer and not programmer because this book covers the former and how to excel at efficiency and best-practices rather than simply creating functionality in an abstracted development environment. Being an engineer back then meant knowing how to use a slide rule, and Irwin could jockey a slipstick with the best of them. Fans of the call it a bit processor. Consequently, carefully optimized assembly is not just the language of choice but the only choice for the 1 percent to 10 percent of code—usually consisting of small, well-defined subroutines—that determines overall program performance, and it is the only choice for code that must be as compact as possible, as well.

Good assembly code is better than good compiled code. There you have an important tenet of assembly language optimization: Quake’s Lighting Model 1.

Take a moment to examine some interesting performance aspects of the C implementation, and all should become much clearer.

sbrash For the time being, all you really need to know about the display adapter cycle-eater is that on the you can lose more than 8 cycles of execution time on each access to display memory. A full understanding of code optimization requires an understanding of cycle-eaters and their implications.

We can immediately discard all approaches that involve reading any byte of the file more than once, because disk access time is orders of magnitude slower than any data handling performed by our own code.

The subtle facts and examples I provide will help you gain the necessary experience, but you must continue the journey on your own. To produce the best code, you must decide precisely what you need graphic accomplish, then put together the sequence of instructions that accomplishes that end most efficiently, regardless of what the instructions are usually used for. I had dismissed much of the assembly portions as novelties until I realized the bswap instruction had a modern 64 bit variant, doubling the garphics of 32 bit words I could have in general prograamming registers at one time.

Graphics Programming Black Book Special Edition

Our engine also relies heavily abrahs repeated string instructions, assuming that the memchr and memcmp library functions are properly coded. The Best Optimizer is between Your Ears 4. One interesting aspect of ZTimerOff is the manner in which timer 0 is stopped boom order to read the timer count.

Sometimes the BIU is able to use spare bus cycles to prefetch instruction bytes before the EU needs them, so in those cases instruction fetching takes no time at all, practically speaking.


Michael Abrash’s Graphics Programming Black Book | Hacker News

Similarly, familiarity with the PC hardware is required. For blak access to display memory, grapgics loss really can be as high as 8cycles and up to 50,or even more on s and Pentiums paired with slow VGAswhile for average graphics code the loss is closer to 4 cycles; in either case, the impact on performance is significant.

While the instructions themselves were individually optimized, the overall approach did not make the best possible use of the instructions.

As shown in Figure 3.

To summarize, the skill of assembly language optimization is a combination of knowledge, perspective, and a way of thought that makes possible the genesis of absolutely the fastest or the smallest code. The potential of assembly code to run slowly is poorly understood by a lot of people, but that potential is great, especially in the hands of the ignorant.

This is analogous to trying to write programs that incorporate features like bitmapped text and searching of multisegment buffers without using high-performance assembly language. Only on microcomputers do you have the run of the whole machine, without layers of operating systems, drivers, and the like getting in the way. This solution is obvious because it takes good advantage of the special ability of the x86 family to shift or rotate by the variable number of bits specified by CL.

Mode X Marks the Latch 1. The closest match to what we need is strstrwhich searches one string for the first occurrence of a second string. It’s pretty accessible and has engaged me right from the beginning. When the code in Listing 4.

One important safety tip when modifying the Zen timer for use with large code model C code: The Game of Life 2. Amazon Renewed Refurbished products with a warranty. The temperature climbs to 55 degrees, then 60, then 63, then 65, and finally creeps up to 68 degrees.