Web Analytics

See also ebooksgratis.com: no banners, no cookies, totally FREE.

CLASSICISTRANIERI HOME PAGE - YOUTUBE CHANNEL
Privacy Policy Cookie Policy Terms and Conditions
Magic number (programming) - Wikipedia, the free encyclopedia

Magic number (programming)

From Wikipedia, the free encyclopedia

In computer programming, a magic number is a constant used to identify the file or data type employed. The term was initially found in a comment in the the early Sixth Edition source code of the Unix operating system and, although it has lost its original meaning, it has become part of computer industry lexicon.

Today's magic numbers are often chosen based on (among other factors):

Contents

[edit] Magic number origin

Deep in the Sixth Edition source code of the Unix program manager, the exec() service read the executable (binary) image from the file system. The first 20 bytes or so were a header containing the sizes of the program (text) and initialized (global) data areas. Also, the first 16-bit word of the header was compared to two constants to determine if the executable image contained absolute memory address references, relocatable memory references, or the newly implemented paged executable image. Comments in the code referred to these constants as magic numbers without further explanation. Given that there were over 10,000 lines of code and many many constants employed in Unix, this indeed was a curious comment, almost as curious as the You are not expected to understand this.[1] comment used in the context switching section of the program manager.

However, if one spent time examining DEC PDP-11 assembly language listings and debugging PDP-11 programs, the constants had a familiar look. The high order byte was, in fact, the operation code for the PDP-11 branch instruction. Calculating branch offsets revealed that if the magic numbers were executed, they would branch the exec() Unix service over the executable image header data and start the program. In this way these special constants provided an illusion and met the criteria for magic.

The Sixth Edition had paging code and the magic number illusion was further preserved since the exec() service read the file header (meta) data into kernel space and read the executable image into user space, thereby not using the magic number branching feature. The magic number concept was implemented in the Unix linker and loader and magic number branching was probably used in the suite of stand-alone diagnostic programs that came with the Sixth Edition.

The first PDP-11/20 did not have memory protection and, therefore, the absolute address reference model was used.[1] Thus, the pre-Sixth Edition Unix versions read the executable file, with header, into memory and used the branch instruction (the initial magic number) to start the program. As more executable formats were developed, new magic numbers were added by incrementing the branch offset value by one. Magic numbers were also kept in the Sixth Edition kernel as a debugging safety feature.[2]

[edit] Magic numbers in files


"Magic numbers" were used to identify file types, then file system types. The term usage has expanded over time, and is now in current use by many other programs across many operating systems. It is a form of in-band signaling.

Many other types of files have content that identifies the type. Detecting such numbers in file content is therefore an effective way of distinguishing between file formats - and can yield further run-time information.

Some examples:

  • Compiled Java class files (bytecode) start with 0xCAFEBABE on big-endian systems.
  • GIF image files have the ASCII code for 'GIF89a' (0x474946383961) or 'GIF87a' (0x474946383761)
  • JPEG image files begin with 0xFFD8FF, and JPEG/JFIF files contain the ASCII code for 'JFIF' (0x4A464946) or JPEG/EXIF files contain the ASCII code for 'Exif' (0x45786966) beginning at byte 6 in the file, followed by more metadata about the file.
  • PNG image files begin with an 8-byte signature which identifies the file as a PNG file and allows immediate detection of some common file-transfer problems: \211 P N G \r \n \032 \n (0x89504e470d0a1a0a)
  • Standard MIDI music files have the ASCII code for 'MThd' (0x4D546864) followed by more metadata about the file.
  • Unix script files usually start with a shebang, '#!' (0x2321, or 0x2123 on little-endian processors) followed by the path to an interpreter.
  • PostScript files and programs start with '%!' (0x2521).
  • PDF files start with '%PDF'.
  • Old MS-DOS .exe files and the newer Microsoft Windows PE (Portable Executable) .exe files start with the ASCII string 'MZ' (0x4D5A), the initials of the designer of the file format, Mark Zbikowski. The definition allows 'ZM' as well but it is quite uncommon.
  • The Berkeley Fast File System superblock format is identified as either 0x19540119 or 0x011954 depending on version; both represent the birthday of author Marshall Kirk McKusick.
  • Executables for the Game Boy and Game Boy Advance handheld video game systems have a 48-byte or 156-byte magic number, respectively, at a fixed spot in the header. This magic number encodes a bitmap of the Nintendo logo.
  • Old Fat binaries (containing code for both 68K processors and PowerPC processors) on Classic Mac OS contained the ASCII code for 'Joy!' (0x4A6F7921) as a prefix.
  • TIFF files begin with either "II" or "MM" depending on the byte order (II for Intel, or little endian, MM for Motorola, or big endian), followed by 0x2A00 or 0x002A (decimal 42 as a 2-byte integer in Intel or Motorola byte ordering).

The Unix program file can read and interpret magic numbers from files, and indeed, the file which is used to parse the information is called magic.

[edit] Magic numbers in protocols

Please expand this section.
Further information might be found on the talk page or at Requests for expansion.
Please remove this message once the section has been expanded.

  • ICQ FLAP-packets start with 0x2A.

[edit] Magic numbers in code

The term magic number also refers to the bad programming practice of using numbers directly in source code without explanation. In most cases this makes programs harder to read, understand, and maintain. Although most guides make an exception for the numbers zero and one, it is a good idea to define all other numbers in code as named constants.

For example, to shuffle the values in an array randomly, this pseudocode will do the job:

   for i from 1 to 52
       j := i + randomInt(53 - i) - 1
       a.swapEntries(i, j)

where a is an array object, the function randomInt(x) chooses a random integer between 1 to x, inclusive, and swapEntries(i, j) swaps the ith and jth entries in the array. In this example, 52 is a magic number. It is considered better programming style to write:

   constant int deckSize := 52
   for i from 1 to deckSize
       j := i + randomInt(deckSize + 1 - i) - 1
       a.swapEntries(i, j)

This is preferable for several reasons:

  • It is easier to read and understand. A programmer reading the first example might wonder, What does the number 52 mean here? Why 52? The programmer might infer the meaning after reading the code carefully, but it's not obvious. Magic numbers become particularly confusing when the same number is used for different purposes in one section of code.
  • It is easier to alter the value of the number, as it is not redundantly duplicated. Changing the value of a magic number is error-prone, because the same value is often used several times in different places within a program. Also, if two semanticly distinct variables or numbers have the same value they may be accidentally both edited together. To modify the first example to shuffle a Tarot deck, which has 78 cards, a programmer might naively replace every instance of 52 in the program with 78. This would cause two problems. First, it would miss the value 53 on the second line of the example, which would cause the algorithm to fail in a subtle way. Second, it would likely replace the characters 52 everywhere, regardless of whether they refer to the deck size or to something else entirely, which could introduce bugs. By contrast, changing the value of the deckSize variable in the second example would be a simple, one-line change.
  • The declarations of "magic number" variables are placed together, usually at the top of a function or file, facilitating their review and change.
  • It facilitates parameterization. For example, to generalize the above example into a procedure that shuffles a deck of any number of cards, it would be sufficient to turn deckSize into a parameter of that procedure. The first example would require several changes, perhaps:
   function shuffle (int deckSize)
      for i from 1 to deckSize
          j := i + randomInt(deckSize + 1 - i) - 1
          a.swapEntries(i, j)
  • It helps detect typos. Using a variable (instead of a literal) takes advantage of a compiler's checking (if any). Accidentally typing "62" instead of "52" would go undetected, whereas typing "dekSize" instead of "deckSize" would result in the compiler's warning that dekSize is undeclared.
  • It can reduce typing in some IDEs. If an IDE supports code completion, it will fill in most of the variable's name from the first few letters.

Disadvantages are:

  • It can increase the line length of the source code, forcing lines to be broken up if many constants are used on the same line.
  • It can make debugging more difficult, especially on systems where the debugger doesn't display the values of constants.

[edit] Allowed use of magic numbers

Although somewhat controversial, most programmers would concede that the use of 0 (zero) and 1 are the only two allowable magic numbers in general code. There are several reasons for this.

  • Some programming languages begin arrays and lists at index 0, while others begin at index 1
  • A 0-based array requires a count minus one upper index limit in a loop. For example:
for index := 0 to list.count-1 do
    DoSomething(index);
  • 0 equates to false and 1 to true in many programming languages. For this reason these two numbers are considered valid Magic Numbers (although most programmers would argue that the explicit constants TRUE and FALSE should be used instead).
  • Some languages (notably C and C++) use 0 to indicate a null pointer constant (although many programmers would argue that the explicit constant NULL should be used instead).

[edit] Magic debug values

Magic debug values are specific values written to memory during allocation or deallocation, so that it will later be possible to tell whether or not they have become corrupted and to make it obvious when values taken from uninitialized memory are being used.

Memory is usually viewed in hexadecimal, so common values used are often repeated digits or hexspeak.

Famous and common examples include:

  • 0x..FACADE : Used by a number of RTOSes
  • 0xA5A5A5A5 : Used in embedded development because the alternating bit pattern (10100101) creates an easily recognized pattern on oscilliscopes and logic analyzers.
  • 0xABABABAB : Used by Microsoft's LocalAlloc() to mark "no man's land" guard bytes after allocated heap memory
  • 0xABADBABE : Used by Apple as the "Boot Zero Block" magic number
  • 0xBAADF00D : Used by Microsoft's LocalAlloc(LMEM_FIXED) to mark uninitialised allocated heap memory
  • 0xBADBADBADBAD : Burroughs B6700 "uninitialized" memory (48-bit words)
  • 0xBADCAB1E : Error Code returned to the Microsoft eVC debugger when connection is severed to the debuggee
  • 0xBADC0FFEE0DDF00D : Used on IBM RS/6000 64-bit systems to indicate uninitialized CPU registers
  • 0xC001D00D
  • 0xC0DEBABE
  • 0xC0EDBABE
  • 0xCACACACA
  • 0xCAFEBABE : Used by both Mach-O ("Fat binary" in both 68k and PowerPC) to identify object files and the Java programming language to identify .class files
  • 0xCAFECAFE
  • 0xCAFEFEED : Used by Sun Microsystems' Solaris debugging kernel to mark kmemfree() memory
  • 0xCEFAEDFE : Seen in Intel Mach-O binaries on Apple Computer's Mac OS X platform
  • 0xCCCCCCCC : Used by Microsoft's C++ debugging heap to mark uninitialised stack memory
  • 0xCDCDCDCD : Used by Microsoft's C++ debugging heap to mark uninitialised heap memory
  • 0xDDDDDDDD : Used by MicroQuill's SmartHeap and Microsoft's C++ debugging heap to mark freed heap memory
  • 0xDEADBABE : Used at the start of Silicon Graphics' IRIX arena files
  • 0xDEADBEEF : Famously used on IBM systems such as the RS/6000, also in OPENSTEP Enterprise and the Commodore Amiga
  • 0xDEADC0DE
  • 0xDEADDEAD : A Microsoft Windows STOP Error code used when the user manually initiates the crash.
  • 0xDECAFBAD
  • 0xEBEBEBEB : From MicroQuill's SmartHeap
  • 0xFDFDFDFD : Used by Microsoft's C++ debugging heap to mark "no man's land" guard bytes before and after allocated heap memory
  • 0xFEEDBABE
  • 0xFEEDFACE : Seen in PowerPC Mach-O binaries on Apple Computer's Mac OS X platform
  • 0xFEEEFEEE : Used by Microsoft's HeapFree() to mark freed heap memory
  • 0xFEE1DEAD : Used by Linux reboot() syscall

Note that most of these are each 8 nibbles (32 bits) long, as most modern computers are designed to work on 32-bit values at a time.

The prevalence of these values in Microsoft technology is no coincidence; they are discussed in detail in Steve Maguire's well-known book Writing Solid Code from Microsoft Press. He gives a variety of criteria for these values, such as:

  • They should not be useful; that is, most algorithms that operate on them should be expected to do something unusual. Numbers like zero don't fit this criterion.
  • They should be easily recognized by the programmer as invalid values in the debugger.
  • On machines that don't have byte alignment, they should be odd numbers, so that dereferencing them as addresses causes an exception.
  • They should cause an exception, or perhaps even a debugger break, if executed as code.

Since they were often used to mark areas of memory that were essentially empty, some of these terms came to be used in phrases meaning "gone, aborted, flushed from memory"; e.g. "Your program is DEADBEEF".

Pietr Brandehörst's ZUG programming language initialized memory to either 0x0000, 0xDEAD or 0xFFFF in development environment and to 0x0000 in the live environment, on the basis that uninitialised variables should be encouraged to misbehave under development to trap them, but encouraged to behave in a live environment to reduce errors.

[edit] See also

[edit] References

  1. ^ a b Odd Comments and Strange Doings in Unix[1]
  2. ^ Personal communication with Dennis M. Ritchie
In other languages

Static Wikipedia (no images)

aa - ab - af - ak - als - am - an - ang - ar - arc - as - ast - av - ay - az - ba - bar - bat_smg - bcl - be - be_x_old - bg - bh - bi - bm - bn - bo - bpy - br - bs - bug - bxr - ca - cbk_zam - cdo - ce - ceb - ch - cho - chr - chy - co - cr - crh - cs - csb - cu - cv - cy - da - de - diq - dsb - dv - dz - ee - el - eml - en - eo - es - et - eu - ext - fa - ff - fi - fiu_vro - fj - fo - fr - frp - fur - fy - ga - gan - gd - gl - glk - gn - got - gu - gv - ha - hak - haw - he - hi - hif - ho - hr - hsb - ht - hu - hy - hz - ia - id - ie - ig - ii - ik - ilo - io - is - it - iu - ja - jbo - jv - ka - kaa - kab - kg - ki - kj - kk - kl - km - kn - ko - kr - ks - ksh - ku - kv - kw - ky - la - lad - lb - lbe - lg - li - lij - lmo - ln - lo - lt - lv - map_bms - mdf - mg - mh - mi - mk - ml - mn - mo - mr - mt - mus - my - myv - mzn - na - nah - nap - nds - nds_nl - ne - new - ng - nl - nn - no - nov - nrm - nv - ny - oc - om - or - os - pa - pag - pam - pap - pdc - pi - pih - pl - pms - ps - pt - qu - quality - rm - rmy - rn - ro - roa_rup - roa_tara - ru - rw - sa - sah - sc - scn - sco - sd - se - sg - sh - si - simple - sk - sl - sm - sn - so - sr - srn - ss - st - stq - su - sv - sw - szl - ta - te - tet - tg - th - ti - tk - tl - tlh - tn - to - tpi - tr - ts - tt - tum - tw - ty - udm - ug - uk - ur - uz - ve - vec - vi - vls - vo - wa - war - wo - wuu - xal - xh - yi - yo - za - zea - zh - zh_classical - zh_min_nan - zh_yue - zu -

Static Wikipedia 2007 (no images)

aa - ab - af - ak - als - am - an - ang - ar - arc - as - ast - av - ay - az - ba - bar - bat_smg - bcl - be - be_x_old - bg - bh - bi - bm - bn - bo - bpy - br - bs - bug - bxr - ca - cbk_zam - cdo - ce - ceb - ch - cho - chr - chy - co - cr - crh - cs - csb - cu - cv - cy - da - de - diq - dsb - dv - dz - ee - el - eml - en - eo - es - et - eu - ext - fa - ff - fi - fiu_vro - fj - fo - fr - frp - fur - fy - ga - gan - gd - gl - glk - gn - got - gu - gv - ha - hak - haw - he - hi - hif - ho - hr - hsb - ht - hu - hy - hz - ia - id - ie - ig - ii - ik - ilo - io - is - it - iu - ja - jbo - jv - ka - kaa - kab - kg - ki - kj - kk - kl - km - kn - ko - kr - ks - ksh - ku - kv - kw - ky - la - lad - lb - lbe - lg - li - lij - lmo - ln - lo - lt - lv - map_bms - mdf - mg - mh - mi - mk - ml - mn - mo - mr - mt - mus - my - myv - mzn - na - nah - nap - nds - nds_nl - ne - new - ng - nl - nn - no - nov - nrm - nv - ny - oc - om - or - os - pa - pag - pam - pap - pdc - pi - pih - pl - pms - ps - pt - qu - quality - rm - rmy - rn - ro - roa_rup - roa_tara - ru - rw - sa - sah - sc - scn - sco - sd - se - sg - sh - si - simple - sk - sl - sm - sn - so - sr - srn - ss - st - stq - su - sv - sw - szl - ta - te - tet - tg - th - ti - tk - tl - tlh - tn - to - tpi - tr - ts - tt - tum - tw - ty - udm - ug - uk - ur - uz - ve - vec - vi - vls - vo - wa - war - wo - wuu - xal - xh - yi - yo - za - zea - zh - zh_classical - zh_min_nan - zh_yue - zu -

Static Wikipedia 2006 (no images)

aa - ab - af - ak - als - am - an - ang - ar - arc - as - ast - av - ay - az - ba - bar - bat_smg - bcl - be - be_x_old - bg - bh - bi - bm - bn - bo - bpy - br - bs - bug - bxr - ca - cbk_zam - cdo - ce - ceb - ch - cho - chr - chy - co - cr - crh - cs - csb - cu - cv - cy - da - de - diq - dsb - dv - dz - ee - el - eml - eo - es - et - eu - ext - fa - ff - fi - fiu_vro - fj - fo - fr - frp - fur - fy - ga - gan - gd - gl - glk - gn - got - gu - gv - ha - hak - haw - he - hi - hif - ho - hr - hsb - ht - hu - hy - hz - ia - id - ie - ig - ii - ik - ilo - io - is - it - iu - ja - jbo - jv - ka - kaa - kab - kg - ki - kj - kk - kl - km - kn - ko - kr - ks - ksh - ku - kv - kw - ky - la - lad - lb - lbe - lg - li - lij - lmo - ln - lo - lt - lv - map_bms - mdf - mg - mh - mi - mk - ml - mn - mo - mr - mt - mus - my - myv - mzn - na - nah - nap - nds - nds_nl - ne - new - ng - nl - nn - no - nov - nrm - nv - ny - oc - om - or - os - pa - pag - pam - pap - pdc - pi - pih - pl - pms - ps - pt - qu - quality - rm - rmy - rn - ro - roa_rup - roa_tara - ru - rw - sa - sah - sc - scn - sco - sd - se - sg - sh - si - simple - sk - sl - sm - sn - so - sr - srn - ss - st - stq - su - sv - sw - szl - ta - te - tet - tg - th - ti - tk - tl - tlh - tn - to - tpi - tr - ts - tt - tum - tw - ty - udm - ug - uk - ur - uz - ve - vec - vi - vls - vo - wa - war - wo - wuu - xal - xh - yi - yo - za - zea - zh - zh_classical - zh_min_nan - zh_yue - zu

Static Wikipedia February 2008 (no images)

aa - ab - af - ak - als - am - an - ang - ar - arc - as - ast - av - ay - az - ba - bar - bat_smg - bcl - be - be_x_old - bg - bh - bi - bm - bn - bo - bpy - br - bs - bug - bxr - ca - cbk_zam - cdo - ce - ceb - ch - cho - chr - chy - co - cr - crh - cs - csb - cu - cv - cy - da - de - diq - dsb - dv - dz - ee - el - eml - en - eo - es - et - eu - ext - fa - ff - fi - fiu_vro - fj - fo - fr - frp - fur - fy - ga - gan - gd - gl - glk - gn - got - gu - gv - ha - hak - haw - he - hi - hif - ho - hr - hsb - ht - hu - hy - hz - ia - id - ie - ig - ii - ik - ilo - io - is - it - iu - ja - jbo - jv - ka - kaa - kab - kg - ki - kj - kk - kl - km - kn - ko - kr - ks - ksh - ku - kv - kw - ky - la - lad - lb - lbe - lg - li - lij - lmo - ln - lo - lt - lv - map_bms - mdf - mg - mh - mi - mk - ml - mn - mo - mr - mt - mus - my - myv - mzn - na - nah - nap - nds - nds_nl - ne - new - ng - nl - nn - no - nov - nrm - nv - ny - oc - om - or - os - pa - pag - pam - pap - pdc - pi - pih - pl - pms - ps - pt - qu - quality - rm - rmy - rn - ro - roa_rup - roa_tara - ru - rw - sa - sah - sc - scn - sco - sd - se - sg - sh - si - simple - sk - sl - sm - sn - so - sr - srn - ss - st - stq - su - sv - sw - szl - ta - te - tet - tg - th - ti - tk - tl - tlh - tn - to - tpi - tr - ts - tt - tum - tw - ty - udm - ug - uk - ur - uz - ve - vec - vi - vls - vo - wa - war - wo - wuu - xal - xh - yi - yo - za - zea - zh - zh_classical - zh_min_nan - zh_yue - zu