Hexadecimal. In hexadecimal (base 16), the digits 0-9 and the characters a-f (or A-F) are commonly used to encode sequences of binary (base 2) digits as shown in the following table.
|
|
|
|
For example, using hexadecimal we can write the binary values 0000000000000000, 0000000000000001, and 0000111111111111 as #0000, #0001, and #0fff, where each digit in the hexadecimal representation of the value corresponds to a sequence of four base-2 digits. (The leading hashtag symbol in the hexadecimal values is used simply to indicate to the reader that a sequence of hexadecimal digits follows. The hashtag itself does not alter the value; it just makes the number base obvious — assuming the reader understands the notation — and may replaced with some other symbol if desired or omitted altogether.)
At the expense of requiring an alphabet with 16 distinct symbols instead of only two, expressing a sequence of binary digits using hexadecimal is a way to compress information into about one fourth the number of digits that would be required to express the same value in base 2. Using hexadecimal to express a value instead of binary works well in the context of computer programming, because the sorts of individual values often discussed often require a cumbersome number of 1s and 0s when expressed in base 2 and the number of bits (binary units of information) held in various discrete storage locations is typically a multiple of a power of 2 — especially some multiple of four — making conversions to and from hexadecimal very easy.
Compressing binary numbers using, say, octal is done less frequently in computer science than converting base 2 values to hexadecimal, because each octal digit corresponds to three binary digits and sequences of bits used to encode values — numeric values, character values, or otherwise — are not usually a multiple of three digits in length. It is even more rare to see computer scientists express numbers in bases that are not a power of 2, since you can’t always map a certain integral number of digits in one representation of a binary value to a certain integral number of digits in another base the way you can when converting between base 2 and bases like 4, 8, 16, etc. A problem with converting base 2 numbers to bases much larger than 16 is establishing an easily understood alphabet for all the distinct digits that are needed.
Numbers. A finite number of real numbers can be expressed in a base b as a series of n + 1 digits to the left and m digits to the right of a decimal point.
(dn ⋯ d2 d1 d0 . d-1 d-2 ⋯ d-m)b (1)The number represented by (1) can be converted from base b to base 10 by summing products of an integer and an integer raised to a power, where dx is the decimal value of a digit in the alphabet of base b, and x indicates the number of places and the direction the digit is from the 1’s place.
(dn * bn) + ⋯ + (d2 * b2) + (d1 * b1) + (d0 * b0) + (d-1 * b-1) + (d-2 * b-2) + ⋯ + (d-m * b-m) (2)
Today, most computers store numbers internally (in memory and in registers, for example) as base 2 numbers. But exactly how a base 2 numbers is encoded can vary in a variety of ways. For example, when more than one byte is needed to represent a simple, non-negative integer, some computers store the most significant byte first (i.e. in the memory location with the smallest address in a byte-addressable memory) and store the least significant byte last. This is referred to as big-endian format. Other computers use little-endian format, storing the least significant byte first and the most significant byte last. Encoding options get especially numerous when numbers can be positive or negative and involve digits to the right of a decimal point. Furthermore, it is often desirable to encode special values like infinity, negative infinity or “not a number (NaN)”. The IEEE Standard for Floating-Point Arithmetic (IEEE 754) addresses these sorts of issues and is widely adopted by programming languages and computer manufacturers.
Once you understand how base 2 numbers work, it might be hard to imagine storing numbers in a digital computer any other way. However, base 2 is not the only option available when sequences of bits are used to represent numbers. There are various ways of interpreting patterns of bits as decimal (base 10) numbers, for example. (See https://en.wikipedia.org/wiki/Binary-coded_decimal.) One obvious way is to simply use four bits to represent each decimal digit: 0 = 0000, 1 = 0001, 2 = 0010, ..., 8 = 1000, 9 = 1001. Such encoding systems sacrifice space efficiency for improved accuracy when performing certain types of calculations — calculations that involve, say, tenths (dimes) or hundredths (pennies) of a unit (dollar).
Since people are used to working with numbers in base 10, high-level programming languages typically assume responsibility for translating between base 10 and base 2, allowing programmers to think in base 10 by default when working with numbers. Some numbers — any integer and certain fractions, like 0.5 and 0.125, for instance — can be represented exactly in either base 2 or base 10 using a finite number of digits. But there are many numbers less than one that can only be approximated using a finite number of digits in base 2, even though they can be specified exactly using a finite number of digits in base 10. The term round-off error refers to errors introduced as a result of approximating numbers that cannot be represented using a finite number of digits given the encoding system being employed.
For example, evaluating the expression
1 − 0.9in JavaScript on the computer I'm using now yields the value 0.09999999999999998. This is because 0.9 (base 10) converted to binary (base 2) is exactly 0.111001100…, where the ellipses (…) mean that the digits in bold repeat indefinitely. So in base 2, it’s impossible to represent the decimal value 0.9 using a finite number of 1s and 0s. But in base 10 or using some variation of binary-coded decimal we can represent the number 0.9 using a very finite number of digits: not counting the decimal point, we can use the single digit '9' or the sequence of 1s and 0s '1001'.