<< Chapter < Page Chapter >> Page >

Hundreds of different encoding systems were invented. But these encoding systems conflict with one another : two encodings can use the same number for two different characters, or use different numbers for the same character.

The Unicode standard was first published in 1991. With two bytes for each character, it can represent 216-1 different characters.

The Unicode standard has been adopted by such industry leaders as HP, IBM, Microsoft, Oracle, Sun, and many others. It is supported in many operating systems, all modern browsers, and many other products.

The obvious advantages of using Unicode are :

  • To offer significant cost savings over the use of legacy character sets.
  • To enable a single software product or a single website to be targeted across multiple platforms, languages and countries without re-engineering.
  • To allow data to be transported through many different systems without corruption.

Representation of real numbers

Basic principles

No human system of numeration can give a unique representation to real numbers. If you give the first few decimal places of a real number, you are giving an approximation to it.

Mathematicians may think of one approach : a real number x can be approximated by any number in the range from x - epsilon to x + epsilon. It is fixed-point representation. Fixed-point representations are unsatisfactory for most applications involving real numbers.

Scientists or engineers will probably use scientific notation: a number is expressed as the product of a mantissa and some power of ten.

A system of numeration for real numbers will typically store the same three data -- a sign, a mantissa, and an exponent -- into an allocated region of storage

The analogues of scientific notation in computer are described as floating-point representations.

In the decimal system, the decimal point indicates the start of negative powers of 10.

12.34 = 1 10 1 + 2 10 0 + 3 10 1 + 4 10 2 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiaaigdacaaIYaGaaiOlaiaaiodacaaI0aGaeyypa0JaaGymaiabgEHiQiaaigdacaaIWaWaaWbaaSqabeaacaaIXaaaaOGaey4kaSIaaGOmaiabgEHiQiaaigdacaaIWaWaaWbaaSqabeaacaaIWaaaaOGaey4kaSIaaG4maiabgEHiQiaaigdacaaIWaWaaWbaaSqabeaacqGHsislcaaIXaaaaOGaey4kaSIaaGinaiabgEHiQiaaigdacaaIWaWaaWbaaSqabeaacqGHsislcaaIYaaaaaaa@4F4C@

If we are using a system in base k (ie the radix is k), the ‘radix point’ serves the same function:

101.1012 = 1 2 2 + 0 2 1 + 1 2 0 + 1 2 1 + 0 2 2 + 1 2 2 = 4 ( 10 ) + 1 ( 10 ) + 0.5 ( 10 ) + 0.125 ( 10 ) = 5.625 ( 10 ) MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOabaeqabaGaaGymaiaaicdacaaIXaGaaiOlaiaaigdacaaIWaGaaGymaiaaikdacqGH9aqpcaaIXaGaey4fIOIaaGOmamaaCaaaleqabaGaaGOmaaaakiabgUcaRiaaicdacqGHxiIkcaaIYaWaaWbaaSqabeaacaaIXaaaaOGaey4kaSIaaGymaiabgEHiQiaaikdadaahaaWcbeqaaiaaicdaaaGccqGHRaWkcaaIXaGaey4fIOIaaGOmamaaCaaaleqabaGaeyOeI0IaaGymaaaakiabgUcaRiaaicdacqGHxiIkcaaIYaWaaWbaaSqabeaacqGHsislcaaIYaaaaOGaey4kaSIaaGymaiabgEHiQiaaikdadaahaaWcbeqaaiaaikdaaaaakeaacaaMe8UaaGjbVlaaysW7caaMf8UaaGzbVlaaywW7cqGH9aqpcaaMe8UaaGinamaaBaaaleaacaGGOaGaaGymaiaaicdacaGGPaaabeaakiabgUcaRiaaigdadaWgaaWcbaGaaiikaiaaigdacaaIWaGaaiykaaqabaGccqGHRaWkcaaIWaGaaiOlaiaaiwdadaWgaaWcbaGaaiikaiaaigdacaaIWaGaaiykaaqabaGccqGHRaWkcaaIWaGaaiOlaiaaigdacaaIYaGaaGynamaaBaaaleaacaGGOaGaaGymaiaaicdacaGGPaaabeaaaOqaaiaaysW7caaMe8UaaGjbVlaaywW7caaMf8UaaGzbVlabg2da9iaaysW7caaI1aGaaiOlaiaaiAdacaaIYaGaaGynamaaBaaaleaacaGGOaGaaGymaiaaicdacaGGPaaabeaaaaaa@8B7C@

A floating point representation allows a large range of numbers to be represented in a relatively small number of digits by separating the digits used for precision from the digits used for range.

To avoid multiple representations of the same number floating point numbers are usually normalized so that there is only one nonzero digit to the left of the ‘radix’ point, called the leading digit.

A normalized (non-zero) floating-point number will be represented using

( 1 ) s d 0 · d 1 d 2 ... d p 1 × b e MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiaacIcacqGHsislcaaIXaGaaiykamaaCaaaleqabaGaam4CaaaakiaadsgadaWgaaWcbaGaaGimaaqabaGccqWIpM+zcaWGKbWaaSbaaSqaaiaaigdaaeqaaOGaamizamaaBaaaleaacaaIYaaabeaakiaac6cacaGGUaGaaiOlaiaadsgadaWgaaWcbaGaamiCaiabgkHiTiaaigdaaeqaaOGaey41aqRaamOyamaaCaaaleqabaGaamyzaaaaaaa@4BF7@

where

  • s is the sign,
  • d 0 · d 1 d 2 ... d p 1 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiaadsgadaWgaaWcbaGaaGimaaqabaGccqWIpM+zcaWGKbWaaSbaaSqaaiaaigdaaeqaaOGaamizamaaBaaaleaacaaIYaaabeaakiaac6cacaGGUaGaaiOlaiaadsgadaWgaaWcbaGaamiCaiabgkHiTiaaigdaaeqaaaaa@43A8@ - termed the significand - has p significant digits, each digit satisfies 0 d i MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiabgsMiJkaadsgadaWgaaWcbaGaamyAaaqabaaaaa@399A@ <b
  • e , e min e e max MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiaadwgacaGGSaGaaGjbVlaadwgadaWgaaWcbaGaciyBaiaacMgacaGGUbaabeaakiabgsMiJkaadwgacqGHKjYOcaWGLbWaaSbaaSqaaiGac2gacaGGHbGaaiiEaaqabaaaaa@4539@ , is the exponent
  • b is the base (or radix)

Example

If k = 10 (base 10) and p = 3, the number 0·1 is represented as 0.100

If k = 2 (base 2) and p = 24, the decimal number 0·1 cannot be represented exactly but is approximately 1 · 1 00 11 00 11 00 11 00 11 00 11 0 1 × 2 4 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiaabgdacaGG3cGaaeymaiaaicdacaaIWaGaaeymaiaabgdacaaIWaGaaGimaiaabgdacaqGXaGaaGimaiaaicdacaqGXaGaaeymaiaaicdacaaIWaGaaeymaiaabgdacaaIWaGaaGimaiaabgdacaqGXaGaaGimaiaabgdacqGHxdaTcaaIYaWaaWbaaSqabeaacqGHsislcaaI0aaaaaaa@4CEA@

Formally,

( 1 ) s d 0 · d 1 d 2 ... d p 1 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiaacIcacqGHsislcaaIXaGaaiykamaaCaaaleqabaGaam4CaaaakiaadsgadaWgaaWcbaGaaGimaaqabaGccqWIpM+zcaWGKbWaaSbaaSqaaiaaigdaaeqaaOGaamizamaaBaaaleaacaaIYaaabeaakiaac6cacaGGUaGaaiOlaiaadsgadaWgaaWcbaGaamiCaiabgkHiTiaaigdaaeqaaaaa@47D8@ be represents the value ( 1 ) s ( d 0 + d 1 b 1 + d 2 b 2 ... d 1 b ( p 1 ) ) b e MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiaacIcacqGHsislcaaIXaGaaiykamaaCaaaleqabaGaam4CaaaakiaacIcacaWGKbWaaSbaaSqaaiaaicdaaeqaaOGaey4kaSIaamizamaaBaaaleaacaaIXaaabeaakiaadkgadaahaaWcbeqaaiabgkHiTiaaigdaaaGccqGHRaWkcaWGKbWaaSbaaSqaaiaaikdaaeqaaOGaamOyamaaCaaaleqabaGaeyOeI0IaaGOmaaaakiaac6cacaGGUaGaaiOlaiaadsgadaWgaaWcbaGaeyOeI0IaaGymaaqabaGccaWGIbWaaWbaaSqabeaacaGGOaGaamiCaiabgkHiTiaaigdacaGGPaaaaOGaaiykaiaadkgadaahaaWcbeqaaiaadwgaaaaaaa@5439@

In brief, a normalized representation of a real number consist of

  • The range of the number : the number of digits in the exponent (i.e. by e max MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiaadwgadaWgaaWcbaGaciyBaiaacggacaGG4baabeaaaaa@39CC@ ) and the base b to which it is raised
  • The precision : the number of digits p in the significand and its base b

Ieee 754/85 standard

There are many ways to represent floating point numbers. In order to improve portability most computers use the IEEE 754 floating point standard.

There are two primary formats:

  • 32 bit single precision.
  • 64 bit double precision.

Single precision consists of:

  • A single sign bit, 0 for positive and 1 for negative;
  • An 8 bit base-2 (b = 2) excess-127 exponent, with e min MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiaadwgadaWgaaWcbaGaciyBaiaacMgacaGGUbaabeaaaaa@39CA@ = –126 (stored as 127 ( 10 ) 126 ( 10 ) = 1 = 00000001 ( 2 ) ) and e max = 127 (stored as 127 ( 10 ) + 127 ( 10 ) = 254 ( 10 ) = 11111110 ( 2 ) ).
  • a 23 bit base-2 (k=2) significand, with a hidden bit giving a precision of 24 bits (i.e. 1. d 1 d 2 ... d 23 )
Single precision memory format

Notes

  • Single precision has 24 bits precision, equivalent to about 7.2 decimal digits.
  • The largest representable non-infinite number is almost 2 × 2 127 3.402823 × 10 38
  • The smallest representable non-zero normalized number is 1 × 2 127 1.17549 × 10 38
  • Denormalized numbers (eg 0.01 × 2 126 ) can be represented.
  • There are two zeros, ± 0.
  • There are two infinities, ± .
  • A NaN (not a number) is used for results from undefined operations

Double precision floating point standard requires a 64 bit word

  • The first bit is the sign bit
  • The next eleven bits are the exponent bits
  • The final 52 bits are the fraction

Range of double numbers : [±2.225×10−308÷±1.7977×10308]

Double precision memory format

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Introduction to computer science. OpenStax CNX. Jul 29, 2009 Download for free at http://cnx.org/content/col10776/1.1
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Introduction to computer science' conversation and receive update notifications?

Ask