Computer

What is the ASCII?

We explain what is ASCII code and what this code of written characters is for. Complete table with ASCII codes. In addition: ASCII art and examples.

What is the ASCII?

In computer science, it is known as ASCII (pronounced aski ) or ASCII Code to a code of written characters that is based on the Latin alphabet, identical to that used by modern English. It comes from a renewal or reworking of the code used until 1963 in the telegraph, carried out by the American Standards Committee (today the ASA). His name is an acronym for American Standard Code for Information Interchange or American Standard Code for Information Exchange.

The original ASCII code used 7 bits of information to represent each of the corresponding characters, and an additional bit for error checking (for a total of 8 bits, that is, one byte ). It should not be confused with various current 8-bit codes that extend the ASCII to incorporate signs from languages ​​other than English.

Simply put, it is a numerical translation of the alphabet used by English, since computer systems only handle binary code (0-1) as a language to represent their logical operations. Thus, to each character (letter, sign, or even blank space) corresponds in ASCII an eight-bit number string (eight digits between 0 and 1, that is, in binary code).

The ASCII standard was first published in 1967 and was last updated in 1986, bringing it to its contemporary version for 32 non-printable and 95 printable characters that follow them in numbering. It is a code used almost universally by today’s computer systems, essential for managing typographic devices, such as keyboards.

As the need for the use of the code increased, extended versions of ASCII were created to facilitate the incorporation of languages ​​other than English and of logical, mathematical, or specialized scientific descriptors. Even the “ASCII art” or popularized images generated by computers through strategic positioning of strings of code on the page that when viewed from afar make figures and drawings.

General View

Computers only understand numbers. ASCII code is a numeric representation of a character such as ‘a’ or ‘@’.

Like other character representation format codes, ASCII is a method for correspondence between bit strings and a series of symbols (alphanumeric and others), thus allowing communication between digital devices as well as processing and storage. The character code ASCII -or a compatible extension (see below) – is used in almost all computers, especially personal computers, and workstations. The most appropriate name for this character code is “US-ASCII”.

character code is "US-ASCII

ASCII is strictly a seven-bit code, which means that it uses representable bit strings with seven binary digits (ranging from 0 to 127 in decimal base) to represent character information.

At the time the ASCII code was introduced, many computers worked with groups of eight bits ( bytes or octets ), as the minimum unit of information; where the eighth bit was commonly used as a parity bit with error control functions on communication lines or other device-specific functions.

Machines that did not use parity checking would set the eighth bit to zero in most cases, although other systems such as Prime computers, which executed PRIMOS set the eighth bit of the ASCII code to one. ASCII code defines a relationship between specific characters and sequences of bits;

in addition to reserving a few control codes for the word processor, and does not define any mechanism to describe the structure or appearance of the text in a document; these matters are specified by other languages ​​such as tag languages.

ASCII Control Characters

The ASCII code reserves the first 32 codes (numbered 0 through 31 in decimal) for control characters – codes not originally intended to represent printable information, but to control devices (such as printers ) that used ASCII. For example, character 10 represents the “line feed” function, which makes a printer feed the paper, and character 27 represents the “escape” key often found in the upper left corner of keyboards. common.

Code 127 (all seven bits to one), another special character, is equivalent to “delete”. Although this function resembles other control characters, the ASCII designers devised this code to be able to “erase” a section of the perforated paper (a popular storage medium until the 1980s) by punching all possible holes in a specific character position, replacing any previous information. Since code 0 was ignored, it was possible to leave gaps (hole regions) and later make corrections.

Many of the ASCII control characters were used to mark data packets, or to control data transmission protocols (for example ENQuiry, with the meaning: is there a station out there? ACKnowledge: received or “, Start Of Header: start header, Start of TeXt: the beginning of the text, End of TeXt: end of the text, etc.).

ESCape and SUBstitute allowed a communications protocol, for example, to mark binary data so that it contained codes with the same code as the character characters, and can be interpreted by the receiver as data rather than as protocol characters The designers of the ASCII code devised the separator characters for use in magnetic tape systems.

Two of the device control characters, commonly called XON and XOFF generally served flow control character functions to control flow to a slow device (such as a printer) from a fast device (such as a computer), so that the data will not saturate the reception capacity of the slow device and will be lost.

The early users of ASCII adopted some of the control codes to represent “meta-information” such as end-of-line, beginning/end of a data item, and so on.

These assignments often conflicted, so part of the effort to convert data from one format to another involves making the correct meta-information conversions. For example, the character that represents the end-of-line in text files varies with the operating system.

When files are copied from one system to another, the conversion system must recognize these characters as end-of-line marks and act accordingly.

Currently, ASCII users use fewer control characters, (with some exceptions such as “carriage return” or “newline”). Modern label languages, modern communication protocols, the shift from text-based to graphics-based devices, the decline of teleprinters, punch cards, and continuous papers have rendered most control characters obsolete.

Binary Deci-mal Hex Abbre-

viation

Play AT Name /

Meaning

0000 0000 0 0 NUL ^ @ Null Character
0000 0001 1 1 SOH ^ A Header Start
0000 0010 2 2 STX ^ B Start of Text
0000 0011 3 3 ETX ^ C End of Text
0000 0100 4 4 EOT ^ D End of Transmission
0000 0101 5 5 ENQ ^ E Query
0000 0110 6 6 ACK ^ F Acknowledgment of receipt
0000 0111 7 7 BEL ^ G Doorbell
0000 1000 8 8 BS ^ H Recoil
0000 1001 9 9 HT ^ I Horizontal tab
0000 1010 10 0A LF ^ J Line break
0000 1011 11 0B VT ^ K Vertical Tab
0000 1100 12 0C FF ^ L Page advance
0000 1101 13 0D CR ^ M Car return
0000 1110 14 0E SW ^ N Disable uppercase
0000 1111 15 0F YES ^ O Activate uppercase
0001 0000 16 10 DLE ^ P Data link escape
0001 0001 17 11 DC1 ^ Q Device Control 1 ( XON )
0001 0010 18 12 DC2 ^ R Device control 2
0001 0011 19 13 DC3 ^ S Device Control 3 ( XOFF )
0001 0100 20 14 DC4 ^ T Device control 4
0001 0101 21 15 NAK ^ U Negative acknowledgment
0001 0110 22 16 SYN ^ V Standby synchro
0001 0111 23 17 ETB ^ W End of transmission block
0001 1000 24 18 DOG ^ X Cancel
0001 1001 25 19 EM ^ And End of middle
0001 1010 26 1A SUB ^ Z Substitution
0001 1011 27 1 B ESC ^ [or ESC Escape
0001 1100 28 1 C FS ^ \ File separator
0001 1101 29 1D GS ^] Group separator
0001 1110 30 1E RS ^^ Record separator
0001 1111 31 1F US ^ _ Unit separator
0111 1111 127 7F OF THE ^? or of Suppress

ASCII Printable Characters

The character ‘space’ designates the space between words, and is normally produced by the space bar on a keyboard. Codes 32 through 126 are known as printable characters, and represent letters, digits, punctuation marks, and various symbols.

Seven-bit ASCII provides seven “national” characters, and if your specific hardware and software combination allows it, you can use key combinations to simulate other international characters: in these cases a backspace can precede an open or grave accent (in British and American standards, but only in these standards, it is also called an “opening single quotation mark”), a tilde or a “breath mark”.

Binary Dec Hex Representation
0010 0000 32 twenty space ()
0010 0001 33 twenty-one !
0010 0010 3. 4 22
0010 0011 35 2. 3 #
0010 0100 36 24 $
0010 0101 37 25 %
0010 0110 38 26 &
0010 0111 39 27
0010 1000 40 28 (
0010 1001 41 29 )
0010 1010 42 2A *
0010 1011 43 2B +
0010 1100 44 2 C ,
0010 1101 Four. Five 2D
0010 1110 46 2E .
0010 1111 47 2F /
0011 0000 48 30 0
0011 0001 49 31 1
0011 0010 fifty 32 2
0011 0011 51 33 3
0011 0100 52 3. 4 4
0011 0101 53 35 5
0011 0110 54 36 6
0011 0111 55 37 7
0011 1000 56 38 8
0011 1001 57 39 9
0011 1010 58 3A :
0011 1011 59 3B ;
0011 1100 60 3C <
0011 1101 61 3D =
0011 1110 62 3E >
0011 1111 63 3F ?

 

Binary Dec Hex Representation
0100 0000 64 40 @
0 100 0001 65 41 TO
0100 0010 66 42 B
0100 0011 67 43 C
0100 0100 68 44 D
0100 0101 69 Four. Five AND
0100 0110 70 46 F
0100 0111 71 47 G
0 100 1000 72 48 H
0100 1001 73 49 I
0100 1010 74 4A J
0100 1011 75 4B K
0100 1100 76 4C L
0100 1101 77 4D M
0100 1110 78 4E N
0100 1111 79 4F OR
0101 0000 80 fifty P
0101 0001 81 51 Q
0101 0010 82 52 R
0101 0011 83 53 S
0101 0100 84 54 T
0101 0101 85 55 OR
0101 0110 86 56 V
0101 0111 87 57 W
0101 1000 88 58 X
0101 1001 89 59 AND
0101 1010 90 5A Z
0101 1011 91 5B [
0101 1100 92 5C \
0101 1101 93 5 D ]
0101 1110 94 5E ^
0101 1111 95 5F _

 

Binary Dec Hex Representation
0110 0000 96 60
0110 0001 97 61 to
0110 0010 98 62 b
0110 0011 99 63 c
0110 0100 100 64 d
0110 0101 101 65 and
0110 0110 102 66 F
0110 0111 103 67 g
0110 1000 104 68 h
0110 1001 105 69 i
0110 1010 106 6A j
0110 1011 107 6B k
0110 1100 108 6C l
0110 1101 109 6D m
0110 1110 110 6E n
0110 1111 111 6F or
0111 0000 112 70 p
0111 0001 113 71 what
0111 0010 114 72 r
0111 0011 115 73 s
0111 0100 116 74 t
0111 0101 117 75 or
0111 0110 118 76 v
0111 0111 119 77 w
0111 1000 120 78 x
0111 1001 121 79 and
0111 1010 122 7A z
0111 1011 123 7B {
0111 1100 124 7C |
0111 1101 125 7D }
0111 1110 126 7E ~

The ASCII code was developed in the field of telegraphy and was first used commercially as a teleprinting code powered by Bell’s data services.

Bell had planned to use a six-bit code, derived from Fieldata, which added punctuation and lowercase letters to the older Baudot teleprinting code, but they were persuaded to join the American Standards Agency (ASA) subcommittee, which had started to develop ASCII code.

Baudot helped in automating the sending and receiving of telegraphic messages, and took many characteristics of Morse code; however, unlike Morse code, Baudot used constant length codes.

Compared with early telegraph codes, the code proposed by Bell and ASA resulted in a more convenient rearrangement for sorting lists (especially since it was arranged alphabetically) and added features such as ‘ escape sequence ‘.

The American Standards Agency (ASA), later to become the American National Standards Institute ( ANSI ), first published the ASCII code in 1963.

The ASCII published in 1963 had an upward pointing arrow (↑) in place of the circumflex (^) and an arrow pointing to the left in place of the underscore (_). The 1967 version added lowercase letters, changed the names of some control codes, and moved the two control codes ACK and ESC from the lowercase letter zone to the control code zone.

ASCII was updated accordingly and published as ANSI X3.4-1968, ANSI X3.4-1977, and finally ANSI X3.4-1986. Other standardization bodies have published character codes that are identical to ASCII. These character codes are often referred to as ASCII, even though ASCII is strictly defined only by ASA / ANSI standards:

  • The European Computer Manufacturers Association ( ECMA ) published editions of its ASCII clone, ECMA-6 in 1965, 1967, 1970, 1973, 1983, and 1991. The 1991 edition is identical to ANSI X3.4-1986. 6
  • The International Organization for Standardization ( ISO ) published its version, ISO 646 (later ISO / IEC 646) in 1967, 1972, 1983, and 1991. In particular, ISO 646: 1972 established a set of country-specific versions where the character’s punctuation marks were replaced with non-English characters. ISO / IEC 646: 1991 The International Reference Version is the same as ANSI X3.4-1986.
  • The International Telecommunication Union ( ITU ) published its version of ANSI X3.4-1986, Recommendation ITU T.50, in 1992. In the early 1970s, it published a version as CCITT Recommendation V.3.
  • DIN published a version of ASCII as the DIN 66003 standard in 1974.
  • The Internet Engineering Working Group ( IETF ) published a version in 1969 as RFC 20 , and established the standard version for the Internet, based on ANSI X3.4-1986, with the publication of RFC 1345 in 1992.
  • The IBM version of ANSI X3.4-1986 was published in IBM technical literature as code page 367.

ASCII code is also included in Unicode, constituting the first 128 characters (or the ‘lowest’).

  • The at symbol, which is represented by the character, is a fundamental component of email addresses, where it appears as a token or separator mark between the user name and the domain name, using the user @ provider format.
  • It is also used in various computer applications, with different functions, such as to denote a user account on Twitter ( @user ), Telegram, Instagram, etc. It is also used as a symbol of the Internet par excellence, even as a pictogram in signage, to indicate the location of an Internet cafe or a place with Internet access. Within the ASCII code, it is represented by the number 64.
  • The term “arroba” comes from the Arabic الربع (ar-rubʿ), which means ‘the fourth part’ and was used in Spain to represent the unit of mass also called arroba. In English it is read at [æt] (“a”, “next to” or “in”), hence its use in computing .

Structural Features

  • The digits 0 through 9 are represented by their default values ​​0011 in binary (this means that the BCD -ASCII conversion is a simple matter of taking each bcd unit and prefixing it with 0011).
  • The bit strings of the upper and lower case letters differ by only one bit, thus simplifying the conversion from one group to the other.

Other Names For ASCII

The RFC 1345 (published in June 1992) and the IANA registration character code, recognize the following alternative names for ASCII for use on the Internet.

  • ANSI_X3.4-1968 (canonical name)
  • ANSI_X3.4-1986
  • ASCII
  • US-ASCII (recommended MIME name)
  • us
  • ISO646-US
  • ISO_646.irv: 1991
  • iso-ir-6
  • IBM367
  • cp367
  • csASCII

Of these, only the names “US-ASCII” and “ASCII” are widely used. They are often found in the optional “character code” parameter in the Content-Type header of some MIME messages, in the equivalent “meta” element of some HTML documents, and in the character encoding declaration part of the header of some XML documents.

Variants of ASCII

As computer technology spread throughout the world, different standards were developed and companies developed many variations of the ASCII code to make it easier to write languages ​​other than English that use Latin alphabets.

Some of these variations can be found classified as ” Extended ASCII “, although the term is sometimes misapplied to cover all variations, even those that do not preserve the original seven-bit ASCII character code set.

The ISO 646 (1972), the first attempt to remedy the bias pro-English character encoding, created compatibility problems since it was also a character code of 7 bits.

He did not specify additional codes, so he reassigned some specifically for the new languages. In this way, it became impossible to know in which variant the text was encoded, and, consequently, word processors could handle only one variant.

The technology improved and provided means to represent the information encoded in the eighth bit of each byte, freeing this bit, which added another 128 additional character codes that were available for new assignments.

For example, IBM developed 8-bit code pages, such as code page 437, which replaced control characters with graphic symbols such as smiles, and mapped additional graphic characters to the top 128 bytes of the code page.

Some operating systems, such as DOS, could work with these code pages, and personal computer manufacturers included support for these pages in their hardware.

Eight-bit standards like ISO 8859 and Mac OS Roman were developed as true extensions to ASCII, leaving the first 127 characters intact and adding only additional values ​​above 7-bits.

This allowed the representation of a wider range of languages, but these standards continued to suffer from incompatibilities and limitations. Still, today, ISO-8859-1 and its variant Windows-1252 (sometimes mistakenly called ISO-8859-1) and the original 7-bit ASCII code are the most commonly used character codes.

Unicode and ISO / IEC 10646 Universal Character Set (UCS) define a much larger character set, and their different forms of encoding have started to quickly replace ISO 8859 and ASCII in many environments.

While ASCII basically uses 7-bit codes, Unicode and UCS use relatively abstract “code points”: positive numbers (including zero) that assign sequences of 8 or more bits to characters. For compatibility, Unicode and UCS map the first 128 pointers to the same characters as the ASCII code.

In this way, you can think of ASCII as a very small subset of Unicode and UCS. The popular UTF-8 encoding recommends the use of one to four 8-bit values ​​for each pointer, where the first 128 values ​​point to the same characters as ASCII.

Other character encodings such as UTF-16 resemble ASCII in how they represent the first 128 characters of Unicode but tend to use 16 to 32 bits per character, so they require proper conversion for compatibility between the two character codes.

The word ASCIIbetical (or, more commonly, the “English” word ASCIIbetical ) describes sorting according to the order of ASCII codes instead of alphabetical order.

The abbreviation ASCIIZ or ASCIZ refers to a character string ending in zero ( zero ). It is very normal for the ASCII code to be embedded in other more sophisticated coding systems and for this reason, it should be clear what the role of the ASCII code is in the table or character map of a computer.

ASCII Art 

ASCII Art

The ASCII character code is the support of a minority artistic discipline, ASCII art, which consists of the composition of images using printable ASCII characters.

The resulting effect has been compared to pointillism, as images produced with this technique are generally appreciated in more detail when viewed from a distance. ASCII art began as an experimental art, but it soon became popular as a resource to represent images on media unable to process graphics, such as teletypes, terminals, e-mails, or some printers.

image created with ascii art

Although you can compose ASCII art manually using a text editor, you can also automatically convert images and videos to ASCII using software, such as the Aalib (freely licensed) library, which has become quite popular. Aalib is supported by some graphic design programs, games, and video players.

ASCII Code Examples

Some examples of ASCII formulation to represent common characters are the following:

  • Character “A” : 0100 0001
  • Character “C” : 0100 0011
  • Character “ ! : 0010 0001
  • Character “#” : 0010 0011
  • Character “/” : 0010 1111
  • Character “K” : 0100 1011
  • Character “k” : 0110 1011
  • Character “X” : 0101 1000
  • Character “x” : 0111 1000
  • Character “[” : 0101 1011
  • Character “=” : 0011 1101
  • Character “Z” : 0101 1010
  • Character “z” : 0111 1010
  • Character “:” : 0011 1010
  • Character “,” : 0010 1100
  • Character “.” : 0010 1110
  • Character “0” : 0011 0000
  • Character “6” : 0011 0110
  • Character “9” : 0011 1001
  • Character “+” : 0010 1011
  • Character “-” : 0010 1101
  • Character “]” : 0101 1101

You May Also Like:

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button

Adblock Detected

Please disable adblocker