From Wikipedia, the free encyclopedia
A text file (or plain text file) is a computer file which contains only ordinary textual characters with essentially no formatting. The term 'text file' is typically used in contrast with the term 'binary file', even though any file is fundamentally a sequence of arbitrary bits, and many computer components (for example, all hard disk circuitry and most system software) make no distinction between file types. However, a large percentage of application programs can understand and use text files in some way, but few programs can typically understand and use the contents of any particular binary file. Hence the distinction can be useful to computer users.
Text files are files where most bytes (or short sequences of bytes) represent ordinary readable characters such as letters, digits, and punctuation (including spaces), and include some control characters such as tabs, line feeds and carriage returns. This simplicity allows a wide variety of programs to display their contents.
The similar term plaintext is most commonly used in a cryptographic context and refers to unencrypted data. The similarity sometimes causes confusion, especially among those new to computers, cryptography, or data communications.
Generally, a text file contains characters in an ASCII-based encoding, or much less commonly an EBCDIC-based encoding, without any embedded information such as font information, hyperlinks or inline images. Text files are often encoded in an extension of ASCII; these include ISO 8859, EUC, a special encoding for Windows, a special Mac-Roman encoding for Mac OS, and Unicode encoding schemes (common on many platforms) such as UTF-8 or UTF-16.
Although text files are often meant for humans to read, they are also commonly used for data storage by computer programs. Text files have some advantages even for data storage because they avoid certain problems with binary files, such as endianness, padding bytes, or differences in the number of bytes in a machine word. Further, when data corruption occurs in a file used for data storage, it is far easier for a human to fix if it is a text file. As a bonus, it may be easier for the program to recover from the error, because text files are pretty verbose, while binary files are usually compact (it's said that text files have a low entropy rate). Damaging an amount of a text file destroys little information; damaging the same amount of a binary file destroys more information.
Text files usually have the MIME type "text/plain", usually with additional information indicating an encoding. Prior to the advent of Mac OS X, the Mac OS system regarded the content of a file (the data fork) to be a text file when its resource fork indicated that the type of the file was "TEXT". Under the Windows operating system, a file is regarded as a text file if the suffix of the name of the file (the "extension") is "txt". However, many other suffixes are used for text files with specific purposes. For example, source code for computer programs is usually kept in text files that have file name suffixes indicating the programming language in which the source is written.
The ASCII standard allows ASCII-only plain text files (unlike most other file types) to be freely interchanged and readable on Unix, Macintosh, Microsoft Windows, DOS, and other systems. These differ in their preferred line ending convention (see new line) and their interpretation of values outside the ASCII range (their character encoding).
Plain text is often used as a readable representation of other data that is not itself purely textual: for example, a formatted webpage is not plain text, but its HTML source is. Similarly, source code for computer programs is usually stored in text files, but is compiled into a binary form for execution.