44
176
4 User input and errorhandling
Music files. . AnMP3fileismuchlikeaJPEGfile:first,thereissome
information about the music (artist, title, album, etc.), and then comes
the music itself as a stream of bytes. A typical MP3 file has a size of
something like five million bytes or five megabytes (5 Mb). The exact
size depends on the complexity of the music, the length of the track,
and the MP3 resolution. On a 16 Gb MP3 player you can then store
roughly 16,000,000,000/5,000,000 = 3200 MP3 files. MP3 is, like JPEG,
acompressed format. The complete data of a song on a CD (the WAV
file) contains about ten times as many bytes. As for pictures, the idea is
that one can throw away a lot of bytes in an intelligent way, such that
the human ear hardly detects the difference between a compressed and
uncompressed version of the music file.
PDF files. . LookingataPDFfileinapuretexteditorshowsthatthefile
contains some readable text mixed with some unreadable characters. It
is not possible for a human to look at the stream of bytes and deduce the
text in the document (well, from the assumption that there are always
some strange people doing strange things, there might be somebody out
there who, with a lot of training, can interpret the pure PDF code with
the eyes). A PDF file reader can easily interpret the contents of the file
and display the text in a human-readable form on the screen.
Remarks. Wehaverepeatedmanytimesthatafileisjustastreamof
bytes. A human can interpret (read) the stream of bytes if it makes sense
in a human language - or a computer language (provided the human is
aprogrammer). When the series of bytes does not make sense to any
human, a computer program must be used to interpret the sequence of
characters.
Think of a report. When you write the report as pure text in a text
editor, the resulting file contains just the characters you typed in from
the keyboard. On the other hand, if you applied a word processor like
Microsoft Word or LibreOffice, the report file contains a large number
of extra bytes describing properties of the formatting of the text. This
stream of extra bytes does not make sense to a human, and a computer
program is required to interpret the file content and display it in a
form that a human can understand. Behind the sequence of bytes in the
file there are strict rules telling what the series of bytes means. These
rules reflect the file format. When the rules or file format is publicly
documented, a programmer can use this documentation to make her own
program for interpreting the file contents (however, interpreting such files
is much more complicated than our examples on reading human-readable
files in this book). It happens, though, that secret file formats are used,
which require certain programs from certain companies to interpret the
files.