Chapter 6: The pspp language
double it, e.g. ‘’it’’s an apostrophe’’. White space and case of letters are
signiﬁcant inside strings.
Strings can be concatenated using ‘+’, so that ‘"a" + ’b’ + ’c’’ is equivalent
to ‘’abc’’. So that a long string may be broken across lines, a line break may
precede or follow, or both precede and follow, the ‘+’. (However, an entirely
blank line preceding or following the ‘+’ is interpreted as ending the current
Strings may also be expressed as hexadecimal character values by preﬁxing
the initial quote character by ‘x’ or ‘X’. Regardless of the syntax ﬁle or ac-
tive dataset’s encoding, the hexadecimal digits in the string are interpreted as
Unicode characters in UTF-8 encoding.
Individual Unicode code points may also be expressed by specifying the hex-
adecimal code point number in single or double quotes preceded by ‘u’ or ‘U’.
For example, Unicode code point U+1D11E, the musical G clef character, could
be expressed as U’1D11E’. Invalid Unicode code points (above U+10FFFF or
in between U+D800 and U+DFFF) are not allowed.
When strings are concatenated with ‘+’, each segment’s preﬁx is considered
individually. For example, ’The G clef symbol is:’ + u"1d11e" + "." inserts
aG clef symbol in the middle of an otherwise plain text string.
Punctuators and Operators
These tokens are the punctuators and operators:
, / = ( ) + - * / ** < <= <> > >= ~= & | .
Most of these appear within the syntax of commands, but the period (‘.’)
punctuator is used only at the end of a command. It is a punctuator only as
the last character on a line (except white space). When it is the last non-space
character on a line, a period is not treated as part of another token, even if it
would otherwise be part of, e.g., an identiﬁer or a ﬂoating-point number.
6.2 Forming commands of tokens
Most pspp commands share a common structure. A command begins with a command
name, such as FREQUENCIES, DATA LIST, or N OF CASES. The command name may be ab-
breviated to its ﬁrst word, and each word in the command name may be abbreviated to its
ﬁrst three or more characters, where these abbreviations are unambiguous.
The command name may be followed by one or more subcommands. Each subcommand
begins with a subcommand name, which may be abbreviated to its ﬁrst three letters. Some
subcommands accept a series of one or more speciﬁcations, which follow the subcommand
name, optionally separated from it by an equals sign (‘=’). Speciﬁcations may be separated
from each other by commas or spaces. Each subcommand must be separated from the next
(if any) by a forward slash (‘/’).
There are multiple ways to mark the end of a command. The most common way is to
end the last line of the command with a period (‘.’) as described in the previous section
(seeSection6.1[Tokens],page28). A blank line, or one that consists only of white space
or comments, also ends a command.