These notational conventions are used for presenting syntax:
BNF-like syntax is used throughout, with productions having the form:
There are some families of nonterminals indexed by
precedence levels (written as a superscript). Similarly, the
nonterminals op, varop, and conop may have a double index:
a letter l, r, or n for left-, right- or nonassociativity and
a precedence level. A precedence-level variable i ranges from 0 to 9;
an associativity variable a varies over {l, r, n}.
Thus, for example
In both the lexical and the context-free syntax, there are some
ambiguities that are to be resolved by making grammatical phrases as
long as possible, proceeding from left to right (in shift-reduce
parsing, resolving shift/reduce conflicts by shifting). In the lexical
syntax, this is the "consume longest lexeme" rule. In the
context-free syntax, this means that conditionals, let-expressions, and
lambda abstractions extend to the right as far as possible.
Definitions: The indentation of a lexeme is the column number
indicating the start of that lexeme; the indentation of a line is the
indentation of its leftmost lexeme. To determine the column number,
assume a fixed-width font with this tab convention: tab stops
are 8 characters apart, and a tab character causes the insertion of
enough spaces to align the current position with the next tab stop.
In the syntax given in the rest of the report, declaration
lists are always preceded by the keyword where, let, do,
or of, and are
enclosed within curly braces ({ }) with the individual declarations
separated by semicolons (;). For example, the syntax of a let
expression is:
let { decl1 ; decl2 ; ... ; decln [;] } in exp
Haskell permits the omission of the braces and semicolons by
using layout to convey the same information. This allows both
layout-sensitive and -insensitive styles of coding, which
can be freely mixed within one program. Because layout is
not required, Haskell programs can be straightforwardly
produced by other programs.
The layout (or "off-side") rule takes effect
whenever the open brace is omitted after the keyword where, let,
do, or
of. When this happens, the indentation of the next lexeme (whether
or not on a new line) is remembered and the omitted open brace is
inserted (the whitespace preceding the lexeme may include comments).
For each subsequent line, if it contains only whitespace or is
indented more, then the previous item is continued (nothing is
inserted); if it is indented the same amount, then a new item begins
(a semicolon is inserted); and if it is indented less, then the
declaration list ends (a close brace is inserted). A close brace is
also inserted whenever the syntactic category containing the
declaration list ends; that is, if an illegal lexeme is encountered at
a point where a close brace would be legal, a close brace is inserted.
The layout rule matches only those open braces that it has
inserted; an explicit open brace must be matched by
an explicit close brace. Within these explicit open braces,
no layout processing is performed for constructs outside the
braces, even if a line is
indented to the left of an earlier implicit open brace.
Given these rules, a single newline may actually terminate several
declaration lists. Also, these rules permit:
To facilitate the use of layout at the top level of a module
(an implementation may allow several modules may reside in one file),
the keyword
module and the end-of-file token are assumed to occur in column
0 (whereas normally the first column is 1). Otherwise, all
top-level declarations would have to be indented.
Section 1.5 gives an example which uses the layout
rule.
B.1 Notational Conventions
[pattern] optional
{pattern} zero or more repetitions
(pattern) grouping
pat1 | pat2 choice
pat<pat'> difference---elements generated by pat
except those generated by pat'
fibonacci terminal syntax in typewriter font
nonterm -> alt1 | alt2 | ... | altn
aexp -> ( expi+1 qop(a,i) )
actually stands for 30 productions, with 10 substitutions for i
and 3 for a.
B.2 Lexical Syntax
program -> { lexeme | whitespace }
lexeme -> varid | conid | varsym | consym | literal | special | reservedop | reservedid
literal -> integer | float | char | string
special -> ( | ) | , | ; | [ | ] | _ | ` | { | }
whitespace -> whitestuff {whitestuff}
whitestuff -> whitechar | comment | ncomment
whitechar -> newline | vertab | formfeed | space | tab | nonbrkspc
newline -> a newline (system dependent)
space -> a space
tab -> a horizontal tab
vertab -> a vertical tab
formfeed -> a form feed
nonbrkspc -> a non-breaking space
comment -> -- {any} newline
ncomment -> {- ANYseq {ncomment ANYseq} -}
ANYseq -> {ANY}<{ANY} ( {- | -} ) {ANY}>
ANY -> any | newline | vertab | formfeed
any -> graphic | space | tab | nonbrkspc
graphic -> large | small | digit | symbol | special | : | " | '
small -> ASCsmall | ISOsmall
ASCsmall -> a | b | ... | z
ISOsmall -> à | á | â | ã | ä | å | æ | ç | è | é | ê | ë
| ì | í | î | ï | ð | ñ | ò | ó | ô | õ | ö | ø
| ù | ú | û | ü | ý | þ | ÿ | ß
large -> ASClarge | ISOlarge
ASClarge -> A | B | ... | Z
ISOlarge -> À | Á | Â | Ã | Ä | Å | Æ | Ç | È | É | Ê | Ë
| Ì | Í | Î | Ï | Ð | Ñ | Ò | Ó | Ô | Õ | Ö | Ø
| Ù | Ú | Û | Ü | Ý | Þ
symbol -> ASCsymbol | ISOsymbol
ASCsymbol -> ! | # | $ | % | & | * | + | . | / | < | = | > | ? | @
| \ | ^ | | | - | ~
ISOsymbol -> ¡ | ¢ | £ | ¤ | ¥ | ¦ | § | ¨ | © | ª | «
| ¬ | | ® | &hibar; | ° | ± | ² | ³ | ´ | µ | ¶
| · | ¸ | ¹| º | » | ¼ | ½ | ¾ | ¿ | × | ÷
digit -> 0 | 1 | ... | 9
octit -> 0 | 1 | ... | 7
hexit -> digit | A | ... | F | a | ... | f
varid -> (small {small | large | digit | ' | _})<reservedid>
conid -> large {small | large | digit | ' | _}
reservedid -> case | class | data | default | deriving | do | else
| if | import | in | infix | infixl | infixr | instance
| let | module | newtype | of | then | type | where
specialid -> as | qualified | hiding
varsym -> ( symbol {symbol | :} )<reservedop>
consym -> (: {symbol | :})<reservedop>
reservedop -> .. | :: | = | \ | | | <- | -> | @ | ~ | =>
specialop -> - | !
varid (variables)
conid (constructors)
tyvar -> varid (type variables)
tycon -> conid (type constructors)
tycls -> conid (type classes)
modid -> conid (modules)
qvarid -> [ modid . ] varid
qconid -> [ modid . ] conid
qtycon -> [ modid . ] tycon
qtycls -> [ modid . ] tycls
qvarsym -> [ modid . ] varsym
qconsym -> [ modid . ] consym
decimal -> digit{digit}
octal -> octit{octit}
hexadecimal -> hexit{hexit}
integer -> decimal
| 0o octal | 0O octal
| 0x hexadecimal | 0X hexadecimal
float -> decimal . decimal[(e | E)[- | +]decimal]
char -> ' (graphic<' | \> | space | escape<\&>) '
string -> " {graphic<" | \> | space | escape | gap} "
escape -> \ ( charesc | ascii | decimal | o octal | x hexadecimal )
charesc -> a | b | f | n | r | t | v | \ | " | ' | &
ascii -> ^cntrl | NUL | SOH | STX | ETX | EOT | ENQ | ACK
| BEL | BS | HT | LF | VT | FF | CR | SO | SI | DLE
| DC1 | DC2 | DC3 | DC4 | NAK | SYN | ETB | CAN
| EM | SUB | ESC | FS | GS | RS | US | SP | DEL
cntrl -> ASClarge | @ | [ | \ | ] | ^ | _
gap -> \ whitechar {whitechar} \
f x = let a = 1; b = 2
g y = exp2
in exp1
making a, b and g all part of the same declaration
list.
module -> module modid [exports] where body
| body
body -> { [impdecls ;] [[fixdecls ;] topdecls [;]] }
| { impdecls [;] }
impdecls -> impdecl1 ; ... ; impdecln
exports -> ( export1 , ... , exportn [ , ] )
export -> qvar
| qtycon [(..) | ( qcname1 , ... , qcnamen )]
| qtycls [(..) | ( qvar1 , ... , qvarn )]
| module modid
qcname -> qvar | qcon
impdecl -> import [qualified] modid [as modid] [impspec]
impspec -> ( import1 , ... , importn [ , ] )
| hiding ( import1 , ... , importn [ , ] )
import -> var
| tycon [ (..) | ( cname1 , ... , cnamen )]
| tycls [(..) | ( var1 , ... , varn )]
cname -> var | con
fixdecls -> fix1 ; ... ; fixn
fix -> infixl [digit] ops
| infixr [digit] ops
| infix [digit] ops
ops -> op1 , ... , opn
topdecls -> topdecl1 ; ... ; topdecln
topdecl -> type simpletype = type
| data [context =>] simpletype = constrs [deriving]
| newtype [context =>] simpletype = con atype [deriving]
| class [context =>] simpleclass [where { cbody [;] }]
| instance [context =>] qtycls inst [where { valdefs [;] }]
| default (type1 , ... , typen)
| decl
decls -> decl1 ; ... ; decln
decl -> signdecl
| valdef
decllist -> { decls [;] }
signdecl -> vars :: [context =>] type
vars -> var1 , ..., varn (n>=1)
type -> btype [-> type] (function type)
btype -> [btype] atype (type application)
atype -> gtycon
| tyvar
| ( type1 , ... , typek ) (tuple type, k>=2)
| [ type ] (list type)
| ( type ) (parenthesised constructor)
gtycon -> qtycon
| () (unit type)
| [] (list constructor)
| (->) (function constructor)
| (,{,}) (tupling constructors)
context -> class
| ( class1 , ... , classn ) (n>=1)
class -> qtycls tyvar
simpletype -> tycon tyvar1 ... tyvark
constrs -> constr1 | ... | constrn (n>=1)
constrs -> constr1 | ... | constrn (n>=1)
constr -> con [!] atype1 ... [!] atypek (arity con = k, k>=0)
| (btype | ! atype) conop (btype | ! atype) (infix conop)
| con { fielddecl1 , ... , fielddecln } (n>=1)
fielddecl -> vars :: (type | ! atype)
deriving -> deriving (dclass | (dclass1, ... , dclassn)) (n>=0)
dclass -> qtycls
simpleclass -> tycls tyvar
cbody -> [ cmethods [ ; cdefaults ] ]
cmethods -> signdecl1 ; ... ; signdecln (n >= 1)
cdefaults -> valdef1 ; ... ; valdefn (n >= 1)
inst -> gtycon
| ( gtycon tyvar1 ... tyvark ) (k>=0, tyvars distinct)
| ( tyvar1 , ... , tyvark ) (k>=2, tyvars distinct)
| [ tyvar ]
| ( tyvar1 -> tyvar2 ) tyvar1 and tyvar2 distinct
valdefs -> valdef1 ; ... ; valdefn (n>=0)
valdef -> lhs = exp [where decllist]
| lhs gdrhs [where decllist]
lhs -> pat0
| funlhs
funlhs -> var apat { apat }
| pati+1 varop(a,i) pati+1
| lpati varop( l,i) pati+1
| pati+1 varop( r,i) rpati
gdrhs -> gd = exp [gdrhs]
gd -> | exp0
exp -> exp0 :: [context =>] type (expression type signature)
| exp0
expi -> expi+1 [qop( n,i) expi+1]
| lexpi
| rexpi
lexpi -> (lexpi | expi+1) qop( l,i) expi+1
lexp6 -> - exp7
rexpi -> expi+1 qop( r,i) (rexpi | expi+1)
exp10 -> \ apat1 ... apatn -> exp (lambda abstraction, n>=1)
| let decllist in exp (let expression)
| if exp then exp else exp (conditional)
| case exp of { alts [;] } (case expression)
| do { stmts [;] } (do expression)
| fexp
fexp -> [fexp] aexp (function application)
aexp -> qvar (variable)
| gcon (general constructor)
| literal
| ( exp ) (parenthesised expression)
| ( exp1 , ... , expk ) (tuple, k>=2)
| [ exp1 , ... , expk ] (list, k>=1)
| [ exp1 [, exp2] .. [exp3] ] (arithmetic sequence)
| [ exp | qual1 , ... , qualn ] (list comprehension, n>=1)
| ( expi+1 qop(a,i) ) (left section)
| ( qop(a,i) expi+1 ) (right section)
| qcon { fbind1 , ... , fbindn } (labeled construction, n>=0)
| aexp{qcon} { fbind1 , ... , fbindn } (labeled update, n >= 1)
qual -> pat <- exp
| let decllist
| exp
alts -> alt1 ; ... ; altn (n>=1)
alt -> pat -> exp [where decllist]
| pat gdpat [where decllist]
gdpat -> gd -> exp [ gdpat ]
stmts -> exp [; stmts]
| pat <- exp ; stmts
| let decllist ; stmts
fbinds -> { fbind1 , ... , fbindn } (n>=0)
fbind -> var | var = exp
pat -> var + integer (successor pattern)
| pat0
pati -> pati+1 [qconop( n,i) pati+1]
| lpati
| rpati
lpati -> (lpati | pati+1) qconop( l,i) pati+1
lpat6 -> - (integer | float) (negative literal)
rpati -> pati+1 qconop( r,i) (rpati | pati+1)
pat10 -> apat
| gcon apat1 ... apatk (arity gcon = k, k>=1)
apat -> var [ @ apat] (as pattern)
| gcon (arity gcon = 0)
| qcon { fpat1 , ... , fpatk } (labeled pattern, k>=0)
| literal
| _ (wildcard)
| ( pat ) (parenthesised pattern)
| ( pat1 , ... , patk ) (tuple pattern, k>=2)
| [ pat1 , ... , patk ] (list pattern, k>=1)
| ~ apat (irrefutable pattern)
fpat -> var = pat
| var
gcon -> ()
| []
| (,{,})
| qcon
var -> varid | ( varsym ) (variable)
qvar -> qvarid | ( qvarsym ) (qualified variable)
con -> conid | ( consym ) (constructor)
qcon -> qconid | ( qconsym ) (qualified constructor)
varop -> varsym | ` varid` (variable operator)
qvarop -> qvarsym | ` qvarid` (qualified variable operator)
conop -> consym | ` conid` (constructor operator)
qconop -> qconsym | ` qconid` (qualified constructor operator)
op -> varop | conop (operator)
qop -> qvarop | qconop (qualified operator)