EBNF for ice9 All input is enclosed in quotes. This is meant to be taken literally. Anything outside of quotes has a meta-meaning. For example, { exp } means zero or more exp's. Whereas, '{' exp '}' means LBRACE followed by exp followed by RBRACE. The start symbol is program. ########### Begin EBNF ############ program -> {var|type|forward|proc} {stm} # 3 complex tokens (these are usually handled in the lexer) id -> [A-Za-z][A-Za-z0-9_]* int -> [0-9]+ # decimal integer literal string -> "[^"\n]" # double-quoted string: any char but " and \n -> '[^'\n]' # single-quoted string: any char but ' and \n stm -> if | do | fa | 'break' ';' | 'exit' ';' -> 'return' ';' | 'return' exp ';' -> lvalue ':=' exp ';' -> 'write' exp ';' | 'writes' exp ';' -> exp ';' # any exp is valid -> ';' # the "empty" statement if -> 'if' exp '->' {stm} { '[]' exp '->' {stm} } 'fi' -> 'if' exp '->' {stm} { '[]' exp '->' {stm} } '[]' 'else' '->' {stm} 'fi' do -> 'do' exp '->' {stm} 'od' fa -> 'fa' id ':=' exp 'to' exp '->' {stm} 'af' proc -> 'proc' id '(' declist ')' {type|var} {stm} 'end' -> 'proc' id '(' declist ')' ':' typeid {type|var} {stm} 'end' forward -> 'forward' id '(' declist ')' -> 'forward' id '(' declist ')' ':' typeid var -> 'var' varlist varlist -> idlist ':' typeid { '[' int ']' } {, varlist} type -> 'type' id '=' typeid { '[' int ']' } idlist -> id {, id} declist -> idlist ':' typeid { ',' declist } -> # empty typeid -> id # This for semantics procid -> id # This for semantics lvalue -> id -> lvalue '[' exp ']' exp -> lvalue -> int # integer literal -> bool # boolean literal -> string -> 'read' -> '-' exp -> '?' exp # bool conversion -> procid '(' ')' # procedure call -> procid '(' exp { ',' exp } ')' # procedure call -> exp '+' exp -> exp '-' exp -> exp '*' exp -> exp '/' exp -> exp '=' exp -> exp '!=' exp -> exp '>' exp -> exp '<' exp -> exp '>=' exp -> exp '<=' exp -> '(' exp ')' ########### End EBNF ############ ############ # Comments # ############ Extends from the first sharp (#) on a line to the end of the line (determined by the newline character). ############# # Semantics # ############# 1. The expressions in 'if' and 'do' statements must be bool values. For example: int x; if x -> ... fi is not semantically correct. Instead use if x != 0 -> ... fi 2. The 'fa' statement fa loop := lo to hi -> body af executes the body once for every value of loop between lo and hi, inclusive. The expressions in 'fa' must be int. The expressions are executed exactly once and the body is executed from hi - lo + 1 times. The loop variable is an int and it is not assignable in the body of the loop. 3. The rule typeid -> id (and similarly the rule for procid) is only denoting semantics. The terminals id and typeid are syntactically indistinguishable. However, semantically typeid represents an id referring to a type (found in a different symbol table from variable ids). 4. The read expression returns an integer. Thus var a: int a := read; is correct. 5. The write statement outputs an expression AND a newline. The writes statement omits the newline. 6. The write and writes statements are over-loaded. If type of exp is an integer, then each evaluates the integer expression and outputs a decimal representation of it. For example: write 1+5; outputs the decimal '6'. On the other hand, if exp is a string, then the string is output. 7. The exit statement terminates the program. 8. Statements in the outermost scope are executed in order at startup. (Think of the outermost scope as being the C main procedure.) ################# # Symbol Tables # ################# There are three symbol tables in ice9. One each for types, variables, and procedures. Initially, the types symbol table consists of two entries: 'int' and 'bool' which resolve to the two basic types. The boolean variables 'true' and 'false' are initially in the symbol for variables. The proc table is initially empty. ################# # Scoping rules # ################# Variables, parameters, types, and procedures are visible beginning with their declaration. If defined in a procedure, the visibility stops at the end of the defining procedure. Otherwise visibility ends at the end of the file. Visibility starts with the declaration itself, allowing recursive definitions. A declaration in a nested scope overrides a previous declaration. However, two declarations in the same scope is a name clash. FORWARD: A forward statement makes a name visible before its declaration. This is useful for making mutually recursive declarations. Forward declarations are only defined for procedures. ############# # Operators # ############# - : T -> T, T={int,bool} // unary minus or boolean not ? : bool -> int // conversion from boolean to integer - : int x int -> int // integer subtraction + : T x T -> T, T={int,bool} // integer addition or boolean or * : T x T -> T, T={int,bool} // integer multiplication or boolean and / : int x int -> int // integer division = : T x T -> bool, T={int,bool} // comparison, equal != : T x T -> bool, T={int,bool} // comparison, not equal > : int x int -> bool // comparison, greater than < : int x int -> bool // comparison, less than >= : int x int -> bool // comparison, greater than or equal to <= : int x int -> bool // comparison, less than or equal to := : T x T -> nil, T={int,bool,str} // assignment ################################ # Precedence and associativity # ################################ Precedence | Associativity (highest to lowest) | ========================|=============== - (Unary minus) ? | right ------------------------|--------------- * / | left ------------------------|--------------- + - | left ------------------------|--------------- = != > < >= <= | none ---------------------------------------- The ? operator does not associate semantically because of a type conflict. The assignment operator does not commute. It binds less tightly than all the above operators. ############## # Conversion # ############## The ? operator converts a bool value to an int. It converts true to 1 and false to 0. i := ? true; # i gets 1 i := ? (1 != 1); # i gets 0 There is no inverse operator to ?. However, an int is easily converted a bool: b := x != 0; ########## # Arrays # ########## Arrays are indexed from 0 to size-1. Bounds are checked. An index that is < 0 or > size-1 generates a runtime error. ######### # Notes # ######### 1. Type checking on user-defined types is structure based. That is two arrays with the same structure are equivalent even if different names are used. 2. Basic types are passed by value. Array parameters are passed by reference. 3. In forward declarations, the name of the parameters are not important. Only the names of the parameters in the actual proc declaration are meaningful.