diff options
Diffstat (limited to 'tools/bison++/bison.info-4')
-rw-r--r-- | tools/bison++/bison.info-4 | 1304 |
1 files changed, 1304 insertions, 0 deletions
diff --git a/tools/bison++/bison.info-4 b/tools/bison++/bison.info-4 new file mode 100644 index 000000000..5c2be51f1 --- /dev/null +++ b/tools/bison++/bison.info-4 @@ -0,0 +1,1304 @@ +This is bison.info, produced by makeinfo version 4.1 from bison.texinfo. + +START-INFO-DIR-ENTRY +* bison: (bison). GNU Project parser generator (yacc replacement). +END-INFO-DIR-ENTRY + + This file documents the Bison parser generator. + + Copyright (C) 1988, 89, 90, 91, 92, 93, 95, 98, 1999 Free Software +Foundation, Inc. + + Permission is granted to make and distribute verbatim copies of this +manual provided the copyright notice and this permission notice are +preserved on all copies. + + Permission is granted to copy and distribute modified versions of +this manual under the conditions for verbatim copying, provided also +that the sections entitled "GNU General Public License" and "Conditions +for Using Bison" are included exactly as in the original, and provided +that the entire resulting derived work is distributed under the terms +of a permission notice identical to this one. + + Permission is granted to copy and distribute translations of this +manual into another language, under the above conditions for modified +versions, except that the sections entitled "GNU General Public +License", "Conditions for Using Bison" and this permission notice may be +included in translations approved by the Free Software Foundation +instead of in the original English. + + +File: bison.info, Node: How Precedence, Prev: Precedence Examples, Up: Precedence + +How Precedence Works +-------------------- + + The first effect of the precedence declarations is to assign +precedence levels to the terminal symbols declared. The second effect +is to assign precedence levels to certain rules: each rule gets its +precedence from the last terminal symbol mentioned in the components. +(You can also specify explicitly the precedence of a rule. *Note +Context-Dependent Precedence: Contextual Precedence.) + + Finally, the resolution of conflicts works by comparing the +precedence of the rule being considered with that of the look-ahead +token. If the token's precedence is higher, the choice is to shift. +If the rule's precedence is higher, the choice is to reduce. If they +have equal precedence, the choice is made based on the associativity of +that precedence level. The verbose output file made by `-v' (*note +Invoking Bison: Invocation.) says how each conflict was resolved. + + Not all rules and not all tokens have precedence. If either the +rule or the look-ahead token has no precedence, then the default is to +shift. + + +File: bison.info, Node: Contextual Precedence, Next: Parser States, Prev: Precedence, Up: Algorithm + +Context-Dependent Precedence +============================ + + Often the precedence of an operator depends on the context. This +sounds outlandish at first, but it is really very common. For example, +a minus sign typically has a very high precedence as a unary operator, +and a somewhat lower precedence (lower than multiplication) as a binary +operator. + + The Bison precedence declarations, `%left', `%right' and +`%nonassoc', can only be used once for a given token; so a token has +only one precedence declared in this way. For context-dependent +precedence, you need to use an additional mechanism: the `%prec' +modifier for rules. + + The `%prec' modifier declares the precedence of a particular rule by +specifying a terminal symbol whose precedence should be used for that +rule. It's not necessary for that symbol to appear otherwise in the +rule. The modifier's syntax is: + + %prec TERMINAL-SYMBOL + +and it is written after the components of the rule. Its effect is to +assign the rule the precedence of TERMINAL-SYMBOL, overriding the +precedence that would be deduced for it in the ordinary way. The +altered rule precedence then affects how conflicts involving that rule +are resolved (*note Operator Precedence: Precedence.). + + Here is how `%prec' solves the problem of unary minus. First, +declare a precedence for a fictitious terminal symbol named `UMINUS'. +There are no tokens of this type, but the symbol serves to stand for its +precedence: + + ... + %left '+' '-' + %left '*' + %left UMINUS + + Now the precedence of `UMINUS' can be used in specific rules: + + exp: ... + | exp '-' exp + ... + | '-' exp %prec UMINUS + + +File: bison.info, Node: Parser States, Next: Reduce/Reduce, Prev: Contextual Precedence, Up: Algorithm + +Parser States +============= + + The function `yyparse' is implemented using a finite-state machine. +The values pushed on the parser stack are not simply token type codes; +they represent the entire sequence of terminal and nonterminal symbols +at or near the top of the stack. The current state collects all the +information about previous input which is relevant to deciding what to +do next. + + Each time a look-ahead token is read, the current parser state +together with the type of look-ahead token are looked up in a table. +This table entry can say, "Shift the look-ahead token." In this case, +it also specifies the new parser state, which is pushed onto the top of +the parser stack. Or it can say, "Reduce using rule number N." This +means that a certain number of tokens or groupings are taken off the +top of the stack, and replaced by one grouping. In other words, that +number of states are popped from the stack, and one new state is pushed. + + There is one other alternative: the table can say that the +look-ahead token is erroneous in the current state. This causes error +processing to begin (*note Error Recovery::). + + +File: bison.info, Node: Reduce/Reduce, Next: Mystery Conflicts, Prev: Parser States, Up: Algorithm + +Reduce/Reduce Conflicts +======================= + + A reduce/reduce conflict occurs if there are two or more rules that +apply to the same sequence of input. This usually indicates a serious +error in the grammar. + + For example, here is an erroneous attempt to define a sequence of +zero or more `word' groupings. + + sequence: /* empty */ + { printf ("empty sequence\n"); } + | maybeword + | sequence word + { printf ("added word %s\n", $2); } + ; + + maybeword: /* empty */ + { printf ("empty maybeword\n"); } + | word + { printf ("single word %s\n", $1); } + ; + +The error is an ambiguity: there is more than one way to parse a single +`word' into a `sequence'. It could be reduced to a `maybeword' and +then into a `sequence' via the second rule. Alternatively, +nothing-at-all could be reduced into a `sequence' via the first rule, +and this could be combined with the `word' using the third rule for +`sequence'. + + There is also more than one way to reduce nothing-at-all into a +`sequence'. This can be done directly via the first rule, or +indirectly via `maybeword' and then the second rule. + + You might think that this is a distinction without a difference, +because it does not change whether any particular input is valid or +not. But it does affect which actions are run. One parsing order runs +the second rule's action; the other runs the first rule's action and +the third rule's action. In this example, the output of the program +changes. + + Bison resolves a reduce/reduce conflict by choosing to use the rule +that appears first in the grammar, but it is very risky to rely on +this. Every reduce/reduce conflict must be studied and usually +eliminated. Here is the proper way to define `sequence': + + sequence: /* empty */ + { printf ("empty sequence\n"); } + | sequence word + { printf ("added word %s\n", $2); } + ; + + Here is another common error that yields a reduce/reduce conflict: + + sequence: /* empty */ + | sequence words + | sequence redirects + ; + + words: /* empty */ + | words word + ; + + redirects:/* empty */ + | redirects redirect + ; + +The intention here is to define a sequence which can contain either +`word' or `redirect' groupings. The individual definitions of +`sequence', `words' and `redirects' are error-free, but the three +together make a subtle ambiguity: even an empty input can be parsed in +infinitely many ways! + + Consider: nothing-at-all could be a `words'. Or it could be two +`words' in a row, or three, or any number. It could equally well be a +`redirects', or two, or any number. Or it could be a `words' followed +by three `redirects' and another `words'. And so on. + + Here are two ways to correct these rules. First, to make it a +single level of sequence: + + sequence: /* empty */ + | sequence word + | sequence redirect + ; + + Second, to prevent either a `words' or a `redirects' from being +empty: + + sequence: /* empty */ + | sequence words + | sequence redirects + ; + + words: word + | words word + ; + + redirects:redirect + | redirects redirect + ; + + +File: bison.info, Node: Mystery Conflicts, Next: Stack Overflow, Prev: Reduce/Reduce, Up: Algorithm + +Mysterious Reduce/Reduce Conflicts +================================== + + Sometimes reduce/reduce conflicts can occur that don't look +warranted. Here is an example: + + %token ID + + %% + def: param_spec return_spec ',' + ; + param_spec: + type + | name_list ':' type + ; + return_spec: + type + | name ':' type + ; + type: ID + ; + name: ID + ; + name_list: + name + | name ',' name_list + ; + + It would seem that this grammar can be parsed with only a single +token of look-ahead: when a `param_spec' is being read, an `ID' is a +`name' if a comma or colon follows, or a `type' if another `ID' +follows. In other words, this grammar is LR(1). + + However, Bison, like most parser generators, cannot actually handle +all LR(1) grammars. In this grammar, two contexts, that after an `ID' +at the beginning of a `param_spec' and likewise at the beginning of a +`return_spec', are similar enough that Bison assumes they are the same. +They appear similar because the same set of rules would be active--the +rule for reducing to a `name' and that for reducing to a `type'. Bison +is unable to determine at that stage of processing that the rules would +require different look-ahead tokens in the two contexts, so it makes a +single parser state for them both. Combining the two contexts causes a +conflict later. In parser terminology, this occurrence means that the +grammar is not LALR(1). + + In general, it is better to fix deficiencies than to document them. +But this particular deficiency is intrinsically hard to fix; parser +generators that can handle LR(1) grammars are hard to write and tend to +produce parsers that are very large. In practice, Bison is more useful +as it is now. + + When the problem arises, you can often fix it by identifying the two +parser states that are being confused, and adding something to make them +look distinct. In the above example, adding one rule to `return_spec' +as follows makes the problem go away: + + %token BOGUS + ... + %% + ... + return_spec: + type + | name ':' type + /* This rule is never used. */ + | ID BOGUS + ; + + This corrects the problem because it introduces the possibility of an +additional active rule in the context after the `ID' at the beginning of +`return_spec'. This rule is not active in the corresponding context in +a `param_spec', so the two contexts receive distinct parser states. As +long as the token `BOGUS' is never generated by `yylex', the added rule +cannot alter the way actual input is parsed. + + In this particular example, there is another way to solve the +problem: rewrite the rule for `return_spec' to use `ID' directly +instead of via `name'. This also causes the two confusing contexts to +have different sets of active rules, because the one for `return_spec' +activates the altered rule for `return_spec' rather than the one for +`name'. + + param_spec: + type + | name_list ':' type + ; + return_spec: + type + | ID ':' type + ; + + +File: bison.info, Node: Stack Overflow, Prev: Mystery Conflicts, Up: Algorithm + +Stack Overflow, and How to Avoid It +=================================== + + The Bison parser stack can overflow if too many tokens are shifted +and not reduced. When this happens, the parser function `yyparse' +returns a nonzero value, pausing only to call `yyerror' to report the +overflow. + + By defining the macro `YYMAXDEPTH', you can control how deep the +parser stack can become before a stack overflow occurs. Define the +macro with a value that is an integer. This value is the maximum number +of tokens that can be shifted (and not reduced) before overflow. It +must be a constant expression whose value is known at compile time. + + The stack space allowed is not necessarily allocated. If you +specify a large value for `YYMAXDEPTH', the parser actually allocates a +small stack at first, and then makes it bigger by stages as needed. +This increasing allocation happens automatically and silently. +Therefore, you do not need to make `YYMAXDEPTH' painfully small merely +to save space for ordinary inputs that do not need much stack. + + The default value of `YYMAXDEPTH', if you do not define it, is 10000. + + You can control how much stack is allocated initially by defining the +macro `YYINITDEPTH'. This value too must be a compile-time constant +integer. The default is 200. + + +File: bison.info, Node: Error Recovery, Next: Context Dependency, Prev: Algorithm, Up: Top + +Error Recovery +************** + + It is not usually acceptable to have a program terminate on a parse +error. For example, a compiler should recover sufficiently to parse the +rest of the input file and check it for errors; a calculator should +accept another expression. + + In a simple interactive command parser where each input is one line, +it may be sufficient to allow `yyparse' to return 1 on error and have +the caller ignore the rest of the input line when that happens (and +then call `yyparse' again). But this is inadequate for a compiler, +because it forgets all the syntactic context leading up to the error. +A syntax error deep within a function in the compiler input should not +cause the compiler to treat the following line like the beginning of a +source file. + + You can define how to recover from a syntax error by writing rules to +recognize the special token `error'. This is a terminal symbol that is +always defined (you need not declare it) and reserved for error +handling. The Bison parser generates an `error' token whenever a +syntax error happens; if you have provided a rule to recognize this +token in the current context, the parse can continue. + + For example: + + stmnts: /* empty string */ + | stmnts '\n' + | stmnts exp '\n' + | stmnts error '\n' + + The fourth rule in this example says that an error followed by a +newline makes a valid addition to any `stmnts'. + + What happens if a syntax error occurs in the middle of an `exp'? The +error recovery rule, interpreted strictly, applies to the precise +sequence of a `stmnts', an `error' and a newline. If an error occurs in +the middle of an `exp', there will probably be some additional tokens +and subexpressions on the stack after the last `stmnts', and there will +be tokens to read before the next newline. So the rule is not +applicable in the ordinary way. + + But Bison can force the situation to fit the rule, by discarding +part of the semantic context and part of the input. First it discards +states and objects from the stack until it gets back to a state in +which the `error' token is acceptable. (This means that the +subexpressions already parsed are discarded, back to the last complete +`stmnts'.) At this point the `error' token can be shifted. Then, if +the old look-ahead token is not acceptable to be shifted next, the +parser reads tokens and discards them until it finds a token which is +acceptable. In this example, Bison reads and discards input until the +next newline so that the fourth rule can apply. + + The choice of error rules in the grammar is a choice of strategies +for error recovery. A simple and useful strategy is simply to skip the +rest of the current input line or current statement if an error is +detected: + + stmnt: error ';' /* on error, skip until ';' is read */ + + It is also useful to recover to the matching close-delimiter of an +opening-delimiter that has already been parsed. Otherwise the +close-delimiter will probably appear to be unmatched, and generate +another, spurious error message: + + primary: '(' expr ')' + | '(' error ')' + ... + ; + + Error recovery strategies are necessarily guesses. When they guess +wrong, one syntax error often leads to another. In the above example, +the error recovery rule guesses that an error is due to bad input +within one `stmnt'. Suppose that instead a spurious semicolon is +inserted in the middle of a valid `stmnt'. After the error recovery +rule recovers from the first error, another syntax error will be found +straightaway, since the text following the spurious semicolon is also +an invalid `stmnt'. + + To prevent an outpouring of error messages, the parser will output +no error message for another syntax error that happens shortly after +the first; only after three consecutive input tokens have been +successfully shifted will error messages resume. + + Note that rules which accept the `error' token may have actions, just +as any other rules can. + + You can make error messages resume immediately by using the macro +`yyerrok' in an action. If you do this in the error rule's action, no +error messages will be suppressed. This macro requires no arguments; +`yyerrok;' is a valid C statement. + + The previous look-ahead token is reanalyzed immediately after an +error. If this is unacceptable, then the macro `yyclearin' may be used +to clear this token. Write the statement `yyclearin;' in the error +rule's action. + + For example, suppose that on a parse error, an error handling +routine is called that advances the input stream to some point where +parsing should once again commence. The next symbol returned by the +lexical scanner is probably correct. The previous look-ahead token +ought to be discarded with `yyclearin;'. + + The macro `YYRECOVERING' stands for an expression that has the value +1 when the parser is recovering from a syntax error, and 0 the rest of +the time. A value of 1 indicates that error messages are currently +suppressed for new syntax errors. + + +File: bison.info, Node: Context Dependency, Next: Debugging, Prev: Error Recovery, Up: Top + +Handling Context Dependencies +***************************** + + The Bison paradigm is to parse tokens first, then group them into +larger syntactic units. In many languages, the meaning of a token is +affected by its context. Although this violates the Bison paradigm, +certain techniques (known as "kludges") may enable you to write Bison +parsers for such languages. + +* Menu: + +* Semantic Tokens:: Token parsing can depend on the semantic context. +* Lexical Tie-ins:: Token parsing can depend on the syntactic context. +* Tie-in Recovery:: Lexical tie-ins have implications for how + error recovery rules must be written. + + (Actually, "kludge" means any technique that gets its job done but is +neither clean nor robust.) + + +File: bison.info, Node: Semantic Tokens, Next: Lexical Tie-ins, Up: Context Dependency + +Semantic Info in Token Types +============================ + + The C language has a context dependency: the way an identifier is +used depends on what its current meaning is. For example, consider +this: + + foo (x); + + This looks like a function call statement, but if `foo' is a typedef +name, then this is actually a declaration of `x'. How can a Bison +parser for C decide how to parse this input? + + The method used in GNU C is to have two different token types, +`IDENTIFIER' and `TYPENAME'. When `yylex' finds an identifier, it +looks up the current declaration of the identifier in order to decide +which token type to return: `TYPENAME' if the identifier is declared as +a typedef, `IDENTIFIER' otherwise. + + The grammar rules can then express the context dependency by the +choice of token type to recognize. `IDENTIFIER' is accepted as an +expression, but `TYPENAME' is not. `TYPENAME' can start a declaration, +but `IDENTIFIER' cannot. In contexts where the meaning of the +identifier is _not_ significant, such as in declarations that can +shadow a typedef name, either `TYPENAME' or `IDENTIFIER' is +accepted--there is one rule for each of the two token types. + + This technique is simple to use if the decision of which kinds of +identifiers to allow is made at a place close to where the identifier is +parsed. But in C this is not always so: C allows a declaration to +redeclare a typedef name provided an explicit type has been specified +earlier: + + typedef int foo, bar, lose; + static foo (bar); /* redeclare `bar' as static variable */ + static int foo (lose); /* redeclare `foo' as function */ + + Unfortunately, the name being declared is separated from the +declaration construct itself by a complicated syntactic structure--the +"declarator". + + As a result, the part of Bison parser for C needs to be duplicated, +with all the nonterminal names changed: once for parsing a declaration +in which a typedef name can be redefined, and once for parsing a +declaration in which that can't be done. Here is a part of the +duplication, with actions omitted for brevity: + + initdcl: + declarator maybeasm '=' + init + | declarator maybeasm + ; + + notype_initdcl: + notype_declarator maybeasm '=' + init + | notype_declarator maybeasm + ; + +Here `initdcl' can redeclare a typedef name, but `notype_initdcl' +cannot. The distinction between `declarator' and `notype_declarator' +is the same sort of thing. + + There is some similarity between this technique and a lexical tie-in +(described next), in that information which alters the lexical analysis +is changed during parsing by other parts of the program. The +difference is here the information is global, and is used for other +purposes in the program. A true lexical tie-in has a special-purpose +flag controlled by the syntactic context. + + +File: bison.info, Node: Lexical Tie-ins, Next: Tie-in Recovery, Prev: Semantic Tokens, Up: Context Dependency + +Lexical Tie-ins +=============== + + One way to handle context-dependency is the "lexical tie-in": a flag +which is set by Bison actions, whose purpose is to alter the way tokens +are parsed. + + For example, suppose we have a language vaguely like C, but with a +special construct `hex (HEX-EXPR)'. After the keyword `hex' comes an +expression in parentheses in which all integers are hexadecimal. In +particular, the token `a1b' must be treated as an integer rather than +as an identifier if it appears in that context. Here is how you can do +it: + + %{ + int hexflag; + %} + %% + ... + expr: IDENTIFIER + | constant + | HEX '(' + { hexflag = 1; } + expr ')' + { hexflag = 0; + $$ = $4; } + | expr '+' expr + { $$ = make_sum ($1, $3); } + ... + ; + + constant: + INTEGER + | STRING + ; + +Here we assume that `yylex' looks at the value of `hexflag'; when it is +nonzero, all integers are parsed in hexadecimal, and tokens starting +with letters are parsed as integers if possible. + + The declaration of `hexflag' shown in the C declarations section of +the parser file is needed to make it accessible to the actions (*note +The C Declarations Section: C Declarations.). You must also write the +code in `yylex' to obey the flag. + + +File: bison.info, Node: Tie-in Recovery, Prev: Lexical Tie-ins, Up: Context Dependency + +Lexical Tie-ins and Error Recovery +================================== + + Lexical tie-ins make strict demands on any error recovery rules you +have. *Note Error Recovery::. + + The reason for this is that the purpose of an error recovery rule is +to abort the parsing of one construct and resume in some larger +construct. For example, in C-like languages, a typical error recovery +rule is to skip tokens until the next semicolon, and then start a new +statement, like this: + + stmt: expr ';' + | IF '(' expr ')' stmt { ... } + ... + error ';' + { hexflag = 0; } + ; + + If there is a syntax error in the middle of a `hex (EXPR)' +construct, this error rule will apply, and then the action for the +completed `hex (EXPR)' will never run. So `hexflag' would remain set +for the entire rest of the input, or until the next `hex' keyword, +causing identifiers to be misinterpreted as integers. + + To avoid this problem the error recovery rule itself clears +`hexflag'. + + There may also be an error recovery rule that works within +expressions. For example, there could be a rule which applies within +parentheses and skips to the close-parenthesis: + + expr: ... + | '(' expr ')' + { $$ = $2; } + | '(' error ')' + ... + + If this rule acts within the `hex' construct, it is not going to +abort that construct (since it applies to an inner level of parentheses +within the construct). Therefore, it should not clear the flag: the +rest of the `hex' construct should be parsed with the flag still in +effect. + + What if there is an error recovery rule which might abort out of the +`hex' construct or might not, depending on circumstances? There is no +way you can write the action to determine whether a `hex' construct is +being aborted or not. So if you are using a lexical tie-in, you had +better make sure your error recovery rules are not of this kind. Each +rule must be such that you can be sure that it always will, or always +won't, have to clear the flag. + + +File: bison.info, Node: Debugging, Next: Invocation, Prev: Context Dependency, Up: Top + +Debugging Your Parser +********************* + + If a Bison grammar compiles properly but doesn't do what you want +when it runs, the `yydebug' parser-trace feature can help you figure +out why. + + To enable compilation of trace facilities, you must define the macro +`YYDEBUG' when you compile the parser. You could use `-DYYDEBUG=1' as +a compiler option or you could put `#define YYDEBUG 1' in the C +declarations section of the grammar file (*note The C Declarations +Section: C Declarations.). Alternatively, use the `-t' option when you +run Bison (*note Invoking Bison: Invocation.). We always define +`YYDEBUG' so that debugging is always possible. + + The trace facility uses `stderr', so you must add +`#include <stdio.h>' to the C declarations section unless it is already +there. + + Once you have compiled the program with trace facilities, the way to +request a trace is to store a nonzero value in the variable `yydebug'. +You can do this by making the C code do it (in `main', perhaps), or you +can alter the value with a C debugger. + + Each step taken by the parser when `yydebug' is nonzero produces a +line or two of trace information, written on `stderr'. The trace +messages tell you these things: + + * Each time the parser calls `yylex', what kind of token was read. + + * Each time a token is shifted, the depth and complete contents of + the state stack (*note Parser States::). + + * Each time a rule is reduced, which rule it is, and the complete + contents of the state stack afterward. + + To make sense of this information, it helps to refer to the listing +file produced by the Bison `-v' option (*note Invoking Bison: +Invocation.). This file shows the meaning of each state in terms of +positions in various rules, and also what each state will do with each +possible input token. As you read the successive trace messages, you +can see that the parser is functioning according to its specification +in the listing file. Eventually you will arrive at the place where +something undesirable happens, and you will see which parts of the +grammar are to blame. + + The parser file is a C program and you can use C debuggers on it, +but it's not easy to interpret what it is doing. The parser function +is a finite-state machine interpreter, and aside from the actions it +executes the same code over and over. Only the values of variables +show where in the grammar it is working. + + The debugging information normally gives the token type of each token +read, but not its semantic value. You can optionally define a macro +named `YYPRINT' to provide a way to print the value. If you define +`YYPRINT', it should take three arguments. The parser will pass a +standard I/O stream, the numeric code for the token type, and the token +value (from `yylval'). + + Here is an example of `YYPRINT' suitable for the multi-function +calculator (*note Declarations for `mfcalc': Mfcalc Decl.): + + #define YYPRINT(file, type, value) yyprint (file, type, value) + + static void + yyprint (file, type, value) + FILE *file; + int type; + YYSTYPE value; + { + if (type == VAR) + fprintf (file, " %s", value.tptr->name); + else if (type == NUM) + fprintf (file, " %d", value.val); + } + + +File: bison.info, Node: Invocation, Next: Table of Symbols, Prev: Debugging, Up: Top + +Invoking Bison +************** + + The usual way to invoke Bison is as follows: + + bison INFILE + + Here INFILE is the grammar file name, which usually ends in `.y'. +The parser file's name is made by replacing the `.y' with `.tab.c'. +Thus, the `bison foo.y' filename yields `foo.tab.c', and the `bison +hack/foo.y' filename yields `hack/foo.tab.c'. + +* Menu: + +* Bison Options:: All the options described in detail, + in alphabetical order by short options. +* Option Cross Key:: Alphabetical list of long options. +* VMS Invocation:: Bison command syntax on VMS. + + +File: bison.info, Node: Bison Options, Next: Option Cross Key, Up: Invocation + +Bison Options +============= + + Bison supports both traditional single-letter options and mnemonic +long option names. Long option names are indicated with `--' instead of +`-'. Abbreviations for option names are allowed as long as they are +unique. When a long option takes an argument, like `--file-prefix', +connect the option name and the argument with `='. + + Here is a list of options that can be used with Bison, alphabetized +by short option. It is followed by a cross key alphabetized by long +option. + +`-b FILE-PREFIX' +`--file-prefix=PREFIX' + Specify a prefix to use for all Bison output file names. The + names are chosen as if the input file were named `PREFIX.c'. + +`-d' +`--defines' + Write an extra output file containing macro definitions for the + token type names defined in the grammar and the semantic value type + `YYSTYPE', as well as a few `extern' variable declarations. + + If the parser output file is named `NAME.c' then this file is + named `NAME.h'. + + This output file is essential if you wish to put the definition of + `yylex' in a separate source file, because `yylex' needs to be + able to refer to token type codes and the variable `yylval'. + *Note Semantic Values of Tokens: Token Values. + +`-l' +`--no-lines' + Don't put any `#line' preprocessor commands in the parser file. + Ordinarily Bison puts them in the parser file so that the C + compiler and debuggers will associate errors with your source + file, the grammar file. This option causes them to associate + errors with the parser file, treating it as an independent source + file in its own right. + +`-n' +`--no-parser' + Do not include any C code in the parser file; generate tables + only. The parser file contains just `#define' directives and + static variable declarations. + + This option also tells Bison to write the C code for the grammar + actions into a file named `FILENAME.act', in the form of a + brace-surrounded body fit for a `switch' statement. + +`-o OUTFILE' +`--output-file=OUTFILE' + Specify the name OUTFILE for the parser file. + + The other output files' names are constructed from OUTFILE as + described under the `-v' and `-d' options. + +`-p PREFIX' +`--name-prefix=PREFIX' + Rename the external symbols used in the parser so that they start + with PREFIX instead of `yy'. The precise list of symbols renamed + is `yyparse', `yylex', `yyerror', `yynerrs', `yylval', `yychar' + and `yydebug'. + + For example, if you use `-p c', the names become `cparse', `clex', + and so on. + + *Note Multiple Parsers in the Same Program: Multiple Parsers. + +`-r' +`--raw' + Pretend that `%raw' was specified. *Note Decl Summary::. + +`-t' +`--debug' + Output a definition of the macro `YYDEBUG' into the parser file, + so that the debugging facilities are compiled. *Note Debugging + Your Parser: Debugging. + +`-v' +`--verbose' + Write an extra output file containing verbose descriptions of the + parser states and what is done for each type of look-ahead token in + that state. + + This file also describes all the conflicts, both those resolved by + operator precedence and the unresolved ones. + + The file's name is made by removing `.tab.c' or `.c' from the + parser output file name, and adding `.output' instead. + + Therefore, if the input file is `foo.y', then the parser file is + called `foo.tab.c' by default. As a consequence, the verbose + output file is called `foo.output'. + +`-V' +`--version' + Print the version number of Bison and exit. + +`-h' +`--help' + Print a summary of the command-line options to Bison and exit. + +`-y' +`--yacc' +`--fixed-output-files' + Equivalent to `-o y.tab.c'; the parser output file is called + `y.tab.c', and the other outputs are called `y.output' and + `y.tab.h'. The purpose of this option is to imitate Yacc's output + file name conventions. Thus, the following shell script can + substitute for Yacc: + + bison -y $* + + +File: bison.info, Node: Option Cross Key, Next: VMS Invocation, Prev: Bison Options, Up: Invocation + +Option Cross Key +================ + + Here is a list of options, alphabetized by long option, to help you +find the corresponding short option. + + --debug -t + --defines -d + --file-prefix=PREFIX -b FILE-PREFIX + --fixed-output-files --yacc -y + --help -h + --name-prefix=PREFIX -p NAME-PREFIX + --no-lines -l + --no-parser -n + --output-file=OUTFILE -o OUTFILE + --raw -r + --token-table -k + --verbose -v + --version -V + + +File: bison.info, Node: VMS Invocation, Prev: Option Cross Key, Up: Invocation + +Invoking Bison under VMS +======================== + + The command line syntax for Bison on VMS is a variant of the usual +Bison command syntax--adapted to fit VMS conventions. + + To find the VMS equivalent for any Bison option, start with the long +option, and substitute a `/' for the leading `--', and substitute a `_' +for each `-' in the name of the long option. For example, the +following invocation under VMS: + + bison /debug/name_prefix=bar foo.y + +is equivalent to the following command under POSIX. + + bison --debug --name-prefix=bar foo.y + + The VMS file system does not permit filenames such as `foo.tab.c'. +In the above example, the output file would instead be named +`foo_tab.c'. + + +File: bison.info, Node: Table of Symbols, Next: Glossary, Prev: Invocation, Up: Top + +Bison Symbols +************* + +`error' + A token name reserved for error recovery. This token may be used + in grammar rules so as to allow the Bison parser to recognize an + error in the grammar without halting the process. In effect, a + sentence containing an error may be recognized as valid. On a + parse error, the token `error' becomes the current look-ahead + token. Actions corresponding to `error' are then executed, and + the look-ahead token is reset to the token that originally caused + the violation. *Note Error Recovery::. + +`YYABORT' + Macro to pretend that an unrecoverable syntax error has occurred, + by making `yyparse' return 1 immediately. The error reporting + function `yyerror' is not called. *Note The Parser Function + `yyparse': Parser Function. + +`YYACCEPT' + Macro to pretend that a complete utterance of the language has been + read, by making `yyparse' return 0 immediately. *Note The Parser + Function `yyparse': Parser Function. + +`YYBACKUP' + Macro to discard a value from the parser stack and fake a + look-ahead token. *Note Special Features for Use in Actions: + Action Features. + +`YYERROR' + Macro to pretend that a syntax error has just been detected: call + `yyerror' and then perform normal error recovery if possible + (*note Error Recovery::), or (if recovery is impossible) make + `yyparse' return 1. *Note Error Recovery::. + +`YYERROR_VERBOSE' + Macro that you define with `#define' in the Bison declarations + section to request verbose, specific error message strings when + `yyerror' is called. + +`YYINITDEPTH' + Macro for specifying the initial size of the parser stack. *Note + Stack Overflow::. + +`YYLEX_PARAM' + Macro for specifying an extra argument (or list of extra + arguments) for `yyparse' to pass to `yylex'. *Note Calling + Conventions for Pure Parsers: Pure Calling. + +`YYLTYPE' + Macro for the data type of `yylloc'; a structure with four + members. *Note Textual Positions of Tokens: Token Positions. + +`yyltype' + Default value for YYLTYPE. + +`YYMAXDEPTH' + Macro for specifying the maximum size of the parser stack. *Note + Stack Overflow::. + +`YYPARSE_PARAM' + Macro for specifying the name of a parameter that `yyparse' should + accept. *Note Calling Conventions for Pure Parsers: Pure Calling. + +`YYRECOVERING' + Macro whose value indicates whether the parser is recovering from a + syntax error. *Note Special Features for Use in Actions: Action + Features. + +`YYSTYPE' + Macro for the data type of semantic values; `int' by default. + *Note Data Types of Semantic Values: Value Type. + +`yychar' + External integer variable that contains the integer value of the + current look-ahead token. (In a pure parser, it is a local + variable within `yyparse'.) Error-recovery rule actions may + examine this variable. *Note Special Features for Use in Actions: + Action Features. + +`yyclearin' + Macro used in error-recovery rule actions. It clears the previous + look-ahead token. *Note Error Recovery::. + +`yydebug' + External integer variable set to zero by default. If `yydebug' is + given a nonzero value, the parser will output information on input + symbols and parser action. *Note Debugging Your Parser: Debugging. + +`yyerrok' + Macro to cause parser to recover immediately to its normal mode + after a parse error. *Note Error Recovery::. + +`yyerror' + User-supplied function to be called by `yyparse' on error. The + function receives one argument, a pointer to a character string + containing an error message. *Note The Error Reporting Function + `yyerror': Error Reporting. + +`yylex' + User-supplied lexical analyzer function, called with no arguments + to get the next token. *Note The Lexical Analyzer Function + `yylex': Lexical. + +`yylval' + External variable in which `yylex' should place the semantic value + associated with a token. (In a pure parser, it is a local + variable within `yyparse', and its address is passed to `yylex'.) + *Note Semantic Values of Tokens: Token Values. + +`yylloc' + External variable in which `yylex' should place the line and + column numbers associated with a token. (In a pure parser, it is a + local variable within `yyparse', and its address is passed to + `yylex'.) You can ignore this variable if you don't use the `@' + feature in the grammar actions. *Note Textual Positions of + Tokens: Token Positions. + +`yynerrs' + Global variable which Bison increments each time there is a parse + error. (In a pure parser, it is a local variable within + `yyparse'.) *Note The Error Reporting Function `yyerror': Error + Reporting. + +`yyparse' + The parser function produced by Bison; call this function to start + parsing. *Note The Parser Function `yyparse': Parser Function. + +`%left' + Bison declaration to assign left associativity to token(s). *Note + Operator Precedence: Precedence Decl. + +`%no_lines' + Bison declaration to avoid generating `#line' directives in the + parser file. *Note Decl Summary::. + +`%nonassoc' + Bison declaration to assign nonassociativity to token(s). *Note + Operator Precedence: Precedence Decl. + +`%prec' + Bison declaration to assign a precedence to a specific rule. + *Note Context-Dependent Precedence: Contextual Precedence. + +`%pure_parser' + Bison declaration to request a pure (reentrant) parser. *Note A + Pure (Reentrant) Parser: Pure Decl. + +`%raw' + Bison declaration to use Bison internal token code numbers in token + tables instead of the usual Yacc-compatible token code numbers. + *Note Decl Summary::. + +`%right' + Bison declaration to assign right associativity to token(s). + *Note Operator Precedence: Precedence Decl. + +`%start' + Bison declaration to specify the start symbol. *Note The + Start-Symbol: Start Decl. + +`%token' + Bison declaration to declare token(s) without specifying + precedence. *Note Token Type Names: Token Decl. + +`%token_table' + Bison declaration to include a token name table in the parser file. + *Note Decl Summary::. + +`%type' + Bison declaration to declare nonterminals. *Note Nonterminal + Symbols: Type Decl. + +`%union' + Bison declaration to specify several possible data types for + semantic values. *Note The Collection of Value Types: Union Decl. + + These are the punctuation and delimiters used in Bison input: + +`%%' + Delimiter used to separate the grammar rule section from the Bison + declarations section or the additional C code section. *Note The + Overall Layout of a Bison Grammar: Grammar Layout. + +`%{ %}' + All code listed between `%{' and `%}' is copied directly to the + output file uninterpreted. Such code forms the "C declarations" + section of the input file. *Note Outline of a Bison Grammar: + Grammar Outline. + +`/*...*/' + Comment delimiters, as in C. + +`:' + Separates a rule's result from its components. *Note Syntax of + Grammar Rules: Rules. + +`;' + Terminates a rule. *Note Syntax of Grammar Rules: Rules. + +`|' + Separates alternate rules for the same result nonterminal. *Note + Syntax of Grammar Rules: Rules. + + +File: bison.info, Node: Glossary, Next: Index, Prev: Table of Symbols, Up: Top + +Glossary +******** + +Backus-Naur Form (BNF) + Formal method of specifying context-free grammars. BNF was first + used in the `ALGOL-60' report, 1963. *Note Languages and + Context-Free Grammars: Language and Grammar. + +Context-free grammars + Grammars specified as rules that can be applied regardless of + context. Thus, if there is a rule which says that an integer can + be used as an expression, integers are allowed _anywhere_ an + expression is permitted. *Note Languages and Context-Free + Grammars: Language and Grammar. + +Dynamic allocation + Allocation of memory that occurs during execution, rather than at + compile time or on entry to a function. + +Empty string + Analogous to the empty set in set theory, the empty string is a + character string of length zero. + +Finite-state stack machine + A "machine" that has discrete states in which it is said to exist + at each instant in time. As input to the machine is processed, the + machine moves from state to state as specified by the logic of the + machine. In the case of the parser, the input is the language + being parsed, and the states correspond to various stages in the + grammar rules. *Note The Bison Parser Algorithm: Algorithm. + +Grouping + A language construct that is (in general) grammatically divisible; + for example, `expression' or `declaration' in C. *Note Languages + and Context-Free Grammars: Language and Grammar. + +Infix operator + An arithmetic operator that is placed between the operands on + which it performs some operation. + +Input stream + A continuous flow of data between devices or programs. + +Language construct + One of the typical usage schemas of the language. For example, + one of the constructs of the C language is the `if' statement. + *Note Languages and Context-Free Grammars: Language and Grammar. + +Left associativity + Operators having left associativity are analyzed from left to + right: `a+b+c' first computes `a+b' and then combines with `c'. + *Note Operator Precedence: Precedence. + +Left recursion + A rule whose result symbol is also its first component symbol; for + example, `expseq1 : expseq1 ',' exp;'. *Note Recursive Rules: + Recursion. + +Left-to-right parsing + Parsing a sentence of a language by analyzing it token by token + from left to right. *Note The Bison Parser Algorithm: Algorithm. + +Lexical analyzer (scanner) + A function that reads an input stream and returns tokens one by + one. *Note The Lexical Analyzer Function `yylex': Lexical. + +Lexical tie-in + A flag, set by actions in the grammar rules, which alters the way + tokens are parsed. *Note Lexical Tie-ins::. + +Literal string token + A token which constists of two or more fixed characters. *Note + Symbols::. + +Look-ahead token + A token already read but not yet shifted. *Note Look-Ahead + Tokens: Look-Ahead. + +LALR(1) + The class of context-free grammars that Bison (like most other + parser generators) can handle; a subset of LR(1). *Note + Mysterious Reduce/Reduce Conflicts: Mystery Conflicts. + +LR(1) + The class of context-free grammars in which at most one token of + look-ahead is needed to disambiguate the parsing of any piece of + input. + +Nonterminal symbol + A grammar symbol standing for a grammatical construct that can be + expressed through rules in terms of smaller constructs; in other + words, a construct that is not a token. *Note Symbols::. + +Parse error + An error encountered during parsing of an input stream due to + invalid syntax. *Note Error Recovery::. + +Parser + A function that recognizes valid sentences of a language by + analyzing the syntax structure of a set of tokens passed to it + from a lexical analyzer. + +Postfix operator + An arithmetic operator that is placed after the operands upon + which it performs some operation. + +Reduction + Replacing a string of nonterminals and/or terminals with a single + nonterminal, according to a grammar rule. *Note The Bison Parser + Algorithm: Algorithm. + +Reentrant + A reentrant subprogram is a subprogram which can be in invoked any + number of times in parallel, without interference between the + various invocations. *Note A Pure (Reentrant) Parser: Pure Decl. + +Reverse polish notation + A language in which all operators are postfix operators. + +Right recursion + A rule whose result symbol is also its last component symbol; for + example, `expseq1: exp ',' expseq1;'. *Note Recursive Rules: + Recursion. + +Semantics + In computer languages, the semantics are specified by the actions + taken for each instance of the language, i.e., the meaning of each + statement. *Note Defining Language Semantics: Semantics. + +Shift + A parser is said to shift when it makes the choice of analyzing + further input from the stream rather than reducing immediately some + already-recognized rule. *Note The Bison Parser Algorithm: + Algorithm. + +Single-character literal + A single character that is recognized and interpreted as is. + *Note From Formal Rules to Bison Input: Grammar in Bison. + +Start symbol + The nonterminal symbol that stands for a complete valid utterance + in the language being parsed. The start symbol is usually listed + as the first nonterminal symbol in a language specification. + *Note The Start-Symbol: Start Decl. + +Symbol table + A data structure where symbol names and associated data are stored + during parsing to allow for recognition and use of existing + information in repeated uses of a symbol. *Note Multi-function + Calc::. + +Token + A basic, grammatically indivisible unit of a language. The symbol + that describes a token in the grammar is a terminal symbol. The + input of the Bison parser is a stream of tokens which comes from + the lexical analyzer. *Note Symbols::. + +Terminal symbol + A grammar symbol that has no rules in the grammar and therefore is + grammatically indivisible. The piece of text it represents is a + token. *Note Languages and Context-Free Grammars: Language and + Grammar. + |