diff options
Diffstat (limited to 'tools/bison++/bison.info-1')
-rw-r--r-- | tools/bison++/bison.info-1 | 1070 |
1 files changed, 1070 insertions, 0 deletions
diff --git a/tools/bison++/bison.info-1 b/tools/bison++/bison.info-1 new file mode 100644 index 000000000..3b458c6ed --- /dev/null +++ b/tools/bison++/bison.info-1 @@ -0,0 +1,1070 @@ +This is bison.info, produced by makeinfo version 4.1 from bison.texinfo. + +START-INFO-DIR-ENTRY +* bison: (bison). GNU Project parser generator (yacc replacement). +END-INFO-DIR-ENTRY + + This file documents the Bison parser generator. + + Copyright (C) 1988, 89, 90, 91, 92, 93, 95, 98, 1999 Free Software +Foundation, Inc. + + Permission is granted to make and distribute verbatim copies of this +manual provided the copyright notice and this permission notice are +preserved on all copies. + + Permission is granted to copy and distribute modified versions of +this manual under the conditions for verbatim copying, provided also +that the sections entitled "GNU General Public License" and "Conditions +for Using Bison" are included exactly as in the original, and provided +that the entire resulting derived work is distributed under the terms +of a permission notice identical to this one. + + Permission is granted to copy and distribute translations of this +manual into another language, under the above conditions for modified +versions, except that the sections entitled "GNU General Public +License", "Conditions for Using Bison" and this permission notice may be +included in translations approved by the Free Software Foundation +instead of in the original English. + + +File: bison.info, Node: Top, Next: Introduction, Prev: (dir), Up: (dir) + + This manual documents version 2.21.5 of Bison. + +* Menu: + +* Introduction:: +* Conditions:: +* Copying:: The GNU General Public License says + how you can copy and share Bison + +Tutorial sections: +* Concepts:: Basic concepts for understanding Bison. +* Examples:: Three simple explained examples of using Bison. + +Reference sections: +* Grammar File:: Writing Bison declarations and rules. +* Interface:: C-language interface to the parser function `yyparse'. +* Algorithm:: How the Bison parser works at run-time. +* Error Recovery:: Writing rules for error recovery. +* Context Dependency:: What to do if your language syntax is too + messy for Bison to handle straightforwardly. +* Debugging:: Debugging Bison parsers that parse wrong. +* Invocation:: How to run Bison (to produce the parser source file). +* Table of Symbols:: All the keywords of the Bison language are explained. +* Glossary:: Basic concepts are explained. +* Index:: Cross-references to the text. + + --- The Detailed Node Listing --- + +The Concepts of Bison + +* Language and Grammar:: Languages and context-free grammars, + as mathematical ideas. +* Grammar in Bison:: How we represent grammars for Bison's sake. +* Semantic Values:: Each token or syntactic grouping can have + a semantic value (the value of an integer, + the name of an identifier, etc.). +* Semantic Actions:: Each rule can have an action containing C code. +* Bison Parser:: What are Bison's input and output, + how is the output used? +* Stages:: Stages in writing and running Bison grammars. +* Grammar Layout:: Overall structure of a Bison grammar file. + +Examples + +* RPN Calc:: Reverse polish notation calculator; + a first example with no operator precedence. +* Infix Calc:: Infix (algebraic) notation calculator. + Operator precedence is introduced. +* Simple Error Recovery:: Continuing after syntax errors. +* Multi-function Calc:: Calculator with memory and trig functions. + It uses multiple data-types for semantic values. +* Exercises:: Ideas for improving the multi-function calculator. + +Reverse Polish Notation Calculator + +* Decls: Rpcalc Decls. Bison and C declarations for rpcalc. +* Rules: Rpcalc Rules. Grammar Rules for rpcalc, with explanation. +* Lexer: Rpcalc Lexer. The lexical analyzer. +* Main: Rpcalc Main. The controlling function. +* Error: Rpcalc Error. The error reporting function. +* Gen: Rpcalc Gen. Running Bison on the grammar file. +* Comp: Rpcalc Compile. Run the C compiler on the output code. + +Grammar Rules for `rpcalc' + +* Rpcalc Input:: +* Rpcalc Line:: +* Rpcalc Expr:: + +Multi-Function Calculator: `mfcalc' + +* Decl: Mfcalc Decl. Bison declarations for multi-function calculator. +* Rules: Mfcalc Rules. Grammar rules for the calculator. +* Symtab: Mfcalc Symtab. Symbol table management subroutines. + +Bison Grammar Files + +* Grammar Outline:: Overall layout of the grammar file. +* Symbols:: Terminal and nonterminal symbols. +* Rules:: How to write grammar rules. +* Recursion:: Writing recursive rules. +* Semantics:: Semantic values and actions. +* Declarations:: All kinds of Bison declarations are described here. +* Multiple Parsers:: Putting more than one Bison parser in one program. + +Outline of a Bison Grammar + +* C Declarations:: Syntax and usage of the C declarations section. +* Bison Declarations:: Syntax and usage of the Bison declarations section. +* Grammar Rules:: Syntax and usage of the grammar rules section. +* C Code:: Syntax and usage of the additional C code section. + +Defining Language Semantics + +* Value Type:: Specifying one data type for all semantic values. +* Multiple Types:: Specifying several alternative data types. +* Actions:: An action is the semantic definition of a grammar rule. +* Action Types:: Specifying data types for actions to operate on. +* Mid-Rule Actions:: Most actions go at the end of a rule. + This says when, why and how to use the exceptional + action in the middle of a rule. + +Bison Declarations + +* Token Decl:: Declaring terminal symbols. +* Precedence Decl:: Declaring terminals with precedence and associativity. +* Union Decl:: Declaring the set of all semantic value types. +* Type Decl:: Declaring the choice of type for a nonterminal symbol. +* Expect Decl:: Suppressing warnings about shift/reduce conflicts. +* Start Decl:: Specifying the start symbol. +* Pure Decl:: Requesting a reentrant parser. +* Decl Summary:: Table of all Bison declarations. + +Parser C-Language Interface + +* Parser Function:: How to call `yyparse' and what it returns. +* Lexical:: You must supply a function `yylex' + which reads tokens. +* Error Reporting:: You must supply a function `yyerror'. +* Action Features:: Special features for use in actions. + +The Lexical Analyzer Function `yylex' + +* Calling Convention:: How `yyparse' calls `yylex'. +* Token Values:: How `yylex' must return the semantic value + of the token it has read. +* Token Positions:: How `yylex' must return the text position + (line number, etc.) of the token, if the + actions want that. +* Pure Calling:: How the calling convention differs + in a pure parser (*note A Pure (Reentrant) Parser: Pure Decl.). + +The Bison Parser Algorithm + +* Look-Ahead:: Parser looks one token ahead when deciding what to do. +* Shift/Reduce:: Conflicts: when either shifting or reduction is valid. +* Precedence:: Operator precedence works by resolving conflicts. +* Contextual Precedence:: When an operator's precedence depends on context. +* Parser States:: The parser is a finite-state-machine with stack. +* Reduce/Reduce:: When two rules are applicable in the same situation. +* Mystery Conflicts:: Reduce/reduce conflicts that look unjustified. +* Stack Overflow:: What happens when stack gets full. How to avoid it. + +Operator Precedence + +* Why Precedence:: An example showing why precedence is needed. +* Using Precedence:: How to specify precedence in Bison grammars. +* Precedence Examples:: How these features are used in the previous example. +* How Precedence:: How they work. + +Handling Context Dependencies + +* Semantic Tokens:: Token parsing can depend on the semantic context. +* Lexical Tie-ins:: Token parsing can depend on the syntactic context. +* Tie-in Recovery:: Lexical tie-ins have implications for how + error recovery rules must be written. + +Invoking Bison + +* Bison Options:: All the options described in detail, + in alphabetical order by short options. +* Option Cross Key:: Alphabetical list of long options. +* VMS Invocation:: Bison command syntax on VMS. + + +File: bison.info, Node: Introduction, Next: Conditions, Prev: Top, Up: Top + +Introduction +************ + + "Bison" is a general-purpose parser generator that converts a +grammar description for an LALR(1) context-free grammar into a C +program to parse that grammar. Once you are proficient with Bison, you +may use it to develop a wide range of language parsers, from those used +in simple desk calculators to complex programming languages. + + Bison is upward compatible with Yacc: all properly-written Yacc +grammars ought to work with Bison with no change. Anyone familiar with +Yacc should be able to use Bison with little trouble. You need to be +fluent in C programming in order to use Bison or to understand this +manual. + + We begin with tutorial chapters that explain the basic concepts of +using Bison and show three explained examples, each building on the +last. If you don't know Bison or Yacc, start by reading these +chapters. Reference chapters follow which describe specific aspects of +Bison in detail. + + Bison was written primarily by Robert Corbett; Richard Stallman made +it Yacc-compatible. Wilfred Hansen of Carnegie Mellon University added +multicharacter string literals and other features. + + This edition corresponds to version 2.21.5 of Bison. + + +File: bison.info, Node: Conditions, Next: Copying, Prev: Introduction, Up: Top + +Conditions for Using Bison +************************** + + As of Bison version 1.24, we have changed the distribution terms for +`yyparse' to permit using Bison's output in non-free programs. +Formerly, Bison parsers could be used only in programs that were free +software. + + The other GNU programming tools, such as the GNU C compiler, have +never had such a requirement. They could always be used for non-free +software. The reason Bison was different was not due to a special +policy decision; it resulted from applying the usual General Public +License to all of the Bison source code. + + The output of the Bison utility--the Bison parser file--contains a +verbatim copy of a sizable piece of Bison, which is the code for the +`yyparse' function. (The actions from your grammar are inserted into +this function at one point, but the rest of the function is not +changed.) When we applied the GPL terms to the code for `yyparse', the +effect was to restrict the use of Bison output to free software. + + We didn't change the terms because of sympathy for people who want to +make software proprietary. *Software should be free.* But we +concluded that limiting Bison's use to free software was doing little to +encourage people to make other software free. So we decided to make the +practical conditions for using Bison match the practical conditions for +using the other GNU tools. + + +File: bison.info, Node: Copying, Next: Concepts, Prev: Conditions, Up: Top + +GNU GENERAL PUBLIC LICENSE +************************** + + Version 2, June 1991 + Copyright (C) 1989, 1991 Free Software Foundation, Inc. + 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA + + Everyone is permitted to copy and distribute verbatim copies + of this license document, but changing it is not allowed. + +Preamble +======== + + The licenses for most software are designed to take away your +freedom to share and change it. By contrast, the GNU General Public +License is intended to guarantee your freedom to share and change free +software--to make sure the software is free for all its users. This +General Public License applies to most of the Free Software +Foundation's software and to any other program whose authors commit to +using it. (Some other Free Software Foundation software is covered by +the GNU Library General Public License instead.) You can apply it to +your programs, too. + + When we speak of free software, we are referring to freedom, not +price. Our General Public Licenses are designed to make sure that you +have the freedom to distribute copies of free software (and charge for +this service if you wish), that you receive source code or can get it +if you want it, that you can change the software or use pieces of it in +new free programs; and that you know you can do these things. + + To protect your rights, we need to make restrictions that forbid +anyone to deny you these rights or to ask you to surrender the rights. +These restrictions translate to certain responsibilities for you if you +distribute copies of the software, or if you modify it. + + For example, if you distribute copies of such a program, whether +gratis or for a fee, you must give the recipients all the rights that +you have. You must make sure that they, too, receive or can get the +source code. And you must show them these terms so they know their +rights. + + We protect your rights with two steps: (1) copyright the software, +and (2) offer you this license which gives you legal permission to copy, +distribute and/or modify the software. + + Also, for each author's protection and ours, we want to make certain +that everyone understands that there is no warranty for this free +software. If the software is modified by someone else and passed on, we +want its recipients to know that what they have is not the original, so +that any problems introduced by others will not reflect on the original +authors' reputations. + + Finally, any free program is threatened constantly by software +patents. We wish to avoid the danger that redistributors of a free +program will individually obtain patent licenses, in effect making the +program proprietary. To prevent this, we have made it clear that any +patent must be licensed for everyone's free use or not licensed at all. + + The precise terms and conditions for copying, distribution and +modification follow. + + TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION + 0. This License applies to any program or other work which contains a + notice placed by the copyright holder saying it may be distributed + under the terms of this General Public License. The "Program", + below, refers to any such program or work, and a "work based on + the Program" means either the Program or any derivative work under + copyright law: that is to say, a work containing the Program or a + portion of it, either verbatim or with modifications and/or + translated into another language. (Hereinafter, translation is + included without limitation in the term "modification".) Each + licensee is addressed as "you". + + Activities other than copying, distribution and modification are + not covered by this License; they are outside its scope. The act + of running the Program is not restricted, and the output from the + Program is covered only if its contents constitute a work based on + the Program (independent of having been made by running the + Program). Whether that is true depends on what the Program does. + + 1. You may copy and distribute verbatim copies of the Program's + source code as you receive it, in any medium, provided that you + conspicuously and appropriately publish on each copy an appropriate + copyright notice and disclaimer of warranty; keep intact all the + notices that refer to this License and to the absence of any + warranty; and give any other recipients of the Program a copy of + this License along with the Program. + + You may charge a fee for the physical act of transferring a copy, + and you may at your option offer warranty protection in exchange + for a fee. + + 2. You may modify your copy or copies of the Program or any portion + of it, thus forming a work based on the Program, and copy and + distribute such modifications or work under the terms of Section 1 + above, provided that you also meet all of these conditions: + + a. You must cause the modified files to carry prominent notices + stating that you changed the files and the date of any change. + + b. You must cause any work that you distribute or publish, that + in whole or in part contains or is derived from the Program + or any part thereof, to be licensed as a whole at no charge + to all third parties under the terms of this License. + + c. If the modified program normally reads commands interactively + when run, you must cause it, when started running for such + interactive use in the most ordinary way, to print or display + an announcement including an appropriate copyright notice and + a notice that there is no warranty (or else, saying that you + provide a warranty) and that users may redistribute the + program under these conditions, and telling the user how to + view a copy of this License. (Exception: if the Program + itself is interactive but does not normally print such an + announcement, your work based on the Program is not required + to print an announcement.) + + These requirements apply to the modified work as a whole. If + identifiable sections of that work are not derived from the + Program, and can be reasonably considered independent and separate + works in themselves, then this License, and its terms, do not + apply to those sections when you distribute them as separate + works. But when you distribute the same sections as part of a + whole which is a work based on the Program, the distribution of + the whole must be on the terms of this License, whose permissions + for other licensees extend to the entire whole, and thus to each + and every part regardless of who wrote it. + + Thus, it is not the intent of this section to claim rights or + contest your rights to work written entirely by you; rather, the + intent is to exercise the right to control the distribution of + derivative or collective works based on the Program. + + In addition, mere aggregation of another work not based on the + Program with the Program (or with a work based on the Program) on + a volume of a storage or distribution medium does not bring the + other work under the scope of this License. + + 3. You may copy and distribute the Program (or a work based on it, + under Section 2) in object code or executable form under the terms + of Sections 1 and 2 above provided that you also do one of the + following: + + a. Accompany it with the complete corresponding machine-readable + source code, which must be distributed under the terms of + Sections 1 and 2 above on a medium customarily used for + software interchange; or, + + b. Accompany it with a written offer, valid for at least three + years, to give any third party, for a charge no more than your + cost of physically performing source distribution, a complete + machine-readable copy of the corresponding source code, to be + distributed under the terms of Sections 1 and 2 above on a + medium customarily used for software interchange; or, + + c. Accompany it with the information you received as to the offer + to distribute corresponding source code. (This alternative is + allowed only for noncommercial distribution and only if you + received the program in object code or executable form with + such an offer, in accord with Subsection b above.) + + The source code for a work means the preferred form of the work for + making modifications to it. For an executable work, complete + source code means all the source code for all modules it contains, + plus any associated interface definition files, plus the scripts + used to control compilation and installation of the executable. + However, as a special exception, the source code distributed need + not include anything that is normally distributed (in either + source or binary form) with the major components (compiler, + kernel, and so on) of the operating system on which the executable + runs, unless that component itself accompanies the executable. + + If distribution of executable or object code is made by offering + access to copy from a designated place, then offering equivalent + access to copy the source code from the same place counts as + distribution of the source code, even though third parties are not + compelled to copy the source along with the object code. + + 4. You may not copy, modify, sublicense, or distribute the Program + except as expressly provided under this License. Any attempt + otherwise to copy, modify, sublicense or distribute the Program is + void, and will automatically terminate your rights under this + License. However, parties who have received copies, or rights, + from you under this License will not have their licenses + terminated so long as such parties remain in full compliance. + + 5. You are not required to accept this License, since you have not + signed it. However, nothing else grants you permission to modify + or distribute the Program or its derivative works. These actions + are prohibited by law if you do not accept this License. + Therefore, by modifying or distributing the Program (or any work + based on the Program), you indicate your acceptance of this + License to do so, and all its terms and conditions for copying, + distributing or modifying the Program or works based on it. + + 6. Each time you redistribute the Program (or any work based on the + Program), the recipient automatically receives a license from the + original licensor to copy, distribute or modify the Program + subject to these terms and conditions. You may not impose any + further restrictions on the recipients' exercise of the rights + granted herein. You are not responsible for enforcing compliance + by third parties to this License. + + 7. If, as a consequence of a court judgment or allegation of patent + infringement or for any other reason (not limited to patent + issues), conditions are imposed on you (whether by court order, + agreement or otherwise) that contradict the conditions of this + License, they do not excuse you from the conditions of this + License. If you cannot distribute so as to satisfy simultaneously + your obligations under this License and any other pertinent + obligations, then as a consequence you may not distribute the + Program at all. For example, if a patent license would not permit + royalty-free redistribution of the Program by all those who + receive copies directly or indirectly through you, then the only + way you could satisfy both it and this License would be to refrain + entirely from distribution of the Program. + + If any portion of this section is held invalid or unenforceable + under any particular circumstance, the balance of the section is + intended to apply and the section as a whole is intended to apply + in other circumstances. + + It is not the purpose of this section to induce you to infringe any + patents or other property right claims or to contest validity of + any such claims; this section has the sole purpose of protecting + the integrity of the free software distribution system, which is + implemented by public license practices. Many people have made + generous contributions to the wide range of software distributed + through that system in reliance on consistent application of that + system; it is up to the author/donor to decide if he or she is + willing to distribute software through any other system and a + licensee cannot impose that choice. + + This section is intended to make thoroughly clear what is believed + to be a consequence of the rest of this License. + + 8. If the distribution and/or use of the Program is restricted in + certain countries either by patents or by copyrighted interfaces, + the original copyright holder who places the Program under this + License may add an explicit geographical distribution limitation + excluding those countries, so that distribution is permitted only + in or among countries not thus excluded. In such case, this + License incorporates the limitation as if written in the body of + this License. + + 9. The Free Software Foundation may publish revised and/or new + versions of the General Public License from time to time. Such + new versions will be similar in spirit to the present version, but + may differ in detail to address new problems or concerns. + + Each version is given a distinguishing version number. If the + Program specifies a version number of this License which applies + to it and "any later version", you have the option of following + the terms and conditions either of that version or of any later + version published by the Free Software Foundation. If the Program + does not specify a version number of this License, you may choose + any version ever published by the Free Software Foundation. + + 10. If you wish to incorporate parts of the Program into other free + programs whose distribution conditions are different, write to the + author to ask for permission. For software which is copyrighted + by the Free Software Foundation, write to the Free Software + Foundation; we sometimes make exceptions for this. Our decision + will be guided by the two goals of preserving the free status of + all derivatives of our free software and of promoting the sharing + and reuse of software generally. + + NO WARRANTY + + 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO + WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE + LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT + HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT + WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT + NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND + FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE + QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE + PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY + SERVICING, REPAIR OR CORRECTION. + + 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN + WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY + MODIFY AND/OR REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE + LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, + INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR + INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF + DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU + OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY + OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN + ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. + + END OF TERMS AND CONDITIONS + +How to Apply These Terms to Your New Programs +============================================= + + If you develop a new program, and you want it to be of the greatest +possible use to the public, the best way to achieve this is to make it +free software which everyone can redistribute and change under these +terms. + + To do so, attach the following notices to the program. It is safest +to attach them to the start of each source file to most effectively +convey the exclusion of warranty; and each file should have at least +the "copyright" line and a pointer to where the full notice is found. + + ONE LINE TO GIVE THE PROGRAM'S NAME AND A BRIEF IDEA OF WHAT IT DOES. + Copyright (C) 19YY NAME OF AUTHOR + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 59 Temple Place - Suite 330, + Boston, MA 02111-1307, USA. + + Also add information on how to contact you by electronic and paper +mail. + + If the program is interactive, make it output a short notice like +this when it starts in an interactive mode: + + Gnomovision version 69, Copyright (C) 19YY NAME OF AUTHOR + Gnomovision comes with ABSOLUTELY NO WARRANTY; for details + type `show w'. + This is free software, and you are welcome to redistribute it + under certain conditions; type `show c' for details. + + The hypothetical commands `show w' and `show c' should show the +appropriate parts of the General Public License. Of course, the +commands you use may be called something other than `show w' and `show +c'; they could even be mouse-clicks or menu items--whatever suits your +program. + + You should also get your employer (if you work as a programmer) or +your school, if any, to sign a "copyright disclaimer" for the program, +if necessary. Here is a sample; alter the names: + + Yoyodyne, Inc., hereby disclaims all copyright interest in the program + `Gnomovision' (which makes passes at compilers) written by James Hacker. + + SIGNATURE OF TY COON, 1 April 1989 + Ty Coon, President of Vice + + This General Public License does not permit incorporating your +program into proprietary programs. If your program is a subroutine +library, you may consider it more useful to permit linking proprietary +applications with the library. If this is what you want to do, use the +GNU Library General Public License instead of this License. + + +File: bison.info, Node: Concepts, Next: Examples, Prev: Copying, Up: Top + +The Concepts of Bison +********************* + + This chapter introduces many of the basic concepts without which the +details of Bison will not make sense. If you do not already know how to +use Bison or Yacc, we suggest you start by reading this chapter +carefully. + +* Menu: + +* Language and Grammar:: Languages and context-free grammars, + as mathematical ideas. +* Grammar in Bison:: How we represent grammars for Bison's sake. +* Semantic Values:: Each token or syntactic grouping can have + a semantic value (the value of an integer, + the name of an identifier, etc.). +* Semantic Actions:: Each rule can have an action containing C code. +* Bison Parser:: What are Bison's input and output, + how is the output used? +* Stages:: Stages in writing and running Bison grammars. +* Grammar Layout:: Overall structure of a Bison grammar file. + + +File: bison.info, Node: Language and Grammar, Next: Grammar in Bison, Up: Concepts + +Languages and Context-Free Grammars +=================================== + + In order for Bison to parse a language, it must be described by a +"context-free grammar". This means that you specify one or more +"syntactic groupings" and give rules for constructing them from their +parts. For example, in the C language, one kind of grouping is called +an `expression'. One rule for making an expression might be, "An +expression can be made of a minus sign and another expression". +Another would be, "An expression can be an integer". As you can see, +rules are often recursive, but there must be at least one rule which +leads out of the recursion. + + The most common formal system for presenting such rules for humans +to read is "Backus-Naur Form" or "BNF", which was developed in order to +specify the language Algol 60. Any grammar expressed in BNF is a +context-free grammar. The input to Bison is essentially +machine-readable BNF. + + Not all context-free languages can be handled by Bison, only those +that are LALR(1). In brief, this means that it must be possible to +tell how to parse any portion of an input string with just a single +token of look-ahead. Strictly speaking, that is a description of an +LR(1) grammar, and LALR(1) involves additional restrictions that are +hard to explain simply; but it is rare in actual practice to find an +LR(1) grammar that fails to be LALR(1). *Note Mysterious Reduce/Reduce +Conflicts: Mystery Conflicts, for more information on this. + + In the formal grammatical rules for a language, each kind of +syntactic unit or grouping is named by a "symbol". Those which are +built by grouping smaller constructs according to grammatical rules are +called "nonterminal symbols"; those which can't be subdivided are called +"terminal symbols" or "token types". We call a piece of input +corresponding to a single terminal symbol a "token", and a piece +corresponding to a single nonterminal symbol a "grouping". + + We can use the C language as an example of what symbols, terminal and +nonterminal, mean. The tokens of C are identifiers, constants (numeric +and string), and the various keywords, arithmetic operators and +punctuation marks. So the terminal symbols of a grammar for C include +`identifier', `number', `string', plus one symbol for each keyword, +operator or punctuation mark: `if', `return', `const', `static', `int', +`char', `plus-sign', `open-brace', `close-brace', `comma' and many +more. (These tokens can be subdivided into characters, but that is a +matter of lexicography, not grammar.) + + Here is a simple C function subdivided into tokens: + + int /* keyword `int' */ + square (x) /* identifier, open-paren, */ + /* identifier, close-paren */ + int x; /* keyword `int', identifier, semicolon */ + { /* open-brace */ + return x * x; /* keyword `return', identifier, */ + /* asterisk, identifier, semicolon */ + } /* close-brace */ + + The syntactic groupings of C include the expression, the statement, +the declaration, and the function definition. These are represented in +the grammar of C by nonterminal symbols `expression', `statement', +`declaration' and `function definition'. The full grammar uses dozens +of additional language constructs, each with its own nonterminal +symbol, in order to express the meanings of these four. The example +above is a function definition; it contains one declaration, and one +statement. In the statement, each `x' is an expression and so is `x * +x'. + + Each nonterminal symbol must have grammatical rules showing how it +is made out of simpler constructs. For example, one kind of C +statement is the `return' statement; this would be described with a +grammar rule which reads informally as follows: + + A `statement' can be made of a `return' keyword, an `expression' + and a `semicolon'. + +There would be many other rules for `statement', one for each kind of +statement in C. + + One nonterminal symbol must be distinguished as the special one which +defines a complete utterance in the language. It is called the "start +symbol". In a compiler, this means a complete input program. In the C +language, the nonterminal symbol `sequence of definitions and +declarations' plays this role. + + For example, `1 + 2' is a valid C expression--a valid part of a C +program--but it is not valid as an _entire_ C program. In the +context-free grammar of C, this follows from the fact that `expression' +is not the start symbol. + + The Bison parser reads a sequence of tokens as its input, and groups +the tokens using the grammar rules. If the input is valid, the end +result is that the entire token sequence reduces to a single grouping +whose symbol is the grammar's start symbol. If we use a grammar for C, +the entire input must be a `sequence of definitions and declarations'. +If not, the parser reports a syntax error. + + +File: bison.info, Node: Grammar in Bison, Next: Semantic Values, Prev: Language and Grammar, Up: Concepts + +From Formal Rules to Bison Input +================================ + + A formal grammar is a mathematical construct. To define the language +for Bison, you must write a file expressing the grammar in Bison syntax: +a "Bison grammar" file. *Note Bison Grammar Files: Grammar File. + + A nonterminal symbol in the formal grammar is represented in Bison +input as an identifier, like an identifier in C. By convention, it +should be in lower case, such as `expr', `stmt' or `declaration'. + + The Bison representation for a terminal symbol is also called a +"token type". Token types as well can be represented as C-like +identifiers. By convention, these identifiers should be upper case to +distinguish them from nonterminals: for example, `INTEGER', +`IDENTIFIER', `IF' or `RETURN'. A terminal symbol that stands for a +particular keyword in the language should be named after that keyword +converted to upper case. The terminal symbol `error' is reserved for +error recovery. *Note Symbols::. + + A terminal symbol can also be represented as a character literal, +just like a C character constant. You should do this whenever a token +is just a single character (parenthesis, plus-sign, etc.): use that +same character in a literal as the terminal symbol for that token. + + A third way to represent a terminal symbol is with a C string +constant containing several characters. *Note Symbols::, for more +information. + + The grammar rules also have an expression in Bison syntax. For +example, here is the Bison rule for a C `return' statement. The +semicolon in quotes is a literal character token, representing part of +the C syntax for the statement; the naked semicolon, and the colon, are +Bison punctuation used in every rule. + + stmt: RETURN expr ';' + ; + +*Note Syntax of Grammar Rules: Rules. + + +File: bison.info, Node: Semantic Values, Next: Semantic Actions, Prev: Grammar in Bison, Up: Concepts + +Semantic Values +=============== + + A formal grammar selects tokens only by their classifications: for +example, if a rule mentions the terminal symbol `integer constant', it +means that _any_ integer constant is grammatically valid in that +position. The precise value of the constant is irrelevant to how to +parse the input: if `x+4' is grammatical then `x+1' or `x+3989' is +equally grammatical. + + But the precise value is very important for what the input means +once it is parsed. A compiler is useless if it fails to distinguish +between 4, 1 and 3989 as constants in the program! Therefore, each +token in a Bison grammar has both a token type and a "semantic value". +*Note Defining Language Semantics: Semantics, for details. + + The token type is a terminal symbol defined in the grammar, such as +`INTEGER', `IDENTIFIER' or `',''. It tells everything you need to know +to decide where the token may validly appear and how to group it with +other tokens. The grammar rules know nothing about tokens except their +types. + + The semantic value has all the rest of the information about the +meaning of the token, such as the value of an integer, or the name of an +identifier. (A token such as `','' which is just punctuation doesn't +need to have any semantic value.) + + For example, an input token might be classified as token type +`INTEGER' and have the semantic value 4. Another input token might +have the same token type `INTEGER' but value 3989. When a grammar rule +says that `INTEGER' is allowed, either of these tokens is acceptable +because each is an `INTEGER'. When the parser accepts the token, it +keeps track of the token's semantic value. + + Each grouping can also have a semantic value as well as its +nonterminal symbol. For example, in a calculator, an expression +typically has a semantic value that is a number. In a compiler for a +programming language, an expression typically has a semantic value that +is a tree structure describing the meaning of the expression. + + +File: bison.info, Node: Semantic Actions, Next: Bison Parser, Prev: Semantic Values, Up: Concepts + +Semantic Actions +================ + + In order to be useful, a program must do more than parse input; it +must also produce some output based on the input. In a Bison grammar, +a grammar rule can have an "action" made up of C statements. Each time +the parser recognizes a match for that rule, the action is executed. +*Note Actions::. + + Most of the time, the purpose of an action is to compute the +semantic value of the whole construct from the semantic values of its +parts. For example, suppose we have a rule which says an expression +can be the sum of two expressions. When the parser recognizes such a +sum, each of the subexpressions has a semantic value which describes +how it was built up. The action for this rule should create a similar +sort of value for the newly recognized larger expression. + + For example, here is a rule that says an expression can be the sum of +two subexpressions: + + expr: expr '+' expr { $$ = $1 + $3; } + ; + +The action says how to produce the semantic value of the sum expression +from the values of the two subexpressions. + + +File: bison.info, Node: Bison Parser, Next: Stages, Prev: Semantic Actions, Up: Concepts + +Bison Output: the Parser File +============================= + + When you run Bison, you give it a Bison grammar file as input. The +output is a C source file that parses the language described by the +grammar. This file is called a "Bison parser". Keep in mind that the +Bison utility and the Bison parser are two distinct programs: the Bison +utility is a program whose output is the Bison parser that becomes part +of your program. + + The job of the Bison parser is to group tokens into groupings +according to the grammar rules--for example, to build identifiers and +operators into expressions. As it does this, it runs the actions for +the grammar rules it uses. + + The tokens come from a function called the "lexical analyzer" that +you must supply in some fashion (such as by writing it in C). The +Bison parser calls the lexical analyzer each time it wants a new token. +It doesn't know what is "inside" the tokens (though their semantic +values may reflect this). Typically the lexical analyzer makes the +tokens by parsing characters of text, but Bison does not depend on +this. *Note The Lexical Analyzer Function `yylex': Lexical. + + The Bison parser file is C code which defines a function named +`yyparse' which implements that grammar. This function does not make a +complete C program: you must supply some additional functions. One is +the lexical analyzer. Another is an error-reporting function which the +parser calls to report an error. In addition, a complete C program must +start with a function called `main'; you have to provide this, and +arrange for it to call `yyparse' or the parser will never run. *Note +Parser C-Language Interface: Interface. + + Aside from the token type names and the symbols in the actions you +write, all variable and function names used in the Bison parser file +begin with `yy' or `YY'. This includes interface functions such as the +lexical analyzer function `yylex', the error reporting function +`yyerror' and the parser function `yyparse' itself. This also includes +numerous identifiers used for internal purposes. Therefore, you should +avoid using C identifiers starting with `yy' or `YY' in the Bison +grammar file except for the ones defined in this manual. + + +File: bison.info, Node: Stages, Next: Grammar Layout, Prev: Bison Parser, Up: Concepts + +Stages in Using Bison +===================== + + The actual language-design process using Bison, from grammar +specification to a working compiler or interpreter, has these parts: + + 1. Formally specify the grammar in a form recognized by Bison (*note + Bison Grammar Files: Grammar File.). For each grammatical rule in + the language, describe the action that is to be taken when an + instance of that rule is recognized. The action is described by a + sequence of C statements. + + 2. Write a lexical analyzer to process input and pass tokens to the + parser. The lexical analyzer may be written by hand in C (*note + The Lexical Analyzer Function `yylex': Lexical.). It could also + be produced using Lex, but the use of Lex is not discussed in this + manual. + + 3. Write a controlling function that calls the Bison-produced parser. + + 4. Write error-reporting routines. + + To turn this source code as written into a runnable program, you +must follow these steps: + + 1. Run Bison on the grammar to produce the parser. + + 2. Compile the code output by Bison, as well as any other source + files. + + 3. Link the object files to produce the finished product. + + +File: bison.info, Node: Grammar Layout, Prev: Stages, Up: Concepts + +The Overall Layout of a Bison Grammar +===================================== + + The input file for the Bison utility is a "Bison grammar file". The +general form of a Bison grammar file is as follows: + + %{ + C DECLARATIONS + %} + + BISON DECLARATIONS + + %% + GRAMMAR RULES + %% + ADDITIONAL C CODE + +The `%%', `%{' and `%}' are punctuation that appears in every Bison +grammar file to separate the sections. + + The C declarations may define types and variables used in the +actions. You can also use preprocessor commands to define macros used +there, and use `#include' to include header files that do any of these +things. + + The Bison declarations declare the names of the terminal and +nonterminal symbols, and may also describe operator precedence and the +data types of semantic values of various symbols. + + The grammar rules define how to construct each nonterminal symbol +from its parts. + + The additional C code can contain any C code you want to use. Often +the definition of the lexical analyzer `yylex' goes here, plus +subroutines called by the actions in the grammar rules. In a simple +program, all the rest of the program can go here. + + +File: bison.info, Node: Examples, Next: Grammar File, Prev: Concepts, Up: Top + +Examples +******** + + Now we show and explain three sample programs written using Bison: a +reverse polish notation calculator, an algebraic (infix) notation +calculator, and a multi-function calculator. All three have been tested +under BSD Unix 4.3; each produces a usable, though limited, interactive +desk-top calculator. + + These examples are simple, but Bison grammars for real programming +languages are written the same way. You can copy these examples out of +the Info file and into a source file to try them. + +* Menu: + +* RPN Calc:: Reverse polish notation calculator; + a first example with no operator precedence. +* Infix Calc:: Infix (algebraic) notation calculator. + Operator precedence is introduced. +* Simple Error Recovery:: Continuing after syntax errors. +* Multi-function Calc:: Calculator with memory and trig functions. + It uses multiple data-types for semantic values. +* Exercises:: Ideas for improving the multi-function calculator. + + +File: bison.info, Node: RPN Calc, Next: Infix Calc, Up: Examples + +Reverse Polish Notation Calculator +================================== + + The first example is that of a simple double-precision "reverse +polish notation" calculator (a calculator using postfix operators). +This example provides a good starting point, since operator precedence +is not an issue. The second example will illustrate how operator +precedence is handled. + + The source code for this calculator is named `rpcalc.y'. The `.y' +extension is a convention used for Bison input files. + +* Menu: + +* Decls: Rpcalc Decls. Bison and C declarations for rpcalc. +* Rules: Rpcalc Rules. Grammar Rules for rpcalc, with explanation. +* Lexer: Rpcalc Lexer. The lexical analyzer. +* Main: Rpcalc Main. The controlling function. +* Error: Rpcalc Error. The error reporting function. +* Gen: Rpcalc Gen. Running Bison on the grammar file. +* Comp: Rpcalc Compile. Run the C compiler on the output code. + + +File: bison.info, Node: Rpcalc Decls, Next: Rpcalc Rules, Up: RPN Calc + +Declarations for `rpcalc' +------------------------- + + Here are the C and Bison declarations for the reverse polish notation +calculator. As in C, comments are placed between `/*...*/'. + + /* Reverse polish notation calculator. */ + + %{ + #define YYSTYPE double + #include <math.h> + %} + + %token NUM + + %% /* Grammar rules and actions follow */ + + The C declarations section (*note The C Declarations Section: C +Declarations.) contains two preprocessor directives. + + The `#define' directive defines the macro `YYSTYPE', thus specifying +the C data type for semantic values of both tokens and groupings (*note +Data Types of Semantic Values: Value Type.). The Bison parser will use +whatever type `YYSTYPE' is defined as; if you don't define it, `int' is +the default. Because we specify `double', each token and each +expression has an associated value, which is a floating point number. + + The `#include' directive is used to declare the exponentiation +function `pow'. + + The second section, Bison declarations, provides information to +Bison about the token types (*note The Bison Declarations Section: +Bison Declarations.). Each terminal symbol that is not a +single-character literal must be declared here. (Single-character +literals normally don't need to be declared.) In this example, all the +arithmetic operators are designated by single-character literals, so the +only terminal symbol that needs to be declared is `NUM', the token type +for numeric constants. + |