YACC is an LALR parser generator developed at the beginning of the 1970s by Stephen C. Johnson for the Unix operating system. It automatically generates the LALR(1) parsers from formal grammar specifications. YACC plays an important role in compiler and interpreter development since it provides a means to specify the grammar of a language and to produce parsers that either interpret or compile code written in that language.
Key Concepts and Features of YACC
- Grammar Specification: The input to YACC is a context-free grammar (usually in the Backus-Naur Form, BNF) that describes the syntax rules of the language it parses.
- Parser Generation: YACC translates the grammar into a C function that could perform an efficient parsing of input text according to such predefined rules.
- LALR(1) Parsing: This is a bottom-up parsing method that makes use of a single token lookahead in determining the next action of parsing.
- Semantic Actions: These are the grammar productions that are associated with an action; this enables the execution of code, usually in C, used in the construction of abstract syntax trees, the generation of intermediate representations, or error handling.
- Attribute Grammars: These grammars consist of non-terminal grammar symbols with attributes, which through semantic actions are used in the construction of parse trees or the output of code.
- Integration with Lex: It is often used along with Lex, a tool that generates lexical analyzers-scanners-which breaks input into tokens that are then processed by the YACC parser.
Semantic Actions and Attribute Grammars in YACC
The semantic actions associated with productions achieve the building of an intermediate representation or target code as follows:
- Every nonterminal symbol in the parser has an attribute.
- The semantic action associated with a production can access attributes of nonterminal symbols used in that production--a symbol "$n' in the semantic action, where n is an integer, designates the attribute of the nonterminal symbol in the RHS of the production and the symbol '$$' designates the attribute of the LHS nonterminal symbol of the production.
- The semantic action uses the values of these attributes for building the intermediate representation or target code.
A parser generator is a program that takes as input a specification of a syntax and produces as output a procedure for recognizing that language. Historically, they are also called compiler compilers. YACC (yet another compiler-compiler) is an LALR(1) (LookAhead, Left-to-right, Rightmost derivation producer with 1 lookahead token) parser generator. YACC was originally designed for being complemented by Lex.
Input File: YACC input file is divided into three parts.
/* definitions */
....
%%
/* rules */
....
%%
/* auxiliary routines */
.... Input File: Definition Part:
- The definition part includes information about the tokens used in the syntax definition:
%token NUMBER
%token ID - Yacc automatically assigns numbers for tokens, but it can be overridden by
%token NUMBER 621 - Yacc also recognizes single characters as tokens. Therefore, assigned token numbers should no overlap ASCII codes.
- The definition part can include C code external to the definition of the parser and variable declarations, within %{ and %} in the first column.
- It can also include the specification of the starting symbol in the grammar:
%start nonterminal Input File: Rule Part:
- The rules part contains grammar definitions in a modified BNF form.
- Actions is C code in { } and can be embedded inside (Translation schemes).
Input File: Auxiliary Routines Part:
- The auxiliary routines part is only C code.
- It includes function definitions for every function needed in the rules part.
- It can also contain the main() function definition if the parser is going to be run as a program.
- The main() function must call the function yyparse().
Input File:
- If yylex() is not defined in the auxiliary routines sections, then it should be included:
#include "lex.yy.c" - YACC input file generally finishes with:
.y Output Files:
- The output of YACC is a file named y.tab.c
- If it contains the main() definition, it must be compiled to be executable.
- Otherwise, the code can be an external function definition for the function int yyparse()
- If called with the –d option in the command line, Yacc produces as output a header file y.tab.h with all its specific definition (particularly important are token definitions to be included, for example, in a Lex input file).
- If called with the –v option, Yacc produces as output a file y.output containing a textual description of the LALR(1) parsing table used by the parser. This is useful for tracking down how the parser solves conflicts.
Example: Yacc File (.y)
%{
#include <ctype.h>
#include <stdio.h>
#define YYSTYPE double /* double type for yacc stack */
%}
%%
Lines : Lines S '\n' { printf("OK \n"); }
| S '\n’
| error '\n' {yyerror("Error: reenter last line:");
yyerrok; };
S : '(' S ')’
| '[' S ']’
| /* empty */ ;
%%
#include "lex.yy.c"
void yyerror(char * s)
/* yacc error handler */
{
fprintf (stderr, "%s\n", s);
}
int main(void)
{
return yyparse();
}
Lex File (.l)
%{
%}
%%
[ \t] { /* skip blanks and tabs */ }
\n|. { return yytext[0]; }
%%
For Compiling YACC Program:
- Write lex program in a file file.l and yacc in a file file.y
- Open Terminal and Navigate to the Directory where you have saved the files.
- type lex file.l
- type yacc file.y
- type cc lex.yy.c y.tab.h -ll
- type ./a.out
Conclusion
YACC is the best tool for generating LALR(1) parsers, though it was seldom used in compiler construction and language processing. In general, the process of grammar specification can be combined with semantic actions in YACC for developing an efficient parser that analyzes and interprets programming languages. By integrating attribute grammars, information could be manipulated during parsing to support the development of intermediate code. Although more modern parsing tools exist, the simplicity and power of YACC in dealing with complicated grammars keep it relevant for modern applications. When used in conjunction with Lex, YACC is a complete system that provides the best parser development, which retains importance within the field.