PUMA Library Reference Manual
Puma::Syntax Class Reference

#include <Puma/Syntax.h>

+ Inheritance diagram for Puma::Syntax:

Description

Syntactic analysis base class.

Implements the top-down parsing algorithm (recursive descend parser). To be derived to implement parsers for specific grammars. Provides infinite look-ahead.

This class uses a tree builder object (see Builder) to create the syntax tree, and a semantic analysis object (see Semantic) to perform required semantic analyses of the parsed code.

The parse process is started by calling Syntax::run() with a token provider as argument. Using the token provider this method reads the first core language token from the input source code and tries to parse it by applying the top grammar rule.

The top grammar rule has to be provided by reimplementing method Syntax::trans_unit(). It may call sub-rules according to the implemented language-specific grammar. Example:

Puma::CTree *MySyntax::trans_unit() {
return parse(&MySyntax::block_seq) ? builder().block_seq() : (Puma::CTree*)0;
}

For context-sensitive grammars it may be necessary in the rules of the grammar to perform first semantic analyses of the parsed code (to differentiate ambigous syntactic constructs, resolve names, detect errors, and so one). Example:

Puma::CTree *MySyntax::block() {
// '{' instruction instruction ... '}'
if (parse(TOK_OPEN_CURLY)) { // parse '{'
semantic().enter_block(); // enter block scope
seq(&MySyntax::instruction); // parse sequence of instructions
semantic().leave_block(); // leave block scope
if (parse(TOK_CLOSE_CURLY)) { // parse '}'
return builder().block(); // build syntax tree for the block
}
}
return (CTree*)0; // rule failed
}

If a rule could be parsed successfully the tree builder is used to create a CTree based syntax tree (fragment) for the parsed rule. Failing grammar rules shall return NULL. The result of the top grammar rule is the root node of the abstract syntax tree for the whole input source code.

Classes

class  State
 Parser state, the current position in the token stream. More...
 

Public Types

typedef std::bitset< TOK_NOtokenset
 

Public Member Functions

pointcut parse_fct ()
 Interface for aspects that affect the syntax and parsing process. More...
 
pointcut check_fct ()
 
pointcut in_syntax ()
 
pointcut rule_exec ()
 
pointcut rule_call ()
 
pointcut rule_check ()
 
CTreerun (TokenProvider &tp)
 Start the parse process. More...
 
template<class T >
CTreerun (TokenProvider &tp, bool(T::*rule)())
 Start the parse process at a specific grammar rule. More...
 
virtual void configure (Config &c)
 Configure the syntactic analysis object. More...
 
TokenProviderprovider () const
 Get the token provider from which the parsed tokens are read. More...
 
Tokenproblem () const
 Get the last token that could not be parsed. More...
 
bool error () const
 Check if errors occured during the parse process. More...
 
bool look_ahead (int token_type, unsigned n=1)
 Look-ahead n core language tokens and check if the n-th token has the given type. More...
 
bool look_ahead (int *token_types, unsigned n=1)
 Look-ahead n core language tokens and check if the n-th token has one of the given types. More...
 
int look_ahead (unsigned n=1)
 Look-ahead one core language token. More...
 
bool consume ()
 Consume all tokens until the next core language token. More...
 
bool predict_1 (const tokenset &ts)
 
template<class T >
bool parse (CTree *(T::*rule)())
 Parse the given grammar rule. More...
 
template<class T >
bool seq (CTree *(T::*rule)())
 Parse a sequence of the given grammar rule. More...
 
template<class T >
bool seq (bool(T::*rule)())
 Parse a sequence of the given grammar rule. More...
 
template<class T >
bool list (CTree *(T::*rule)(), int separator, bool trailing_separator=false)
 Parse a sequence of rule-separator pairs. More...
 
template<class T >
bool list (CTree *(T::*rule)(), int *separators, bool trailing_separator=false)
 Parse a sequence of rule-separator pairs. More...
 
template<class T >
bool list (bool(T::*rule)(), int separator, bool trailing_separator=false)
 Parse a sequence of rule-separator pairs. More...
 
template<class T >
bool list (bool(T::*rule)(), int *separators, bool trailing_separator=false)
 Parse a sequence of rule-separator pairs. More...
 
template<class T >
bool catch_error (bool(T::*rule)(), const char *msg, int *finish_tokens, int *skip_tokens)
 Parse a grammar rule automatically catching parse errors. More...
 
bool parse (int token_type)
 Parse a token with the given type. More...
 
bool parse (int *token_types)
 Parse a token with one of the given types. More...
 
bool parse_token (int token_type)
 Parse a token with the given type. More...
 
bool opt (bool dummy) const
 Optional rule parsing. More...
 
Builderbuilder () const
 Get the syntax tree builder. More...
 
Semanticsemantic () const
 Get the semantic analysis object. More...
 
virtual bool trans_unit ()
 Top parse rule to be reimplemented for a specific grammar. More...
 
virtual void handle_directive ()
 Handle a compiler directive token. More...
 
State save_state ()
 Save the current parser state. More...
 
void forget_state ()
 Forget the saved parser state. More...
 
void restore_state ()
 Restore the saved parser state. More...
 
void restore_state (State state)
 Restore the saved parser state to the given state. More...
 
void set_state (State state)
 Overwrite the parser state with the given state. More...
 
bool accept (CTree *tree, State state)
 Accept the given syntax tree node. More...
 
CTreeaccept (CTree *tree)
 Accept the given syntax tree node. More...
 
Tokenlocate_token ()
 Skip all non-core language tokens until the next core-language token is read. More...
 
void skip ()
 Skip the current token. More...
 
void skip_block (int start, int end, bool inclusive=true)
 Skip all tokens between start and end, including start and end token. More...
 
void skip_curly_block ()
 Skip all tokens between '{' and '}', including '{' and '}'. More...
 
void skip_round_block ()
 Skip all tokens between '(' and ')', including '(' and ')'. More...
 
bool parse_block (int start, int end)
 Parse all tokens between start and end, including start and end token. More...
 
bool parse_curly_block ()
 Parse all tokens between '{' and '}', including '{' and '}'. More...
 
bool parse_round_block ()
 Parse all tokens between '(' and ')', including '(' and ')'. More...
 
bool skip (int stop_token, bool inclusive=true)
 Skip all tokens until a token with the given type is read. More...
 
bool skip (int *stop_tokens, bool inclusive=true)
 Skip all tokens until a token with one of the given types is read. More...
 
bool is_in (int token_type, int *token_types) const
 Check if the given token type is in the set of given token types. More...
 

Static Public Member Functions

template<typename SYNTAX , typename RULE >
static bool seq (SYNTAX &s)
 Parse a sequence of the given grammar rule by calling RULE::check() in a loop. More...
 
template<typename SYNTAX , typename RULE >
static bool list (SYNTAX &s, int sep, bool trailing_sep=false)
 Parse a sequence of rule-separator pairs by calling RULE::check() in a loop. More...
 
template<typename SYNTAX , typename RULE >
static bool list (SYNTAX &s, int *separators, bool trailing_sep=false)
 Parse a sequence of rule-separator pairs by calling RULE::check() in a loop. More...
 
template<class SYNTAX , class RULE >
static bool catch_error (SYNTAX &s, const char *msg, int *finish_tokens, int *skip_tokens)
 Parse a grammar rule automatically catching parse errors. More...
 
template<class RULE1 , class RULE2 , class SYNTAX >
static bool ambiguous (SYNTAX &s)
 First parse rule1 and if that rule fails discard all errors and parse the rule2. More...
 

Public Attributes

TokenProvidertoken_provider
 Token provider for getting the tokens to parse. More...
 

Protected Member Functions

 Syntax (Builder &b, Semantic &s)
 Constructor. More...
 
virtual ~Syntax ()
 Destructor. More...
 

Member Typedef Documentation

typedef std::bitset<TOK_NO> Puma::Syntax::tokenset

Constructor & Destructor Documentation

Puma::Syntax::Syntax ( Builder b,
Semantic s 
)
inlineprotected

Constructor.

Parameters
bThe syntax tree builder.
sThe semantic analysis object.
virtual Puma::Syntax::~Syntax ( )
inlineprotectedvirtual

Destructor.

Member Function Documentation

bool Puma::Syntax::accept ( CTree tree,
State  state 
)

Accept the given syntax tree node.

If the node is NULL then the parser state is restored to the given state. Otherwise all saved states are discarded.

Parameters
treeTree to accept.
stateThe saved state.
CTree* Puma::Syntax::accept ( CTree tree)

Accept the given syntax tree node.

Returns the given node.

Parameters
treeTree to accept.
template<class RULE1 , class RULE2 , class SYNTAX >
bool Puma::Syntax::ambiguous ( SYNTAX &  s)
inlinestatic

First parse rule1 and if that rule fails discard all errors and parse the rule2.

Template Parameters
RULE1The class that represents the first grammar rule
RULE2The class that represents the second grammar rule
SYNTAXThe type of syntax
Parameters
sThe syntax object on which the rules should be executed
Builder & Puma::Syntax::builder ( ) const
inline

Get the syntax tree builder.

template<class T >
bool Puma::Syntax::catch_error ( bool(T::*)()  rule,
const char *  msg,
int *  finish_tokens,
int *  skip_tokens 
)

Parse a grammar rule automatically catching parse errors.

Parameters
ruleThe rule to parse.
msgThe error message to show if the rule fails.
finish_tokensSet of token types that abort parsing the rule.
skip_tokensIf the rule fails skip all tokens until a token is read that has one of the types given here.
Returns
False if at EOF or a finish_token is read, true otherwise.
template<class SYNTAX , class RULE >
bool Puma::Syntax::catch_error ( SYNTAX &  s,
const char *  msg,
int *  finish_tokens,
int *  skip_tokens 
)
static

Parse a grammar rule automatically catching parse errors.

Template Parameters
SYNTAXThe type of syntax
RULEThe class that represents the grammar rule
Parameters
sA pointer to the syntax object on which the rule should be executed
msgThe error message to show if the rule fails.
finish_tokensSet of token types that abort parsing the rule.
skip_tokensIf the rule fails skip all tokens until a token is read that has one of the types given here.
Returns
False if at EOF or a finish_token is read, true otherwise.
pointcut Puma::Syntax::check_fct ( )
virtual void Puma::Syntax::configure ( Config c)
virtual

Configure the syntactic analysis object.

Parameters
cThe configuration object.

Reimplemented in Puma::CCSyntax, Puma::InstantiationSyntax, and Puma::CSyntax.

bool Puma::Syntax::consume ( )
inline

Consume all tokens until the next core language token.

bool Puma::Syntax::error ( ) const
inline

Check if errors occured during the parse process.

void Puma::Syntax::forget_state ( )

Forget the saved parser state.

void Puma::Syntax::handle_directive ( )
inlinevirtual

Handle a compiler directive token.

The default handling is to skip the compiler directive.

Reimplemented in Puma::CSyntax.

pointcut Puma::Syntax::in_syntax ( )
bool Puma::Syntax::is_in ( int  token_type,
int *  token_types 
) const

Check if the given token type is in the set of given token types.

Parameters
token_typeThe token type to check.
token_typesThe set of token types.
template<class T >
bool Puma::Syntax::list ( CTree *(T::*)()  rule,
int  separator,
bool  trailing_separator = false 
)
inline

Parse a sequence of rule-separator pairs.

Parameters
ruleThe rule to parse at least once.
separatorThe separator token.
trailing_separatorTrue if a trailing separator token is allowed.
Returns
True if parsed successfully.
template<class T >
bool Puma::Syntax::list ( CTree *(T::*)()  rule,
int *  separators,
bool  trailing_separator = false 
)
inline

Parse a sequence of rule-separator pairs.

Parameters
ruleThe rule to parse at least once.
separatorsThe separator tokens.
trailing_separatorTrue if a trailing separator token is allowed.
Returns
True if parsed successfully.
template<class T >
bool Puma::Syntax::list ( bool(T::*)()  rule,
int  separator,
bool  trailing_separator = false 
)
inline

Parse a sequence of rule-separator pairs.

Parameters
ruleThe rule to parse at least once.
separatorThe separator token.
trailing_separatorTrue if a trailing separator token is allowed.
Returns
True if parsed successfully.
template<class T >
bool Puma::Syntax::list ( bool(T::*)()  rule,
int *  separators,
bool  trailing_separator = false 
)
inline

Parse a sequence of rule-separator pairs.

Parameters
ruleThe rule to parse at least once.
separatorsThe separator tokens.
trailing_separatorTrue if a trailing separator token is allowed.
Returns
True if parsed successfully.
template<typename SYNTAX , typename RULE >
bool Puma::Syntax::list ( SYNTAX &  s,
int  sep,
bool  trailing_sep = false 
)
inlinestatic

Parse a sequence of rule-separator pairs by calling RULE::check() in a loop.

Parameters
sA pointer to the syntax object on which the rule should be executed
Template Parameters
SYNTAXThe type of syntax
RULEThe class that represents the grammar rule
Parameters
sepThe separator token
trailing_sepTrue if a trailing separator token is allowed.
Returns
True if parsed successfully.
template<typename SYNTAX , typename RULE >
bool Puma::Syntax::list ( SYNTAX &  s,
int *  separators,
bool  trailing_sep = false 
)
inlinestatic

Parse a sequence of rule-separator pairs by calling RULE::check() in a loop.

Parameters
sA pointer to the syntax object on which the rule should be executed
Template Parameters
SYNTAXThe type of syntax
RULEThe class that represents the grammar rule
Parameters
separatorsThe separator tokens
trailing_sepTrue if a trailing separator token is allowed.
Returns
True if parsed successfully.
Token* Puma::Syntax::locate_token ( )

Skip all non-core language tokens until the next core-language token is read.

Returns
The next core-language token.
bool Puma::Syntax::look_ahead ( int  token_type,
unsigned  n = 1 
)

Look-ahead n core language tokens and check if the n-th token has the given type.

Parameters
token_typeThe type of the n-th token.
nThe number of tokens to look-ahead.
Returns
True if the n-th token has the given type.
bool Puma::Syntax::look_ahead ( int *  token_types,
unsigned  n = 1 
)

Look-ahead n core language tokens and check if the n-th token has one of the given types.

Parameters
token_typesThe possible types of the n-th token.
nThe number of tokens to look-ahead.
Returns
True if the n-th token has one of the given types.
int Puma::Syntax::look_ahead ( unsigned  n = 1)
inline

Look-ahead one core language token.

Parameters
nThe number of tokens to look-ahead.
Returns
The type of the next core language token.
bool Puma::Syntax::opt ( bool  dummy) const
inline

Optional rule parsing.

Always succeeds regardless of the argument.

Parameters
dummyDummy parameter, is not evaluated.
Returns
True.
template<class T >
bool Puma::Syntax::parse ( CTree *(T::*)()  rule)
inline

Parse the given grammar rule.

Saves the current state of the builder, semantic, and token provider objects.

Parameters
ruleThe rule to parse.
Returns
True if parsed successfully.
bool Puma::Syntax::parse ( int  token_type)
inline

Parse a token with the given type.

Parameters
token_typeThe token type.
Returns
True a corresponding token was parsed.
bool Puma::Syntax::parse ( int *  token_types)

Parse a token with one of the given types.

Parameters
token_typesThe token types.
Returns
True a corresponding token was parsed.
bool Puma::Syntax::parse_block ( int  start,
int  end 
)

Parse all tokens between start and end, including start and end token.

Parameters
startThe start token type.
endThe end token type.
Returns
False if the stop token is not found, true otherwise.
bool Puma::Syntax::parse_curly_block ( )

Parse all tokens between '{' and '}', including '{' and '}'.

Returns
False if the stop token '}' is not found, true otherwise.
pointcut Puma::Syntax::parse_fct ( )

Interface for aspects that affect the syntax and parsing process.

bool Puma::Syntax::parse_round_block ( )

Parse all tokens between '(' and ')', including '(' and ')'.

Returns
False if the stop token ')' is not found, true otherwise.
bool Puma::Syntax::parse_token ( int  token_type)

Parse a token with the given type.

Parameters
token_typeThe token type.
Returns
True a corresponding token was parsed.
bool Puma::Syntax::predict_1 ( const tokenset ts)
inline
Token * Puma::Syntax::problem ( ) const
inline

Get the last token that could not be parsed.

TokenProvider* Puma::Syntax::provider ( ) const
inline

Get the token provider from which the parsed tokens are read.

void Puma::Syntax::restore_state ( )

Restore the saved parser state.

Triggers restoring the syntax and semantic trees to the saved state.

void Puma::Syntax::restore_state ( State  state)

Restore the saved parser state to the given state.

Triggers restoring the syntax and semantic trees.

Parameters
stateThe state to which to restore.
pointcut Puma::Syntax::rule_call ( )
pointcut Puma::Syntax::rule_check ( )
pointcut Puma::Syntax::rule_exec ( )
CTree* Puma::Syntax::run ( TokenProvider tp)

Start the parse process.

Parameters
tpThe token provider from where to get the tokens to parse.
Returns
The resulting syntax tree.
template<class T >
CTree * Puma::Syntax::run ( TokenProvider tp,
bool(T::*)()  rule 
)

Start the parse process at a specific grammar rule.

Parameters
tpThe token provider from where to get the tokens to parse.
ruleThe grammar rule where to start.
Returns
The resulting syntax tree.
State Puma::Syntax::save_state ( )

Save the current parser state.

Calls save_state() on the builder, semantic, and token provider objects.

Returns
The current parser state.
Semantic & Puma::Syntax::semantic ( ) const
inline

Get the semantic analysis object.

template<class T >
bool Puma::Syntax::seq ( CTree *(T::*)()  rule)
inline

Parse a sequence of the given grammar rule.

Parameters
ruleThe rule to parse at least once.
Returns
True if parsed successfully.
template<class T >
bool Puma::Syntax::seq ( bool(T::*)()  rule)
inline

Parse a sequence of the given grammar rule.

Parameters
ruleThe rule to parse at least once.
Returns
True if parsed successfully.
template<typename SYNTAX , typename RULE >
bool Puma::Syntax::seq ( SYNTAX &  s)
static

Parse a sequence of the given grammar rule by calling RULE::check() in a loop.

Parameters
sA pointer to the syntax object on which the rule should be executed
Template Parameters
SYNTAXThe type of syntax
RULEThe class that represents the grammar rule
Returns
True if parsed successfully.
void Puma::Syntax::set_state ( State  state)

Overwrite the parser state with the given state.

Parameters
stateThe new parser state.
void Puma::Syntax::skip ( )

Skip the current token.

bool Puma::Syntax::skip ( int  stop_token,
bool  inclusive = true 
)

Skip all tokens until a token with the given type is read.

Parameters
stop_tokenThe type of the token to stop.
inclusiveIf true, the stop token is skipped too.
Returns
False if the stop token is not found, true otherwise.
bool Puma::Syntax::skip ( int *  stop_tokens,
bool  inclusive = true 
)

Skip all tokens until a token with one of the given types is read.

Parameters
stop_tokensThe types of the token to stop.
inclusiveIf true, the stop token is skipped too.
Returns
False if the stop token is not found, true otherwise.
void Puma::Syntax::skip_block ( int  start,
int  end,
bool  inclusive = true 
)

Skip all tokens between start and end, including start and end token.

Parameters
startThe start token type.
endThe end token type.
inclusiveIf true, the stop token is skipped too.
void Puma::Syntax::skip_curly_block ( )

Skip all tokens between '{' and '}', including '{' and '}'.

void Puma::Syntax::skip_round_block ( )

Skip all tokens between '(' and ')', including '(' and ')'.

bool Puma::Syntax::trans_unit ( )
inlinevirtual

Top parse rule to be reimplemented for a specific grammar.

Returns
The root node of the syntax tree, or NULL.

Reimplemented in Puma::CSyntax.

Member Data Documentation

TokenProvider* Puma::Syntax::token_provider

Token provider for getting the tokens to parse.