#include <Puma/Syntax.h>

Inheritance diagram for Puma::Syntax:

Description

Syntactic analysis base class.

Implements the top-down parsing algorithm (recursive descend parser). To be derived to implement parsers for specific grammars. Provides infinite look-ahead.

This class uses a tree builder object (see Builder) to create the syntax tree, and a semantic analysis object (see Semantic) to perform required semantic analyses of the parsed code.

The parse process is started by calling Syntax::run() with a token provider as argument. Using the token provider this method reads the first core language token from the input source code and tries to parse it by applying the top grammar rule.

return parse(&Puma::Syntax::trans_unit) ? builder().Top() : (Puma::CTree*)0;

Puma::CTree

Base class for all C/C++ syntax tree classes.

Definition CTree.h:227

Puma::PtrStack::Top

T * Top() const

Get the top item from the current layer.

Definition PtrStack.h:128

Puma::Syntax::builder

Builder & builder() const

Get the syntax tree builder.

Definition Syntax.h:425

Puma::Syntax::trans_unit

virtual bool trans_unit()

Top parse rule to be reimplemented for a specific grammar.

Definition Syntax.h:428

Puma::Syntax::parse

bool parse(CTree *(T::*rule)())

Parse the given grammar rule.

Definition Syntax.h:437

The top grammar rule has to be provided by reimplementing method Syntax::trans_unit(). It may call sub-rules according to the implemented language-specific grammar. Example:

Puma::CTree *MySyntax::trans_unit() {
  return parse(&MySyntax::block_seq) ? builder().block_seq() : (Puma::CTree*)0;
}

For context-sensitive grammars it may be necessary in the rules of the grammar to perform first semantic analyses of the parsed code (to differentiate ambigous syntactic constructs, resolve names, detect errors, and so one). Example:

Puma::CTree *MySyntax::block() {
  // '{' instruction instruction ... '}'
  if (parse(TOK_OPEN_CURLY)) {             // parse '{'
    semantic().enter_block();              // enter block scope
    seq(&MySyntax::instruction);           // parse sequence of instructions
    semantic().leave_block();              // leave block scope
    if (parse(TOK_CLOSE_CURLY)) {          // parse '}'
      return builder().block();            // build syntax tree for the block
    }
  }
  return (CTree*)0;                        // rule failed
}

If a rule could be parsed successfully the tree builder is used to create a CTree based syntax tree (fragment) for the parsed rule. Failing grammar rules shall return NULL. The result of the top grammar rule is the root node of the abstract syntax tree for the whole input source code.

Classes
class	State
	Parser state, the current position in the token stream. More...

Public Types
typedef std::bitset< TOK_NO >	tokenset

Public Member Functions
pointcut	parse_fct ()
	Interface for aspects that affect the syntax and parsing process.

pointcut	check_fct ()

pointcut	in_syntax ()

pointcut	rule_exec ()

pointcut	rule_call ()

pointcut	rule_check ()

CTree *	run (TokenProvider &tp)
	Start the parse process.

template<class T>
CTree *	run (TokenProvider &tp, bool(T::*rule)())
	Start the parse process at a specific grammar rule.

virtual void	configure (Config &c)
	Configure the syntactic analysis object.

TokenProvider *	provider () const
	Get the token provider from which the parsed tokens are read.

Token *	problem () const
	Get the last token that could not be parsed.

bool	error () const
	Check if errors occured during the parse process.

bool	look_ahead (int token_type, unsigned n=1)
	Look-ahead n core language tokens and check if the n-th token has the given type.

bool	look_ahead (int *token_types, unsigned n=1)
	Look-ahead n core language tokens and check if the n-th token has one of the given types.

int	look_ahead (unsigned n=1)
	Look-ahead one core language token.

bool	consume ()
	Consume all tokens until the next core language token.

bool	predict_1 (const tokenset &ts)

template<class T>
bool	parse (CTree (T::rule)())
	Parse the given grammar rule.

template<class T>
bool	seq (CTree (T::rule)())
	Parse a sequence of the given grammar rule.

template<class T>
bool	seq (bool(T::*rule)())
	Parse a sequence of the given grammar rule.

template<class T>
bool	list (CTree (T::rule)(), int separator, bool trailing_separator=false)
	Parse a sequence of rule-separator pairs.

template<class T>
bool	list (CTree (T::rule)(), int *separators, bool trailing_separator=false)
	Parse a sequence of rule-separator pairs.

template<class T>
bool	list (bool(T::*rule)(), int separator, bool trailing_separator=false)
	Parse a sequence of rule-separator pairs.

template<class T>
bool	list (bool(T::rule)(), int separators, bool trailing_separator=false)
	Parse a sequence of rule-separator pairs.

template<class T>
bool	catch_error (bool(T::rule)(), const char msg, int finish_tokens, int skip_tokens)
	Parse a grammar rule automatically catching parse errors.

bool	parse (int token_type)
	Parse a token with the given type.

bool	parse (int *token_types)
	Parse a token with one of the given types.

bool	parse_token (int token_type)
	Parse a token with the given type.

bool	opt (bool dummy) const
	Optional rule parsing.

Builder &	builder () const
	Get the syntax tree builder.

Semantic &	semantic () const
	Get the semantic analysis object.

virtual bool	trans_unit ()
	Top parse rule to be reimplemented for a specific grammar.

virtual void	handle_directive ()
	Handle a compiler directive token.

State	save_state ()
	Save the current parser state.

void	forget_state ()
	Forget the saved parser state.

void	restore_state ()
	Restore the saved parser state.

void	restore_state (State state)
	Restore the saved parser state to the given state.

void	set_state (State state)
	Overwrite the parser state with the given state.

bool	accept (CTree *tree, State state)
	Accept the given syntax tree node.

CTree *	accept (CTree *tree)
	Accept the given syntax tree node.

Token *	locate_token ()
	Skip all non-core language tokens until the next core-language token is read.

void	skip ()
	Skip the current token.

void	skip_block (int start, int end, bool inclusive=true)
	Skip all tokens between start and end, including start and end token.

void	skip_curly_block ()
	Skip all tokens between '{' and '}', including '{' and '}'.

void	skip_round_block ()
	Skip all tokens between '(' and ')', including '(' and ')'.

bool	parse_block (int start, int end)
	Parse all tokens between start and end, including start and end token.

bool	parse_curly_block ()
	Parse all tokens between '{' and '}', including '{' and '}'.

bool	parse_round_block ()
	Parse all tokens between '(' and ')', including '(' and ')'.

bool	skip (int stop_token, bool inclusive=true)
	Skip all tokens until a token with the given type is read.

bool	skip (int *stop_tokens, bool inclusive=true)
	Skip all tokens until a token with one of the given types is read.

bool	is_in (int token_type, int *token_types) const
	Check if the given token type is in the set of given token types.

Static Public Member Functions
template<typename SYNTAX, typename RULE>
static bool	seq (SYNTAX &s)
	Parse a sequence of the given grammar rule by calling RULE::check() in a loop.

template<typename SYNTAX, typename RULE>
static bool	list (SYNTAX &s, int sep, bool trailing_sep=false)
	Parse a sequence of rule-separator pairs by calling RULE::check() in a loop.

template<typename SYNTAX, typename RULE>
static bool	list (SYNTAX &s, int *separators, bool trailing_sep=false)
	Parse a sequence of rule-separator pairs by calling RULE::check() in a loop.

template<class SYNTAX, class RULE>
static bool	catch_error (SYNTAX &s, const char msg, int finish_tokens, int *skip_tokens)
	Parse a grammar rule automatically catching parse errors.

template<class RULE1, class RULE2, class SYNTAX>
static bool	ambiguous (SYNTAX &s)
	First parse rule1 and if that rule fails discard all errors and parse the rule2.

Public Attributes
TokenProvider *	token_provider
	Token provider for getting the tokens to parse.

Protected Member Functions
	Syntax (Builder &b, Semantic &s)
	Constructor.

virtual	~Syntax ()
	Destructor.

Member Typedef Documentation

◆ tokenset

typedef std::bitset<TOK_NO> Puma::Syntax::tokenset

Constructor & Destructor Documentation

◆ Syntax()

Puma::Syntax::Syntax	(	Builder &	b,
		Semantic &	s )

inlineprotected

Constructor.

Parameters

b	The syntax tree builder.
s	The semantic analysis object.

◆ ~Syntax()

virtual Puma::Syntax::~Syntax ( )

inlineprotectedvirtual

Destructor.

Member Function Documentation

◆ accept() [1/2]

CTree * Puma::Syntax::accept ( CTree * tree )

Accept the given syntax tree node.

Returns the given node.

Parameters

tree	Tree to accept.

◆ accept() [2/2]

bool Puma::Syntax::accept	(	CTree *	tree,
		State	state )

Accept the given syntax tree node.

If the node is NULL then the parser state is restored to the given state. Otherwise all saved states are discarded.

Parameters

tree	Tree to accept.
state	The saved state.

◆ ambiguous()

template<class RULE1, class RULE2, class SYNTAX>

bool Puma::Syntax::ambiguous ( SYNTAX & s )

inlinestatic

First parse rule1 and if that rule fails discard all errors and parse the rule2.

Template Parameters

RULE1	The class that represents the first grammar rule
RULE2	The class that represents the second grammar rule
SYNTAX	The type of syntax

Parameters

s	The syntax object on which the rules should be executed

◆ builder()

Builder & Puma::Syntax::builder ( ) const

inline

Get the syntax tree builder.

◆ catch_error() [1/2]

template<class T>

bool Puma::Syntax::catch_error	(	bool(T::*	rule )(),
		const char *	msg,
		int *	finish_tokens,
		int *	skip_tokens )

Parse a grammar rule automatically catching parse errors.

Parameters

rule	The rule to parse.
msg	The error message to show if the rule fails.
finish_tokens	Set of token types that abort parsing the rule.
skip_tokens	If the rule fails skip all tokens until a token is read that has one of the types given here.

Returns: False if at EOF or a finish_token is read, true otherwise.

◆ catch_error() [2/2]

template<class SYNTAX, class RULE>

bool Puma::Syntax::catch_error	(	SYNTAX &	s,
		const char *	msg,
		int *	finish_tokens,
		int *	skip_tokens )

static

Parse a grammar rule automatically catching parse errors.

Template Parameters

SYNTAX	The type of syntax
RULE	The class that represents the grammar rule

Parameters

s	A pointer to the syntax object on which the rule should be executed
msg	The error message to show if the rule fails.
finish_tokens	Set of token types that abort parsing the rule.
skip_tokens	If the rule fails skip all tokens until a token is read that has one of the types given here.

Returns: False if at EOF or a finish_token is read, true otherwise.

◆ check_fct()

pointcut Puma::Syntax::check_fct ( )

◆ configure()

virtual void Puma::Syntax::configure ( Config & c )

virtual

Configure the syntactic analysis object.

Parameters

c	The configuration object.

Reimplemented in Puma::CCSyntax, Puma::CSyntax, and Puma::InstantiationSyntax.

◆ consume()

bool Puma::Syntax::consume ( )

inline

Consume all tokens until the next core language token.

◆ error()

bool Puma::Syntax::error ( ) const

inline

Check if errors occured during the parse process.

◆ forget_state()

void Puma::Syntax::forget_state ( )

Forget the saved parser state.

◆ handle_directive()

void Puma::Syntax::handle_directive ( )

inlinevirtual

Handle a compiler directive token.

The default handling is to skip the compiler directive.

Reimplemented in Puma::CSyntax.

◆ in_syntax()

pointcut Puma::Syntax::in_syntax ( )

◆ is_in()

bool Puma::Syntax::is_in	(	int	token_type,
		int *	token_types ) const

Check if the given token type is in the set of given token types.

Parameters

token_type	The token type to check.
token_types	The set of token types.

◆ list() [1/6]

template<class T>

bool Puma::Syntax::list	(	bool(T::*	rule )(),
		int *	separators,
		bool	trailing_separator = false )

inline

Parse a sequence of rule-separator pairs.

Parameters

rule	The rule to parse at least once.
separators	The separator tokens.
trailing_separator	True if a trailing separator token is allowed.

Returns: True if parsed successfully.

◆ list() [2/6]

template<class T>

bool Puma::Syntax::list	(	bool(T::*	rule )(),
		int	separator,
		bool	trailing_separator = false )

inline

Parse a sequence of rule-separator pairs.

Parameters

rule	The rule to parse at least once.
separator	The separator token.
trailing_separator	True if a trailing separator token is allowed.

Returns: True if parsed successfully.

◆ list() [3/6]

template<class T>

bool Puma::Syntax::list	(	CTree (T::	rule )(),
		int *	separators,
		bool	trailing_separator = false )

inline

Parse a sequence of rule-separator pairs.

Parameters

rule	The rule to parse at least once.
separators	The separator tokens.
trailing_separator	True if a trailing separator token is allowed.

Returns: True if parsed successfully.

◆ list() [4/6]

template<class T>

bool Puma::Syntax::list	(	CTree (T::	rule )(),
		int	separator,
		bool	trailing_separator = false )

inline

Parse a sequence of rule-separator pairs.

Parameters

rule	The rule to parse at least once.
separator	The separator token.
trailing_separator	True if a trailing separator token is allowed.

Returns: True if parsed successfully.

◆ list() [5/6]

template<typename SYNTAX, typename RULE>

bool Puma::Syntax::list	(	SYNTAX &	s,
		int *	separators,
		bool	trailing_sep = false )

inlinestatic

Parse a sequence of rule-separator pairs by calling RULE::check() in a loop.

Parameters

s	A pointer to the syntax object on which the rule should be executed

Template Parameters

SYNTAX	The type of syntax
RULE	The class that represents the grammar rule

Parameters

separators	The separator tokens
trailing_sep	True if a trailing separator token is allowed.

Returns: True if parsed successfully.

◆ list() [6/6]

template<typename SYNTAX, typename RULE>

bool Puma::Syntax::list	(	SYNTAX &	s,
		int	sep,
		bool	trailing_sep = false )

inlinestatic

Parse a sequence of rule-separator pairs by calling RULE::check() in a loop.

Parameters

s	A pointer to the syntax object on which the rule should be executed

Template Parameters

SYNTAX	The type of syntax
RULE	The class that represents the grammar rule

Parameters

sep	The separator token
trailing_sep	True if a trailing separator token is allowed.

Returns: True if parsed successfully.

◆ locate_token()

Token * Puma::Syntax::locate_token ( )

Skip all non-core language tokens until the next core-language token is read.

Returns: The next core-language token.

◆ look_ahead() [1/3]

bool Puma::Syntax::look_ahead	(	int *	token_types,
		unsigned	n = 1 )

Look-ahead n core language tokens and check if the n-th token has one of the given types.

Parameters

token_types	The possible types of the n-th token.
n	The number of tokens to look-ahead.

Returns: True if the n-th token has one of the given types.

◆ look_ahead() [2/3]

bool Puma::Syntax::look_ahead	(	int	token_type,
		unsigned	n = 1 )

Look-ahead n core language tokens and check if the n-th token has the given type.

Parameters

token_type	The type of the n-th token.
n	The number of tokens to look-ahead.

Returns: True if the n-th token has the given type.

◆ look_ahead() [3/3]

int Puma::Syntax::look_ahead ( unsigned n = 1 )

inline

Look-ahead one core language token.

Parameters

n	The number of tokens to look-ahead.

Returns: The type of the next core language token.

◆ opt()

bool Puma::Syntax::opt ( bool dummy ) const

inline

Optional rule parsing.

Always succeeds regardless of the argument.

Parameters

dummy Dummy parameter, is not evaluated.

Returns: True.

◆ parse() [1/3]

template<class T>

bool Puma::Syntax::parse ( CTree *(T::* rule )() )

inline

Parse the given grammar rule.

Saves the current state of the builder, semantic, and token provider objects.

Parameters

rule	The rule to parse.

Returns: True if parsed successfully.

◆ parse() [2/3]

bool Puma::Syntax::parse ( int * token_types )

Parse a token with one of the given types.

Parameters

token_types The token types.

Returns: True a corresponding token was parsed.

◆ parse() [3/3]

bool Puma::Syntax::parse ( int token_type )

inline

Parse a token with the given type.

Parameters

token_type The token type.

Returns: True a corresponding token was parsed.

◆ parse_block()

bool Puma::Syntax::parse_block	(	int	start,
		int	end )

Parse all tokens between start and end, including start and end token.

Parameters

start	The start token type.
end	The end token type.

Returns: False if the stop token is not found, true otherwise.

◆ parse_curly_block()

bool Puma::Syntax::parse_curly_block ( )

Parse all tokens between '{' and '}', including '{' and '}'.

Returns: False if the stop token '}' is not found, true otherwise.

◆ parse_fct()

pointcut Puma::Syntax::parse_fct ( )

Interface for aspects that affect the syntax and parsing process.

◆ parse_round_block()

bool Puma::Syntax::parse_round_block ( )

Parse all tokens between '(' and ')', including '(' and ')'.

Returns: False if the stop token ')' is not found, true otherwise.

◆ parse_token()

bool Puma::Syntax::parse_token ( int token_type )

Parse a token with the given type.

Parameters

token_type The token type.

Returns: True a corresponding token was parsed.

◆ predict_1()

bool Puma::Syntax::predict_1 ( const tokenset & ts )

inline

◆ problem()

Token * Puma::Syntax::problem ( ) const

inline

Get the last token that could not be parsed.

◆ provider()

TokenProvider * Puma::Syntax::provider ( ) const

inline

Get the token provider from which the parsed tokens are read.

◆ restore_state() [1/2]

void Puma::Syntax::restore_state ( )

Restore the saved parser state.

Triggers restoring the syntax and semantic trees to the saved state.

◆ restore_state() [2/2]

void Puma::Syntax::restore_state ( State state )

Restore the saved parser state to the given state.

Triggers restoring the syntax and semantic trees.

Parameters

state The state to which to restore.

◆ rule_call()

pointcut Puma::Syntax::rule_call ( )

◆ rule_check()

pointcut Puma::Syntax::rule_check ( )

◆ rule_exec()

pointcut Puma::Syntax::rule_exec ( )

◆ run() [1/2]

CTree * Puma::Syntax::run ( TokenProvider & tp )

Start the parse process.

Parameters

tp	The token provider from where to get the tokens to parse.

Returns: The resulting syntax tree.

◆ run() [2/2]

template<class T>

CTree * Puma::Syntax::run	(	TokenProvider &	tp,
		bool(T::*	rule )() )

Start the parse process at a specific grammar rule.

Parameters

tp	The token provider from where to get the tokens to parse.
rule	The grammar rule where to start.

Returns: The resulting syntax tree.

◆ save_state()

State Puma::Syntax::save_state ( )

Save the current parser state.

Calls save_state() on the builder, semantic, and token provider objects.

Returns: The current parser state.

◆ semantic()

Semantic & Puma::Syntax::semantic ( ) const

inline

Get the semantic analysis object.

◆ seq() [1/3]

template<class T>

bool Puma::Syntax::seq ( bool(T::* rule )() )

inline

Parse a sequence of the given grammar rule.

Parameters

rule	The rule to parse at least once.

Returns: True if parsed successfully.

◆ seq() [2/3]

template<class T>

bool Puma::Syntax::seq ( CTree *(T::* rule )() )

inline

Parse a sequence of the given grammar rule.

Parameters

rule	The rule to parse at least once.

Returns: True if parsed successfully.

◆ seq() [3/3]

template<typename SYNTAX, typename RULE>

bool Puma::Syntax::seq ( SYNTAX & s )

static

Parse a sequence of the given grammar rule by calling RULE::check() in a loop.

Parameters

s	A pointer to the syntax object on which the rule should be executed

Template Parameters

SYNTAX	The type of syntax
RULE	The class that represents the grammar rule

Returns: True if parsed successfully.

◆ set_state()

void Puma::Syntax::set_state ( State state )

Overwrite the parser state with the given state.

Parameters

state The new parser state.

◆ skip() [1/3]

void Puma::Syntax::skip ( )

Skip the current token.

◆ skip() [2/3]

bool Puma::Syntax::skip	(	int *	stop_tokens,
		bool	inclusive = true )

Skip all tokens until a token with one of the given types is read.

Parameters

stop_tokens	The types of the token to stop.
inclusive	If true, the stop token is skipped too.

Returns: False if the stop token is not found, true otherwise.

◆ skip() [3/3]

bool Puma::Syntax::skip	(	int	stop_token,
		bool	inclusive = true )

Skip all tokens until a token with the given type is read.

Parameters

stop_token	The type of the token to stop.
inclusive	If true, the stop token is skipped too.

Returns: False if the stop token is not found, true otherwise.

◆ skip_block()

void Puma::Syntax::skip_block	(	int	start,
		int	end,
		bool	inclusive = true )

Skip all tokens between start and end, including start and end token.

Parameters

start	The start token type.
end	The end token type.
inclusive	If true, the stop token is skipped too.

◆ skip_curly_block()

void Puma::Syntax::skip_curly_block ( )

Skip all tokens between '{' and '}', including '{' and '}'.

◆ skip_round_block()

void Puma::Syntax::skip_round_block ( )

Skip all tokens between '(' and ')', including '(' and ')'.

◆ trans_unit()

bool Puma::Syntax::trans_unit ( )

inlinevirtual

Top parse rule to be reimplemented for a specific grammar.

Returns: The root node of the syntax tree, or NULL.

Reimplemented in Puma::CSyntax.

Member Data Documentation

◆ token_provider

TokenProvider* Puma::Syntax::token_provider

Token provider for getting the tokens to parse.

Description

Classes

Public Types

Public Member Functions

Static Public Member Functions

Public Attributes

Protected Member Functions

Member Typedef Documentation

◆ tokenset

Constructor & Destructor Documentation

◆ Syntax()

◆ ~Syntax()

Member Function Documentation

◆ accept() [1/2]

◆ accept() [2/2]

◆ ambiguous()

◆ builder()

◆ catch_error() [1/2]

◆ catch_error() [2/2]

◆ check_fct()

◆ configure()

◆ consume()

◆ error()

◆ forget_state()

◆ handle_directive()

◆ in_syntax()

◆ is_in()

◆ list() [1/6]

◆ list() [2/6]

◆ list() [3/6]

◆ list() [4/6]

◆ list() [5/6]

◆ list() [6/6]

◆ locate_token()

◆ look_ahead() [1/3]

◆ look_ahead() [2/3]

◆ look_ahead() [3/3]

◆ opt()

◆ parse() [1/3]

◆ parse() [2/3]

◆ parse() [3/3]

◆ parse_block()

◆ parse_curly_block()

◆ parse_fct()

◆ parse_round_block()

◆ parse_token()

◆ predict_1()

◆ problem()

◆ provider()

◆ restore_state() [1/2]

◆ restore_state() [2/2]

◆ rule_call()

◆ rule_check()

◆ rule_exec()

◆ run() [1/2]

◆ run() [2/2]

◆ save_state()

◆ semantic()

◆ seq() [1/3]

◆ seq() [2/3]

◆ seq() [3/3]

◆ set_state()

◆ skip() [1/3]

◆ skip() [2/3]

◆ skip() [3/3]

◆ skip_block()

◆ skip_curly_block()

◆ skip_round_block()

◆ trans_unit()

Member Data Documentation

◆ token_provider