curly.parser
¶
Parser has another main Curly’s function, parse()
.
The main idea of parsing is to take a stream of
tokens and convert it into abstract syntax tree. Each node in the
tree is present by Node
instances and each instance has a
list of nodes (therefore - tree).
parse()
produces such tree and it is possible to render it with
method Node.process()
.
Example:
>>> from curly.lexer import tokenize
>>> from curly.parser import parse
>>> text = '''\
... Hello! My name is {{ name }}.\
... {% if likes %}And I like these things: {% loop likes %}\
... {{ item }},{% /loop %}{% /if %}'''
>>> tokens = tokenize(text)
>>> print(repr(parse(tokens)))
[{'done': True,
'nodes': [],
'raw_string': "<LiteralToken(raw=' Hello! My name is ', contents={'text': "
"' Hello! My name is '})>",
'text': ' Hello! My name is ',
'type': 'LiteralNode'},
{'done': True,
'expression': ['name'],
'nodes': [],
'raw_string': "<PrintToken(raw='{{ name }}', contents={'expression': "
"['name']})>",
'type': 'PrintNode'},
{'done': True,
'nodes': [],
'raw_string': "<LiteralToken(raw='.', contents={'text': '.'})>",
'text': '.',
'type': 'LiteralNode'},
{'done': True,
'else': {},
'expression': ['likes'],
'nodes': [{'done': True,
'nodes': [],
'raw_string': "<LiteralToken(raw='And I like these things: ', "
"contents={'text': 'And I like these things: '})>",
'text': 'And I like these things: ',
'type': 'LiteralNode'},
{'done': True,
'expression': ['likes'],
'nodes': [{'done': True,
'expression': ['item'],
'nodes': [],
'raw_string': "<PrintToken(raw='{{ item }}', "
"contents={'expression': ['item']})>",
'type': 'PrintNode'},
{'done': True,
'nodes': [],
'raw_string': "<LiteralToken(raw=',', contents={'text': "
"','})>",
'text': ',',
'type': 'LiteralNode'}],
'raw_string': "<StartBlockToken(raw='{% loop likes %}', "
"contents={'expression': ['likes'], 'function': "
"'loop'})>",
'type': 'LoopNode'}],
'raw_string': "<StartBlockToken(raw='{% if likes %}', contents={'expression': "
"['likes'], 'function': 'if'})>",
'type': 'IfNode'}]
-
class
curly.parser.
BlockTagNode
(token)[source]¶ Bases:
curly.parser.ExpressionMixin
,curly.parser.Node
Node which presents block tag token.
Block tag example:
{% if something %}
. This, with{%
stuff.This is one-to-one representation of
curly.lexer.StartBlockToken
token.-
emit
(context)¶ Return generator which emits rendered chunks of text.
Axiom:
"".join(self.emit(context)) == self.process(context)
Parameters: context (dict) – Dictionary with a context variables. Returns: Generator with rendered texts. Return type: Generator[str]
-
evaluate_expression
(context)¶ Evaluate expression in given context.
Parameters: context (dict) – Variables for template rendering. Returns: Evaluated expression.
-
expression
¶ expression from underlying token.
-
function
¶ function from underlying token.
-
process
(context)¶ Return rendered content of the node as a string.
Parameters: context (dict) – Dictionary with a context variables. Returns: Rendered template Return type: str
-
raw_string
¶ Raw content of the related token.
For example, for token
{{ var }}
it returns literally{{ var }}
.
-
-
class
curly.parser.
ConditionalNode
(token)[source]¶ Bases:
curly.parser.BlockTagNode
Node which represent condition.
This is a not real node in AST tree, this is a preliminary node which should be popped on closing and be replaced by actual
IfNode
.Such fictional node is required to simplify logic of parsing for if/elif/elif/else blocks. If conditions are nested, we need to identify the groups of conditional flows and attach
IfNode
andElseNode
for correct parents.Parameters: token ( curly.lexer.BlockTagNode
) – Token, which starts to produce that node. Basically, it is a first token from theif
block.-
evaluate_expression
(context)¶ Evaluate expression in given context.
Parameters: context (dict) – Variables for template rendering. Returns: Evaluated expression.
-
expression
¶ expression from underlying token.
-
function
¶ function from underlying token.
-
process
(context)¶ Return rendered content of the node as a string.
Parameters: context (dict) – Dictionary with a context variables. Returns: Rendered template Return type: str
-
raw_string
¶ Raw content of the related token.
For example, for token
{{ var }}
it returns literally{{ var }}
.
-
-
class
curly.parser.
ElseNode
(token)[source]¶ Bases:
curly.parser.BlockTagNode
Node which represents
else
statement.For idea how it works, please check description of
IfNode
.-
emit
(context)¶ Return generator which emits rendered chunks of text.
Axiom:
"".join(self.emit(context)) == self.process(context)
Parameters: context (dict) – Dictionary with a context variables. Returns: Generator with rendered texts. Return type: Generator[str]
-
evaluate_expression
(context)¶ Evaluate expression in given context.
Parameters: context (dict) – Variables for template rendering. Returns: Evaluated expression.
-
expression
¶ expression from underlying token.
-
function
¶ function from underlying token.
-
process
(context)¶ Return rendered content of the node as a string.
Parameters: context (dict) – Dictionary with a context variables. Returns: Rendered template Return type: str
-
raw_string
¶ Raw content of the related token.
For example, for token
{{ var }}
it returns literally{{ var }}
.
-
-
class
curly.parser.
ExpressionMixin
[source]¶ Bases:
object
A small helper mixin for
Node
which adds expression related methods.-
evaluate_expression
(context)[source]¶ Evaluate expression in given context.
Parameters: context (dict) – Variables for template rendering. Returns: Evaluated expression.
-
expression
¶ expression from underlying token.
-
-
class
curly.parser.
IfNode
(token)[source]¶ Bases:
curly.parser.BlockTagNode
Node which represents
if
statement. Andelif
.Actually, since we have
ConditionalNode
, it is possible to use only 1 node type for ifs. Here is why:{ "conditional": [ { "if": "expression1", "nodes": [] }, { "if": "expression2", "nodes": [] }, { "else": "", "nodes": [] } ] }
Here is an idea how does
if
/elif
/else
looks like with conditional You have a list ofIfNode
instances and one (optional)ElseNode
at the end. So if firstif
does not match, you go to the next one. If it istrue
, emit its nodes and exitconditional
.-
evaluate_expression
(context)¶ Evaluate expression in given context.
Parameters: context (dict) – Variables for template rendering. Returns: Evaluated expression.
-
expression
¶ expression from underlying token.
-
function
¶ function from underlying token.
-
process
(context)¶ Return rendered content of the node as a string.
Parameters: context (dict) – Dictionary with a context variables. Returns: Rendered template Return type: str
-
raw_string
¶ Raw content of the related token.
For example, for token
{{ var }}
it returns literally{{ var }}
.
-
-
class
curly.parser.
LiteralNode
(token)[source]¶ Bases:
curly.parser.Node
Node which presents literal text.
This is one-to-one representation of
curly.lexer.LiteralToken
in AST tree.Parameters: token ( curly.lexer.LiteralToken
) – Token which produced that node.-
process
(context)¶ Return rendered content of the node as a string.
Parameters: context (dict) – Dictionary with a context variables. Returns: Rendered template Return type: str
-
raw_string
¶ Raw content of the related token.
For example, for token
{{ var }}
it returns literally{{ var }}
.
-
text
¶ Rendered text.
-
-
class
curly.parser.
LoopNode
(token)[source]¶ Bases:
curly.parser.BlockTagNode
Node which represents
loop
statement.This node repeats its content as much times as elements found in its evaluated expression. Every iteration it injects
item
variable into the context (incoming context is safe and untouched).For dicts, it emits {“key”: k, “value”: v} where
k
andv
are taken fromexpression.items()
. For any other iterable it emits item as is.-
evaluate_expression
(context)¶ Evaluate expression in given context.
Parameters: context (dict) – Variables for template rendering. Returns: Evaluated expression.
-
expression
¶ expression from underlying token.
-
function
¶ function from underlying token.
-
process
(context)¶ Return rendered content of the node as a string.
Parameters: context (dict) – Dictionary with a context variables. Returns: Rendered template Return type: str
-
raw_string
¶ Raw content of the related token.
For example, for token
{{ var }}
it returns literally{{ var }}
.
-
-
class
curly.parser.
Node
(token)[source]¶ Bases:
collections.UserList
Node of an AST tree.
It has 2 methods for rendering of the node content:
Node.emit()
andNode.process()
. First one is the generator over the rendered content, second one just concatenates them into a single string. So, if you defines your own node type, you want to defineNoed.emit()
only,Node.process()
stays the same.If you want to render template to the string, use
Node.process()
. This is a thing you are looking for.Parameters: token ( curly.lexer.Token
) – Token which produced that node.-
emit
(context)[source]¶ Return generator which emits rendered chunks of text.
Axiom:
"".join(self.emit(context)) == self.process(context)
Parameters: context (dict) – Dictionary with a context variables. Returns: Generator with rendered texts. Return type: Generator[str]
-
process
(context)[source]¶ Return rendered content of the node as a string.
Parameters: context (dict) – Dictionary with a context variables. Returns: Rendered template Return type: str
-
raw_string
¶ Raw content of the related token.
For example, for token
{{ var }}
it returns literally{{ var }}
.
-
-
class
curly.parser.
PrintNode
(token)[source]¶ Bases:
curly.parser.ExpressionMixin
,curly.parser.Node
Node which presents print token.
This is one-to-one representation of
curly.lexer.PrintToken
in AST tree. Example of such node is the node for{{ var }}
token.Parameters: token ( curly.lexer.PrintToken
) – Token which produced that node.-
evaluate_expression
(context)¶ Evaluate expression in given context.
Parameters: context (dict) – Variables for template rendering. Returns: Evaluated expression.
-
expression
¶ expression from underlying token.
-
process
(context)¶ Return rendered content of the node as a string.
Parameters: context (dict) – Dictionary with a context variables. Returns: Rendered template Return type: str
-
raw_string
¶ Raw content of the related token.
For example, for token
{{ var }}
it returns literally{{ var }}
.
-
-
class
curly.parser.
RootNode
(nodes)[source]¶ Bases:
curly.parser.Node
Node class for the most top-level node, root.
Parameters: nodes (list[Node]) – Nodes for root. -
emit
(context)¶ Return generator which emits rendered chunks of text.
Axiom:
"".join(self.emit(context)) == self.process(context)
Parameters: context (dict) – Dictionary with a context variables. Returns: Generator with rendered texts. Return type: Generator[str]
-
process
(context)¶ Return rendered content of the node as a string.
Parameters: context (dict) – Dictionary with a context variables. Returns: Rendered template Return type: str
-
raw_string
¶ Raw content of the related token.
For example, for token
{{ var }}
it returns literally{{ var }}
.
-
-
curly.parser.
parse
(tokens)[source]¶ One of the main functions (see also
curly.lexer.tokenize()
).The idea of parsing is simple: we have a flow of well defined tokens taken from
curly.lexer.tokenize()
and now we need to build AST tree from them.Curly does that maintaining a single stack. There could be different implementations, some of them more efficient, but we are using single stack implementation because it is most obvious way of representing and idea on current scale of the template language. If you decide to fork one day, please consider other options.
Please read following stuff before (at least Wikipedia articles):
- https://en.wikipedia.org/wiki/Shift-reduce_parser
- https://en.wikipedia.org/wiki/LR_parser
- https://en.wikipedia.org/wiki/Shunting-yard_algorithm
- https://en.wikipedia.org/wiki/Operator-precedence_parser
- http://blog.reverberate.org/2013/09/ll-and-lr-in-context-why-parsing-tools.html
- http://blog.reverberate.org/2013/07/ll-and-lr-parsing-demystified.html
Current implementation is LR(0) parser. Feel free to compose formal grammar if you want (
curly.lexer.LiteralToken
is terminal, everything except of it - non terminal). I am going to describe just a main idea in a simple words pretendind that no theory was created before.Now, algorithm.
Read from the Left (look, ma! L from LR!) of stream, without returning back. This allow us to use
tokens
as an iterator.For any token, check its class and call corresponding function which manages it.
After all tokens are consumed, check that all nodes in the stack are done (
done
attribute) and build resultingRootNode
instance.
The main idea is to maintain stack. Stack is a list of the children for the root node. We read a token by token and put corresponding nodes into stack. Each node has 2 states: done or not done. Done means that node is ready and processed, not done means that further squashing will be performed when corresponding terminating token will come to the parser.
So, let’s assume that we have a following list of tokens (stack on the left, incoming tokens on the right. Top of the token stream is the same one which is going to be consumed).
Some notation: exclamation mark before a node means that node is finished; it means that it is ready to participate into rendering, finalized.
| | | LiteralToken | | | | StackBlockToken(if) | | | | PrintToken | | | | StartBlockToken(elif) | | | | StartBlockToken(loop) | | | | PrintToken | | | | EndBlockToken(loop) | | | | EndBlockToken(if) |
Read
LiteralToken
. It is fine as is, so wrap it intoLiteralNode
and put it into stack.| | | | | | | StackBlockToken(if) | | | | PrintToken | | | | StartBlockToken(elif) | | | | StartBlockToken(loop) | | | | PrintToken | | | | EndBlockToken(loop) | | !LiteralNode | | EndBlockToken(if) |
And now it is a time for
curly.lexer.StartBlockToken
. A kind remember, this is a start tag of{% function expression %}...{% /function %}
construction. The story about such tag is that it has another tokens it encloses. So other tokens has to be subnodes of related node. This would be done of reduce phase described in a few paragraphs below but right now pay attention todone
attribute of the node: if it isFalse
it means that we still try to collect all contents of this block subnodes.True
means that node is finished.Function of this token is
if
so we need to addConditionalNode
as a marker of the closure and the firstIfNode
in this enclosement.| | | | | | | | | | | PrintToken | | | | StartBlockToken(elif) | | | | StartBlockToken(loop) | | IfNode | | PrintToken | | ConditionalNode | | EndBlockToken(loop) | | !LiteralNode | | EndBlockToken(if) |
The upcoming
curly.lexer.PrintToken
is a single functional node: to emit rendered template, we need to resolve its expression in given context. This is one finished nodePrintNode
.| | | | | | | | | | | | | | | StartBlockToken(elif) | | !PrintNode | | StartBlockToken(loop) | | IfNode | | PrintToken | | ConditionalNode | | EndBlockToken(loop) | | !LiteralNode | | EndBlockToken(if) |
Now it is a time for next
curly.lexer.StartBlockToken
which is responsible for{% elif %}
. It means, that scope of first, initialif
is completed, but not for correspondingConditionalNode
! Anyway, we can safely addPrintNode
from the top of the stack to nodelist ofIfNode
. To do so, we pop stack till thatIfNode
and add popped content to the nodelist. After that, we can finally markIfNode
as done.| | | | | | | | | | | | | | | StartBlockToken(elif) | | | | StartBlockToken(loop) | | !IfNode(!PrintNode) | | PrintToken | | ConditionalNode | | EndBlockToken(loop) | | !LiteralNode | | EndBlockToken(if) |
Stack was rewinded and we can add new
IfNode
to condition.| | | | | | | | | | | | | | | | | IfNode | | StartBlockToken(loop) | | !IfNode(!PrintNode) | | PrintToken | | ConditionalNode | | EndBlockToken(loop) | | !LiteralNode | | EndBlockToken(if) |
Next token is a loop (
{% loop items %}
). The same story as withIfNode
: emitLoopNode
to the top of the stack.| | | | | | | | | | | | | LoopNode | | | | IfNode | | | | !IfNode(!PrintNode) | | PrintToken | | ConditionalNode | | EndBlockToken(loop) | | !LiteralNode | | EndBlockToken(if) |
Add
curly.lexer.PrintToken
as aPrintNode
.| | | | | | | | | PrintToken | | | | LoopNode | | | | IfNode | | | | !IfNode(!PrintNode) | | | | ConditionalNode | | EndBlockToken(loop) | | !LiteralNode | | EndBlockToken(if) |
Next token is
curly.lexer.EndBlockToken
for the loop ({% /loop %}
). So we can rewind the stack to the loop node, putting all popped nodes as a nodelist forLoopNode
.| | | | | | | | | | | | | !LoopNode(!PrintNode) | | | | IfNode | | | | !IfNode(!PrintNode) | | | | ConditionalNode | | | | !LiteralNode | | EndBlockToken(if) |
And it is a time for
curly.lexer.EndBlockToken
forif
({% /if %}
). Now we need to rewind stack twice. First rewind is to completeIfNode
which is almost on the top of the stack.| | | | | | | | | | | | | !LoopNode(!PrintNode) | | | | !IfNode(!LoopNode(...)) | | | | !IfNode(!PrintNode) | | | | ConditionalNode | | | | !LiteralNode | | EndBlockToken(if) |
And the second rewind is to finish nearest
ConditionalNode
.| | | | | | | | | | | | | | | | | | | | | | | | | !ConditionalNode(!IfNode,!IfNode) | | | | !LiteralNode | | |
And that is all. Token list is empty, so it is a time to compose relevant
RootNode
with the contents of the stack.!RootNode(!LiteralNode, !ConditionalNode(!IfNode, !IfNode))
We’ve just made AST tree.
Parameters: token (Iterator[ curly.lexer.Token
]) – A stream with tokens.Returns: Parsed AST tree. Return type: RootNode
Raises: curly.exceptions.CurlyParserError
: if token is unknown.
-
curly.parser.
parse_end_block_token
(stack, token)[source]¶ This function does parsing of
curly.lexer.EndBlockToken
.Actually, since this token may have different behaviour, dependend on function from that token.
Token function Parsing function if parse_end_if_token()
loop parse_end_loop_token()
Parameters: - stack (list[
Node
]) – Stack of the parser. - token (
curly.lexer.EndBlockToken
) – Token to process.
Returns: Updated stack.
Return type: list[
Node
]Raises: curly.exceptions.CurlyParserUnknownEndBlockError
: if function of end block is unknown.- stack (list[
-
curly.parser.
parse_end_if_token
(stack, token)[source]¶ Parsing of token for
{% /if %}
.Check
parse()
for details. Also, it pops out redundantConditionalNode
and make chaining ofIfNode
andElseNode
verifying that there is only one possible else and it placed at the end.Parameters: - stack (list[
Node
]) – Stack of the parser. - token (
curly.lexer.EndBlockToken
) – Token to process.
Returns: Updated stack.
Return type: list[
Node
]- stack (list[
-
curly.parser.
parse_end_loop_token
(stack, token)[source]¶ Parsing of token for
{% /loop %}
.Check
parse()
for details. Stack rewinding is performed withrewind_stack_for()
.Parameters: - stack (list[
Node
]) – Stack of the parser. - token (
curly.lexer.EndBlockToken
) – Token to process.
Returns: Updated stack.
Return type: list[
Node
]- stack (list[
-
curly.parser.
parse_literal_token
(stack, token)[source]¶ This function does parsing of
curly.lexer.LiteralToken
.Since there is nothing to do with literals, it is just put corresponding
LiteralNode
on the top of the stack.Parameters: - stack (list[
Node
]) – Stack of the parser. - token (
curly.lexer.LiteralToken
) – Token to process.
Returns: Updated stack.
Return type: list[
Node
]- stack (list[
-
curly.parser.
parse_print_token
(stack, token)[source]¶ This function does parsing of
curly.lexer.PrintToken
.Since there is nothing to do with literals, it is just put corresponding
PrintNode
on the top of the stack.Parameters: - stack (list[
Node
]) – Stack of the parser. - token (
curly.lexer.PrintToken
) – Token to process.
Returns: Updated stack.
Return type: list[
Node
]- stack (list[
-
curly.parser.
parse_start_block_token
(stack, token)[source]¶ This function does parsing of
curly.lexer.StartBlockToken
.Actually, since this token may have different behaviour, dependend on function from that token.
Token function Parsing function if parse_start_if_token()
elif parse_start_elif_token()
else parse_start_else_token()
loop parse_start_loop_token()
Parameters: - stack (list[
Node
]) – Stack of the parser. - token (
curly.lexer.StartBlockToken
) – Token to process.
Returns: Updated stack.
Return type: list[
Node
]Raises: curly.exceptions.CurlyParserUnknownStartBlockError
: if token function is unknown.- stack (list[
-
curly.parser.
parse_start_elif_token
(stack, token)[source]¶ Parsing of token for
{% elif function expression %}
.It rewinds stack with
rewind_stack_for()
till previousIfNode
first and appends new one. Checks docs forparse()
to understand why.Parameters: - stack (list[
Node
]) – Stack of the parser. - token (
curly.lexer.StartBlockToken
) – Token to process.
Returns: Updated stack.
Return type: list[
Node
]- stack (list[
-
curly.parser.
parse_start_else_token
(stack, token)[source]¶ Parsing of token for
{% else %}
.It rewinds stack with
rewind_stack_for()
till previousIfNode
first and appends newElseNode
. Checks docs forparse()
to understand why.Parameters: - stack (list[
Node
]) – Stack of the parser. - token (
curly.lexer.StartBlockToken
) – Token to process.
Returns: Updated stack.
Return type: list[
Node
]- stack (list[
-
curly.parser.
parse_start_if_token
(stack, token)[source]¶ Parsing of token for
{% if function expression %}
.It puts 2 nodes on the stack:
ConditionalNode
andIfNode
. Checks docs forparse()
to understand why.Parameters: - stack (list[
Node
]) – Stack of the parser. - token (
curly.lexer.StartBlockToken
) – Token to process.
Returns: Updated stack.
Return type: list[
Node
]- stack (list[
-
curly.parser.
parse_start_loop_token
(stack, token)[source]¶ Parsing of token for
{% loop iterable %}
.Check
parse()
for details.Parameters: - stack (list[
Node
]) – Stack of the parser. - token (
curly.lexer.StartBlockToken
) – Token to process.
Returns: Updated stack.
Return type: list[
Node
]- stack (list[
-
curly.parser.
rewind_stack_for
(stack, *, search_for)[source]¶ Stack rewinding till some node found.
This function performes stack reducing on parsing. Idea is quite simple: we pop out nodes until some not done is found. If it has a type of node we are looking for, we are good: it basically means that we’ve found the node which should have popped results as subnodes. Otherwise: exception.
At the end of the procedure updated node is placed on the top of the stack.
Parameters: Returns: Updated stack.
Return type: list[
Node
]Raises: curly.exceptions.CurlyParserNoUnfinishedNodeError
: if not possible to find open start statement.curly.exceptions.CurlyParserUnexpectedUnfinishedNodeError
: if we expected to find one open statement but found another.
-
curly.parser.
validate_for_all_nodes_done
(root)[source]¶ Validates that all nodes in given AST trees are marked as done.
It simply does in-order traversing of the tree, verifying attribute.
Parameters: root ( RootNode
) – Root of the tree.Raises: curly.exceptions.CurlyParserFoundNotDoneError
: if node which is not closed is found.