Zend_Markup Parsers
Zend_Markup is currently shipped with two parsers, a BBCode parser
and a Textile parser.
Theory of Parsing
The parsers of Zend_Markup are classes that convert text with
markup to a token tree. Although we are using the BBCode parser as example here, the
idea of the token tree remains the same across all parsers. We will start with this
piece of BBCode for example:
Then the BBCode parser will take that value, tear it apart and create the following
tree:
[b]
foo
[i]
bar
baz
You will notice that the closing tags are gone, they don't show up as content in the
tree structure. This is because the closing tag isn't part of the actual content.
Although, this does not mean that the closing tag is just lost, it is stored inside the
tag information for the tag itself. Also, please note that this is just a simplified
view of the tree itself. The actual tree contains a lot more information, like the tag's
attributes and its name.
The BBCode parser
The BBCode parser is a Zend_Markup parser that converts BBCode to
a token tree. The syntax of all BBCode tags is:
Some examples of valid BBCode tags are:
By default, all tags are closed by using the format '[/tagname]'.
The Textile parser
The Textile parser is a Zend_Markup parser that converts Textile
to a token tree. Because Textile doesn't have a tag structure, the following is a list
of example tags:
List of basic Textile tags
Sample input
Sample output
*foo*
foo]]>
_foo_
foo]]>
??foo??
foo]]>
-foo-
foo]]>
+foo+
foo]]>
^foo^
foo]]>
~foo~
foo]]>
%foo%
foo]]>
PHP(PHP Hypertext Preprocessor)
PHP]]>
"Zend Framework":http://framework.zend.com/
Zend Framework]]>
h1. foobar
foobar]]>
h6. foobar
foobar]]>
!http://framework.zend.com/images/logo.gif!
]]>
Also, the Textile parser wraps all tags into paragraphs; a paragraph ends with two
newlines, and if there are more tags, a new paragraph will be added.
Lists
The Textile parser also supports two types of lists. The numeric type, using the "#"
character and bullit-lists using the "*" character. An example of both lists:
The above will generate two lists: the first, numbered; and the second, bulleted.
Inside list items, you can use normal tags like strong (*), and emphasized (_). Tags
that need to start on a new line (like 'h1' etc.) cannot be used inside lists.