zope.structuredtext Documentation

Using Structured Text

The goal of StructuredText is to make it possible to express structured text using a relatively simple plain text format. Simple structures, like bullets or headings are indicated through conventions that are natural, for some definition of “natural”. Hierarchical structures are indicated through indentation. The use of indentation to express hierarchical structure is inspired by the Python programming language.

Use of StructuredText consists of one to three logical steps. In the first step, a text string is converted to a network of objects using the structurize() facility, as in the following example:

raw = open("mydocument.txt").read()
from zope.structuredtext.stng import structurize
st = structurize(raw)

The output of structurize() is simply a StructuredTextDocument object containing StructuredTextParagraph objects arranged in a hierarchy. Paragraphs are delimited by strings of two or more whitespace characters beginning and ending with newline characters. Hierarchy is indicated by indentation. The indentation of a paragraph is the minimum number of leading spaces in a line containing non-white-space characters after converting tab characters to spaces (assuming a tab stop every eight characters).

StructuredTextNode objects support the read-only subset of the Document Object Model (DOM) API. It should be possible to process StructuredTextNode hierarchies using XML tools such as XSLT.

The second step in using StructuredText is to apply additional structuring rules based on text content. A variety of differentText rules can be used. Typically, these are used to implement a structured text language for producing documents, but any sort of structured text language could be implemented in the second step. For example, it is possible to use StructuredText to implement structured text formats for representing structured data. The second step, which could consist of multiple processing steps, is performed by processing, or “coloring”, the hierarchy of generic StructuredTextParagraph objects into a network of more specialized objects. Typically, the objects produced should also implement the DOM API to allow processing with XML tools.

A document processor is provided to convert a StructuredTextDocument object containing only StructuredTextParagraph objects into a StructuredTextDocument object containing a richer collection of objects such as bullets, headings, emphasis, and so on using hints in the text. Hints are selected based on conventions of the sort typically seen in electronic mail or news-group postings. It should be noted, however, that these conventions are somewhat culturally dependent, fortunately, the document processor is easily customized to implement alternative rules. Here’s an example of using the DOC processor to convert the output of the previous example:

from zope.structuredtext.document import Document
doc = Document()(st)

The final step is to process the colored networks produced from the second step to produce additional outputs. The final step could be performed by Python programs, or by XML tools. A Python outputter is provided for the document processor output that produces Hypertext Markup Language (HTML) text:

from zope.structuredtext.html import HTML
html = HTML()(doc)

Customizing the document processor

The document processor is driven by two tables. The first table, named paragraph_types, is a sequence of callable objects or method names for coloring paragraphs. If a table entry is a string, then it is the name of a method of the document processor to be used. For each input paragraph, the objects in the table are called until one returns a value (not ‘None’). The value returned replaces the original input paragraph in the output. If none of the objects in the paragraph types table return a value, then a copy of the original paragraph is used. The new object returned by calling a paragraph type should implement the ReadOnlyDOM, StructuredTextColorizable, and StructuredTextSubparagraphContainer interfaces. See the zope.structuredtext.document source file for examples.

A paragraph type may return a list or tuple of replacement paragraphs, this allowing a paragraph to be split into multiple paragraphs.

The second table, text_types, is a sequence of callable objects or method names for coloring text. The callable objects in this table are used in sequence to transform the input text into new text or objects. The callable objects are passed a string and return nothing (None) or a three-element tuple consisting of:

  • a replacement object,
  • a starting position, and
  • an ending position

The text from the starting position is (logically) replaced with the replacement object. The replacement object is typically an object that implements that implements the ReadOnlyDOM and StructuredTextColorizable interfaces. The replacement object can also be a string or a list of strings or objects. Replacement is done from beginning to end and text after the replacement ending position will be passed to the character type objects for processing.

Contents:

zope.structuredtext API

zope.structuredtext.document

Structured text document parser

class zope.structuredtext.document.Document[source]

Bases: object

Class instance calls [ex.=> x()] require a structured text structure. Doc will then parse each paragraph in the structure and will find the special structures within each paragraph. Each special structure will be stored as an instance. Special structures within another special structure are stored within the ‘top’ structure EX : ‘-underline this-‘ => would be turned into an underline instance. ‘-underline this’ would be stored as an underline instance with a strong instance stored in its string

parse(raw_string, text_type, type=<type 'type'>)[source]

Parse accepts a raw_string, an expr to test the raw_string, and the raw_string’s subparagraphs.

Parse will continue to search through raw_string until all instances of expr in raw_string are found.

If no instances of expr are found, raw_string is returned. Otherwise a list of substrings and instances is returned

color_text(text, types=None)[source]

Search the paragraph for each special structure

doc_sgml(s, expr=<built-in method search of _sre.SRE_Pattern object>)[source]

SGML text is ignored and outputed as-is

class zope.structuredtext.document.DocumentWithImages[source]

Bases: zope.structuredtext.document.Document

Document with images

zope.structuredtext.stletters

Structured text character classes

zope.structuredtext.stng

Core document model.

zope.structuredtext.stng.indention(str, front=<built-in method match of _sre.SRE_Pattern object>)[source]

Find the number of leading spaces. If none, return 0.

zope.structuredtext.stng.insert(struct, top, level)[source]

Find what will be the parant paragraph of a sentence and return that paragraph’s sub-paragraphs. The new paragraph will be appended to those sub-paragraphs

zope.structuredtext.stng.display(struct)[source]

Runs through the structure and prints out the paragraphs. If the insertion works correctly, display’s results should mimic the orignal paragraphs.

zope.structuredtext.stng.display2(struct)[source]

Runs through the structure and prints out the paragraphs. If the insertion works correctly, display’s results should mimic the orignal paragraphs.

zope.structuredtext.stng.findlevel(levels, indent)[source]

Remove all level information of levels with a greater level of indentation. Then return which level should insert this paragraph

zope.structuredtext.stng.structurize(paragraphs, delimiter=<_sre.SRE_Pattern object>)[source]

Accepts paragraphs, which is a list of lines to be parsed. structurize creates a structure which mimics the structure of the paragraphs. Structure => [paragraph,[sub-paragraphs]]

class zope.structuredtext.stng.StructuredTextParagraph(src, subs=None, **kw)[source]

Bases: zope.structuredtext.stdom.Element

getChildren()[source]

Get a Python sequence of children

class zope.structuredtext.stng.StructuredTextDocument(subs=None, **kw)[source]

Bases: zope.structuredtext.stng.StructuredTextParagraph

A StructuredTextDocument holds StructuredTextParagraphs as its subparagraphs.

getChildren()[source]

Get a Python sequence of children

class zope.structuredtext.stng.StructuredTextExample(subs, **kw)[source]

Bases: zope.structuredtext.stng.StructuredTextParagraph

Represents a section of document with literal text, as for examples

class zope.structuredtext.stng.StructuredTextBullet(src, subs=None, **kw)[source]

Bases: zope.structuredtext.stng.StructuredTextParagraph

Represents a section of a document with a title and a body

class zope.structuredtext.stng.StructuredTextNumbered(src, subs=None, **kw)[source]

Bases: zope.structuredtext.stng.StructuredTextParagraph

Represents a section of a document with a title and a body

class zope.structuredtext.stng.StructuredTextDescriptionTitle(src, subs=None, **kw)[source]

Bases: zope.structuredtext.stng.StructuredTextParagraph

Represents a section of a document with a title and a body

class zope.structuredtext.stng.StructuredTextDescriptionBody(src, subs=None, **kw)[source]

Bases: zope.structuredtext.stng.StructuredTextParagraph

Represents a section of a document with a title and a body

class zope.structuredtext.stng.StructuredTextDescription(title, src, subs, **kw)[source]

Bases: zope.structuredtext.stng.StructuredTextParagraph

Represents a section of a document with a title and a body

getChildren()[source]

Get a Python sequence of children

class zope.structuredtext.stng.StructuredTextSectionTitle(src, subs=None, **kw)[source]

Bases: zope.structuredtext.stng.StructuredTextParagraph

Represents a section of a document with a title and a body

class zope.structuredtext.stng.StructuredTextSection(src, subs=None, **kw)[source]

Bases: zope.structuredtext.stng.StructuredTextParagraph

Represents a section of a document with a title and a body

class zope.structuredtext.stng.StructuredTextTable(rows, src, subs, **kw)[source]

Bases: zope.structuredtext.stng.StructuredTextParagraph

rows is a list of lists containing tuples, which represent the columns/cells in each rows. EX rows = [[(‘row 1:column1’,1)],[(‘row2:column1’,1)]]

getColorizableTexts()[source]

return a tuple where each item is a column/cell’s contents. The tuple, result, will be of this format. (“r1 col1”, “r1=col2”, “r2 col1”, “r2 col2”)

setColorizableTexts(texts)[source]

texts is going to a tuple where each item is the result of being mapped to the colortext function. Need to insert the results appropriately into the individual columns/cells

class zope.structuredtext.stng.StructuredTextRow(row, kw)[source]

Bases: zope.structuredtext.stng.StructuredTextParagraph

row is a list of tuples, where each tuple is the raw text for a cell/column and the span of that cell/column. EX [(‘this is column one’,1), (‘this is column two’,1)]

class zope.structuredtext.stng.StructuredTextColumn(text, span, align, valign, typ, kw)[source]

Bases: zope.structuredtext.stng.StructuredTextParagraph

StructuredTextColumn is a cell/column in a table. A cell can hold multiple paragraphs. The cell is either classified as a StructuredTextTableHeader or StructuredTextTableData.

class zope.structuredtext.stng.StructuredTextTableHeader(src, subs=None, **kw)[source]

Bases: zope.structuredtext.stng.StructuredTextParagraph

class zope.structuredtext.stng.StructuredTextTableData(src, subs=None, **kw)[source]

Bases: zope.structuredtext.stng.StructuredTextParagraph

class zope.structuredtext.stng.StructuredTextMarkup(value, **kw)[source]

Bases: zope.structuredtext.stdom.Element

getChildren()[source]

Get a Python sequence of children

class zope.structuredtext.stng.StructuredTextLiteral(value, **kw)[source]

Bases: zope.structuredtext.stng.StructuredTextMarkup

class zope.structuredtext.stng.StructuredTextEmphasis(value, **kw)[source]

Bases: zope.structuredtext.stng.StructuredTextMarkup

class zope.structuredtext.stng.StructuredTextStrong(value, **kw)[source]

Bases: zope.structuredtext.stng.StructuredTextMarkup

Bases: zope.structuredtext.stng.StructuredTextMarkup

Bases: zope.structuredtext.stng.StructuredTextMarkup

class zope.structuredtext.stng.StructuredTextUnderline(value, **kw)[source]

Bases: zope.structuredtext.stng.StructuredTextMarkup

class zope.structuredtext.stng.StructuredTextSGML(value, **kw)[source]

Bases: zope.structuredtext.stng.StructuredTextMarkup

Bases: zope.structuredtext.stng.StructuredTextMarkup

class zope.structuredtext.stng.StructuredTextXref(value, **kw)[source]

Bases: zope.structuredtext.stng.StructuredTextMarkup

class zope.structuredtext.stng.StructuredTextImage(value, **kw)[source]

Bases: zope.structuredtext.stng.StructuredTextMarkup

A simple embedded image

zope.structuredtext.stdom

DOM implementation in StructuredText: read-only methods

exception zope.structuredtext.stdom.DOMException[source]

Bases: exceptions.Exception

exception zope.structuredtext.stdom.IndexSizeException[source]

Bases: zope.structuredtext.stdom.DOMException

exception zope.structuredtext.stdom.DOMStringSizeException[source]

Bases: zope.structuredtext.stdom.DOMException

exception zope.structuredtext.stdom.HierarchyRequestException[source]

Bases: zope.structuredtext.stdom.DOMException

exception zope.structuredtext.stdom.WrongDocumentException[source]

Bases: zope.structuredtext.stdom.DOMException

exception zope.structuredtext.stdom.InvalidCharacterException[source]

Bases: zope.structuredtext.stdom.DOMException

exception zope.structuredtext.stdom.NoDataAllowedException[source]

Bases: zope.structuredtext.stdom.DOMException

exception zope.structuredtext.stdom.NoModificationAllowedException[source]

Bases: zope.structuredtext.stdom.DOMException

exception zope.structuredtext.stdom.NotFoundException[source]

Bases: zope.structuredtext.stdom.DOMException

exception zope.structuredtext.stdom.NotSupportedException[source]

Bases: zope.structuredtext.stdom.DOMException

exception zope.structuredtext.stdom.InUseAttributeException[source]

Bases: zope.structuredtext.stdom.DOMException

class zope.structuredtext.stdom.ParentNode[source]

Bases: object

A node that can have children, or, more precisely, that implements the child access methods of the DOM.

getChildNodes(type=<type 'type'>, sts=(<type 'unicode'>, <type 'str'>))[source]

Returns a NodeList that contains all children of this node. If there are no children, this is a empty NodeList

getFirstChild(type=<type 'type'>, sts=(<type 'unicode'>, <type 'str'>))[source]

The first child of this node. If there is no such node this returns None

getLastChild(type=<type 'type'>, sts=(<type 'unicode'>, <type 'str'>))[source]

The last child of this node. If there is no such node this returns None.

class zope.structuredtext.stdom.NodeWrapper(aq_self, aq_parent)[source]

Bases: zope.structuredtext.stdom.ParentNode

This is an acquisition-like wrapper that provides parent access for DOM sans circular references!

getParentNode()[source]

The parent of this node. All nodes except Document DocumentFragment and Attr may have a parent

getPreviousSibling()[source]

The node immediately preceding this node. If there is no such node, this returns None.

getNextSibling()[source]

The node immediately preceding this node. If there is no such node, this returns None.

getOwnerDocument()[source]

The Document object associated with this node, if any.

class zope.structuredtext.stdom.Node[source]

Bases: zope.structuredtext.stdom.ParentNode

Node Interface

getNodeName()[source]

The name of this node, depending on its type

getNodeValue()[source]

The value of this node, depending on its type

getParentNode()[source]

The parent of this node. All nodes except Document DocumentFragment and Attr may have a parent

getChildren()[source]

Get a Python sequence of children

getPreviousSibling()[source]

The node immediately preceding this node. If there is no such node, this returns None.

getNextSibling()[source]

The node immediately preceding this node. If there is no such node, this returns None.

getAttributes()[source]

Returns a NamedNodeMap containing the attributes of this node (if it is an element) or None otherwise.

getOwnerDocument()[source]

The Document object associated with this node, if any.

hasChildNodes()[source]

Returns true if the node has any children, false if it doesn’t.

getNodeType()[source]

A code representing the type of the node.

class zope.structuredtext.stdom.TextNode(str)[source]

Bases: zope.structuredtext.stdom.Node

getNodeName()[source]

The name of this node, depending on its type

getNodeValue()[source]

The value of this node, depending on its type

class zope.structuredtext.stdom.Element[source]

Bases: zope.structuredtext.stdom.Node

Element interface

getTagName()[source]

The name of the element

getNodeName()

The name of the element

getNodeValue()[source]

The value of this node, depending on its type

getParentNode()[source]

The parent of this node. All nodes except Document DocumentFragment and Attr may have a parent

getAttribute(name)[source]

Retrieves an attribute value by name.

getAttributeNode(name)[source]

Retrieves an Attr node by name or None if there is no such attribute.

getAttributes()[source]

Returns a NamedNodeMap containing the attributes of this node (if it is an element) or None otherwise.

getElementsByTagName(tagname)[source]

Returns a NodeList of all the Elements with a given tag name in the order in which they would be encountered in a preorder traversal of the Document tree. Parameter: tagname The name of the tag to match (* = all tags). Return Value: A new NodeList object containing all the matched Elements.

class zope.structuredtext.stdom.NodeList(list=None)[source]

Bases: object

NodeList interface - Provides the abstraction of an ordered collection of nodes.

Python extensions: can use sequence-style ‘len’, ‘getitem’, and ‘for..in’ constructs.

item(index)[source]

Returns the index-th item in the collection

getLength()[source]

The length of the NodeList

class zope.structuredtext.stdom.NamedNodeMap(data=None)[source]

Bases: object

NamedNodeMap interface - Is used to represent collections of nodes that can be accessed by name. NamedNodeMaps are not maintained in any particular order.

Python extensions: can use sequence-style ‘len’, ‘getitem’, and ‘for..in’ constructs, and mapping-style ‘getitem’.

item(index)[source]

Returns the index-th item in the map.

This is arbitrary because maps have no order.

getLength()[source]

The length of the NodeList

getNamedItem(name)[source]

Retrieves a node specified by name. Parameters: name Name of a node to retrieve. Return Value A Node (of any type) with the specified name, or None if the specified name did not identify any node in the map.

class zope.structuredtext.stdom.Attr(name, value, specified=1)[source]

Bases: zope.structuredtext.stdom.Node

Attr interface - The Attr interface represents an attriubte in an Element object. Attr objects inherit the Node Interface

getNodeName()[source]

The name of this node, depending on its type

getName()

The name of this node, depending on its type

getNodeValue()[source]

The value of this node, depending on its type

getSpecified()[source]

If this attribute was explicitly given a value in the original document, this is true; otherwise, it is false.

zope.structuredtext.html

HTML renderer for STX documents.

zope.structuredtext.docbook

Render STX document as docbook.

class zope.structuredtext.docbook.DocBook[source]

Bases: object

Structured text document renderer for Docbook.

class zope.structuredtext.docbook.DocBookChapter[source]

Bases: zope.structuredtext.docbook.DocBook

class zope.structuredtext.docbook.DocBookChapterWithFigures[source]

Bases: zope.structuredtext.docbook.DocBookChapter

class zope.structuredtext.docbook.DocBookArticle[source]

Bases: zope.structuredtext.docbook.DocBook

zope.structuredtext

Zope structured text markeup

Consider the following example:

>>> from zope.structuredtext.stng import structurize
>>> from zope.structuredtext.document import DocumentWithImages
>>> from zope.structuredtext.html import HTMLWithImages
>>> from zope.structuredtext.docbook import DocBook
>>> from zope.structuredtext.docbook import DocBookChapterWithFigures
>>> from zope.structuredtext.docbook import DocBookArticle

We first need to structurize the string and make a full-blown document out of it:

>>> structured_string = '''
... Title Here
...
...     Body text here.'''
>>> struct = structurize(structured_string)
>>> doc = DocumentWithImages()(struct)

Now feed it to some output generator, in this case HTML or DocBook:

>>> HTMLWithImages()(doc, level=1)
'<html>...'
>>> DocBook()(doc, level=1)
'<!DOCTYPE book ...<book>...'
>>> DocBookArticle()(doc, level=1)
'<!DOCTYPE article ...<article>...'
>>> DocBookChapterWithFigures()(doc, level=1)
'<chapter>...'

For HTML, there is a shortcut:

>>> from zope.structuredtext import stx2html
>>> stx2html(structured_string)
'<html>...'

If we have references in the text we can use a different function:

>>> from zope.structuredtext import stx2htmlWithReferences
>>> stx2htmlWithReferences(structured_string)
'<html>...'
zope.structuredtext.stx2html(aStructuredString, level=1, header=1)[source]

A shortcut to produce HTML.

zope.structuredtext.stx2htmlWithReferences(text, level=1, header=1)[source]

A shortcut to produce HTML with references

Changes

4.3 (2018-10-09)

  • Add support for Python 3.7.

4.2.0 (2017-09-05)

  • Add support for Python 3.5 and 3.6.
  • Drop support for Python 2.6 and 3.3.
  • Add support for PyPy and PyPy3.
  • Support several new elements (inner and named links, underlines, etc) in the docbook writer.
  • Fix the XML output of DocBookBook.
  • 100% test coverage, maintained by CI and tox.
  • Unused internal code in the stdom module was removed. See issue 3.

4.1.0 (2014-12-29)

  • Drop dependency on six.
  • Add support for Python 3.4.
  • Add support for testing on Travis.

4.0.0 (2013-02-25)

  • Add support for Python 3.3.
  • Drop support for Python 2.4 and 2.5.

3.5.1 (2010-12-03)

  • Remove antique copyright assertions in regression texts, in conformance with repository policy.

3.5.0 (2010-04-30)

  • Update docs to conform to ZTK / Sphinx usage.
  • LP #120376: Output valid html for non-ASCII characters.

3.4.0 (2007/09/01)

  • Public release for completeness of Zope 3.4.

3.2.0 (2006/01/05)

  • Corresponds to the verison of the zope.structuredtext package shipped as part of the Zope 3.2.0 release.
  • Only coding style / documentation changes.

3.0.0 (2004/11/07)

  • Corresponds to the verison of the zope.structuredtext package shipped as part of the Zope X3.0.0 release.

Indices and tables