===================================
使用reStructuredText来进行文学编程
===================================

:作者: limodou 
:联系: limodou AT gmail.com
:版本: 0.1
:主页: http://wiki.woodpecker.org.cn/moin/NewEdit
:BLOG: http://www.donews.net/limodou
:版权: FDL

.. contents:: 目录
.. sectnum::

前言
=====

我没有看错吧！你可能要问。的确没有。这里是我在了解了“ `文学编程`_ ”的一些基本知识后，萌生的一个想法。学过 Python_ 的人可能知道， docutils_ 是一个 Python 下的文档处理模块，它使用一种叫 reStructuredText(以下简写为reST)的文本书写格式来编写文档，这种格式是从 StructuredText 发展起来的，通过一些约定的文本格式，就可以通过转换工具将文档由文本转换为 Html, Latex, DocBook等许多种格式。而且 docutils 虽然是使用 Python 编写的，但 reST 并不是只限于 Python 的应用，你可以用它来书写任意的内容。而且许多的 Python 项目都采用它来编写文档。

什么叫文学编程？简单地讲就是采用文学创作的方式来写程序，因此更看重文档的作用，它是一种文档与代码的混合。它是由Donald Knuth 在1984年提出的。具体的可以参见 literateprogramming_ 网站。

因此我想如果可以对 reST 进行扩展，增加文学编程的功能，岂不是件好事。

.. _`文学编程`: http://www.literateprogramming.com/
.. _docutils: http://docutils.sourceforge.net/
.. _Python: http://www.python.org/
.. _literateprogramming: http://www.literateprogramming.com/

格式要求
========

在分析了 docutils 之后，我认为可以通过扩展 directive(指示器) 来实现。一个标准的 reST directive 的格式为::

    .. name:: parameters
       :option: value
    
       content
    
而且 docutils 提供了扩展新的 directive 的机制。因此你可以创建自已的新的 directive。于是我创建了一个新的 code directive，格式为::

    .. code:: code_name
       :file: output_filename
       :display: on|off

       content
    
这里 ``code_name`` 表示这段代码的名字。 ``file`` 可选项表示这段代码的输出结果将保存到哪个文件中。文件可以带路径。如果不同的代码块，但有着相同的文件名的，表示将输出到一个文件中，不过这里另有要求，后面会讲到。 ``display`` 可选项表示代码是否显示在文档中，或不显示在文档中。content为代码块的内容。其中，目前支持在一个文件内的代码块的引用。在 content 中，一行只能引用一个代码块，写法为::

    [缩近]<<code name>>
    
代码块使用<<和>>来包括，名字可以有空格。但最终都视为无空格的串，如下面几种写法表示同一个名字::

    << code name >>
    <<code   name>>
    <<codename>>
    
代码块前可以有空白(注意是空格)。在输出时，有缩近的代码块，所有行都会进行缩近，因此特别适合对缩近有要求的语言。

输出代码的要求
==============

首先对于要输出的代码需要使用 ``code directive`` 的格式来编写。一个代码块可以指定一个 file 的参数，它表示输出结果保存到哪个文件中。对于每个代码块需要指定一个名字，这个名字在对应一个输出文件的代码块中不能重复。一般来说，在一个输出文件所对应的代码块中，应该有一个命名为 ``main`` 的代码块，它表示根代码，通过它可以在内容中关联到其它的代码块。当在一个输出文件对应的代码块中找不到名为 ``main`` 的代码块时，第一个代码块就被默认为是这个文件的根代码块。因此如果一个文档中的不同的代码块的 file 参数不是同一个文件的话，那就表示这些代码块将输出到不同的文件中。

比如下面的代码块是可以的::

    .. code:: main
       :file: a.c

       #include <stdio.h>
       main()
       {
           printf("hello, world.\n");
       }

    .. code:: main
       :file: a.py

       print "hello, world."

它们的名字相同，但 ``file`` 参数不同。

由于对于一个文件的处理是顺序执行的，因此当第一个代码块设定了 ``file`` 参数后，当前处理文件即指定为这个文件名。这样当后面的代码块不设定 ``file`` 参数时，表示与前面使用相同的文件。比如::

    .. code:: main
       :file: a.c

       #include <stdio.h>
       main()
       {
           <<main_code>>
       }

    .. code:: main_code

       printf("hello, world.\n");
       printf("hello, reST.\n");
    
输出结果将为::

    #include <stdio.h>
    main()
    {
        printf("hello, world.\n");
        printf("hello, reST.\n");
    }

可以看到上面定义了两个代码块，同时 ``main`` 的内容包含了 ``<<main_code>>`` 。 ``main`` 代码块指定了 ``file`` 参数为 ``a.c`` ,但 ``main_code`` 没有指定。因此 ``main_code`` 将自动使用前一个代码块的文件名，因此与 ``main`` 代码块相同。所以 ``main`` 与 ``main_code`` 可以认为是属于同一组的代码块(即对应一个输出文件)。因此 ``<<main_code>>`` 将直接引用下面的 ``main_code`` 代码块。同时你还可以看到，当 ``main_code`` 代码块的内容插入到 ``main`` 中时，每一行都缩近了。因为在 ``main`` 中定义代码块的引用时，前面有空格，因此上整个插入的内容，每行都进行了缩近。

转换工具
=========

在使用转换工具前你需要安装最新版的 docutils_ 模块。

所有以上的操作需要一个转换工具来完成。它由两个文件组成：

#. doc.py
   主程序

#. docnotes.py
   结点处理程序

doc.py 采用了rst2html.py的操作界面。没有增加新的命令行参数。因此如何使用你可以执行::

    python doc.py --help
    
来查看详细的命令行参数。

那么在执行时，所有定义在 reST 文档中的代码将按指定的文件进行输出。在创建文件时，你将会在命令行看到执行步骤。同时，这个工具可以自动创建目录。

命令示例::

    python doc.py t.txt t.html
    
小结
====

通过 doc.py 工具的扩展可以实现部分文学编程。不过此代码功能有限，并且没有经过详细的测试。我想在以后的使用中会不停地完善。如果你有兴趣欢迎与我交流。
    
作为示例，把两个代码放在文档中。你可以通过::

    python doc.py doc.txt doc.html
    
来看出输出结果。不过，目标文件名为 ``doc_.py`` 和 ``docnotes_.py`` 。

附录A: doc.py代码
=================

.. code:: main
   :file: doc_.py

    from docutils import nodes, utils
    from docutils.parsers.rst import directives, states

    import docnodes

    display_values = ('on', 'off')

    def display(argument):
        return directives.choice(argument, display_values)

    def code(name, arguments, options, content, lineno,
              content_offset, block_text, state, state_machine):
        opt = {'display':'on'}
        opt.update(options)
        
        docnodes.Node(content, ''.join(arguments), **opt)
        if opt['display'].lower() == 'on':
            return [nodes.literal_block('', '\n'.join(content))]
        else:
            return []

    code.content = 1
    code.arguments = (1, 0, 1)
    code.options = {
                    'file':directives.unchanged,    #used to save code to the file
                    'display':display,              #if show code in document
                    }

    directives.register_directive('code', code)

    try:
        import locale
        locale.setlocale(locale.LC_ALL, '')
    except:
        pass

    from docutils.core import publish_cmdline, default_description


    description = ('Generates (X)HTML documents from standalone reStructuredText '
                   'sources.  ' + default_description)

    publish_cmdline(writer_name='html', description=description)

    docnodes.render()
    
附录B: docnotes.py代码
=======================

.. code:: main
   :file: docnodes_.py

    import re
    import sys
    import os.path
    import traceback

    DEBUG = 0

    class Node(object):
        node_pattern = re.compile(r'^(?P<blank>\s*)<<(?P<nodename>[^<]+)>>\s*')
        def __init__(self, text, name, **opts):
            """text will be passed by doctuils, and it'll be a list of string"""
            self.text = text
            self.pieces = []
            self.output = None
            self.name = Node.compressname(name)
            self.outputfile = opts.get('file', None)
            
            self.init()
            self.parent_nodelist = add_node(self)
            
        def render(self, nodelist):
            if self.output is None:
                buf = []
                for p in self.pieces:
                    if isinstance(p, (str, unicode)):
                        buf.append(p)
                    else:
                        buf.extend(p.render(nodelist))
                self.output = buf
            return self.output
                        
        def init(self):
            for i in self.text:
                b = Node.node_pattern.search(i)
                if b:
                    nodename = b.groupdict()['nodename']
                    indent = len(b.groupdict()['blank'])
                    self.pieces.append(LinkNode(Node.compressname(nodename), indent))
                else:
                    self.pieces.append(i)
                    
        def compressname(name):
            return name.replace(' ', '')
        compressname = staticmethod(compressname)
        
    class LinkNode(object):
        def __init__(self, name, indent=0):
            self.name = name
            self.indent = indent
            
        def render(self, nodelist):
            node = nodelist.get(self.name, None)
            if node:
                return [' '*self.indent + x for x in node.render(nodelist)]
            else:
                return [' '*self.indent + '<<' + self.name + '>>']
            
    class OrderedDict(dict):
        def __init__(self, d=None):
            super(dict, self).__init__(d)
            self._sequence = []
            
        def __setitem__(self, key, val):
            if not self.has_key(key):
                self._sequence.append(key)
            dict.__setitem__(self, key, val)

        def __delitem__(self, key):
            dict.__delitem__(self, key)
            self._sequence.remove(key)
            
        def getlist(self):
            return self._sequence
            
    class NodeList(object):
        def __init__(self):
            self.list = {}
            self.currentfile = None
            
        def add_node(self, node):
            if node.outputfile:
                self.currentfile = node.outputfile
            nodelist = self.list.setdefault(self.currentfile, OrderedDict({}))
            nodelist[node.name] = node
            return nodelist
        
        def render(self):
            for filename, nodelist in self.list.items():
                if filename:
                    basedir, filen = os.path.split(filename)
                    if basedir and not os.path.exists(basedir):
                        try:
                            print 'create dir', basedir
                            os.makedirs(basedir)
                        except:
                            error_output('Error: there is something wrong with create directory ' + basedir)
                    try:
                        print 'create file', filename
                        f = file(filename, 'w')
                    except:
                        error_output('Error: there is something wrong with create file ' + filename)
                else:
                    f = sys.stdout
                    
                f.write(self._render(nodelist))
                f.write('\n')
                f.close()
                
        def _render(self, nodelist):
            main = self.find_mainnode(nodelist)
            return '\n'.join(flatlist(main.render(nodelist)))
            
        def find_mainnode(self, nodelist):
            """if there is a node named main, then it's the main node, if there is none, 
            so the first node is the main node"""
            main = nodelist.get("main", None)
            if not main:
                main = nodelist.get(nodelist.getlist()[0])
            return main
                    

    _nodelist = NodeList()

    def add_node(node):
        return _nodelist.add_node(node)

    def get_root_nodelist():
        return _nodelist

    def render():
        get_root_nodelist().render()

    def error_output(msg):    
        print msg
        if DEBUG:
            traceback.print_exc()
        else:
            print 'You can set --debug to see the traceback'
        sys.exit(1)
            
    def flatlist(alist):
        buf = []
        for i in alist:
            if isinstance(i, list):
                buf.extend(flatlist(i))
            else:
                buf.append(i)
        return buf
                
    if __name__ == '__main__':
        text = """Test program
        << node 1 >>
            << node 2 >>
            """
        node = Node(text.splitlines(), "main")
        
        node1text = """if __name__ == '__main__':
    """
        node1 = Node(node1text.splitlines(), "node1")

        node2text = """print "hello, world"
    """
        node2 = Node(node2text.splitlines(), "node2")

        render()