Changeset 779

Show
Ignore:
Timestamp:
Mon Feb 27 08:09:54 2006
Author:
osmond
Message:

YuLin译毕

Files:

Legend:

Unmodified
Added
Removed
Modified
  • zh-translations/branches/diveintopython-zh-5.4/zh-cn/xml/kgp.xml

    r201 r779  
    3 3 <?dbhtml filename="xml_processing/index.html"?>  
    4 4 <title>&xml; Processing</title>  
    5   <titleabbrev id="kgp.numberonly">Chapter 9</titleabbrev>  
      5 <titleabbrev id="kgp.numberonly">第九章</titleabbrev>  
    5 5 <section id="kgp.divein">  
    6   <title>Diving in</title>  
    7   <para>These next two chapters are about &xml; processing in &python;.  It would be helpful if you already knew what an &xml; document looks like, that it's made up of structured tags to form a hierarchy of elements, and so on.  If this doesn't make sense to you, there are <ulink url="&url_xmltutorial;">many &xml; tutorials</ulink> that can explain the basics.</para>  
    8   <para>If you're not particularly interested in XML, you should still read these chapters, which cover important topics like &python; packages, Unicode, command line arguments, and how to use &getattr; for method dispatching.</para>  
      6 <title>接触</title>  
      7 <para>下面两章是关于 &python; 中 &xml; 处理的。如果你已经知道一个 &xml; 文档的样子,比如它是由结构化标记构成的,这些标记形成了层次模型的元素,等等这些知识都是有帮助的。如果你不明白这些,这里有<ulink url="&url_xmltutorial;">很多 &xml; 教程</ulink> 能够解释这些基础知识。</para>  
      8 <para>如果你对XML不是很感兴趣,你还是应该读一下这些章节,它们涵盖了不少重要的主题比如 &python; 包,Unicode,命令行参数以及如何使用 &getattr; 进行方法分发。</para>  
    9 9 <para>Being a philosophy major is not required, although if you have ever had the misfortune of being subjected to the writings of Immanuel Kant, you will appreciate the example program a lot more than if you majored in something useful, like computer science.</para>  
    10 10 <abstract>  
    11 11 <title/>  
    12   <para>There are two basic ways to work with &xml;.  One is called &sax; (<quote>Simple &api; for &xml;</quote>), and it works by reading the &xml; a little bit at a time and calling a method for each element it finds.  (If you read <xref linkend="dialect"/>, this should sound familiar, because that's how the &sgmllib_modulename; module works.)  The other is called &dom; (<quote>Document Object Model</quote>), and it works by reading in the entire &xml; document at once and creating an internal representation of it using native &python; classes linked in a tree structure.  &python; has standard modules for both kinds of parsing, but this chapter will only deal with using the &dom;.</para>  
      12 <para>处理 &xml; 有两种基本的方式。一种叫做 &sax;(<quote>Simple &api; for &xml;</quote>),它的工作方式是,一次读出一点 &xml; 内容,然后对发现的每一个元素调用一个方法。(如果你读了 <xref linkend="dialect"/>,这应该听起来很熟悉,因为这是 &sgmllib_modulename; 工作的方式。)另一种方式叫做 &dom; (<quote>Document Object Model</quote>),它的工作方式是,一次性读入整个 &xml; 文档,然后使用 &python; 类创建一个内部表示形式(以树结构进行连接)。&python; 拥有这两种解析方式的标准模块,但是本章只涉及 &dom;。</para>  
    12 12 </abstract>  
    13   <para>The following is a complete &python; program which generates pseudo-random output based on a context-free grammar defined in an &xml; format.  Don't worry yet if you don't understand what that means; you'll examine both the program's input and its output in more depth throughout these next two chapters.</para>  
      13 <para>下面是一个完整的 &python; 程序,它根据 &xml; 格式定义的上下文无关语法生成伪随机输出。如果你不明白是什么意思,不用担心,下面两章中将会深入的检视这个程序的输入和输出。</para>  
    13 13 <example>  
    14 14 <title>&kgp_filename;</title>  
     
    198 198 </programlisting>  
    199 199 </example>  
    200   <para>Run the program &kgp_filename; by itself, and it will parse the default &xml;-based grammar, in &kantxml_filename;, and print several paragraphs worth of philosophy in the style of Immanuel Kant.</para>  
      200 <para>独立运行程序 &kgp_filename; ,它会解析 &kantxml_filename; 中默认的基于 &xml; 的语法,并以康德的风格打印出几段有哲学价值的段落来。</para>  
    200 200 <example>  
    201 201 <title>Sample output of &kgp_filename;</title>  
     
    236 236 [...snip...]</computeroutput></screen>  
    237 237 </example>  
    238   <para>This is, of course, complete gibberish.  Well, not complete gibberish.  It is syntactically and grammatically correct (although very verbose -- Kant wasn't what you would call a get-to-the-point kind of guy).  Some of it may actually be true (or at least the sort of thing that Kant would have agreed with), some of it is blatantly false, and most of it is simply incoherent.  But all of it is in the style of Immanuel Kant.</para>  
    239   <para>Let me repeat that this is much, much funnier if you are now or have ever been a philosophy major.</para>  
    240   <para>The interesting thing about this program is that there is nothing Kant-specific about it.  All the content in the previous example was derived from the grammar file, &kantxml_filename;.  If you tell the program to use a different grammar file (which you can specify on the command line), the output will be completely different.</para>  
      238 <para>当然这是胡言乱语。噢,不完全是胡言乱语。它在句法和语法上都是正确的(尽管非常罗嗦--康德可不是你们所说的踩得到点上的那种人)。其中一些实际上是正确的(或者至少康德可能会认同的事情),其中一些则明显是错误的,大部分只是语无伦次。但所有内容都是符合康德的风格。  
      239 </para>  
      240 <para>让我重复一遍,如果你现在或曾经主修哲学专业,这会非常、非常有趣。</para>  
      241 <para>关于这个程序的有趣之处在于没有一点内容是属于康德的。所有的内容都来自于上下文无关语法文件&kantxml_filename;。如果你要程序使用不同的语法文件(可以在命令行中指定),输出信息将完全不同。</para>  
    241 242 <example>  
    242   <title>Simpler output from &kgp_filename;</title>  
      243 <title>&kgp_filename; 的简单输出</title>  
    242 243 <screen><prompt>[you@localhost kgp]$ python kgp.py -g binary.xml</prompt>  
    243 244 <computeroutput>00101001</computeroutput>  
     
    246 247 <computeroutput>10110100</computeroutput></screen>  
    247 248 </example>  
    248   <para>You will take a closer look at the structure of the grammar file later in this chapter.  For now, all you need to know is that the grammar file defines the structure of the output, and the &kgp_filename; program reads through the grammar and makes random decisions about which words to plug in where.</para>  
      249 <para>在本章后面的内容中,你将近距离的观察语法文件的结构。现在,你只要知道语法文件定义了输出信息的结构,而 &kgp_filename; 程序读取语法规则并随机确定哪些单词插入哪里。  
      250 </para>  
    249 251 </section>  
    250 252 <section id="kgp.packages">  
    251 253 <?dbhtml filename="xml_processing/packages.html"?>  
    252   <title>Packages</title>  
      254 <title></title>  
    252 254 <abstract>  
    253   <para>Actually parsing an &xml; document is very simple: one line of code.  However, before you get to that line of code, you need to take a short detour to talk about packages.</para>  
      255 <para>实际上解析一个 &xml; 文档是很简单的:只要一行代码。但是,在你接触那行代码前,需要暂时岔开一下,讨论一下包。</para>  
    253 255 </abstract>  
    254 256 <example>  
    255   <title>Loading an &xml; document (a sneak peek)</title>  
      257 <title>载入一个 &xml; 文档 (偷瞥一下)</title>  
    255 257 <screen>  
    256 258 &prompt;<userinput>from xml.dom import minidom</userinput> <co id="kgp.packages.1.1"/>  
     
    261 263 <calloutlist>  
    262 264 <callout arearefs="kgp.packages.1.1">  
    263   <para>This is a syntax you haven't seen before.  It looks almost like the &frommoduleimport; you know and love, but the <literal>"."</literal> gives it away as something above and beyond a simple import.  In fact, &xml_packagename; is what is known as a package, &dom_packagename; is a nested package within &xml_packagename;, and &minidom_modulename; is a module within &xmldom_packagename;.</para>  
      265 <para>这个语法你之前没有见过。它看上去很像我们所知并且喜欢的 &frommoduleimport; ,但是<literal>"."</literal> 使得它好像不止是import那么简单。事实上,&xml_packagename; 是我们所知的包,&dom_packagename; 是&xml_packagename; 中嵌套的包,而 &minidom_modulename; 是 &xmldom_packagename; 中的模块。</para>  
    263 265 </callout>  
    264 266 </calloutlist>  
    265 267 </example>  
    266   <para>That sounds complicated, but it's really not.  Looking at the actual implementation may help.  Packages are little more than directories of modules; nested packages are subdirectories.  The modules within a package (or a nested package) are still just <filename class="headerfile">.py</filename> files, like always, except that they're in a subdirectory instead of the main <filename class="directory">lib/</filename> directory of your &python; installation.</para>  
      268 <para>听起来挺复杂的,其实不是。看一下确切的实现可能会有帮助。包不过是模块的目录;嵌套包是子目录。一个包(或一个嵌套包)中的模块也只是 <filename class="headerfile">.py</filename> 文件罢了,永远都是,只是它们是在一个子目录中,而不是在你的 &python; 安装环境的主 <filename class="directory">lib/</filename> 目录下。</para>  
    266 268 <example>  
    267   <title>File layout of a package</title>  
    268   <screen><computeroutput>&python;21/           root &python; installation (home of the executable)  
      269 <title>包的文件布局</title>  
      270 <screen><computeroutput>&python;21/           &python; 安装根目录 (可执行文件的所在地)  
    269 271 |  
    270   +--lib/             library directory (home of the standard library modules)  
      272 +--lib/             库目录 (标准库模块的所在地)  
    270 272    |  
    271      +-- xml/         xml package (really just a directory with other stuff in it)  
      273    +-- xml/         xml包 (实际上目录中还有其它东西)  
    271 273        |  
    272          +--sax/      xml.sax package (again, just a directory)  
      274        +--sax/      xml.sax包 (也只是一个目录)  
    272 274        |  
    273          +--dom/      xml.dom package (contains minidom.py)  
      275        +--dom/      xml.dom包 (包含 minidom.py)  
    273 275        |  
    274          +--parsers/  xml.parsers package (used internally)</computeroutput></screen>  
      276        +--parsers/  xml.parsers包 (内部使用)</computeroutput></screen>  
    274 276 </example>  
    275   <para>So when you say <literal>from xml.dom import minidom</literal>, &python; figures out that that means <quote>look in the &xml_packagename; directory for a &dom_packagename; directory, and look in <emphasis>that</emphasis> for the &minidom_modulename; module, and import it as &minidom_modulename;</quote>.  But &python; is even smarter than that; not only can you import entire modules contained within a package, you can selectively import specific classes or functions from a module contained within a package.  You can also import the package itself as a module.  The syntax is all the same; &python; figures out what you mean based on the file layout of the package, and automatically does the right thing.</para>  
      277 <para>所以你说<literal>from xml.dom import minidom</literal>,&python; 认为它的意思是<quote>在 &xml_packagename; 目录中查找 &dom_packagename; 目录,然后在<emphasis>其</emphasis>中查找 &minidom_modulename; 模块,接着导入它并以 &minidom_modulename; 命名 </quote>。但是 &python; 更聪明;你不仅可以导入包含在一个包中的所有模块,还可以从包的模块中有选择地导入指定的类或者函数。语法都是一样的; &python; 会根据包的布局理解你的意思,然后自动进行正确的导入。  
      278 </para>  
    276 279 <example>  
    277   <title>Packages are modules, too</title>  
      280 <title>包也是模块</title>  
    277 280 <screen>&prompt;<userinput>from xml.dom import minidom</userinput>         <co id="kgp.packages.2.1"/>  
    278 281 &prompt;<userinput>minidom</userinput>  
     
    301 304 <calloutlist>  
    302 305 <callout arearefs="kgp.packages.2.1">  
    303   <para>Here you're importing a module (&minidom_modulename;) from a nested package (&xmldom_packagename;).  The result is that &minidom_modulename; is imported into your <link linkend="dialect.locals">namespace</link>, and in order to reference classes within the &minidom_modulename; module (like &element_classname;), you need to preface them with the module name.</para>  
      306 <para>这里你正从一个嵌套包(&xmldom_packagename;)中导入一个模块(&minidom_modulename;)。结果就是 &minidom_modulename; 被导入到了你(程序)的<link linkend="dialect.locals">命名空间</link>中,为了能够引用 &minidom_modulename; 模块中的类(比如 &element_classname;),你必须在它们的类名前面加上模块名。</para>  
    303 306 </callout>  
    304 307 <callout arearefs="kgp.packages.2.2">  
    305   <para>Here you are importing a class (&element_classname;) from a module (&minidom_modulename;) from a nested package (&xmldom_packagename;).  The result is that &element_classname; is imported directly into your namespace.  Note that this does not interfere with the previous import; the &element_classname; class can now be referenced in two ways (but it's all still the same class).</para>  
      308 <para>这里你正从一个来自嵌套包(&xmldom_packagename;)的模块(&minidom_modulename;)中导入一个类(&element_classname;)。结果就是 &element_classname; 直接导入到了你(程序)的命名空间中。注意,这样做并不会干扰以前的导入;现在 &element_classname; 类可以用两种方式引用了(但其实是同一个类)。</para>  
    305 308 </callout>  
    306 309 <callout arearefs="kgp.packages.2.3">  
    307   <para>Here you are importing the &dom_packagename; package (a nested package of &xml_packagename;) as a module in and of itself.  Any level of a package can be treated as a module, as you'll see in a moment.  It can even have its own attributes and methods, just the modules you've seen before.</para>  
      310 <para>这里你正在导入 &dom_packagename; 包(&xml_packagename; 的一个嵌套包),并将其作为自己或者内部的一个模块。一个包的任何层次都可以视为一个模块,一会就会看到。它甚至可以拥有自己的属性和方法,就像你在前面看到过的模块。</para>  
    307 310 </callout>  
    308 311 <callout arearefs="kgp.packages.2.4">  
    309   <para>Here you are importing the root level &xml_packagename; package as a module.</para>  
      312 <para>这里你正在将根层次的 &xml_packagename; 包作为一个模块导入。</para>  
    309 312 </callout>  
    310 313 </calloutlist>  
    311 314 </example>  
    312   <para>So how can a package (which is just a directory on disk) be imported and treated as a module (which is always a file on disk)?  The answer is the magical &init_filename; file.  You see, packages are not simply directories; they are directories with a specific file, &init_filename;, inside.  This file defines the attributes and methods of the package.  For instance, &xmldom_packagename; contains a &node_classname; class, which is defined in <filename>xml/dom/__init__.py</filename>.  When you import a package as a module (like &dom_packagename; from &xml_packagename;), you're really importing its &init_filename; file.</para>  
      315 <para>那么如何才能导入一个包(它不过是磁盘上的一个目录)并使其成为一个模块(它总是在磁盘上的一个文件)呢?答案就是神奇的 &init_filename; 文件。你明白了吧,包不只是目录,它们是包含一个特殊文件 &init_filename; 的目录。这个文件定义了包的属性和方法。例如,&xmldom_packagename; 包含了 &node_classname; 类,它在<filename>xml/dom/__init__.py</filename>中有所定义。当你将一个包作为模块导入(比如从 &xml_packagename; 导入 &dom_packagename;)的时候,实际上导入了它的 &init_filename; 文件。</para>  
    312 315 <note>  
    313 316 <title>What makes a package</title>  
    314   <para>A package is a directory with the special &init_filename; file in it.  The &init_filename; file defines the attributes and methods of the package.  It doesn't need to define anything; it can just be an empty file, but it has to exist.  But if &init_filename; doesn't exist, the directory is just a directory, not a package, and it can't be imported or contain modules or nested packages.</para>  
      317 <para>一个包是一个其中带有特殊文件 &init_filename; 的目录。&init_filename; 文件定义了包的属性和方法。其实它可以什么也不定义;可以只是一个空文件,但是必须要存在。如果 &init_filename; 不存在,这个目录就仅仅是一个目录,而不是一个包,它就不能被导入或者包含其它的模块和嵌套包。</para>  
    314 317 </note>  
    315   <para>So why bother with packages?  Well, they provide a way to logically group related modules.  Instead of having an &xml_packagename; package with &sax_packagename; and &dom_packagename; packages inside, the authors could have chosen to put all the &sax_packagename; functionality in <filename>xmlsax.py</filename> and all the &dom_packagename; functionality in <filename>xmldom.py</filename>, or even put all of it in a single module.  But that would have been unwieldy (as of this writing, the &xml; package has over 3000 lines of code) and difficult to manage (separate source files mean multiple people can work on different areas simultaneously).</para>  
    316   <para>If you ever find yourself writing a large subsystem in &python; (or, more likely, when you realize that your small subsystem has grown into a large one), invest some time designing a good package architecture.  It's one of the many things &python; is good at, so take advantage of it.</para>  
      318 <para>那为什么非得用包呢?恩,它们提供了在逻辑上将相关模块归为一组的方法。不使用其中带有 &sax_packagename; 和 &dom_packagename; 的 &xml_packagename; 包,作者也可以选择将所有的 &sax_packagename; 功能放入 <filename>xmlsax.py</filename>中,并将所有的 &dom_packagename; 功能放入 <filename>xmldom.py</filename>中,或者干脆将所有东西放入单个模块中。但是这样可能不实用(在写到这儿时,&xml; 包已经超过了3000行代码)并且很难管理(独立的源文件意味着多个人可以同时在不同的地方进行开发)。</para>  
      319 <para>如果你发现自己正在用 &python; 编写一个大型的子系统(或者,很有可能,当你意识到你的小型子系统已经成长为一个大型子系统时),你应该花费些时间设计一个好的包架构。它是 &python; 所擅长的事情之一,所以应该好好利用它。</para>  
    317 320 </section>  
    318 321 <section id="kgp.parse">  
    319 322 <?dbhtml filename="xml_processing/parsing_xml.html"?>  
    320   <title>Parsing &xml;</title>  
      323 <title>&xml; 解析</title>  
    320 323 <abstract>  
    321   <para>As I was saying, actually parsing an &xml; document is very simple: one line of code.  Where you go from there is up to you.</para>  
      324 <para>正如我说的,实际解析一个 &xml; 文档是非常简单的:只要一行代码。从这里出发到哪儿去就是你自己的事了。</para>  
    321 324 </abstract>  
    322 325 <example>  
    323   <title>Loading an &xml; document (for real this time)</title>  
      326 <title>载入一个 &xml; 文档 (这次是真的)</title>  
    323 326 <screen>  
    324 327 &prompt;<userinput>from xml.dom import minidom</userinput>                                          <co id="kgp.parse.1.1"/>  
     
    349 352 <calloutlist>  
    350 353 <callout arearefs="kgp.parse.1.1">  
    351   <para>As you saw in the <link linkend="kgp.packages">previous section</link>, this imports the &minidom_modulename; module from the &xmldom_packagename; package.</para>  
      354 <para>正如在<link linkend="kgp.packages">上一章节</link>看到的,该语句从 &xmldom_packagename; 包中导入 &minidom_modulename; 模块。</para>  
    351 354 </callout>  
    352 355 <callout arearefs="kgp.parse.1.2">  
    353   <para>Here is the one line of code that does all the work: &minidomparse_functionname; takes one argument and returns a parsed representation of the &xml; document.  The argument can be many things; in this case, it's simply a filename of an &xml; document on my local disk.  (To follow along, you'll need to change the path to point to your downloaded examples directory.)  But you can also pass a <link linkend="fileinfo.files">file object</link>, or even a <link linkend="dialect.extract.urllib">file-like object</link>.  You'll take advantage of this flexibility later in this chapter.</para>  
      356 <para>这就是进行所有工作的一行代码:&minidomparse_functionname; 接收一个参数并返回 &xml; 文档解析后的表示形式。这个参数可以是很多东西;在本例中,它只是我本地磁盘上一个 &xml; 文档的文件名。(为了继续执行,你需要将路径改为指向下载的例子所在的目录。)但是你也可以传入一个 <link linkend="fileinfo.files">文件对象</link>,或甚至是一个<link linkend="dialect.extract.urllib">类似文件的对象</link>。这样你就可以在本章后面好好利用这一灵活性了。</para>  
    353 356 </callout>  
    354 357 <callout arearefs="kgp.parse.1.3">  
    355   <para>The object returned from &minidomparse_functionname; is a &document_classname; object, a descendant of the &node_classname; class.  This &document_classname; object is the root level of a complex tree-like structure of interlocking &python; objects that completely represent the &xml; document you passed to &minidomparse_functionname;.</para>  
      358 <para>从 &minidomparse_functionname; 返回的对象是一个 &document_classname; 对象,它是 &node_classname; 类的一个子对象。这个&document_classname; 对象是联锁的 &python; 对象的一个复杂树状结构的根层次,这些 &python; 对象完整表示了传给 &minidomparse_functionname; 的 &xml; 文档。</para>  
    355 358 </callout>  
    356 359 <callout arearefs="kgp.parse.1.4">  
    357   <para>&toxml_functionname; is a method of the &node_classname; class (and is therefore available on the &document_classname; object you got from &minidomparse_functionname;).  &toxml_functionname; prints out the &xml; that this &node_classname; represents.  For the &document_classname; node, this prints out the entire &xml; document.</para>  
      360 <para>&toxml_functionname; 是 &node_classname; 类的一个方法(因此可以在从 &minidomparse_functionname; 中得到的 &document_classname; 对象上使用)。&toxml_functionname; 打印出了 &node_classname; 表示的 &xml;。对于 &document_classname; 节点,这样就会打印出整个 &xml; 文档。</para>  
    357 360 </callout>  
    358 361 </calloutlist>  
    359 362 </example>  
    360   <para>Now that you have an &xml; document in memory, you can start traversing through it.</para>  
      363 <para>现在内存中已经有了一个 &xml; 文档了,你可以开始遍历它了。</para>  
    360 363 <example id="kgp.parse.gettingchildnodes.example">  
    361   <title>Getting child nodes</title>  
      364 <title>获取子节点</title>  
    361 364 <screen>  
    362 365 &prompt;<userinput>xmldoc.childNodes</userinput>    <co id="kgp.parse.2.1"/>  
     
    374 377 <calloutlist>  
    375 378 <callout arearefs="kgp.parse.2.1">  
    376   <para>Every &node_classname; has a &childnodes_attr; attribute, which is a list of the &node_classname; objects.  A &document_classname; always has only one child node, the root element of the &xml; document (in this case, the &grammarnode; element).</para>  
      379 <para>每个 &node_classname; 都有一个 &childnodes_attr; 属性,它是一个 &node_classname; 对象的列表。一个 &document_classname; 只有一个子节点,即 &xml; 文档的根元素(在本例中,是 &grammarnode; 元素)。</para>  
    376 379 </callout>  
    377 380 <callout arearefs="kgp.parse.2.2">  
    378   <para>To get the first (and in this case, the only) child node, just use regular list syntax.  Remember, there is nothing special going on here; this is just a regular &python; list of regular &python; objects.</para>  
      381 <para>为了得到第一个(在本例中,只有一个)子节点,只要使用正规的列表语法。回想一下,其实这里没有发生什么特别的;这只是一个由正规 &python; 对象构成的正规 &python; 列表。</para>  
    378 381 </callout>  
    379 382 <callout arearefs="kgp.parse.2.3">  
    380   <para>Since getting the first child node of a node is a useful and common activity, the &node_classname; class has a &firstchild_attr; attribute, which is synonymous with <literal>childNodes[0]</literal>.  (There is also a &lastchild_attr; attribute, which is synonymous with <literal>childNodes[-1]</literal>.)</para>  
      383 <para>鉴于获取某个节点的第一个子节点是有用而且常见的行为,所以 &node_classname; 类有一个 &firstchild_attr; 属性,它和<literal>childNodes[0]</literal>具有相同的语义。(还有一个 &lastchild_attr; 属性,它和<literal>childNodes[-1]</literal>具有相同的语义。)</para>  
    380 383 </callout>  
    381 384 </calloutlist>  
    382 385 </example>  
    383 386 <example>  
    384   <title>&toxml_functionname; works on any node</title>  
      387 <title>&toxml_functionname; 用于任何节点</title>  
    384 387 <screen>  
    385 388 &prompt;<userinput>grammarNode = xmldoc.firstChild</userinput>  
     
    401 404 <calloutlist>  
    402 405 <callout arearefs="kgp.parse.3.1">  
    403   <para>Since the &toxml_functionname; method is defined in the &node_classname; class, it is available on any &xml; node, not just the &document_classname; element.</para>  
      406 <para>由于 &toxml_functionname; 方法是定义在 &node_classname; 类中的,所以对任何 &xml; 节点都是可用的,不仅仅是 &document_classname; 元素。</para>  
    403 406 </callout>  
    404 407 </calloutlist>  
    405 408 </example>  
    406 409 <example id="kgp.parse.childnodescanbetext.example">  
    407   <title>Child nodes can be text</title>  
      410 <title>子节点可以是文本</title>  
    407 410 <screen>  
    408 411 &prompt;<userinput>grammarNode.childNodes</userinput>                  <co id="kgp.parse.4.1"/>  
     
    431 434 <calloutlist>  
    432 435 <callout arearefs="kgp.parse.4.1">  
    433   <para>Looking at the &xml; in &binaryxml_filename;, you might think that the &grammarnode; has only two child nodes, the two &refnode; elements.  But you're missing something: the carriage returns!  After the <literal>'&lt;grammar>'</literal> and before the first <literal>'&lt;ref>'</literal> is a carriage return, and this text counts as a child node of the &grammarnode; element.  Similarly, there is a carriage return after each <literal>'&lt;/ref>'</literal>; these also count as child nodes.  So <literal>grammar.childNodes</literal> is actually a list of 5 objects: 3 &text_classname; objects and 2 &element_classname; objects.</para>  
      436 <para>查看 &binaryxml_filename; 中的 &xml; ,你可能会认为 &grammarnode; 只有两个子节点,即两个 &refnode; 元素。但是你忘记了一些东西:硬回车!在<literal>'&lt;grammar>'</literal>之后,第一个<literal>'&lt;ref>'</literal>之前是一个硬回车,并且这个文本算作 &grammarnode; 元素的一个子节点。类似的,在每个<literal>'&lt;/ref>'</literal>之后都有一个硬回车;它们都被当作子节点。所以<literal>grammar.childNodes</literal>实际上是一个有5个对象的列表:3个 &text_classname; 对象和两个 &element_classname; 对象。</para>  
    433 436 </callout>  
    434 437 <callout arearefs="kgp.parse.4.2">  
    435   <para>The first child is a &text_classname; object representing the carriage return after the <literal>'&lt;grammar>'</literal> tag and before the first <literal>'&lt;ref>'</literal> tag.</para>  
      438 <para>第一个子节点是一个 &text_classname; 对象,它表示在<literal>'&lt;grammar>'</literal>标记之后、第一个<literal>'&lt;ref>'</literal>标记之后的硬回车。</para>  
    435 438 </callout>  
    436 439 <callout arearefs="kgp.parse.4.3">  
    437   <para>The second child is an &element_classname; object representing the first &refnode; element.</para>  
      440 <para>第二个子节点是一个 &element_classname; 对象,表示了第一个 &refnode; 元素。</para>  
    437 440 </callout>  
    438 441 <callout arearefs="kgp.parse.4.4">  
    439   <para>The fourth child is an &element_classname; object representing the second &refnode; element.</para>  
      442 <para>第四个子节点是一个 &element_classname; 对象,表示了第二个 &refnode; 元素。</para>  
    439 442 </callout>  
    440 443 <callout arearefs="kgp.parse.4.5">  
    441   <para>The last child is a &text_classname; object representing the carriage return after the <literal>'&lt;/ref>'</literal> end tag and before the <literal>'&lt;/grammar>'</literal> end tag.</para>  
      444 <para>最后一个子节点是一个 &text_classname; 对象,表示了在<literal>'&lt;/ref>'</literal>结束标记之后、<literal>'&lt;/grammar>'</literal> 结束标记之前的硬回车。</para>  
    441 444 </callout>  
    442 445 </calloutlist>  
     
    470 473 <calloutlist>  
    471 474 <callout arearefs="kgp.parse.5.1">  
    472   <para>As you saw in the previous example, the first <sgmltag>ref</sgmltag> element is <literal>grammarNode.childNodes[1]</literal>, since childNodes[0] is a &text_classname; node for the carriage return.</para>  
      475 <para>正如你在前面的例子中看到的,第一个<sgmltag>ref</sgmltag>元素是<literal>grammarNode.childNodes[1]</literal>,因为childNodes[0]是一个代表硬回车的 &text_classname; 节点。</para>  
    472 475 </callout>  
    473 476 <callout arearefs="kgp.parse.5.2">  
    474   <para>The <sgmltag>ref</sgmltag> element has its own set of child nodes, one for the carriage return, a separate one for the spaces, one for the <sgmltag>p</sgmltag> element, and so forth.</para>  
      477 <para><sgmltag>ref</sgmltag>元素有它自己的子节点集合,一个表示硬回车,一个独立表示空格的,一个用于<sgmltag>p</sgmltag>元素的,诸如此类。</para>  
    474 477 </callout>  
    475 478 <callout arearefs="kgp.parse.5.3">  
    476   <para>You can even use the &toxml_functionname; method here, deeply nested within the document.</para>  
      479 <para>你甚至可以在这里使用 &toxml_functionname; 方法,深深嵌套在文档中。</para>  
    476 479 </callout>  
    477 480 <callout arearefs="kgp.parse.5.4">  
    478   <para>The <sgmltag>p</sgmltag> element has only one child node (you can't tell that from this example, but look at <literal>pNode.childNodes</literal> if you don't believe me), and it is a &text_classname; node for the single character <literal>'0'</literal>.</para>  
      481 <para><sgmltag>p</sgmltag>元素只有一个子节点(在这个例子中,你无法知道这一点,但是如果你不信,可以看看<literal>pNode.childNodes</literal>),而且它是表示单字符<literal>'0'</literal>的一个 &text_classname; 节点。</para>  
    478 481 </callout>  
    479 482 <callout arearefs="kgp.parse.5.5">  
    480   <para>The <literal>.data</literal> attribute of a &text_classname; node gives you the actual string that the text node represents.  But what is that <literal>'u'</literal> in front of the string?  The answer to that deserves its own section.</para>  
      483 <para>&text_classname; 节点的<literal>.data</literal>属性可以向你提供文本节点真正代表的字符串。但是字符串前面的<literal>'u'</literal>是什么意思呢?答案将自己专门有一部分来论述。</para>  
    480 483 </callout>  
    481 484 </calloutlist>  
     
    492 495 <abstract>  
    493 496 <title/>  
    494   <para>Unicode is a system to represent characters from all the world's different languages.  When &python; parses an &xml; document, all data is stored in memory as unicode.</para>  
      497 <para>Unicode 是一个系统,用来表示世界上所有不同语言的字符。当 &python; 解析一个 &xml; 文档时,所有的数据都是以unicode的形式保存在内存中的。</para>  
    494 497 </abstract>  
    495   <para>You'll get to all that in a minute, but first, some background.</para>  
      498 <para>一会儿你就会了解,但首先,先看一些背景知识。</para>  
    495 498 <formalpara>  
    496   <title>Historical note</title>  
    497   <para>Before unicode, there were separate character encoding systems for each language, each using the same numbers (0-255) to represent that language's characters.  Some languages (like Russian) have multiple conflicting standards about how to represent the same characters; other languages (like Japanese) have so many characters that they require multiple-byte character sets.  Exchanging documents between systems was difficult because there was no way for a computer to tell for certain which character encoding scheme the document author had used; the computer only saw numbers, and the numbers could mean different things.  Then think about trying to store these documents in the same place (like in the same database table); you would need to store the character encoding alongside each piece of text, and make sure to pass it around whenever you passed the text around.  Then think about multilingual documents, with characters from multiple languages in the same document.  (They typically used escape codes to switch modes; poof, you're in Russian koi8-r mode, so character 241 means this; poof, now you're in Mac Greek mode, so character 241 means something else.  And so on.)  These are the problems which unicode was designed to solve.</para>  
      499 <title>历史注解</title>  
      500 <para>在unicode之前,对于每一种语言都存在独立的字符编码系统,每个系统都使用相同的数字(0-255)来表示这种语言的字符。一些语言(象俄语)对于如何表示相同的字符还有几种有冲突的标准;另一些语言(象日语)拥有太多的字符,需要多个字符集。在系统之间进行文档交流是困难的,因为对于一台计算机来说,没有方法可以识别出文档的作者使用了哪种编码模式;计算机看到的只是数字,并且这些数字可以表示不同的东西。接着考虑到试图将这些文档存放到同一个地方(比如在同一个数据库表中);你需要在每段文本的旁边保存字符的编码,并且确保在传递文本的同时将编码也进行传递。接着考虑多语言文档,即在同一文档中使用了不同语言的字符。(比较有代表性的是使用转义符来进行模式切换;扑,我们处于俄语 koi8-r 模式,所以字符 241 表示这个;扑,现在我们处于 Mac 希腊语模式,所以字符 241 表示其它什么。等等。)这些就是unicode被设计出来要解决的问题。  
      501 </para>  
    498 502 </formalpara>  
    499   <para>To solve these problems, unicode represents each character as a 2-byte number, from 0 to 65535.<footnote><para>This, sadly, is <emphasis>still</emphasis> an oversimplification.  Unicode now has been extended to handle ancient Chinese, Korean, and Japanese texts, which had so many different characters that the 2-byte unicode system could not represent them all.  But &python; doesn't currently support that out of the box, and I don't know if there is a project afoot to add it.  You've reached the limits of my expertise, sorry.</para></footnote>  Each 2-byte number represents a unique character used in at least one of the world's languages.  (Characters that are used in multiple languages have the same numeric code.)  There is exactly 1 number per character, and exactly 1 character per number.  Unicode data is never ambiguous.</para>  
    500   <para>Of course, there is still the matter of all these legacy encoding systems.  7-bit &ascii;, for instance, which stores English characters as numbers ranging from 0 to 127.  (65 is capital <quote><literal>A</literal></quote>, 97 is lowercase <quote><literal>a</literal></quote>, and so forth.)  English has a very simple alphabet, so it can be completely expressed in 7-bit &ascii;.  Western European languages like French, Spanish, and German all use an encoding system called ISO-8859-1 (also called <quote>latin-1</quote>), which uses the 7-bit &ascii; characters for the numbers 0 through 127, but then extends into the 128-255 range for characters like n-with-a-tilde-over-it (241), and u-with-two-dots-over-it (252).  And unicode uses the same characters as 7-bit &ascii; for 0 through 127, and the same characters as ISO-8859-1 for 128 through 255, and then extends from there into characters for other languages with the remaining numbers, 256 through 65535.</para>  
    501   <para>When dealing with unicode data, you may at some point need to convert the data back into one of these other legacy encoding systems.  For instance, to integrate with some other computer system which expects its data in a specific 1-byte encoding scheme, or to print it to a non-unicode-aware terminal or printer.  Or to store it in an &xml; document which explicitly specifies the encoding scheme.</para>  
    502   <para>And on that note, let's get back to &python;.</para>  
    503   <para>&python; has had unicode support throughout the language since version 2.0.  The &xml; package uses unicode to store all parsed &xml; data, but you can use unicode anywhere.</para>  
      503 <para>为了解决这些问题,unicode用一个 2 字节数字表示每个字符,从 0 到 65535。<footnote><para>这一点,很不幸<emphasis>仍然</emphasis> 过分简单了。现在unicode已经扩展用来处理古老的汉字、韩文和日文文本,它们有太多不同的字符,以至于2字节的unicode系统不能全部表示。但当前 &python; 不支持超出范围的编码,并且我不知道是否有正在计划进行解决的项目。对不起,你已经到了我经验的极限了。</para></footnote> 每个 2 字节数字表示至少在一种世界语言中使用的一个唯一字符。(在多种语言中都使用的字符具有相同的数字码。)这样就确保每个字符一个数字,并且每个数字一个字符。Unicode数据永远不会模棱两可。</para>  
      504 <para>当然,仍然还存在着所有那些遗留的编码系统的情况。例如,7位 &ascii;,它可以将英文字符存诸为从0到127的数值。(65是大写字母<quote><literal>A</literal></quote>,97是小写字母<quote><literal>a</literal></quote>,等等。)英语有着非常简单的字母表,所以它可以完全用7位 &ascii; 来表示。象法语、西班牙语和德语之类的西欧语言都使用叫做ISO-8859-1的编码系统(也叫做<quote>latin-1</quote>),它使用7位 &ascii; 字符表示从0到127的数字,但接着扩展到了128-255的范围来表示象n上带有一个波浪线(241),和u上带有两个点(252)的字符。Unicode使用同7位 &ascii; 码一样的字符表示0到127,同ISO-8859-1一样的字符表示128到255,接着使用剩余的数字,256到65535,扩展到表示其它语言的字符。</para>  
      505 <para>在处理unicode数据时,在某些地方你可能需要将数据转换回这些遗留编码系统之一。例如,为了同其它一些计算机系统集成,这些系统期望它的数据使用一种特定的单字节编码模式,或将数据打印输出到一个非unicode识别终端或打印机。或将数据保存到一个明确指定编码模式的  &xml; 文档中。</para>  
      506 <para>在了解这个注解之后,让我们回到 &python;上来。</para>  
      507 <para>从2.0版本开始,&python; 在整个语言的基础上已经支持unicode。&xml; 包使用unicode来保存所有解析了的 &xml; 数据,而且你可以在任何地方使用unicode。</para>  
    504 508 <example>  
    505   <title>Introducing unicode</title>  
      509 <title>unicode介绍</title>  
    505 509 <screen>  
    506 510 &prompt;<userinput>s = u'Dive in'</userinput>            <co id="kgp.unicode.1.1"/>  
     
    514 518 <calloutlist>  
    515 519 <callout arearefs="kgp.unicode.1.1">  
    516   <para>To create a unicode string instead of a regular &ascii; string, add the letter <quote><literal>u</literal></quote> before the string.  Note that this particular string doesn't have any non-&ascii; characters.  That's fine; unicode is a superset of &ascii; (a very large superset at that), so any regular &ascii; string can also be stored as unicode.</para>  
      520 <para>为了创建一个unicode字符串而不是通常的 &ascii; 字符串,要在字符串前面加上字母<quote><literal>u</literal></quote>。注意这个特殊的字符串没有任何非 &ascii; 的字符。这样很好;unicode是 &ascii; 的一个超集(一个非常大的超集),所以任何正常的 &ascii; 都可以以unicode形式保存起来。</para>  
    516 520 </callout>  
    517 521 <callout arearefs="kgp.unicode.1.2">  
    518   <para>When printing a string, &python; will attempt to convert it to your default encoding, which is usually &ascii;.  (More on this in a minute.)  Since this unicode string is made up of characters that are also &ascii; characters, printing it has the same result as printing a normal &ascii; string; the conversion is seamless, and if you didn't know that <varname>s</varname> was a unicode string, you'd never notice the difference.</para>  
      522 <para>在打印字符串时,&python; 试图将字符串转换为你的默认编码,通常是 &ascii; 。(过会儿有更详细的说明。)因为组成这个unicode字符串的字符都是 &ascii; 字符,打印结果与打印正常的 &ascii; 字符串是一样的;转换是无缝的,而且如果你没有注意到<varname>s</varname>是一个unicode字符串的话,你永远也不会注意到两者之间的差别。</para>  
    518 522 </callout>  
    519 523 </calloutlist>  
    520 524 </example>  
    521 525 <example>  
    522   <title>Storing non-&ascii; characters</title>  
      526 <title>存储非 &ascii; 字符</title>  
    522 526 <screen>  
    523 527 &prompt;<userinput>s = u'La Pe\xf1a'</userinput>         <co id="kgp.unicode.2.1"/>  
     
    532 536 <calloutlist>  
    533 537 <callout arearefs="kgp.unicode.2.1">  
    534   <para>The real advantage of unicode, of course, is its ability to store non-&ascii; characters, like the Spanish <quote><literal>&ntilde;</literal></quote> (<literal>n</literal> with a tilde over it).  The unicode character code for the tilde-n is <literal>0xf1</literal> in hexadecimal (241 in decimal), which you can type like this: <literal>\xf1</literal>.</para>  
      538 <para>unicode真正的优势,理所当然的是它保存非 &ascii; 字符的能力,例如西班牙语的<quote><literal>&ntilde;</literal></quote>(<literal>n</literal>上带有一个波浪线)。用来表示波浪线n的unicode字符编码是十六进制的<literal>0xf1</literal> (十进制的241),你可以象这样输入:<literal>\xf1</literal></para>  
    534 538 </callout>  
    535 539 <callout arearefs="kgp.unicode.2.2">  
    536   <para>Remember I said that the &print; function attempts to convert a unicode string to &ascii; so it can print it?  Well, that's not going to work here, because your unicode string contains non-&ascii; characters, so &python; raises a <errorname>UnicodeError</errorname> error.</para>  
      540 <para>还记得我说过 &print; 函数会尝试将unicode字符串转换为 &ascii;,这样就可以打印它了吗?嗯,在这里将不会起作用,因为你的unicode字符串包含非 &ascii; 字符,所以 &python; 会引发<errorname>UnicodeError异常。</errorname></para>  
    536 540 </callout>  
    537 541 <callout arearefs="kgp.unicode.2.3">  
    538   <para>Here's where the conversion-from-unicode-to-other-encoding-schemes comes in.  <varname>s</varname> is a unicode string, but &print; can only print a regular string.  To solve this problem, you call the <function>encode</function> method, available on every unicode string, to convert the unicode string to a regular string in the given encoding scheme, which you pass as a parameter.  In this case, you're using <literal>latin-1</literal> (also known as <literal>iso-8859-1</literal>), which includes the tilde-n (whereas the default &ascii; encoding scheme did not, since it only includes characters numbered 0 through 127).</para>  
      542 <para>这儿就是将unicode转换为其它编码模式起作用的地方。<varname>s</varname>是一个unicode字符串,但 &print; 只能打印正常的字符串。为了解决这个问题,我们调用 <function>encode</function> 方法(它可以用于每个unicode字符串)将unicode字符串转换为指定编码模式的正常字符串。我们向此函数传入一个参数。在本例中,我们使用 <literal>latin-1</literal> (也就是大家知道的 <literal>iso-8859-1</literal>),它包括带波浪线的n(然而缺省的 &ascii; 编码模式不包括,因为它只包含数值从 0 到 127 的字符)。</para>  
    538 542 </callout>  
    539 543 </calloutlist>  
    540 544 </example>  
    541   <para>Remember I said &python; usually converted unicode to &ascii; whenever it needed to make a regular string out of a unicode string?  Well, this default encoding scheme is an option which you can customize.</para>  
      545 <para>还记得我说过:一旦需要从一个unicode得到一个正常字符串,&python;通常默认将unicode转换成 &ascii; 吗?嗯,这个默认编码模式是一个可以定制的选项。</para>  
    541 545 <example>  
    542 546 <title><filename>sitecustomize.py</filename></title>  
     
    554 558 <calloutlist>  
    555 559 <callout arearefs="kgp.unicode.3.1">  
    556   <para><filename>sitecustomize.py</filename> is a special script; &python; will try to import it on startup, so any code in it will be run automatically.  As the comment mentions, it can go anywhere (as long as &import; can find it), but it usually goes in the <filename>site-packages</filename> directory within your &python; <filename>lib</filename> directory.</para>  
      560 <para><filename>sitecustomize.py</filename>是一个特殊的脚本;&python; 会在启动的时候导入它,所以在其中的任何代码都将自动运行。就像注解中提到的那样,它可以放在任何地方(只要 &import; 能够找到它),但是通常它位于 &python; 的<filename>lib</filename>目录的<filename>site-packages</filename>目录中。</para>  
    556 560 </callout>  
    557 561 <callout arearefs="kgp.unicode.3.2">  
    558   <para><function>setdefaultencoding</function> function sets, well, the default encoding.  This is the encoding scheme that &python; will try to use whenever it needs to auto-coerce a unicode string into a regular string.</para>  
      562 <para>恩,<function>setdefaultencoding</function> 函数设置默认编码。&python; 会在任何需要自动将unicode字符串强制转换为正规字符串的地方,使用这个编码模式。</para>  
    558 562 </callout>  
    559 563 </calloutlist>  
    560 564 </example>  
    561 565 <example>  
    562   <title>Effects of setting the default encoding</title>  
      566 <title>设置默认编码的效果</title>  
    562 566 <screen>  
    563 567 &prompt;<userinput>import sys</userinput>  
     
    572 576 <calloutlist>  
    573 577 <callout arearefs="kgp.unicode.4.1">  
    574   <para>This example assumes that you have made the changes listed in the previous example to your <filename>sitecustomize.py</filename> file, and restarted &python;.  If your default encoding still says <literal>'ascii'</literal>, you didn't set up your <filename>sitecustomize.py</filename> properly, or you didn't restart &python;.  The default encoding can only be changed during &python; startup; you can't change it later.  (Due to some wacky programming tricks that I won't get into right now, you can't even call <function>sys.setdefaultencoding</function> after &python; has started up.  Dig into <filename>site.py</filename> and search for <quote><literal>setdefaultencoding</literal></quote> to find out how.)</para>  
      578 <para>这个例子假设你已经按前一个例子中的改动对<filename>sitecustomize.py</filename>文件做了修改,并且已经重启了 &python;。如果你的默认编码还是<literal>'ascii'</literal>,可能你就没有正确设置<filename>sitecustomize.py</filename> 文件,或者是没有重新启动 &python;。默认的编码只会在 &python; 启动的时候改变;之后就不能改变了。(由于一些古怪的编程技巧,我没有马上深入,你甚至不能在 &python; 启动之后调用<function>sys.setdefaultencoding</function>函数。仔细研究<filename>site.py</filename>,并搜索<quote><literal>setdefaultencoding</literal></quote>去发现为什么吧。)</para>  
    574 578 </callout>  
    575 579 <callout arearefs="kgp.unicode.4.2">  
    576   <para>Now that the default encoding scheme includes all the characters you use in your string, &python; has no problem auto-coercing the string and printing it.</para>  
      580 <para>现在默认的编码模式已经包含了你在字符串中使用的所有字符,&python; 对字符串的自动强制转换和打印就不存在问题了。</para>  
    576 580 </callout>  
    577 581 </calloutlist>  
    578 582 </example>  
    579 583 <example>  
    580   <title>Specifying encoding in <filename>.py</filename> files</title>  
    581   <para>If you are going to be storing non-ASCII strings within your &python; code, you'll need to specify the encoding of each individual <filename>.py</filename> file by putting an encoding declaration at the top of each file.  This declaration defines the <filename>.py</filename> file to be UTF-8:</para>  
      584 <title>指定<filename>.py</filename>文件的编码</title>  
      585 <para>如果你打算在你的 &python; 代码中保存非 &ascii; 字符串,你需要在每个文件的顶端加入编码声明来指定每个<filename>.py</filename>文件的编码。这个声明定义了<filename>.py</filename>文件的编码为UTF-8:</para>  
    582 586 <programlisting>  
    583 587 #!/usr/bin/env python  
     
    587 591 </programlisting>  
    588 592 </example>  
    589   <para>Now, what about &xml;?  Well, every &xml; document is in a specific encoding.  Again, ISO-8859-1 is a popular encoding for data in Western European languages.  KOI8-R is popular for Russian texts.  The encoding, if specified, is in the header of the &xml; document.</para>  
      593 <para>现在,想想 &xml; 中的编码应该是怎样的呢?不错的是,每一个 &xml; 文档都有指定的编码。重复一下,ISO-8859-1是西欧语言存放数据的流行编码方式。KOI8-R是俄语流行的编码方式。编码,如果指定了的话,都在 &xml; 文档的首部。</para>  
    589 593 <example>  
    590 594 <title><filename>russiansample.xml</filename></title>  
     
    597 601 <calloutlist>  
    598 602 <callout arearefs="kgp.unicode.5.1">  
    599   <para>This is a sample extract from a real Russian &xml; document; it's part of a Russian translation of this very book.  Note the encoding, <literal>koi8-r</literal>, specified in the header.</para>  
      603 <para>这是从一个真实的俄语 &xml; 文档中提取出来的示例;它就是这本书俄语翻译版的一部分。注意,编码<literal>koi8-r</literal>是在首部指定的。</para>  
    599 603 </callout>  
    600 604 <callout arearefs="kgp.unicode.5.2">  
    601   <para>These are Cyrillic characters which, as far as I know, spell the Russian word for <quote>Preface</quote>.  If you open this file in a regular text editor, the characters will most likely like gibberish, because they're encoded using the <literal>koi8-r</literal> encoding scheme, but they're being displayed in <literal>iso-8859-1</literal>.</para>  
      605 <para>这些是古代斯拉夫语的字符,就我所知,它们用来拼写俄语单词<quote>Preface</quote>。如果你在一个正常文本编辑器中打开这个文件,这些字符非常象乱码,因为它们使用了<literal>koi8-r</literal>编码模式进行编码,但是却以<literal>iso-8859-1</literal>编码模式进行显示。</para>  
    601 605 </callout>  
    602 606 </calloutlist>  
    603 607 </example>  
    604 608 <example>  
    605   <title>Parsing <filename>russiansample.xml</filename></title>  
      609 <title>解析<filename>russiansample.xml</filename></title>  
    605 609 <screen>  
    606 610 &prompt;<userinput>from xml.dom import minidom</userinput>  
     
    622 626 <calloutlist>  
    623 627 <callout arearefs="kgp.unicode.6.1">  
    624   <para>I'm assuming here that you saved the previous example as <filename>russiansample.xml</filename> in the current directory.  I am also, for the sake of completeness, assuming that you've changed your default encoding back to <literal>'ascii'</literal> by removing your <filename>sitecustomize.py</filename> file, or at least commenting out the <function>setdefaultencoding</function> line.</para>  
      628 <para>我假设在这里你将前一个例子以<filename>russiansample.xml</filename>为名保存在当前目录中。也出于完整性的考虑,我假设你已经删除了<filename>sitecustomize.py</filename>文件,将缺省编码改回到<literal>'ascii'</literal>,或至少将<function>setdefaultencoding</function>一行注释起来了。</para>  
    624 628 </callout>  
    625 629 <callout arearefs="kgp.unicode.6.2">  
    626   <para>Note that the text data of the <sgmltag>title</sgmltag> tag (now in the <varname>title</varname> variable, thanks to that long concatenation of &python; functions which I hastily skipped over and, annoyingly, won't explain until the next section) -- the text data inside the &xml; document's <sgmltag>title</sgmltag> element is stored in unicode.</para>  
      630 <para>注意<sgmltag>title</sgmltag>标记的文本数据(现在在<varname>title</varname>变量中,幸亏有 &python; 函数的常串联,我快速地将它跳过去,并且在下一节之前不会进行解释)--在 &xml; 文档的<sgmltag>title</sgmltag>元素中的文本数据是以unicode保存的。</para>  
    626 630 </callout>  
    627 631 <callout arearefs="kgp.unicode.6.3">  
    628   <para>Printing the title is not possible, because this unicode string contains non-&ascii; characters, so &python; can't convert it to &ascii; because that doesn't make sense.</para>  
      632 <para>打印title是不可能的,因为这个unicode字符串包哈了非 &ascii; 字符,所以 &python; 不能把它转换为 &ascii; 因为它无法理解。</para>  
    628 632 </callout>  
    629 633 <callout arearefs="kgp.unicode.6.4">  
    630   <para>You can, however, explicitly convert it to <literal>koi8-r</literal>, in which case you get a (regular, not unicode) string of single-byte characters (<literal>f0</literal>, <literal>d2</literal>, <literal>c5</literal>, and so forth) that are the <literal>koi8-r</literal>-encoded versions of the characters in the original unicode string.</para>  
      634 <para>你能够,但是,显式的将它转换为<literal>koi8-r</literal>,在本例中,我们得到一个(正常,非unicode)单字节字符的字符串(<literal>f0</literal>, <literal>d2</literal>, <literal>c5</literal>,等等),它是初始unicode字符串中字符<literal>koi8-r</literal>-编码的版本。</para>  
    630 634 </callout>  
    631 635 <callout arearefs="kgp.unicode.6.5">  
    632   <para>Printing the <literal>koi8-r</literal>-encoded string will probably show gibberish on your screen, because your &python; &ide; is interpreting those characters as <literal>iso-8859-1</literal>, not <literal>koi8-r</literal>.  But at least they do print.  (And, if you look carefully, it's the same gibberish that you saw when you opened the original &xml; document in a non-unicode-aware text editor.  &python; converted it from <literal>koi8-r</literal> into unicode when it parsed the &xml; document, and you've just converted it back.)</para>  
      636 <para>打印<literal>koi8-r</literal>编码的字符串有可能会在你的屏幕上显示为乱码,因为你的 &python; &ide; 将这些字符作为  
      637 <literal>iso-8859-1</literal>的编码进行解析,而不是<literal>koi8-r</literal>编码。但是,至少它们能打印。(并且,如果你仔细看,当在一个不支持unicode的文本编辑器中打开最初的 &xml; 文档时,会看到相同的乱码。 &python; 在解析 &xml; 文档时,将它从<literal>koi8-r</literal>转换到了unicode,你只不过是将它转换回来。)</para>  
    633 638 </callout>  
    634 639 </calloutlist>  
    635 640 </example>  
    636   <para>To sum up, unicode itself is a bit intimidating if you've never seen it before, but unicode data is really very easy to handle in &python;.  If your &xml; documents are all 7-bit &ascii; (like the examples in this chapter), you will literally never think about unicode.  &python; will convert the &ascii; data in the &xml; documents into unicode while parsing, and auto-coerce it back to &ascii; whenever necessary, and you'll never even notice.  But if you need to deal with that in other languages, &python; is ready.</para>  
      641 <para>总结一下,如果你以前从没有看到过unicode,倒是有些唬人,但是在 &python; 处理unicode数据真是非常容易。如果你的 &xml; 文档都是7位的 &ascii;(像本章中的例子),你差不多永远都不用考虑unicode。&python; 在进行解析时会将 &xml; 文档中的 &ascii; 数据转换为unicode,在任何需要的时候强制转换回为 &ascii;,你甚至永远都不用注意。但是如果你要处理其它语言的数据,&python; 已经准备好了。</para>  
    636 641 <itemizedlist role="furtherreading">  
    637   <title>Further reading</title>  
    638   <listitem><para><ulink url="&url_unicode;">Unicode.org</ulink> is the home page of the unicode standard, including a brief <ulink url="&url_unicodetech;">technical introduction</ulink>.</para></listitem>  
    639   <listitem><para><ulink url="&url_unicodetutorial;">Unicode Tutorial</ulink> has some more examples of how to use &python;'s unicode functions, including how to force &python; to coerce unicode into &ascii; even when it doesn't really want to.</para></listitem>  
    640   <listitem><para><ulink url="http://www.python.org/peps/pep-0263.html">PEP 263</ulink> goes into more detail about how and when to define a character encoding in your <filename>.py</filename> files.</para></listitem>  
      642 <title>进一步阅读</title>  
      643 <listitem><para><ulink url="&url_unicode;">Unicode.org</ulink>是unicode标准的主页,包含了一个简要的<ulink url="&url_unicodetech;">技术简介</ulink>。</para></listitem>  
      644 <listitem><para><ulink url="&url_unicodetutorial;">Unicode教程</ulink>有更多关于如何使用 &python; unicode函数的例子,包括甚至在并不真的需要时如何将unicode强制转换为 &ascii;。</para></listitem>  
      645 <listitem><para><ulink url="http://www.python.org/peps/pep-0263.html">PEP 263</ulink>涉及了何时、如何在你的<filename>.py</filename>文件中定义字符的更多细节。</para></listitem>  
    641 646 </itemizedlist>  
    642 647 </section>  
    643 648 <section id="kgp.search">  
    644 649 <?dbhtml filename="xml_processing/searching.html"?>  
    645   <title>Searching for elements</title>  
      650 <title>搜索元素</title>  
    645 650 <abstract>  
    646 651 <title/>  
    647   <para>Traversing &xml; documents by stepping through each node can be tedious.  If you're looking for something in particular, buried deep within your &xml; document, there is a shortcut you can use to find it quickly: &getelementsbytagname_functionname;.</para>  
      652 <para>通过一步步访问每一个节点的方式遍历 &xml; 文档可能很乏味。如果你正在寻找些特别的东西,又恰恰它们深深埋入了你的 &xml; 文档,有个捷径让你可以快速找到它:&getelementsbytagname_functionname; 。</para>  
    647 652 </abstract>  
    648   <para>For this section, you'll be using the &binaryxml_filename; grammar file, which looks like this:</para>  
      653 <para>在这部分,将使用 &binaryxml_filename; 语法文件,它看上去是这样的:</para>  
    648 653 <example>  
    649 654 <title>&binaryxml_filename;</title>  
     
    669 674 &lt;/grammar></computeroutput></screen>  
    670 675 </example>  
    671   <para>It has two &refnode;s, <literal>'bit'</literal> and <literal>'byte'</literal>.  A <literal>bit</literal> is either a <literal>'0'</literal> or <literal>'1'</literal>, and a <literal>byte</literal> is 8 <literal>bit</literal>s.</para>  
      676 <para>它有两个 &refnode;,<literal>'bit'</literal>和<literal>'byte'</literal>。一个<literal>位</literal>是<literal>'0'</literal>或者<literal>'1'</literal>,而一个<literal>字节</literal>是8个<literal>位</literal>。</para>  
    671 676 <example>  
    672   <title>Introducing &getelementsbytagname_functionname;</title>  
      677 <title>&getelementsbytagname_functionname; 介绍</title>  
    672 677 <screen>  
    673 678 &prompt;<userinput>from xml.dom import minidom</userinput>  
     
    691 696 <calloutlist>  
    692 697 <callout arearefs="kgp.search.1.1">  
    693   <para>&getelementsbytagname_functionname; takes one argument, the name of the element you wish to find.  It returns a list of &element_classname; objects, corresponding to the &xml; elements that have that name.  In this case, you find two <literal>ref</literal> elements.</para>  
      698 <para>&getelementsbytagname_functionname; 接收一个参数,即要找的元素的名称。它返回一个 &element_classname;  对象的列表,列表中的对象都是有指定名称的 &xml; 元素。在本例中,你能找到两个<literal>ref</literal>元素。</para>  
    693 698 </callout>  
    694 699 </calloutlist>  
    695 700 </example>  
    696 701 <example>  
    697   <title>Every element is searchable</title>  
      702 <title>每个元素都是可搜索的</title>  
    697 702 <screen>  
    698 703 &prompt;<userinput>firstref = reflist[0]</userinput>                      <co id="kgp.search.2.1"/>  
     
    713 718 <calloutlist>  
    714 719 <callout arearefs="kgp.search.2.1">  
    715   <para>Continuing from the previous example, the first object in your <varname>reflist</varname> is the <literal>'bit'</literal> &refnode; element.</para>  
      720 <para>继续前面的例子,在<varname>reflist</varname>中的第一个对象是<literal>'bit'</literal> &refnode;元素。</para>  
    715 720 </callout>  
    716 721 <callout arearefs="kgp.search.2.2">  
    717   <para>You can use the same &getelementsbytagname_functionname; method on this &element_classname; to find all the <sgmltag>&lt;p></sgmltag> elements within the <literal>'bit'</literal> &refnode; element.</para>  
      722 <para>你可以在这个 &element_classname; 上使用相同的 &getelementsbytagname_functionname; 方法来寻找所有在<literal>'bit'</literal> &refnode; 元素中的<sgmltag>&lt;p></sgmltag>元素。</para>  
    717 722 </callout>  
    718 723 <callout arearefs="kgp.search.2.3">  
    719   <para>Just as before, the &getelementsbytagname_functionname; method returns a list of all the elements it found.  In this case, you have two, one for each bit.</para>  
      724 <para>和前面一样,&getelementsbytagname_functionname; 方法返回一个找到元素的列表。在本例中,你有两个,每“位”使用一个。</para>  
    719 724 </callout>  
    720 725 </calloutlist>  
    721 726 </example>  
    722 727 <example>  
    723   <title>Searching is actually recursive</title>  
      728 <title>搜索实际上是递归的</title>  
    723 728 <screen>  
    724 729 &prompt;<userinput>plist = xmldoc.getElementsByTagName("p")</userinput> <co id="kgp.search.3.1"/>  
     
    738 743 <calloutlist>  
    739 744 <callout arearefs="kgp.search.3.1">  
    740   <para>Note carefully the difference between this and the previous example.  Previously, you were searching for &pnode; elements within <varname>firstref</varname>, but here you are searching for &pnode; elements within <varname>xmldoc</varname>, the root-level object that represents the entire &xml; document.  This <emphasis>does</emphasis> find the &pnode; elements nested within the &refnode; elements within the root &grammarnode; element.</para>  
      745 <para>仔细注意这个例子和前面例子之间的不同。前面,你是在<varname>firstref</varname>中搜索 &pnode; 元素,但是这里你是在<varname>xmldoc</varname>中搜索 &pnode; 元素,<varname>xmldoc</varname>是代表了整个 &xml; 文档的根层对象。这样<emphasis>就会</emphasis>找到嵌套在 &refnode; 元素(它嵌套在根 &grammarnode; 元素中)中的 &pnode; 元素。</para>  
    740 745 </callout>  
    741 746 <callout arearefs="kgp.search.3.2">  
    742   <para>The first two &pnode; elements are within the first &refnode; (the <literal>'bit'</literal> &refnode;).</para>  
      747 <para>前两个 &pnode; 元素在第一个 &refnode; 内(<literal>'bit'</literal> &refnode;)。</para>  
    742 747 </callout>  
    743 748 <callout arearefs="kgp.search.3.3">  
    744   <para>The last &pnode; element is the one within the second &refnode; (the <literal>'byte'</literal> &refnode;).</para>  
      749 <para>后一个 &pnode; 元素在第二个 &refnode; 中(<literal>'byte'</literal> &refnode;)。</para>  
    744 749 </callout>  
    745 750 </calloutlist>  
     
    751 756 <section id="kgp.attributes">  
    752 757 <?dbhtml filename="xml_processing/attributes.html"?>  
    753   <title>Accessing element attributes</title>  
      758 <title>访问元素属性</title>  
    753 758 <abstract>  
    754 759 <title/>  
    755   <para>&xml; elements can have one or more attributes, and it is incredibly simple to access them once you have parsed an &xml; document.</para>  
      760 <para>&xml; 元素可以有一个或者多个属性,一旦你已经解析了一个 &xml; 文档,访问它们就太简单了。</para>  
    755 760 </abstract>  
    756   <para>For this section, you'll be using the &binaryxml_filename; grammar file that you saw in the <link linkend="kgp.search">previous section</link>.</para>  
      761 <para>在这部分中,将使用 &binaryxml_filename; 语法文件,你在<link linkend="kgp.search">上一节</link>中已经看到过了。</para>  
    756 761 <note>  
    757   <title>&xml; attributes and &python; attributes</title>  
    758   <para>This section may be a little confusing, because of some overlapping terminology.  Elements in an &xml; document have attributes, and &python; objects also have attributes.  When you parse an &xml; document, you get a bunch of &python; objects that represent all the pieces of the &xml; document, and some of these &python; objects represent attributes of the &xml; elements.  But the (&python;) objects that represent the (&xml;) attributes also have (&python;) attributes, which are used to access various parts of the (&xml;) attribute that the object represents.  I told you it was confusing.  I am open to suggestions on how to distinguish these more clearly.</para>  
      762 <title>&xml; 属性和&python; 属性</title>  
      763 <para>这部分由于某个涵义重叠的术语可能让人有点糊涂。在一个 &xml; 文档中,元素可以有属性,而 &python; 对象也有属性。当你解析一个 &xml; 文档时,你得到了一组 &python; 对象,它们代表 &xml; 文档中的所有片段,同时有些 &python; 对象代表 &xml; 元素的属性。但是表示(&xml;)属性的(&python;)对象也有(&python;)属性,它们用于访问对象表示的(&xml;)属性。我告诉过你它让人糊涂。我会公开提出关于如何更明显地区分这些不同的建议。  
      764 </para>  
    759 765 </note>  
    760 766 <example>  
    761   <title>Accessing element attributes</title>  
      767 <title>访问元素属性</title>  
    761 767 <screen>  
    762 768 &prompt;<userinput>xmldoc = minidom.parse('binary.xml')</userinput>  
     
    782 788 <calloutlist>  
    783 789 <callout arearefs="kgp.attributes.1.1">  
    784   <para>Each &element_classname; object has an attribute called <literal>attributes</literal>, which is a &namednodemap_classname; object.  This sounds scary, but it's not, because a &namednodemap_classname; is an object that <link linkend="fileinfo.userdict">acts like a dictionary</link>, so you already know how to use it.</para>  
      790 <para>每个 &element_classname; 对象都有一个称为<literal>attributes</literal>的属性,它是一个 &namednodemap_classname;  对象。听上去挺吓人的,其实不然,因为 &namednodemap_classname; 是一个<link linkend="fileinfo.userdict">行为像字典</link>的对象,所以你已经知道怎么使用它了。</para>  
    784 790 </callout>  
    785 791 <callout arearefs="kgp.attributes.1.2">  
    786   <para>Treating the &namednodemap_classname; as a dictionary, you can get a list of the names of the attributes of this element by using <function>attributes.keys()</function>.  This element has only one attribute, <literal>'id'</literal>.</para>  
      792 <para>将 &namednodemap_classname; 视为一个字典,你可以通过<function>attributes.keys()</function>获得属性名称的一个列表。这个元素只有一个属性,<literal>'id'</literal>。</para>  
    786 792 </callout>  
    787 793 <callout arearefs="kgp.attributes.1.3">  
    788   <para>Attribute names, like all other text in an &xml; document, are stored in <link linkend="kgp.unicode">unicode</link>.</para>  
      794 <para>属性名称,像其它 &xml; 文档中的文本一样,都是以<link linkend="kgp.unicode">unicode</link>保存的。</para>  
    788 794 </callout>  
    789 795 <callout arearefs="kgp.attributes.1.4">  
    790   <para>Again treating the &namednodemap_classname; as a dictionary, you can get a list of the values of the attributes by using <function>attributes.values()</function>.  The values are themselves objects, of type &attr_classname;.  You'll see how to get useful information out of this object in the next example.</para>  
      796 <para>再次将 &namednodemap_classname; 视为一个字典,你可以通过<function>attributes.values()</function>获取属性值的一个列表。这些值本身是 &attr_classname; 类型的对象。你将在下一个例子中看到如何获取对象的有用信息。</para>  
    790 796 </callout>  
    791 797 <callout arearefs="kgp.attributes.1.5">  
    792   <para>Still treating the &namednodemap_classname; as a dictionary, you can access an individual attribute by name, using normal dictionary syntax.  (Readers who have been paying extra-close attention will already know how the &namednodemap_classname; class accomplishes this neat trick: by defining a <link linkend="fileinfo.specialmethods">&getitem; special method</link>.  Other readers can take comfort in the fact that they don't need to understand how it works in order to use it effectively.)</para>  
      798 <para>仍然把 &namednodemap_classname; 视为一个字典,你可以通过常用的字典语法和名称访问单个的属性。(那些非常认真的读者将已经知道 &namednodemap_classname; 类是如何实现这一技巧的:通过定义一个<link linkend="fileinfo.specialmethods">&getitem; 特殊方法</link>。它的读者可能乐意接受这一事实:他们不需要理解它是如何工作的就可以有效地使用它。)</para>  
    792 798 </callout>  
    793 799 </calloutlist>  
    794 800 </example>  
    795 801 <example>  
    796   <title>Accessing individual attributes</title>  
      802 <title>访问单个属性</title>  
    796 802 <screen>  
    797 803 &prompt;<userinput>a = bitref.attributes["id"]</userinput>  
     
    810 816 <calloutlist>  
    811 817 <callout arearefs="kgp.attributes.2.1">  
    812   <para>The &attr_classname; object completely represents a single &xml; attribute of a single &xml; element.  The name of the attribute (the same name as you used to find this object in the <literal>bitref.attributes</literal> &namednodemap_classname; pseudo-dictionary) is stored in <literal>a.name</literal>.</para>  
      818 <para>&attr_classname; 对象完整代表了单个 &xml; 元素的单个 &xml; 属性。属性的名称(与你在<literal>bitref.attributes</literal> &namednodemap_classname; 的伪目录中寻找的对象同名)保存在<literal>a.name</literal>中。</para>  
    812 818 </callout>  
    813 819 <callout arearefs="kgp.attributes.2.2">  
    814   <para>The actual text value of this &xml; attribute is stored in <literal>a.value</literal>.</para>  
      820 <para>这个 &xml; 属性的真实文本值保存在<literal>a.value</literal>中。</para>  
    814 820 </callout>  
    815 821 </calloutlist>  
    816 822 </example>  
    817 823 <note>  
    818   <title>Attributes have no order</title>  
    819   <para>Like a dictionary, attributes of an &xml; element have no ordering.  Attributes may <emphasis>happen to be</emphasis> listed in a certain order in the original &xml; document, and the &attr_classname; objects may <emphasis>happen to be</emphasis> listed in a certain order when the &xml; document is parsed into &python; objects, but these orders are arbitrary and should carry no special meaning.  You should always access individual attributes by name, like the keys of a dictionary.</para>  
      824 <title>属性没有顺序</title>  
      825 <para>类似于字典,一个 &xml; 元素的属性没有顺序。属性可以以某种顺序<emphasis>偶然</emphasis>列在最初的 &xml; 文档中,而在 &xml; 文档解析为 &python; 对象时,&attr_classname; 对象以某种顺序<emphasis>偶然</emphasis>列出,这些顺序都是任意的,没有任何特别的含义。你应该总是使用名称来访问单个属性,就像字典的键一样。</para>  
    820 826 </note>  
    821 827 </section>  
     
    827 833 <abstract>  
    828 834 <title/>  
    829   <para>OK, that's it for the hard-core XML stuff.  The next chapter will continue to use these same example programs, but focus on other aspects that make the program more flexible: using streams for input processing, using &getattr; for method dispatching, and using command-line flags to allow users to reconfigure the program without changing the code.</para>  
      835 <para>OK,that's it for the hard-core XML stuff. 下一章将继续使用相同的示例程序,但是焦点在于能使程序更加灵活的其它方面:使用输入流处理,使用 &getattr; 进行方法分发,并使用命令行标识允许用户重新配置程序而无需修改代码。</para>  
    829 835 </abstract>  
    830   <para>Before moving on to the next chapter, you should be comfortable doing all of these things:</para>  
      836 <para>在进入下一章前,你应该没有困难的完成这些事情:</para>  
    830 836 <itemizedlist>  
    831   <listitem><para><link linkend="kgp.parse">Parsing &xml; documents</link> using &minidom_modulename;, <link linkend="kgp.search">searching through the parsed document</link>, and accessing arbitrary <link linkend="kgp.attributes">element attributes</link> and <link linkend="kgp.child">element children</link></para></listitem>  
    832   <listitem><para>Organizing complex libraries into <link linkend="kgp.packages">packages</link></para></listitem>  
    833   <listitem><para><link linkend="kgp.unicode">Converting unicode strings</link> to different character encodings</para></listitem>  
      837 <listitem><para>使用 &minidom_modulename; <link linkend="kgp.parse">解析 &xml; 文档</link> ,<link linkend="kgp.search">搜索已解析文档</link>,并以任意顺序访问<link linkend="kgp.attributes">元素属性</link>和<link linkend="kgp.child">元素子元素</link></para></listitem>  
      838 <listitem><para>将复杂的库组织为<link linkend="kgp.packages">包</link></para></listitem>  
      839 <listitem><para>将<link linkend="kgp.unicode">unicode字符串转换</link>为不同的字符编码</para></listitem>  
    834 840 </itemizedlist>  
    835 841 </section>  
     
    841 847 <!-- You can only enter the same stream once. -->  
    842 848 <title>Scripts and Streams</title>  
    843   <titleabbrev id="streams.numberonly">Chapter 10</titleabbrev>  
      849 <titleabbrev id="streams.numberonly">第十章</titleabbrev>  
    843 849 <section id="kgp.openanything">  
    844 850 <?dbhtml filename="scripts_and_streams/input_sources.html"?>  
    845   <title>Abstracting input sources</title>  
      851 <title>抽象输入源</title>  
    845 851 <abstract>  
    846 852 <title/>  
    847   <para>One of &python;'s greatest strengths is its dynamic binding, and one powerful use of dynamic binding is the <emphasis>file-like object</emphasis>.</para>  
      853 <para>&python; 的最强大力量之一是它的动态绑定,并且动态绑定最强大的用法之一是<emphasis>类文件(file-like)对象</emphasis>。</para>  
    847 853 </abstract>  
    848   <para>Many functions which require an input source could simply take a filename, go open the file for reading, read it, and close it when they're done.  But they don't.  Instead, they take a <emphasis>file-like object</emphasis>.</para>  
    849   <para>In the simplest case, a <emphasis>file-like object</emphasis> is any object with a &read; method with an optional <varname>size</varname> parameter, which returns a string.  When called with no <varname>size</varname> parameter, it reads everything there is to read from the input source and returns all the data as a single string.  When called with a <varname>size</varname> parameter, it reads that much from the input source and returns that much data; when called again, it picks up where it left off and returns the next chunk of data.</para>  
    850   <para>This is how <link linkend="fileinfo.files">reading from real files</link> works; the difference is that you're not limiting yourself to real files.  The input source could be anything: a file on disk, a web page, even a hard-coded string.  As long as you pass a file-like object to the function, and the function simply calls the object's &read; method, the function can handle any kind of input source without specific code to handle each kind.</para>  
    851   <para>In case you were wondering how this relates to &xml; processing, &minidomparse_functionname; is one such function which can take a file-like object.</para>  
      854 <para>  
      855 许多需要输入源的函数可以只接收一个文件名,并以读方式打开文件,读取文件,处理完成后关闭它。其实它们不是这样的,而是接收一个<emphasis>类文件对象</emphasis>。</para>  
      856 <para>在最简单的例子中,<emphasis>类文件对象</emphasis>是任意一个带有 &read; 方法的对象,这个方法带有一个可选的<varname>size</varname>参数,并返回一个字符串。调用时如果没有<varname>size</varname>参数,它从输入源中读取所有东西并将所有数据作为单个字符串返回。调用时如果指定了<varname>size</varname>参数,它将从输入源中读取<varname>size</varname>大小的数据并返回这些数据;再次调用的时候,它从余下的地方开始并返回下一块数据。</para>  
      857 <para>这就是<link linkend="fileinfo.files">从真实文件读取数据</link>的工作方式;区别在于你不用把自己局限于真实的文件。输入源可以是任何东西:磁盘上的文件,甚至是一个硬编码的字符串。只要你将一个类文件对象传递给函数,函数只是调用对象的 &read; 方法,函数可以处理任何类型的输入源,而不需要处理每种类型的特定代码。  
      858 </para>  
      859 <para>你可能纳闷过这和 &xml; 处理有什么关系,其实 &minidomparse_functionname; 就是一个可以接收类文件对象的函数。</para>  
    852 860 <example>  
    853   <title>Parsing &xml; from a file</title>  
      861 <title>从文件中解析 &xml; </title>  
    853 861 <screen>  
    854 862 &prompt;<userinput>from xml.dom import minidom</userinput>  
     
    874 882 <calloutlist>  
    875 883 <callout arearefs="kgp.openanything.1.1">  
    876   <para>First, you open the file on disk.  This gives you a <link linkend="fileinfo.files">file object</link>.</para>  
      884 <para>首先,你要打开一个磁盘上的文件。这会提供给你一个<link linkend="fileinfo.files">文件对象</link>。</para>  
    876 884 </callout>  
    877 885 <callout arearefs="kgp.openanything.1.2">  
    878   <para>You pass the file object to &minidomparse_functionname;, which calls the &read; method of <varname>fsock</varname> and reads the &xml; document from the file on disk.</para>  
      886 <para>将文件对象传递给 &minidomparse_functionname; ,它调用<varname>fsock</varname>的 &read; 方法并从磁盘上的文件读取 &xml; 文档。</para>  
    878 886 </callout>  
    879 887 <callout arearefs="kgp.openanything.1.3">  
    880   <para>Be sure to call the &close; method of the file object after you're done with it.  &minidomparse_functionname; will not do this for you.</para>  
      888 <para>确保处理完成文件后调用 &close; 方法。&minidomparse_functionname;不会替你做这件事。</para>  
    880 888 </callout>  
    881 889 <callout arearefs="kgp.openanything.1.4">  
    882   <para>Calling the <methodname>toxml()</methodname> method on the returned &xml; document prints out the entire thing.</para>  
      890 <para>在返回的 &xml; 文档上调用<methodname>toxml()</methodname>方法,打印出整个文档的内容。</para>  
    882 890 </callout>  
    883 891 </calloutlist>  
    884 892 </example>  
    885   <para>Well, that all seems like a colossal waste of time.  After all, you've already seen that &minidomparse_functionname; can simply take the filename and do all the opening and closing nonsense automatically.  And it's true that if you know you're just going to be parsing a local file, you can pass the filename and &minidomparse_functionname; is smart enough to <trademark>Do The Right Thing</trademark>.  But notice how similar -- and easy -- it is to parse an &xml; document straight from the Internet.</para>  
      893 <para>哦,所有这些看上去象是在浪费大量的时间。毕竟,你已经看过 &minidomparse_functionname; 可以只接收文件名,并自动执行所有打开文件和关闭无用文件的行为。不错,如果你知道正要解析的是一个本地文件,你可以传递文件名而且 &minidomparse_functionname; 可以足够聪明的<trademark>做正确的事情</trademark>,这一切都不会有问题。但是请注意,使用类文件分析直接从Internet上来的 &xml; 文档是多么相似和容易的事情!</para>  
    885 893 <example id="kgp.openanything.urllib">  
    886   <title>Parsing &xml; from a &url;</title>  
      894 <title>解析来自 &url; 的 &xml; </title>  
    886 894 <screen>  
    887 895 &prompt;<userinput>import urllib</userinput>  
     
    920 928 <calloutlist>  
    921 929 <callout arearefs="kgp.openanything.2.1">  
    922   <para>As you saw <link linkend="dialect.extract.urllib">in a previous chapter</link>, &urlopen; takes a web page &url; and returns a file-like object.  Most importantly, this object has a &read; method which returns the &html; source of the web page.</para>  
      930 <para>如<link linkend="dialect.extract.urllib">前一章</link>,&urlopen; 接收一个web页面的 &url; 作为参数并返回一个类文件对象。最重要的是,这个对象有一个 &read; 方法可以返回web页面的 &html; 源代码。</para>  
    922 930 </callout>  
    923 931 <callout arearefs="kgp.openanything.2.2">  
    924   <para>Now you pass the file-like object to &minidomparse_functionname;, which obediently calls the &read; method of the object and parses the &xml; data that the &read; method returns.  The fact that this &xml; data is now coming straight from a web page is completely irrelevant.  &minidomparse_functionname; doesn't know about web pages, and it doesn't care about web pages; it just knows about file-like objects.</para>  
      932 <para>现在把类文件对象传递给 &minidomparse_functionname; ,它顺从地调用对象的 &read; 方法并解析 &read; 方法返回的 &xml; 数据。这与 &xml; 数据现在直接来源于web页面的事实毫不相干。&minidomparse_functionname; 并不知道web页面,它也不关心web页面;它只知道类文件对象。</para>  
    924 932 </callout>  
    925 933 <callout arearefs="kgp.openanything.2.3">  
    926   <para>As soon as you're done with it, be sure to close the file-like object that &urlopen; gives you.</para>  
      934 <para>到这里已经处理完毕了,确保将 &urlopen; 提供给你的类文件对象关闭。</para>  
    926 934 </callout>  
    927 935 <callout arearefs="kgp.openanything.2.4">  
    928   <para>By the way, this &url; is real, and it really is &xml;.  It's an &xml; representation of the current headlines on <ulink url="http://slashdot.org/">Slashdot</ulink>, a technical news and gossip site.</para>  
      936 <para>顺便提一句,这个 &url; 是真实的,它真的是一个 &xml;。它是<ulink url="http://slashdot.org/">Slashdot</ulink>站点(这是一个技术新闻和随笔站点)上当前标题的 &xml; 表示。</para>  
    928 936 </callout>  
    929 937 </calloutlist>  
    930 938 </example>  
    931 939 <example>  
    932   <title>Parsing &xml; from a string (the easy but inflexible way)</title>  
      940 <title>解析字符串 &xml;  (容易但不灵活的方式)</title>  
    932 940 <screen>  
    933 941 &prompt;<userinput>contents = "&lt;grammar>&lt;ref id='bit'>&lt;p>0&lt;/p>&lt;p>1&lt;/p>&lt;/ref>&lt;/grammar>"</userinput>  
     
    943 951 <calloutlist>  
    944 952 <callout arearefs="kgp.openanything.3.1">  
    945   <para>&minidom_modulename; has a method, &parsestring_functionname;, which takes an entire &xml; document as a string and parses it.  You can use this instead of &minidomparse_functionname; if you know you already have your entire &xml; document in a string.</para>  
      953 <para>&minidom_modulename;  有一个方法,&parsestring_functionname;,它接收一个字符串形式的完整 &xml; 文档作为参数并解析这个参数。如果你已经将整个 &xml; 文档放入一个字符串,你可以使用它代替&minidomparse_functionname;。</para>  
    945 953 </callout>  
    946 954 </calloutlist>  
    947 955 </example>  
    948   <para>OK, so you can use the &minidomparse_functionname; function for parsing both local files and remote &url;s, but for parsing strings, you use... a different function.  That means that if you want to be able to take input from a file, a &url;, or a string, you'll need special logic to check whether it's a string, and call the &parsestring_functionname; function instead.  How unsatisfying.</para>  
    949   <para>If there were a way to turn a string into a file-like object, then you could simply pass this object to &minidomparse_functionname;.  And in fact, there is a module specifically designed for doing just that: &stringio_modulename;.</para>  
      956 <para>OK,所以你可以使用 &minidomparse_functionname; 函数来解析本地文件和远端 &url;,但对于解析字符串,你使用...一个不同的函数。这就是说,你要从文件,&url; 或者字符串接收输入,你需要特别的逻辑来判断参数是否是字符串,然后调用 &parsestring_functionname;。多不让人满意。</para>  
      957 <para>如果有一个方法可以把字符串转换成类文件对象,那么你可以只把这个对象传递给 &minidomparse_functionname; 就可以了。事实上,有一个模块专门设计用来做这件事:&stringio_modulename;。</para>  
    950 958 <example id="kgp.openanything.stringio.example">  
    951   <title>Introducing &stringio_modulename;</title>  
      959 <title>&stringio_modulename; 介绍</title>  
    951 959 <screen>  
    952 960 &prompt;<userinput>contents = "&lt;grammar>&lt;ref id='bit'>&lt;p>0&lt;/p>&lt;p>1&lt;/p>&lt;/ref>&lt;/grammar>"</userinput>  
     
    969 977 <calloutlist>  
    970 978 <callout arearefs="kgp.openanything.4.1">  
    971   <para>The &stringio_modulename; module contains a single class, also called &stringio_classname;, which allows you to turn a string into a file-like object.  The &stringio_classname; class takes the string as a parameter when creating an instance.</para>  
      979 <para>&stringio_modulename; 模块只包含了单个类,也叫 &stringio_modulename;,它允许你将一个字符串转换为一个类文件对象。这个 &stringio_modulename; 类在创建实例的时候接收字符串作为参数。</para>  
    971 979 </callout>  
    972 980 <callout arearefs="kgp.openanything.4.2">  
    973   <para>Now you have a file-like object, and you can do all sorts of file-like things with it.  Like &read;, which returns the original string.</para>  
      981 <para> 现在你有了一个类文件对象,你可用它做类文件的所有事情。比如 &read; 可以返回原始字符串。</para>  
    973 981 </callout>  
    974 982 <callout arearefs="kgp.openanything.4.3">  
    975   <para>Calling &read; again returns an empty string.  This is how real file objects work too; once you read the entire file, you can't read any more without explicitly seeking to the beginning of the file.  The &stringio_classname; object works the same way.</para>  
      983 <para>再次调用 &read; 返回空字符串。真实文件对象的工作方式也是这样的;一旦你读取了整个文件,如果不显式定位到文件的开始位置,就不可能读取到任何其他数据。&stringio_classname; 对象以相同的方式进行工作。</para>  
    975 983 </callout>  
    976 984 <callout arearefs="kgp.openanything.4.4">  
    977   <para>You can explicitly seek to the beginning of the string, just like seeking through a file, by using the &seek; method of the &stringio_classname; object.</para>  
      985 <para>使用 &stringio_classname; 对象的 &seek; 方法,你可以显式的定位到字符串的开始位置,就像在文件中定位一样。</para>  
    977 985 </callout>  
    978 986 <callout arearefs="kgp.openanything.4.5">  
    979   <para>You can also read the string in chunks, by passing a <varname>size</varname> parameter to the &read; method.</para>  
      987 <para>将一个<varname>size</varname>参数传递给 &read; 方法,你还可以以块的形式读取字符串。</para>  
    979 987 </callout>  
    980 988 <callout arearefs="kgp.openanything.4.6">  
    981   <para>At any time, &read; will return the rest of the string that you haven't read yet.  All of this is exactly how file objects work; hence the term <emphasis>file-like object</emphasis>.</para>  
      989 <para>任何时候,&read; 都将返回字符串的未读剩余部分。所有这些严格地按文件对象的方式工作;,这就是术语<emphasis>类文件对象</emphasis>的来历。</para>  
    981 989 </callout>  
    982 990 </calloutlist>  
    983 991 </example>  
    984 992 <example>  
    985   <title>Parsing &xml; from a string (the file-like object way)</title>  
      993 <title>解析字符串 &xml; (类文件对象方式)</title>  
    985 993 <screen>  
    986 994 &prompt;<userinput>contents = "&lt;grammar>&lt;ref id='bit'>&lt;p>0&lt;/p>&lt;p>1&lt;/p>&lt;/ref>&lt;/grammar>"</userinput>  
     
    1000 1008 <calloutlist>  
    1001 1009 <callout arearefs="kgp.openanything.5.1">  
    1002   <para>Now you can pass the file-like object (really a &stringio_classname;) to &minidomparse_functionname;, which will call the object's &read; method and happily parse away, never knowing that its input came from a hard-coded string.</para>  
      1010 <para>现在你可以把类文件对象(实际是一个 &stringio_classname;)传递给 &minidomparse_functionname;,它将调用对象的 &read; 方法并高兴的开始解析,绝不会知道它的输入源源自一个硬编码的字符串。</para>  
    1002 1010 </callout>  
    1003 1011 </calloutlist>  
    1004 1012 </example>  
    1005   <para>So now you know how to use a single function, &minidomparse_functionname;, to parse an &xml; document stored on a web page, in a local file, or in a hard-coded string.  For a web page, you use &urlopen; to get a file-like object; for a local file, you use &open;; and for a string, you use &stringio_classname;.  Now let's take it one step further and generalize <emphasis>these</emphasis> differences as well.</para>  
      1013 <para>那么现在你知道了如何使用单个函数,&minidomparse_functionname;,来解析一个保存在web页面上,本地文件中或硬编码字符串中的 &xml; 文档。对于一个web页面,使用 &urlopen; 得到类文件对象;对于本地文件,使用 &open; ;对于字符串,使用 &stringio_classname;。现在让我们进一步并总结一下<emphasis>这些</emphasis> 不同。</para>  
    1005 1013 <example id="kgp.openanything.example">  
    1006 1014 <title>&openanything_functionname;</title>  
     
    1027 1035 <calloutlist>  
    1028 1036 <callout arearefs="kgp.openanything.6.1">  
    1029   <para>The &openanything_functionname; function takes a single parameter, <varname>source</varname>, and returns a file-like object.  <varname>source</varname> is a string of some sort; it can either be a &url; (like <literal>'http://slashdot.org/slashdot.rdf'</literal>), a full or partial pathname to a local file (like <literal>'binary.xml'</literal>), or a string that contains actual &xml; data to be parsed.</para>  
      1037 <para>&openanything_functionname; 函数接收单个参数,<varname>source</varname>,并返回类文件对象。<varname>source</varname>是某种类型的字符串;它可能是一个 &url; (例如<literal>'http://slashdot.org/slashdot.rdf'</literal>),一个本地文件的完整或者部分路径名(例如<literal>'binary.xml'</literal>),或者是一个包含了需要解析 &xml; 数据的字符串。</para>  
    1029 1037 </callout>  
    1030 1038 <callout arearefs="kgp.openanything.6.2">  
    1031   <para>First, you see if <varname>source</varname> is a &url;.  You do this through brute force: you try to open it as a &url; and silently ignore errors caused by trying to open something which is not a &url;.  This is actually elegant in the sense that, if &urllib; ever supports new types of &url;s in the future, you will also support them without recoding.  If &urllib; is able to open <varname>source</varname>, then the &return; kicks you out of the function immediately and the following <literal>try</literal> statements never execute.</para>  
      1039 <para>首先,查看<varname>source</varname>是否是一个 &url; 。这里通过强制方式进行:尝试把它当作一个 &url; 打开并静静地忽略打开非 &url; 引起的错误。感觉上这样做非常好,如果 &urllib; 将来支持更多的 &url; 类型,不用重新编码就可以支持它们。如果 &urllib; 能够打开<varname>source</varname>,那么 &return; 可以立刻把你踢出函数,下面的<literal>try</literal>语句将不会执行。</para>  
    1031 1039 </callout>  
    1032 1040 <callout arearefs="kgp.openanything.6.3">  
    1033   <para>On the other hand, if &urllib; yelled at you and told you that <varname>source</varname> wasn't a valid &url;, you assume it's a path to a file on disk and try to open it.  Again, you don't do anything fancy to check whether <varname>source</varname> is a valid filename or not (the rules for valid filenames vary wildly between different platforms anyway, so you'd probably get them wrong anyway).  Instead, you just blindly open the file, and silently trap any errors.</para>  
      1041 <para>另一方面,如果 &urllib; 向你呼喊并告诉你<varname>source</varname>不是一个有效的 &url;,你假设它是一个磁盘文件的路径并尝试打开它。再一次,你不用做任何特别的事来检查<varname>source</varname>是否是一个有效的文件名(总之在不同的平台上,判断文件名有效性的规则变化很大,那么不管怎样做都可能会判断错)。反而,只要盲目地打开文件并静静地捕获任何错误就可以了。</para>  
    1033 1041 </callout>  
    1034 1042 <callout arearefs="kgp.openanything.6.4">  
    1035   <para>By this point, you need to assume that <varname>source</varname> is a string that has hard-coded data in it (since nothing else worked), so you use &stringio_classname; to create a file-like object out of it and return that.  (In fact, since you're using the &str; function, <varname>source</varname> doesn't even need to be a string; it could be any object, and you'll use its string representation, as defined by its &strspecial; <link linkend="fileinfo.morespecial">special method</link>.)</para>  
      1043 <para>到这里,你需要假设<varname>source</varname>是一个其中有硬编码数据的字符串(因为没有什么可以判断的了),所以你可以使用 &stringio_classname; 从中创建一个类文件对象并将它返回。(实际上,由于使用了 &str; 函数,所以<varname>source</varname>没有必要一定是字符串;它可以是任何对象,你可以使用它的字符串表示形式,通过它的 &strspecial; 定义的<link linkend="fileinfo.morespecial">特殊方法</link>。)</para>  
    1035 1043 </callout>  
    1036 1044 </calloutlist>  
    1037 1045 </example>  
    1038   <para>Now you can use this &openanything_functionname; function in conjunction with &minidomparse_functionname; to make a function that takes a <varname>source</varname> that refers to an &xml; document somehow (either as a &url;, or a local filename, or a hard-coded &xml; document in a string) and parses it.</para>  
      1046 <para>现在你可以使用这个 &openanything_functionname; 函数联合 &minidomparse_functionname; 构造一个函数,接收一个指向 &xml; 文档的<varname>source</varname>,而且无需知道这个<varname>source</varname>的含义(可以是一个 &url; 或是一个本地文件名,或是一个硬编码 &xml; 文档的字符串形式),并解析它。</para>  
    1038 1046 <example>  
    1039   <title>Using &openanything_functionname; in &kgp_filename;</title>  
      1047 <title>在 &kgp_filename; 中使用 &openanything_functionname;</title>  
    1039 1047 <programlisting>  
    1040 1048 &kgp_classdef;  
     
    1054 1062 <section id="kgp.stdio">  
    1055 1063 <?dbhtml filename="scripts_and_streams/stdin_stdout_stderr.html"?>  
    1056   <title>Standard input, output, and error</title>  
      1064 <title>标准输入,输出和错误</title>  
    1056 1064 <abstract>  
    1057 1065 <title/>  
    1058   <para>&unix; users are already familiar with the concept of standard input, standard output, and standard error.  This section is for the rest of you.</para>  
      1066 <para>&unix; 用户已经对标准输入,标准输出和标准错误的概念非常熟悉了。这一节是为其他不熟悉的人准备的。</para>  
    1058 1066 </abstract>  
    1059   <para>Standard output and standard error (commonly abbreviated &stdout; and &stderr;) are pipes that are built into every &unix; system.  When you &print; something, it goes to the &stdout; pipe; when your program crashes and prints out debugging information (like a traceback in &python;), it goes to the &stderr; pipe.  Both of these pipes are ordinarily just connected to the terminal window where you are working, so when a program prints, you see the output, and when a program crashes, you see the debugging information.  (If you're working on a system with a window-based &python; &ide;, &stdout; and &stderr; default to your <quote>Interactive Window</quote>.)</para>  
      1067 <para>标准输入和标准错误(通常缩写为 &stdout; 和 &stderr;)是內建在每一个 &unix; 系统中的管道。当你 &print; 某些东西时,结果前往 &stdout; 管道;当你的程序崩溃并打印出调试信息(类似于 &python; 中的错误跟踪)的时候,信息前往 &stderr; 管道。通常这两个管道只与你正在工作的终端窗口相联,所以当一个程序打印时,你可以看到输出,而当一个程序崩溃时,你可以看到调式信息。(如果你正在一个基于窗口的 &python; &ide; 上工作时,&stdout; 和 &stderr; 缺省为你的<quote>交互窗口</quote>。)</para>  
    1059 1067 <example>  
    1060   <title>Introducing &stdout; and &stderr;</title>  
      1068 <title>&stdout; 和 &stderr; 介绍</title>  
    1060 1068 <screen>  
    1061 1069 &prompt;<userinput>for i in range(3):</userinput>  
     
    1077 1085 <calloutlist>  
    1078 1086 <callout arearefs="kgp.stdio.1.1">  
    1079   <para>As you saw in <xref linkend="fileinfo.for.counter"/>, you can use &python;'s built-in &range; function to build simple counter loops that repeat something a set number of times.</para>  
      1087 <para>正如<xref linkend="fileinfo.for.counter"/>中看到的,你可以使用 &python; 内置的 &range; 函数来构造简单的计数循环,即重复某物一定的次数。</para>  
    1079 1087 </callout>  
    1080 1088 <callout arearefs="kgp.stdio.1.2">  
    1081   <para>&stdout; is a file-like object; calling its &write; function will print out whatever string you give it.  In fact, this is what the &print; function really does; it adds a carriage return to the end of the string you're printing, and calls <function>sys.stdout.write</function>.</para>  
      1089 <para>&stdout; 是一个类文件对象;调用它的 &write; 函数可以打印出你给定的任何字符串。实际上,这就是 &print; 函数真正做的事情;它在你打印的字符串后面加上一个硬回车,然后调用<function>sys.stdout.write</function>函数。</para>  
    1081 1089 </callout>  
    1082 1090 <callout arearefs="kgp.stdio.1.3">  
    1083   <para>In the simplest case, &stdout; and &stderr; send their output to the same place: the &python; &ide; (if you're in one), or the terminal (if you're running &python; from the command line).  Like &stdout;, &stderr; does not add carriage returns for you; if you want them, add them yourself.</para>  
      1091 <para>在最简单的例子中,&stdout; 和 &stderr; 把它们的输出发送到相同的地方:&python; &ide; (如果你在一个 &ide; 中的话),或者终端(如果你从命令行运行 &python; 的话)。像 &stdout;,&stderr; 并不为你添加硬回车;如果需要,要自己加上。</para>  
    1083 1091 </callout>  
    1084 1092 </calloutlist>  
    1085 1093 </example>  
    1086   <para>&stdout; and &stderr; are both file-like objects, like the ones you discussed in <xref linkend="kgp.openanything"/>, but they are both write-only.  They have no &read; method, only &write;.  Still, they are file-like objects, and you can assign any other file- or file-like object to them to redirect their output.</para>  
      1094 <para>&stdout; 和 &stderr; 都是类文件对象,就像在<xref linkend="kgp.openanything"/>中讨论的一样,但是它们都是只写的。它们都没有 &read; 方法,只有 &write; 方法。然而,它们仍然是类文件对象,并且你可以将其它任何文件或者类文件对象赋值给它们来重定向它们的输出。</para>  
    1086 1094 <example>  
    1087   <title>Redirecting output</title>  
      1095 <title>重定向输出</title>  
    1087 1095 <screen>  
    1088 1096 <prompt>[you@localhost kgp]$ </prompt><userinput>python stdout.py</userinput>  
     
    1095 1103 <prompt>[you@localhost kgp]$ </prompt><userinput>cat out.log</userinput>  
    1096 1104 <computeroutput>This message will be logged instead of displayed</computeroutput></screen>  
    1097   <para>(On Windows, you can use <literal>type</literal> instead of <literal>cat</literal> to display the contents of a file.)</para>  
      1105 <para>(在Windows上,你可以使用<literal>type</literal>来代替<literal>cat</literal>显式文件的内容。)</para>  
    1097 1105 &para_download;  
    1098 1106 <programlisting>  
     
    1111 1119 <calloutlist>  
    1112 1120 <callout arearefs="kgp.stdio.2.1">  
    1113   <para>This will print to the &ide; <quote>Interactive Window</quote> (or the terminal, if running the script from the command line).</para>  
      1121 <para>打印输出到 &ide; <quote>交互窗口</quote>(或终端,如果从命令行运行脚本的话)。</para>  
    1113 1121 </callout>  
    1114 1122 <callout arearefs="kgp.stdio.2.2">  
    1115   <para>Always save &stdout; before redirecting it, so you can set it back to normal later.</para>  
      1123 <para>始终在重定向前保存 &stdout; ,这样的话之后你还可以将其设回正常。</para>  
    1115 1123 </callout>  
    1116 1124 <callout arearefs="kgp.stdio.2.3">  
    1117   <para>Open a file for writing.  If the file doesn't exist, it will be created.  If the file does exist, it will be overwritten.</para>  
      1125 <para>打开一个新文件用于写入。如果文件不存在,将会被创建。如果文件存在,将被覆盖。</para>  
    1117 1125 </callout>  
    1118 1126 <callout arearefs="kgp.stdio.2.4">  
    1119   <para>Redirect all further output to the new file you just opened.</para>  
      1127 <para>将所有后续的输出重定向到刚才打开的新文件上。</para>  
    1119 1127 </callout>  
    1120 1128 <callout arearefs="kgp.stdio.2.5">  
    1121   <para>This will be <quote>printed</quote> to the log file only; it will not be visible in the &ide; window or on the screen.</para>  
      1129 <para>这样只会将输出结果<quote>printed</quote>到日志文件中;在 &ide; 窗口中或在屏幕上不会看到输出结果。</para>  
    1121 1129 </callout>  
    1122 1130 <callout arearefs="kgp.stdio.2.6">  
    1123   <para>Set &stdout; back to the way it was before you mucked with it.</para>  
      1131 <para>在我们将 &stdout; 搞乱之前,让我们把它设回原来的方式。</para>  
    1123 1131 </callout>  
    1124 1132 <callout arearefs="kgp.stdio.2.7">  
    1125   <para>Close the log file.</para>  
      1133 <para>关闭日志文件。</para>  
    1125 1133 </callout>  
    1126 1134 </calloutlist>  
    1127 1135 </example>  
    1128   <para>Redirecting &stderr; works exactly the same way, using <function>sys.stderr</function> instead of <function>sys.stdout</function>.</para>  
      1136 <para>重定向 &stderr; 完全以相同的方式进行,用 <function>sys.stderr</function> 代替 <function>sys.stdout</function>。</para>  
    1128 1136 <example>  
    1129   <title>Redirecting error information</title>  
      1137 <title>重定向错误信息</title>  
    1129 1137 <screen>  
    1130 1138 <prompt>[you@localhost kgp]$ </prompt><userinput>python stderr.py</userinput>  
     
    1154 1162 <calloutlist>  
    1155 1163 <callout arearefs="kgp.stdio.3.1">  
    1156   <para>Open the log file where you want to store debugging information.</para>  
      1164 <para>打开你要存储调试信息的日志文件。</para>  
    1156 1164 </callout>  
    1157 1165 <callout arearefs="kgp.stdio.3.2">  
    1158   <para>Redirect standard error by assigning the file object of the newly-opened log file to &stderr;.</para>  
      1166 <para>将新打开的日志文件的文件对象赋值给 &stderr; 以重定向标准错误。</para>  
    1158 1166 </callout>  
    1159 1167 <callout arearefs="kgp.stdio.3.3">  
    1160   <para>Raise an exception.  Note from the screen output that this does <emphasis>not</emphasis> print anything on screen.  All the normal traceback information has been written to <filename>error.log</filename>.</para>  
      1168 <para>引发一个异常。从屏幕输出上可以注意到这个行为<emphasis>没有</emphasis>在屏幕上打印出任何东西。所有正常的跟踪信息已经写进 <filename>error.log</filename>。</para>  
    1160 1168 </callout>  
    1161 1169 <callout arearefs="kgp.stdio.3.4">  
    1162   <para>Also note that you're not explicitly closing your log file, nor are you setting &stderr; back to its original value.  This is fine, since once the program crashes (because of the exception), &python; will clean up and close the file for us, and it doesn't make any difference that &stderr; is never restored, since, as I mentioned, the program crashes and &python; ends.  Restoring the original is more important for &stdout;, if you expect to go do other stuff within the same script afterwards.</para>  
      1170 <para>还要注意你既没有显式关闭日志文件,也没有将 &stderr; 设回最初的值。这样挺好,因为一旦程序崩溃(由于引发的异常),&python; 将替我们清理并关闭文件,这和永远不恢复 &stderr; 不会造成什么不同,因为,我提到过,一旦程序崩溃,则 &python; 结束。如果你希望在同一个脚本的后面去做其它的事情,恢复初始值对 &stdout; 更为重要。</para>  
    1162 1170 </callout>  
    1163 1171 </calloutlist>  
    1164 1172 </example>  
    1165   <para>Since it is so common to write error messages to standard error, there is a shorthand syntax that can be used instead of going through the hassle of redirecting it outright.</para>  
      1173 <para>向标准错误写入错误信息是很常见的,所以有一种较快的语法可以立刻信息导出。</para>  
    1165 1173 <example id="kgp.stdio.print.example">  
    1166   <title>Printing to &stderr;</title>  
      1174 <title>打印到 &stderr;</title>  
    1166 1174 <screen>  
    1167 1175 &prompt;<userinput>print 'entering function'</userinput>  
     
    1179 1187 <calloutlist>  
    1180 1188 <callout arearefs="kgp.stdio.6.1">  
    1181   <para>This shorthand syntax of the &print; statement can be used to write to any open file, or file-like object.  In this case, you can redirect a single &print; statement to &stderr; without affecting subsequent &print; statements.</para>  
      1189 <para>&print; 语句的快捷语法可以用于向任何打开的文件写入,或者是类文件对象。在这种情况下,你可以将单个&print; 语句重定向到&stderr; 而且不用影响后面的&print; 语句。</para>  
    1181 1189 </callout>  
    1182 1190 </calloutlist>  
    1183 1191 </example>  
    1184   <para>Standard input, on the other hand, is a read-only file object, and it represents the data flowing into the program from some previous program.  This will likely not make much sense to classic &macos; users, or even &windows; users unless you were ever fluent on the &dos; command line.  The way it works is that you can construct a chain of commands in a single line, so that one program's output becomes the input for the next program in the chain.  The first program simply outputs to standard output (without doing any special redirecting itself, just doing normal &print; statements or whatever), and the next program reads from standard input, and the operating system takes care of connecting one program's output to the next program's input.</para>  
      1192 <para>标准输出,另一方面,只是一个只读文件对象,它表示从前一个程序到这个程序的数据流。这个对于老的&macos;用户和&windows;用户可能不太容易理解,除非你受到过 &dos; 命令行的影响。它工作的方式是你可以在单个命令行中构造一个命令的链,这样的话一个程序的输出就可以成为下一个程序的输入。第一个程序只是简单的输出到标准输出上(其本身没有做任何特别的重定向,只是执行了普通的 &print; 语句),然后,下一个程序从标准输入中读取,操作系统只是关注将一个程序的输出连接到一个程序的输入。</para>  
    1184 1192 <example>  
    1185   <title>Chaining commands</title>  
      1193 <title>链接命令</title>  
    1185 1193 <screen>  
    1186 1194 <prompt>[you@localhost kgp]$ </prompt><userinput>python kgp.py -g binary.xml</userinput>         <co id="kgp.stdio.4.1"/>  
     
    1206 1214 <calloutlist>  
    1207 1215 <callout arearefs="kgp.stdio.4.1">  
    1208   <para>As you saw in <xref linkend="kgp.divein"/>, this will print a string of eight random bits, &zero; or &one;.</para>  
      1216 <para>正如你在<xref linkend="kgp.divein"/>中看到的,该命令将只打印一个随机的八位字符串,其中只有&zero; 或者 &one;。</para>  
    1208 1216 </callout>  
    1209 1217 <callout arearefs="kgp.stdio.4.2">  
    1210   <para>This simply prints out the entire contents of &binaryxml_filename;.  (&windows; users should use <literal>type</literal> instead of <literal>cat</literal>.)</para>  
      1218 <para>该处只是简单的打印出整个&binaryxml_filename;文档的内容。(&windows;用户应该用<literal>type</literal>代替<literal>cat</literal>。)</para>  
    1210 1218 </callout>  
    1211 1219 <callout arearefs="kgp.stdio.4.3">  
    1212   <para>This prints the contents of &binaryxml_filename;, but the <quote><literal>|</literal></quote> character, called the <quote>pipe</quote> character, means that the contents will not be printed to the screen.  Instead, they will become the standard input of the next command, which in this case calls your &python; script.</para>  
      1220 <para>该处打印&binaryxml_filename;的内容,但是<quote><literal>|</literal></quote>字符,称为<quote>管道</quote>符,说明内容不会打印到屏幕上。而且,它们会成为下一个命令的标准输入,在这个例子中是你调用的&python; 脚本。</para>  
    1212 1220 </callout>  
    1213 1221 <callout arearefs="kgp.stdio.4.4">  
    1214   <para>Instead of specifying a module (like &binaryxml_filename;), you specify <quote><literal>-</literal></quote>, which causes your script to load the grammar from standard input instead of from a specific file on disk.  (More on how this happens in the next example.)  So the effect is the same as the first syntax, where you specified the grammar filename directly, but think of the expansion possibilities here.  Instead of simply doing <literal>cat binary.xml</literal>, you could run a script that dynamically generates the grammar, then you can pipe it into your script.  It could come from anywhere: a database, or some grammar-generating meta-script, or whatever.  The point is that you don't need to change your &kgp_filename; script at all to incorporate any of this functionality.  All you need to do is be able to take grammar files from standard input, and you can separate all the other logic into another program.</para>  
      1222 <para>为了不用指定一个模块(例如&binaryxml_filename;),你需要指定<quote><literal>-</literal></quote>,它会使得你的脚本从标准输入载入脚本而不是从磁盘上的特定文件。(下一个例子更多地说明了这是如何实现的)。所以效果和第一种语法是一样的,在那里你要直接指定语法文件,但是想想这里的扩展性。代替<literal>cat binary.xml</literal>,你可以通过运行一个脚本动态生成语法,然后你可以通过管道将它导入你的脚本。它可以来源于任何地方:数据库,或者是生成语法的元脚本,或者其他。你根本不需要修改你的&kgp_filename; 脚本就可以混合使用这个功能。所有你要作的仅仅是从标准输入取得一个语法文件,然后你就可以将其他的逻辑分离出来放到另一程序中去了。</para>  
    1214 1222 </callout>  
    1215 1223 </calloutlist>  
    1216 1224 </example>  
    1217   <para>So how does the script <quote>know</quote> to read from standard input when the grammar file is <quote><literal>-</literal></quote>?  It's not magic; it's just code.</para>  
      1225 <para>那么脚本是如何<quote>知道</quote>在语法文件是<quote><literal>-</literal></quote>时从标准输入读取? 其实不神奇;它只是代码。</para>  
    1217 1225 <example>  
    1218   <title>Reading from standard input in &kgp_filename;</title>  
      1226 <title>在&kgp_filename;中从标准输入读取</title>  
    1218 1226 <programlisting>  
    1219 1227 def openAnything(source):  
     
    1235 1243 <calloutlist>  
    1236 1244 <callout arearefs="kgp.stdio.5.1">  
    1237   <para>This is the <function>openAnything</function> function from &toolbox_filename;, which you previously examined in <xref linkend="kgp.openanything"/>.  All you've done is add three lines of code at the beginning of the function to check if the source is <quote><literal>-</literal></quote>; if so, you return <literal>sys.stdin</literal>.  Really, that's it!  Remember, &stdin; is a file-like object with a &read; method, so the rest of the code (in &kgp_filename;, where you call <function>openAnything</function>) doesn't change a bit.</para>  
      1245 <para>这是&toolbox_filename; 中的<function>openAnything</function>函数,以前在<xref linkend="kgp.openanything"/>中你已经检视过了。所有你要做的就是在函数的开始加入3行代码来检测源是否是<quote><literal>-</literal></quote>; 如果是,返回<literal>sys.stdin</literal>。实际上,that's it!  记住,&stdin; 是一个拥有&read;方法的类文件对象,所以剩下的代码(在&kgp_filename;中,在那里你调用了<function>openAnything</function>) 一点都不需要改动。</para>  
    1237 1245 </callout>  
    1238 1246 </calloutlist>  
     
    1242 1250 <section id="kgp.cache">  
    1243 1251 <?dbhtml filename="scripts_and_streams/caching.html"?>  
    1244   <title>Caching node lookups</title>  
      1252 <title>缓冲节点查询</title>  
    1244 1252 <abstract>  
    1245 1253 <title/>  
    1246   <para>&kgp_filename; employs several tricks which may or may not be useful to you in your &xml; processing.  The first one takes advantage of the consistent structure of the input documents to build a cache of nodes.</para>  
      1254 <para>&kgp_filename; 使用了多种技巧,对你进行 &xml; 处理而言它们或许有用。第一个就是,使用输入文档的结构稳定特征来构建节点缓冲。</para>  
    1246 1254 </abstract>  
    1247   <para>A grammar file defines a series of &refnode; elements.  Each &refnode; contains one or more &pnode; elements, which can contain a lot of different things, including &xrefnode;s.  Whenever you encounter an &xrefnode;, you look for a corresponding &refnode; element with the same &idattr; attribute, and choose one of the &refnode; element's children and parse it.  (You'll see how this random choice is made in the next section.)</para>  
    1248   <para>This is how you build up the grammar: define &refnode; elements for the smallest pieces, then define &refnode; elements which "include" the first &refnode; elements by using &xrefnode;, and so forth.  Then you parse the "largest" reference and follow each &xrefnode;, and eventually output real text.  The text you output depends on the (random) decisions you make each time you fill in an &xrefnode;, so the output is different each time.</para>  
    1249   <para>This is all very flexible, but there is one downside: performance.  When you find an &xrefnode; and need to find the corresponding &refnode; element, you have a problem.  The &xrefnode; has an &idattr; attribute, and you want to find the &refnode; element that has that same &idattr; attribute, but there is no easy way to do that.  The slow way to do it would be to get the entire list of &refnode; elements each time, then manually loop through and look at each &idattr; attribute.  The fast way is to do that once and build a cache, in the form of a dictionary.</para>  
      1255 <para>一个语法文件定义了一系列的 &refnode; 元素。每个 &refnode; 包含了一个或者多个 &pnode; 元素,&pnode; 元素可以包含很多不同的东西,包括 &xrefnode;。无论何时你遇到一个 &xrefnode; ,都可以通过相同的 &idattr; 属性找到相对应的 &refnode; 元素,并选择 &refnode; 元素的子元素之一进行解析。(在下一部分中你将看到是如何进行这种随机选择的。)</para>  
      1256 <para>如何构建语法:为最小的片段定义 &refnode; 元素,然后通过 &xrefnode; 定义“包含”第一个 &refnode; 元素的 &refnode; 元素,等等。然后,解析“最大的”引用并跟在每个 &xrefnode; 后面,最后输出真实的文本。输出的文本依赖于你每次填充 &xrefnode; 所做的(随机)决策,所以每次的输出都是不同的。</para>  
      1257 <para>这种方式非常灵活,但是有一个不好的地方:性能。当你找到一个 &xrefnode; 并需要找到相应的 &refnode; 元素时,会遇到一个问题。 &xrefnode; 有 &idattr; 属性,而你要找拥有相同 &idattr; 属性的 &refnode; 元素,但是没有简单的方式做到这件事。较慢的方式是每次获取所有 &refnode; 元素的完整列表,然后手动遍历并检视每一个 &idattr; 属性。较快的方式是只做一次然后以字典形式构建一个缓冲。</para>  
    1250 1258 <example>  
    1251 1259 <title><function>loadGrammar</function></title>  
     
    1260 1268 <calloutlist>  
    1261 1269 <callout arearefs="kgp.cache.1.1">  
    1262   <para>Start by creating an empty dictionary, &kgprefs;.</para>  
      1270 <para>从创建一个空字典 &kgprefs; 开始。</para>  
    1262 1270 </callout>  
    1263 1271 <callout arearefs="kgp.cache.1.2">  
    1264   <para>As you saw in <xref linkend="kgp.search"/>, &getelementsbytagname_functionname; returns a list of all the elements of a particular name.  You easily can get a list of all the &refnode; elements, then simply loop through that list.</para>  
      1272 <para>正如你在<xref linkend="kgp.search"/>中看到的,&getelementsbytagname_functionname; 返回所有特定名称元素的一个列表。你可以很容易的得到所有 &refnode; 元素的一个列表,然后仅仅是遍历这个列表</para>  
    1264 1272 </callout>  
    1265 1273 <callout arearefs="kgp.cache.1.3">  
    1266   <para>As you saw in <xref linkend="kgp.attributes"/>, you can access individual attributes of an element by name, using standard dictionary syntax.  So the keys of the &kgprefs; dictionary will be the values of the &idattr; attribute of each &refnode; element.</para>  
      1274 <para>正如你在<xref linkend="kgp.attributes"/>中看到的,使用标准的字典语法,你可以通过名称来访问个别元素。所以,&kgprefs; 字典的键将是每个 &refnode; 元素的 &idattr; 属性值。</para>  
    1266 1274 </callout>  
    1267 1275 <callout arearefs="kgp.cache.1.4">  
    1268   <para>The values of the &kgprefs; dictionary will be the &refnode; elements themselves.  As you saw in <xref linkend="kgp.parse"/>, each element, each node, each comment, each piece of text in a parsed &xml; document is an object.</para>  
      1276 <para>&kgprefs; 字典的值将是 &refnode; 元素本身。如你在<xref linkend="kgp.parse"/>中看到的,已解析 &xml; 文档中的每个元素,每个节点,每个注释,每个文本片段都是一个对象。</para>  
    1268 1276 </callout>  
    1269 1277 </calloutlist>  
    1270 1278 </example>  
    1271   <para>Once you build this cache, whenever you come across an &xrefnode; and need to find the &refnode; element with the same &idattr; attribute, you can simply look it up in &kgprefs;.</para>  
      1279 <para>一旦你构建了这个缓冲,无论何时你遇到一个 &xrefnode; 并且需要找到具有相同 &idattr; 属性的 &refnode; 元素,你只要在 &kgprefs; 中查找它。</para>  
    1271 1279 <example>  
    1272   <title>Using the &refnode; element cache</title>  
      1280 <title>使用 &refnode; 元素缓冲</title>  
    1272 1280 <programlisting>  
    1273 1281 &kgp_doxrefdef;  
     
    1281 1289 &kgp_parsexref;</programlisting>  
    1282 1290 </example>  
    1283   <para>You'll explore the &randomchildelement_functionname; function in the next section.</para>  
      1291 <para>你将在下一部分探究 &randomchildelement_functionname; 函数。</para>  
    1283 1291 </section>  
    1284 1292 <section id="kgp.child">  
    1285 1293 <?dbhtml filename="scripts_and_streams/child_nodes.html"?>  
    1286   <title>Finding direct children of a node</title>  
      1294 <title>查找节点的直接子节点</title>  
    1286 1294 <abstract>  
    1287 1295 <title/>  
    1288   <para>Another useful techique when parsing &xml; documents is finding all the direct child elements of a particular element.  For instance, in the grammar files, a &refnode; element can have several &pnode; elements, each of which can contain many things, including other &pnode; elements.  You want to find just the &pnode; elements that are children of the &refnode;, not &pnode; elements that are children of other &pnode; elements.</para>  
      1296 <para>解析 &xml; 文档时,另一个有用的己技巧是查找某个特定元素的所有直接子元素。例如,在语法文件中,一个 &refnode; 元素可以有数个 &pnode; 元素,其中每一个都可以包含很多东西,包括其他的 &pnode; 元素。你只要查找作为 &refnode; 孩子的 &pnode; 元素,不用查找其他 &pnode; 元素的孩子 &pnode; 元素。</para>  
    1288 1296 </abstract>  
    1289   <para>You might think you could simply use &getelementsbytagname_functionname; for this, but you can't.  &getelementsbytagname_functionname; searches recursively and returns a single list for all the elements it finds.  Since &pnode; elements can contain other &pnode; elements, you can't use &getelementsbytagname_functionname;, because it would return nested &pnode; elements that you don't want.  To find only direct child elements, you'll need to do it yourself.</para>  
      1297 <para>你可能认为你只要简单的使用 &getelementsbytagname_functionname; 来实现这点就可以了,但是你不可以这么做。 &getelementsbytagname_functionname; 递归搜索并返回所有找到的元素的单个列表。由于 &pnode; 元素可以包含其他的 &pnode; 元素,你不能使用 &getelementsbytagname_functionname; ,因为它会返回你不要的嵌套 &pnode; 元素。为了只找到直接子元素,你要自己进行处理。</para>  
    1289 1297 <example>  
    1290   <title>Finding direct child elements</title>  
      1298 <title>查找直接子元素</title>  
    1290 1298 <programlisting>  
    1291 1299 &kgp_randomchildelementdef;  
     
    1300 1308 <calloutlist>  
    1301 1309 <callout arearefs="kgp.child.1.1">  
    1302   <para>As you saw in <xref linkend="kgp.parse.gettingchildnodes.example"/>, the &childnodes_attr; attribute returns a list of all the child nodes of an element.</para>  
      1310 <para>正如你在<xref linkend="kgp.parse.gettingchildnodes.example"/>中看到的, &childnodes_attr; 属性返回元素所有子节点的一个列表。</para>  
    1302 1310 </callout>  
    1303 1311 <callout arearefs="kgp.child.1.2">  
    1304   <para>However, as you saw in <xref linkend="kgp.parse.childnodescanbetext.example"/>, the list returned by &childnodes_attr; contains all different types of nodes, including text nodes.  That's not what you're looking for here.  You only want the children that are elements.</para>  
      1312 <para>However, 正如你在<xref linkend="kgp.parse.childnodescanbetext.example"/>中看到的,&childnodes_attr; 返回的列表包含了所有不同类型的节点,包括文本节点。这并不是你在这里要查找的。你只要元素形式的孩子。</para>  
    1304 1312 </callout>  
    1305 1313 <callout arearefs="kgp.child.1.3">  
    1306   <para>Each node has a <varname>nodeType</varname> attribute, which can be <literal>ELEMENT_NODE</literal>, <literal>TEXT_NODE</literal>, <literal>COMMENT_NODE</literal>, or any number of other values.  The complete list of possible values is in the <filename>__init__.py</filename> file in the <classname>xml.dom</classname> package.  (See <xref linkend="kgp.packages"/> for more on packages.)  But you're just interested in nodes that are elements, so you can filter the list to only include those nodes whose <varname>nodeType</varname> is <literal>ELEMENT_NODE</literal>.</para>  
      1314 <para>每个节点都有一个<varname>nodeType</varname>属性,它可以是<literal>元素节点</literal>, <literal>文本节点</literal>, <literal>注释节点</literal>,或者任意数量的其它值。可能值的完整列表在<classname>xml.dom</classname>包的<filename>__init__.py</filename>文件中。(关于包更多的,参见<xref linkend="kgp.packages"/>。)但你只是对元素节点有兴趣,所以你可以过滤出一个列表,其中只包含<varname>nodeType</varname>是<literal>元素节点</literal>的节点。</para>  
    1306 1314 </callout>  
    1307 1315 <callout arearefs="kgp.child.1.4">  
    1308   <para>Once you have a list of actual elements, choosing a random one is easy.  &python; comes with a module called &random_modulename; which includes several useful functions.  The <function>random.choice</function> function takes a list of any number of items and returns a random item.  For example, if the &refnode; elements contains several &pnode; elements, then <varname>choices</varname> would be a list of &pnode; elements, and <varname>chosen</varname> would end up being assigned exactly one of them, selected at random.</para>  
      1316 <para>一旦你拥有了一个真实元素的列表,选择任意一个都很容易。&python; 有一个叫 &random_modulename; 的模块,它包含了好几个有用的函数。<function>random.choice</function>函数接收一个任意数量条目的列表并随机返回其中的一个条目。比如,如果 &refnode; 元素包含了多个 &pnode; 元素,那么<varname>choices</varname>将会是 &pnode; 元素的一个列表,并且<varname>chosen</varname>将以被赋予其中的一个确切值而结束,而这个值是随即选择的。</para>  
    1308 1316 </callout>  
    1309 1317 </calloutlist>  
     
    1316 1324 <section id="kgp.handler">  
    1317 1325 <?dbhtml filename="scripts_and_streams/handlers_by_node_type.html"?>  
    1318   <title>Creating separate handlers by node type</title>  
      1326 <title>通过节点类型创建独立的处理句柄 Creating separate handlers by node type</title>  
    1318 1326 <abstract>  
    1319 1327 <title/>  
    1320   <para>The third useful &xml; processing tip involves separating your code into logical functions, based on node types and element names.  Parsed &xml; documents are made up of various types of nodes, each represented by a &python; object.  The root level of the document itself is represented by a <classname>Document</classname> object.  The <classname>Document</classname> then contains one or more <classname>Element</classname> objects (for actual &xml; tags), each of which may contain other <classname>Element</classname> objects, <classname>Text</classname> objects (for bits of text), or <classname>Comment</classname> objects (for embedded comments).  &python; makes it easy to write a dispatcher to separate the logic for each node type.</para>  
      1328 <para>第三个有用的 &xml; 处理技巧是将你的代码基于节点类型和元素名称分散到逻辑函数中。解析后的 &xml; 文档是由各种类型的节点组成的,每一个都是通过 &python; 对象表示的。文档本身的根层次通过一个<classname>Document</classname>对象表示。<classname>Document</classname>还包含了一个或者多个<classname>Element</classname>对象(for actual &xml; tags),其中的每一个可以包含其它的<classname>Element</classname>对象,<classname>Text</classname>对象(for bits of text),或者<classname>Comment</classname>对象(for embedded comments)。 &python; 使编写分离每个节点类型逻辑的分发器非常容易。</para>  
    1320 1328 </abstract>  
    1321 1329 <example>  
    1322   <title>Class names of parsed &xml; objects</title>  
      1330 <title>已解析 &xml; 对象的类名</title>  
    1322 1330 <screen>  
    1323 1331 &prompt;<userinput>from xml.dom import minidom</userinput>  
     
    1334 1342 <calloutlist>  
    1335 1343 <callout arearefs="kgp.handler.1.1">  
    1336   <para>Assume for a moment that <filename>kant.xml</filename> is in the current directory.</para>  
      1344 <para>暂时假设<filename>kant.xml</filename>在当前目录中。</para>  
    1336 1344 </callout>  
    1337 1345 <callout arearefs="kgp.handler.1.2">  
    1338   <para>As you saw in <xref linkend="kgp.packages"/>, the object returned by parsing an &xml; document is a <classname>Document</classname> object, as defined in the <filename>minidom.py</filename> in the <filename>xml.dom</filename> package.  As you saw in <xref linkend="fileinfo.create"/>, <literal>__class__</literal> is built-in attribute of every &python; object.</para>  
      1346 <para>正如你在<xref linkend="kgp.packages"/>中看到的,解析 &xml; 文档返回的对象是一个<classname>Document</classname>对象,就像在<filename>xml.dom</filename>包的<filename>minidom.py</filename>中定义的一样。又如你在<xref linkend="fileinfo.create"/>中看到的,<literal>__class__</literal>是每个 &python; 对象的一个内置属性。</para>  
    1338 1346 </callout>  
    1339 1347 <callout arearefs="kgp.handler.1.3">  
    1340   <para>Furthermore, <literal>__name__</literal> is a built-in attribute of every &python; class, and it is a string.  This string is not mysterious; it's the same as the class name you type when you define a class yourself.  (See <xref linkend="fileinfo.class"/>.)</para>  
      1348 <para>此外,<literal>__name__</literal>是每个 &python; 类的内置属性,是一个字符串。这个字符串并不神秘;它和你在定义类时输入的类名相同。(参见<xref linkend="fileinfo.class"/>。)</para>  
    1340 1348 </callout>  
    1341 1349 </calloutlist>  
    1342 1350 </example>  
    1343   <para>Fine, so now you can get the class name of any particular &xml; node (since each &xml; node is represented as a &python; object).  How can you use this to your advantage to separate the logic of parsing each node type?  The answer is &getattr;, which you first saw in <xref linkend="apihelper.getattr"/>.</para>  
      1351 <para>好,现在你能够得到任何特定 &xml; 节点的类名了(因为每个 &xml; 节点都是以一个 &python; 对象表示的)。你怎样才能利用这点来分离解析每个节点类型的逻辑呢?答案就是 &getattr;,你第一次见它是在<xref linkend="apihelper.getattr"/>中。</para>  
    1343 1351 <example>  
    1344   <title><function>parse</function>, a generic &xml; node dispatcher</title>  
      1352 <title><function>parse</function>, 一个通用的 &xml; 节点分发器</title>  
    1344 1352 <programlisting>  
    1345 1353 &kgp_parsedef;  
     
    1353 1361 <calloutlist>  
    1354 1362 <callout arearefs="kgp.handler.2.1">  
    1355   <para>First off, notice that you're constructing a larger string based on the class name of the node you were passed (in the <varname>node</varname> argument).  So if you're passed a <classname>Document</classname> node, you're constructing the string <literal>'parse_Document'</literal>, and so forth.</para>  
      1363 <para>First off, 注意你正在基于传入节点(在<varname>node</varname>参数中)的类名构造一个较大的字符串。所以如果你传入一个<classname>Document</classname>节点,你就构造了字符串<literal>'parse_Document'</literal>,其它类同于此。</para>  
    1355 1363 </callout>  
    1356 1364 <callout arearefs="kgp.handler.2.2">  
    1357   <para>Now you can treat that string as a function name, and get a reference to the function itself using &getattr;</para>  
      1365 <para>现在你可以把这个字符串当作一个函数名称,然后通过 &getattr; 得到函数自身的引用。</para>  
    1357 1365 </callout>  
    1358 1366 <callout arearefs="kgp.handler.2.3">  
    1359   <para>Finally, you can call that function and pass the node itself as an argument.  The next example shows the definitions of each of these functions.</para>  
      1367 <para>最后,你可以调用函数并将节点自身作为参数传入。下一个例子将展示每个函数的定义。</para>  
    1359 1367 </callout>  
    1360 1368 </calloutlist>  
    1361 1369 </example>  
    1362 1370 <example>  
    1363   <title>Functions called by the <function>parse</function> dispatcher</title>  
      1371 <title><function>parse</function>分发者调用的函数</title>  
    1363 1371 <programlisting>  
    1364 1372 &kgp_parsedocumentdef; <co id="kgp.handler.3.1"/>  
     
    1386 1394 <calloutlist>  
    1387 1395 <callout arearefs="kgp.handler.3.1">  
    1388   <para><function>parse_Document</function> is only ever called once, since there is only one <classname>Document</classname> node in an &xml; document, and only one <classname>Document</classname> object in the parsed &xml; representation.  It simply turns around and parses the root element of the grammar file.</para>  
      1396 <para><function>parse_Document</function>只会被调用一次,因为在一个 &xml; 文档中只有一个<classname>Document</classname>节点,并且在已解析 &xml; 的表示中只有一个<classname>Document</classname>对象。它只是turn around并解析语法文件的根元素。</para>  
    1388 1396 </callout>  
    1389 1397 <callout arearefs="kgp.handler.3.2">  
    1390   <para><function>parse_Text</function> is called on nodes that represent bits of text.  The function itself does some special processing to handle automatic capitalization of the first word of a sentence, but otherwise simply appends the represented text to a list.</para>  
      1398 <para><function>parse_Text</function> 在节点表示文本时被调用。这个函数本身做某种特殊处理,自动将句子的第一个单词进行大写处理,而不是简单的将表示的文本追加到一个列表中。</para>  
    1390 1398 </callout>  
    1391 1399 <callout arearefs="kgp.handler.3.3">  
    1392   <para><function>parse_Comment</function> is just a &pass;, since you don't care about embedded comments in the grammar files.  Note, however, that you still need to define the function and explicitly make it do nothing.  If the function did not exist, the generic <function>parse</function> function would fail as soon as it stumbled on a comment, because it would try to find the non-existent <function>parse_Comment</function> function.  Defining a separate function for every node type, even ones you don't use, allows the generic <function>parse</function> function to stay simple and dumb.</para>  
      1400 <para><function>parse_Comment</function> 只有一个&pass;,因为你并不关心语法文件中嵌入的注释。但是注意,你还是要定义这个函数并显式的让它不做任何事情。如果这个函数不存在,通用<function>parse</function>函数在遇到一个注释的时候,会执行失败,因为它试图找到并不存在的<function>parse_Comment</function>函数。为每个节点类型定义独立的函数,甚至你不要使用的,将会使通用<function>parse</function>函数保持简单和沉默。</para>  
    1392 1400 </callout>  
    1393 1401 <callout arearefs="kgp.handler.3.4">  
    1394   <para>The <function>parse_Element</function> method is actually itself a dispatcher, based on the name of the element's tag.  The basic idea is the same: take what distinguishes elements from each other (their tag names) and dispatch to a separate function for each of them.  You construct a string like <literal>'do_xref'</literal> (for an <sgmltag>&lt;xref&gt;</sgmltag> tag), find a function of that name, and call it.  And so forth for each of the other tag names that might be found in the course of parsing a grammar file (<sgmltag>&lt;p&gt;</sgmltag> tags, <sgmltag>&lt;choice&gt;</sgmltag> tags).</para>  
      1402 <para><function>parse_Element</function>方法其实本身就是一个分发器,它基于元素的标记名称。这个基本概念是相同的:使用元素的区别(它们的标记名称)然后针对每一个分发到一个独立的函数。你构建了一个类似于<literal>'do_xref'</literal>的字符串(对<sgmltag>&lt;xref&gt;</sgmltag>标记而言),找到这个名称的函数,并调用它。对其它的标记名称在解析语法文件的时候都可以找到类似的函数(<sgmltag>&lt;p&gt;</sgmltag>标记,<sgmltag>&lt;choice&gt;</sgmltag>标记)。</para>  
    1394 1402 </callout>  
    1395 1403 </calloutlist>  
    1396 1404 </example>  
    1397   <para>In this example, the dispatch functions <function>parse</function> and <function>parse_Element</function> simply find other methods in the same class.  If your processing is very complex (or you have many different tag names), you could break up your code into separate modules, and use dynamic importing to import each module and call whatever functions you needed.  Dynamic importing will be discussed in <xref linkend="regression"/>.</para>  
      1405 <para>在这个例子中,分发函数<function>parse</function>和<function>parse_Element</function>只是找到相同类中的其它方法。如果你进行的处理过程很复杂(或者你有很多不同的标记名称),你可以将代码分散到独立的模块中,然后使用动态导入的方式导入每个模块并调用你需要的任何函数。动态导入将在<xref linkend="regression"/>中介绍。</para>  
    1397 1405 </section>  
    1398 1406 <section id="kgp.commandline">  
    1399 1407 <?dbhtml filename="scripts_and_streams/command_line_arguments.html"?>  
    1400   <title>Handling command-line arguments</title>  
      1408 <title>处理命令行参数</title>  
    1400 1408 <abstract>  
    1401 1409 <title/>  
    1402   <para>&python; fully supports creating programs that can be run on the command line, complete with command-line arguments and either short- or long-style flags to specify various options.  None of this is &xml;-specific, but this script makes good use of command-line processing, so it seemed like a good time to mention it.</para>  
      1410 <para>&python; 完备支持创建在命令行运行的程序,并且连同命令行参数和短长样式来指定各种选项。这些并非是 &xml; 特定的,但是这样的脚本可以充分使用命令行处理,看来是时候提一下它了。</para>  
    1402 1410 </abstract>  
    1403   <para>It's difficult to talk about command-line processing without understanding how command-line arguments are exposed to your &python; program, so let's write a simple program to see them.</para>  
      1411 <para>如果不理解命令行参数如何暴露给你的 &python; 程序,讨论命令行处理是很困难的,所以让我们先写个简单那的程序来看一下。</para>  
    1403 1411 <example>  
    1404   <title>Introducing <varname>sys.argv</varname></title>  
      1412 <title><varname>sys.argv</varname> 介绍</title>  
    1404 1412 &para_download;  
    1405 1413 <programlisting>  
     
    1420 1428 <calloutlist>  
    1421 1429 <callout arearefs="kgp.commandline.0.1">  
    1422   <para>Each command-line argument passed to the program will be in <varname>sys.argv</varname>, which is just a list.  Here you are printing each argument on a separate line.</para>  
      1430 <para>每个传递给程序的命令行参数都在<varname>sys.argv</varname>,它仅仅是一个列表。这里是在独立行中打印出每个参数。</para>  
    1422 1430 </callout>  
    1423 1431 </calloutlist>  
    1424 1432 </example>  
    1425 1433 <example>  
    1426   <title>The contents of <varname>sys.argv</varname></title>  
      1434 <title><varname>sys.argv</varname>的内容</title>  
    1426 1434 <screen>  
    1427 1435 <prompt>[you@localhost py]$ </prompt><userinput>python argecho.py</userinput>             <co id="kgp.commandline.1.1"/>  
     
    1442 1450 <calloutlist>  
    1443 1451 <callout arearefs="kgp.commandline.1.1">  
    1444   <para>The first thing to know about <varname>sys.argv</varname> is that it contains the name of the script you're calling.  You will actually use this knowledge to your advantage later, in <xref linkend="regression"/>.  Don't worry about it for now.</para>  
      1452 <para>关于<varname>sys.argv</varname>需要了解的第一件事情是它包含了你正在调用的脚本的名称。你后面会实际使用这个知识,在<xref linkend="regression"/>中。现在不用担心</para>  
    1444 1452 </callout>  
    1445 1453 <callout arearefs="kgp.commandline.1.2">  
    1446   <para>Command-line arguments are separated by spaces, and each shows up as a separate element in the <varname>sys.argv</varname> list.</para>  
      1454 <para>命令行参数通过空格进行分隔,在<varname>sys.argv</varname>类表中,每个参数都是一个独立的元素。</para>  
    1446 1454 </callout>  
    1447 1455 <callout arearefs="kgp.commandline.1.3">  
    1448   <para>Command-line flags, like <literal>--help</literal>, also show up as their own element in the <varname>sys.argv</varname> list.</para>  
      1456 <para>命令行标志,就像<literal>--help</literal>,在<varname>sys.argv</varname>列表中还保存了它们自己的元素。</para>  
    1448 1456 </callout>  
    1449 1457 <callout arearefs="kgp.commandline.1.4">  
    1450   <para>To make things even more interesting, some command-line flags themselves take arguments.  For instance, here you have a flag (<literal>-m</literal>) which takes an argument (<literal>kant.xml</literal>).  Both the flag itself and the flag's argument are simply sequential elements in the <varname>sys.argv</varname> list.  No attempt is made to associate one with the other; all you get is a list.</para>  
      1458 <para>为了让事情更有趣,有些命令行标志本身就接收参数。比如,这里有一个标记(<literal>-m</literal>)接收一个参数(<literal>kant.xml</literal>)。标记自身和标记参数只是<varname>sys.argv</varname>列表中的序列元素。并没有试图将元素与其它元素进行关联;所有你得到的是一个列表。</para>  
    1450 1458 </callout>  
    1451 1459 </calloutlist>  
    1452 1460 </example>  
    1453   <para>So as you can see, you certainly have all the information passed on the command line, but then again, it doesn't look like it's going to be all that easy to actually use it.  For simple programs that only take a single argument and have no flags, you can simply use <literal>sys.argv[1]</literal> to access the argument.  There's no shame in this; I do it all the time.  For more complex programs, you need the &getopt_modulename; module.</para>  
      1461 <para>所以正如你所看到的,你确实拥有了命令行传入的所有信息,但是, but then again, it doesn't look like it's going to be all that easy to actually use it.  对于只是接收单个参数或者没有标记的简单程序,你可以简单的使用<literal>sys.argv[1]</literal>来访问参数。这没有什么羞耻的;我一直都是这样做的。对更复杂的程序,你需要 &getopt_modulename; 模块。</para>  
    1453 1461 <example>  
    1454   <title>Introducing &getopt_modulename;</title>  
      1462 <title>&getopt_modulename; 介绍</title>  
    1454 1462 <programlisting>  
    1455 1463 &kgp_maindef;  
     
    1473 1481 <calloutlist>  
    1474 1482 <callout arearefs="kgp.commandline.2.1">  
    1475   <para>First off, look at the bottom of the example and notice that you're calling the <function>main</function> function with <literal>sys.argv[1:]</literal>.  Remember, <literal>sys.argv[0]</literal> is the name of the script that you're running; you don't care about that for command-line processing, so you chop it off and pass the rest of the list.</para>  
      1483 <para>First off,看一下例子最后并注意你正在调用<function>main</function>函数,参数是<literal>sys.argv[1:]</literal>。记住,<literal>sys.argv[0]</literal>是你正在运行脚本的名称;对命令行而言,你不用关心它,所以你可以砍掉它并传入列表的剩余部分。</para>  
    1475 1483 </callout>  
    1476 1484 <callout arearefs="kgp.commandline.2.2">  
    1477   <para>This is where all the interesting processing happens.  The &getopt_functionname; function of the &getopt_modulename; module takes three parameters: the argument list (which you got from <literal>sys.argv[1:]</literal>), a string containing all the possible single-character command-line flags that this program accepts, and a list of longer command-line flags that are equivalent to the single-character versions.  This is quite confusing at first glance, and is explained in more detail below.</para>  
      1485 <para>这里就是所有有趣处理发生的地方。&getopt_modulename; 模块的&getopt_functionname; 函数接收三个参数:参数列表(你从<literal>sys.argv[1:]</literal>得到的),一个包含了程序所有可能接收到的单字符命令行标志,和一个等价于单字符的长命令行标志的列表。第一次看的时候,这有点混乱,下面有更多的细节解释。</para>  
    1477 1485 </callout>  
    1478 1486 <callout arearefs="kgp.commandline.2.3">  
    1479   <para>If anything goes wrong trying to parse these command-line flags, &getopt_modulename; will raise an exception, which you catch.  You told &getopt_modulename; all the flags you understand, so this probably means that the end user passed some command-line flag that you don't understand.</para>  
      1487 <para>在解析这些命令行标志时,如果有任何事情错了,&getopt_modulename; 会抛出异常,你可以捕获它。你可以告诉 &getopt_modulename; 你明白的所有标志,那么这也意味着终端用户可以传入一些你不理解的命令行标志。</para>  
    1479 1487 </callout>  
    1480 1488 <callout arearefs="kgp.commandline.2.4">  
    1481   <para>As is standard practice in the &unix; world, when the script is passed flags it doesn't understand, you print out a summary of proper usage and exit gracefully.  Note that I haven't shown the <function>usage</function> function here.  You would still need to code that somewhere and have it print out the appropriate summary; it's not automatic.</para>  
      1489 <para>和 &unix; 世界中的标准实践一样,如果脚本被传入了不能理解的标志,你要打印出正确用法的一个概要并友好的退出。注意,在这里我没有写出<function>usage</function>函数。你还是要在某个地方写一个,使它打印出合适的概要;它不是自动的。</para>  
    1481 1489 </callout>  
    1482 1490 </calloutlist>  
    1483 1491 </example>  
    1484   <para>So what are all those parameters you pass to the &getopt_functionname; function?  Well, the first one is simply the raw list of command-line flags and arguments (not including the first element, the script name, which you already chopped off before calling the <function>main</function> function).  The second is the list of short command-line flags that the script accepts.</para>  
      1492 <para>那么你传给 &getopt_functionname; 函数的参数是什么呢?好的,第一个单单只是一个命令行标志和参数的原始列表(不包括第一个元素,脚本名称,你在调用<function>main</function>函数之前就已经将它砍掉了)。第二个是脚本接收的短命令行标志的一个列表。</para>  
    1484 1492 <variablelist>  
    1485 1493 <title><literal>"hg:d"</literal></title>  
     
    1502 1510 </varlistentry>  
    1503 1511 </variablelist>  
    1504   <para>The first and third flags are simply standalone flags; you specify them or you don't, and they do things (print help) or change state (turn on debugging).  However, the second flag (<literal>-g</literal>) <emphasis>must</emphasis> be followed by an argument, which is the name of the grammar file to read from.  In fact it can be a filename or a web address, and you don't know which yet (you'll figure it out later), but you know it has to be <emphasis>something</emphasis>.  So you tell &getopt_modulename; this by putting a colon after the <literal>g</literal> in that second parameter to the &getopt_functionname; function.</para>  
    1505   <para>To further complicate things, the script accepts either short flags (like <literal>-h</literal>) or long flags (like <literal>--help</literal>), and you want them to do the same thing.  This is what the third parameter to &getopt_functionname; is for, to specify a list of the long flags that correspond to the short flags you specified in the second parameter.</para>  
      1512 <para>第一个标志和第三个标志是简单的独立标志;你选择是否指定它们,它们做某些事情(打印帮助)或者改变状态(关闭调试)。但是,第二个标志(<literal>-g</literal>)<emphasis>必须</emphasis>跟随一个参数,进行读取的语法文件的名称。实际上,它可以是一个文件名或者一个web地址,你可能还不知道(后面你会明白的),但是你要知道必须要<emphasis>有些东西</emphasis>。所以,你可以通过在 &getopt_functionname; 函数的第二个参数的<literal>g</literal>后面放一个冒号,来向 &getopt_modulename; 说明这一点。</para>  
      1513 <para>To further complicate things,这个脚本接收短标志(像<literal>-h</literal>)或者长标记(像<literal>--help</literal>),并且你要它们做相同的事。这就是 &getopt_functionname; 第三个参数存在的原因,为了指定长标志的一个列表,其中的长标志是和第二个参数中指定的短标志相对应的。</para>  
    1506 1514 <variablelist>  
    1507 1515 <title><literal>["help", "grammar="]</literal></title>  
     
    1515 1523 </varlistentry>  
    1516 1524 </variablelist>  
    1517   <para>Three things of note here:</para>  
      1525 <para>这里要注意的三件事:</para>  
    1517 1525 <orderedlist>  
    1518   <listitem><para>All long flags are preceded by two dashes on the command line, but you don't include those dashes when calling &getopt_functionname;.  They are understood.</para></listitem>  
    1519   <listitem><para>The <literal>--grammar</literal> flag must always be followed by an additional argument, just like the <literal>-g</literal> flag.  This is notated by an equals sign, <literal>"grammar="</literal>.</para></listitem>  
    1520   <listitem><para>The list of long flags is shorter than the list of short flags, because the <literal>-d</literal> flag does not have a corresponding long version.  This is fine; only <literal>-d</literal> will turn on debugging.  But the order of short and long flags needs to be the same, so you'll need to specify all the short flags that <emphasis>do</emphasis> have corresponding long flags first, then all the rest of the short flags.</para></listitem>  
      1526 <listitem><para>所有命令行中的长标志以两个短划线开始,但是在调用 &getopt_functionname; 时,你不用包含这两个短划线。它们是能够被理解的。</para></listitem>  
      1527 <listitem><para><literal>--grammar</literal>标志的后面必须跟着另一个参数,就像<literal>-g</literal>标志一样。通过等于号标识出来 <literal>"grammar="</literal>。</para></listitem>  
      1528 <listitem><para>长标志列表比短标志列表更短一些,因为<literal>-d</literal>标志没有相应的长标志。这也好;只有<literal>-d</literal>才会打开调试。但是短标志和长标志的顺序必须是相同的,你应该先指定有长标志的短标志,然后才是剩下的短标志。</para></listitem>  
    1521 1529 </orderedlist>  
    1522   <para>Confused yet?  Let's look at the actual code and see if it makes sense in context.</para>  
      1530 <para>被搞昏没?让我们看一下真实的代码,看看它在上下文中是否起作用。</para>  
    1522 1530 <example>  
    1523   <title>Handling command-line arguments in &kgp_filename;</title>  
      1531 <title>在 &kgp_filename; 中处理命令行参数</title>  
    1523 1531 <programlisting>  
    1524 1532 &kgp_maindef; <co id="kgp.commandline.3.0"/>  
     
    1548 1556 <calloutlist>  
    1549 1557 <callout arearefs="kgp.commandline.3.0">  
    1550   <para>The <varname>grammar</varname> variable will keep track of the grammar file you're using.  You initialize it here in case it's not specified on the command line (using either the <literal>-g</literal> or the <literal>--grammar</literal> flag).</para>  
      1558 <para><varname>grammar</varname>变量会跟踪你正在使用的语法文件。如果你没有在命令行指定它(使用<literal>-g</literal>或者<literal>--grammar</literal>标志定义它),在这里你将初始化它。</para>  
    1550 1558 </callout>  
    1551 1559 <callout arearefs="kgp.commandline.3.1">  
    1552   <para>The <varname>opts</varname> variable that you get back from &getopt_functionname; contains a list of tuples: <varname>flag</varname> and <varname>argument</varname>.  If the flag doesn't take an argument, then <varname>arg</varname> will simply be &none;.  This makes it easier to loop through the flags.</para>  
      1560 <para>你从 &getopt_functionname; 取回的<varname>opts</varname>变量包含了元组(<varname>flag</varname> 和 <varname>argument</varname>)的一个列表。如果标志没有带任何参数,那么<varname>arg</varname>只是 &none; 。这使得遍历标志更容易了。</para>  
    1552 1560 </callout>  
    1553 1561 <callout arearefs="kgp.commandline.3.2">  
    1554   <para>&getopt_functionname; validates that the command-line flags are acceptable, but it doesn't do any sort of conversion between short and long flags.  If you specify the <literal>-h</literal> flag, <varname>opt</varname> will contain <literal>"-h"</literal>; if you specify the <literal>--help</literal> flag, <varname>opt</varname> will contain <literal>"--help"</literal>.  So you need to check for both.</para>  
      1562 <para>&getopt_functionname; 验证命令行标志是否可接受,但是它不会在短标志和长标志之间做任何转换。如果你指定<literal>-h</literal>标志,<varname>opt</varname>将会包含<literal>"-h"</literal>;如果你指定<literal>--help</literal>标志,<varname>opt</varname>将会包含<literal>"--help"</literal>标志。所以你需要检查它们两个。</para>  
    1554 1562 </callout>  
    1555 1563 <callout arearefs="kgp.commandline.3.3">  
    1556   <para>Remember, the <literal>-d</literal> flag didn't have a corresponding long flag, so you only need to check for the short form.  If you find it, you set a global variable that you'll refer to later to print out debugging information.  (I used this during the development of the script.  What, you thought all these examples worked on the first try?)</para>  
      1564 <para>记得,<literal>-d</literal>标记没有相应的长标志,所以你只需要检查短形式。如果你找到了它,你就可以设置一个全局变量来指示后面要打印出调试信息。(我习惯在脚本的开发过程中使用它。What, you thought all these examples worked on the first try?)</para>  
    1556 1564 </callout>  
    1557 1565 <callout arearefs="kgp.commandline.3.4">  
    1558   <para>If you find a grammar file, either with a <literal>-g</literal> flag or a <literal>--grammar</literal> flag, you save the argument that followed it (stored in <varname>arg</varname>) into the <varname>grammar</varname> variable, overwriting the default that you initialized at the top of the <function>main</function> function.</para>  
      1566 <para>如果你找到了一个语法文件,<literal>-g</literal>标志或者<literal>--grammar</literal>标志带着的,那你要保存跟在它(保存在<varname>arg</varname>)后面的参数到变量<varname>grammar</varname>中,覆盖掉在<function>main</function>函数你初始化的默认值。</para>  
    1558 1566 </callout>  
    1559 1567 <callout arearefs="kgp.commandline.3.5">  
    1560   <para>That's it.  You've looped through and dealt with all the command-line flags.  That means that anything left must be command-line arguments.  These come back from the &getopt_functionname; function in the <varname>args</varname> variable.  In this case, you're treating them as source material for the parser.  If there are no command-line arguments specified, <varname>args</varname> will be an empty list, and <varname>source</varname> will end up as the empty string.</para>  
      1568 <para>That’s it。你已经遍历并处理了所有的命令行标志。这意味着所有剩下的东西都必须是命令行参数。这些从 &getopt_functionname; 函数的<varname>args</varname>变量回来。在这个例子中,你把它们当作了解析器源材料。如果没有指定命令行参数,<varname>args</varname>将是一个空列表,并且<varname>source</varname>将以空字符串结束。</para>  
    1560 1568 </callout>  
    1561 1569 </calloutlist>  
     
    1570 1578 <section id="kgp.alltogether">  
    1571 1579 <?dbhtml filename="scripts_and_streams/all_together.html"?>  
    1572   <title>Putting it all together</title>  
      1580 <title>放到一起</title>  
    1572 1580 <abstract>  
    1573 1581 <title/>  
    1574   <para>You've covered a lot of ground.  Let's step back and see how all the pieces fit together.</para>  
      1582 <para>你已经了解很多基础的东西。让我们回来看看所有片段是如何整合到一起的。</para>  
    1574 1582 </abstract>  
    1575   <para>To start with, this is a script that <link linkend="kgp.commandline">takes its arguments on the command line</link>, using the &getopt_modulename; module.</para>  
      1583 <para>作为开始,这里是一个<link linkend="kgp.commandline">接收命令行参数</link>的脚本,它使用 &getopt_modulename; 模块。</para>  
    1575 1583 <informalexample>  
    1576 1584 <programlisting>  
     
    1587 1595 ...</programlisting>  
    1588 1596 </informalexample>  
    1589   <para>You create a new instance of the <classname>KantGenerator</classname> class, and pass it the grammar file and source that may or may not have been specified on the command line.</para>  
      1597 <para>创建<classname>KantGenerator</classname>类的一个实例,然后将语法文件文件和源传给它,可能在命令行没有指定。</para>  
    1589 1597 <informalexample>  
    1590 1598 <programlisting>  
    1591 1599 &kgp_createkantgenerator;</programlisting>  
    1592 1600 </informalexample>  
    1593   <para>The <classname>KantGenerator</classname> instance automatically loads the grammar, which is an &xml; file.  You use your custom &openanything_functionname; function to open the file (which <link linkend="kgp.openanything">could be stored in a local file or a remote web server</link>), then use the built-in &minidom_modulename; parsing functions to <link linkend="kgp.parse">parse the &xml; into a tree of &python; objects</link>.</para>  
      1601 <para><classname>KantGenerator</classname>实例自动加载语法,它是一个 &xml; 文件。你使用自定义的 &openanything_functionname; 函数打开这个文件(<link linkend="kgp.openanything">可能保存在一个本地文件中或者一个远程服务器上</link>),然后使用内置的&minidom_modulename; 解析函数<link linkend="kgp.parse">将 &xml; 解析为一棵 &python; 对象树</link>。</para>  
    1593 1601 <informalexample>  
    1594 1602 <programlisting>  
     
    1600 1608 &kgp_loadclose;</programlisting>  
    1601 1609 </informalexample>  
    1602   <para>Oh, and along the way, you take advantage of your knowledge of the structure of the &xml; document to <link linkend="kgp.cache">set up a little cache of references</link>, which are just elements in the &xml; document.</para>  
      1610 <para>哦,根据这种方式,你将使用到 &xml; 文档结构的知识<link linkend="kgp.cache">建立一个引用的小缓冲</link>,这些引用只是 &xml; 文档中的元素。</para>  
    1602 1610 <informalexample>  
    1603 1611 <programlisting>  
     
    1607 1615 &kgp_refid;</programlisting>  
    1608 1616 </informalexample>  
    1609   <para>If you specified some source material on the command line, you use that; otherwise you rip through the grammar looking for the "top-level" reference (that isn't referenced by anything else) and use that as a starting point.</para>  
      1617 <para>如果你在命令行中指定了某些源材料,你可以使用它;否则你将打开语法查找“顶层”引用(没有被其它的东西引用)并把它作为开始点。</para>  
    1609 1617 <informalexample>  
    1610 1618 <programlisting>  
     
    1618 1626 &kgp_returndefaultsource;</programlisting>  
    1619 1627 </informalexample>  
    1620   <para>Now you rip through the source material.  The source material is also &xml;, and you parse it one node at a time.  To keep the code separated and more maintainable, you use <link linkend="kgp.handler">separate handlers for each node type</link>.</para>  
      1628 <para>现在你打开了了源材料。它是一个 &xml; 你每次解析一个节点。为了让代码分离并具备更高的可维护性,你可以使用<link linkend="kgp.handler">针对每个节点类型的独立处理方法</link>。</para>  
    1620 1628 <informalexample>  
    1621 1629 <programlisting>  
     
    1625 1633 &kgp_handlermethod;</programlisting>  
    1626 1634 </informalexample>  
    1627   <para>You bounce through the grammar, <link linkend="kgp.child">parsing all the children</link> of each &pnode; element,</para>  
      1635 <para>通过语法的反弹,<link linkend="kgp.child">解析所有 &pnode; 元素的孩子</link>,</para>  
    1627 1635 <informalexample>  
    1628 1636 <programlisting>  
     
    1633 1641 &kgp_parsep;</programlisting>  
    1634 1642 </informalexample>  
    1635   <para>replacing &choicenode; elements with a random child,</para>  
      1643 <para>用任意一个孩子替换 &choicenode; 元素,</para>  
    1635 1643 <informalexample>  
    1636 1644 <programlisting>  
     
    1639 1647 &kgp_parsechoice;</programlisting>  
    1640 1648 </informalexample>  
    1641   <para>and replacing &xrefnode; elements with a random child of the corresponding &refnode; element, which you previously cached.</para>  
      1649 <para>并用对应 &refnode; 元素的任意孩子替换 &xrefnode; ,前面你已经进行了缓冲。</para>  
    1641 1649 <informalexample>  
    1642 1650 <programlisting>  
     
    1646 1654 &kgp_parsexref;</programlisting>  
    1647 1655 </informalexample>  
    1648   <para>Eventually, you parse your way down to plain text,</para>  
      1656 <para>最后,你以你的方式进行解析直到普通文本。</para>  
    1648 1656 <informalexample>  
    1649 1657 <programlisting>  
     
    1654 1662 &kgp_appendnormal;</programlisting>  
    1655 1663 </informalexample>  
    1656   <para>which you print out.</para>  
      1664 <para>你打印出来的。</para>  
    1656 1664 <informalexample>  
    1657 1665 <programlisting>  
    1665 1673 <section id="kgp.summary">  
    1666 1674 <?dbhtml filename="scripts_and_streams/summary.html"?>  
    1667   <title>Summary</title>  
      1675 <title>综述</title>  
    1667 1675 <abstract>  
    1668 1676 <title/>  
    1669   <para>&python; comes with powerful libraries for parsing and manipulating &xml; documents.  The &minidom_modulename; takes an &xml; file and parses it into &python; objects, providing for random access to arbitrary elements.  Furthermore, this chapter shows how &python; can be used to create a "real" standalone command-line script, complete with command-line flags, command-line arguments, error handling, even the ability to take input from the piped result of a previous program.</para>  
      1677 <para>&python; 带有解析和操作 &xml; 文档非常强大的库。这个 &minidom_modulename; 接收一个 &xml; 文件并将其解析为 &python; 对象,提供了对任意元素的随即访问。进一步,本章展示了如何利用 &python; 创建一个“真实”独立的命令行脚本,连同命令行标志,命令行参数,错误处理,甚至从前一个程序的管道接收输入的能力。</para>  
    1669 1677 </abstract>  
    1670   <para>Before moving on to the next chapter, you should be comfortable doing all of these things:</para>  
      1678 <para>在继续下一章前,你应该无困难的完成所有这些事情:</para>  
    1670 1678 <itemizedlist>  
    1671   <listitem><para><link linkend="kgp.stdio">Chaining programs</link> with standard input and output</para></listitem>  
    1672   <listitem><para><link linkend="kgp.handler">Defining dynamic dispatchers</link> with &getattr;.</para></listitem>  
    1673   <listitem><para><link linkend="kgp.commandline">Using command-line flags</link> and validating them with &getopt_modulename;</para></listitem>  
      1679 <listitem><para>通过标准输入输出<link linkend="kgp.stdio">链接程序</link></para></listitem>  
      1680 <listitem><para>使用 &getattr; <link linkend="kgp.handler">定义动态分发器</link>。</para></listitem>  
      1681 <listitem><para>通过 &getopt_modulename; <link linkend="kgp.commandline">使用命令行标志</link>并进行验证</para></listitem>  
    1674 1682 </itemizedlist>  
    1675 1683 </section>