/[svn]/web/ocaml.xml
ViewVC logotype

Contents of /web/ocaml.xml

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1918 - (show annotations)
Tue Jul 10 19:31:33 2007 UTC (5 years, 10 months ago) by abate
File MIME type: text/xml
File size: 43588 byte(s)
[r2006-11-24 13:25:15 by afrisch] ocamlduce 3.09.2pl2

Original author: afrisch
Date: 2006-11-24 13:25:15+00:00
1 <?xml version="1.0" encoding="ISO-8859-1" standalone="yes"?>
2 <page name="ocaml">
3
4 <title>OCamlDuce</title>
5
6 <left>
7 <local-links href="index,documentation"/>
8 <p>On this page:</p>
9 <boxes-toc/>
10 </left>
11
12 <box>
13
14 <p>
15 OCamlDuce is a merger between <a
16 href="http://caml.inria.fr/">OCaml</a> and
17 <local href="index">CDuce</local>. It comes as a modified
18 version of OCaml which integrates CDuce features: XML expressions,
19 regular expression types and patterns, iterators.
20 </p>
21
22 <p>
23 OCamlDuce is distributed under the Q Public License version 1.0.
24 </p>
25
26 <ul>
27 <li>A <a
28 href="papers/ocamlduce_icfp.pdf">technical
29 report</a> describes the theory behind OCamlDuce's type system (to be
30 presented in ICFP 2006).</li>
31 <li><local href="ocaml_install">How to get OCamlDuce:</local> download,
32 installation instructions, packages.</li>
33 <li><local href="ocaml_manual">User's manual</local>.</li>
34 <li><local href="ocaml_code">Code samples and
35 applications</local>.</li>
36 <li><local href="mailing">Mailing lists</local>.</li>
37 </ul>
38
39 </box>
40
41 <page name="ocaml_install">
42 <title>Getting OCamlDuce</title>
43
44 <box title="Download and installation" link="install">
45
46 <p>
47 Currently, OCamlDuce
48 is based on OCaml 3.09.2 and CDuce 0.4.0.
49 </p>
50
51 <ul>
52 <li><a
53 href="http://gallium.inria.fr/~frisch/ocamlcduce/download/ocamlduce-3.08.4pl5.tar.gz">Compiler,
54 version 3.08.4, patch level 5</a> (to be used with OCaml 3.08.4)</li>
55 <li><a
56 href="http://gallium.inria.fr/~frisch/ocamlcduce/download/ocamlduce-3.09.1pl1.tar.gz">Compiler,
57 version 3.09.1, patch level 1</a> (to be used with OCaml 3.09.1)</li>
58 <li><a
59 href="http://gallium.inria.fr/~frisch/ocamlcduce/download/ocamlduce-3.09.2pl2.tar.gz">Compiler,
60 version 3.09.2, patch level 2</a> (to be used with OCaml 3.09.2)</li>
61 </ul>
62
63 <p>
64 The following describes the installation procedure for the
65 3.09.2 release.
66 OCamlDuce is installed on top of an existing OCaml
67 installation (whose version number must match) and it requires
68 a recent version of findlib. The build procedure
69 is: <tt>make all &amp;&amp; make opt &amp;&amp; make
70 install</tt>. The configuration is taken from OCaml's
71 <tt>Makefile.config</tt>.
72 </p>
73
74 <p>
75 The tools are named <tt>ocamlduce, ocamlducec, ocamlduceopt,
76 ocamlducedep, ocamlducemktop, ocamlducemktop, ocamlducefind</tt>.
77 They are installed in the same directory as the ocaml compiler itself.
78 </p>
79
80 <p>
81 In addition, a library called <tt>ocamlduce.cma/.cmxa</tt> is built.
82 It depends on the <tt>nums</tt> library. A findlib package named
83 <tt>ocamlduce</tt> is created by the <tt>make install</tt> target.
84 Normally, you don't need to care about the package except if you
85 insist to link your modules with the regular OCaml compilers (not
86 OCamlDuce), but there is no good reason to do so.
87 </p>
88
89 <p>
90 To generate the ocamldoc documentation for the <tt>Ocamlduce</tt>
91 module: <tt>make htdoc</tt>.
92 </p>
93
94 <section title="Compiling, linking, calling the toplevel">
95
96 <p>Starting from OCamlDuce 3.09.2, you don't need to struggle with
97 extra command-line options. You must simply use the OCamlDuce tools:</p>
98
99 <sample>
100 {{Call the toplevel:}} ocamlduce
101 {{Compile:}} ocamlducec -c x.ml
102 {{Link:}} ocamlducec -o x x.cmo
103 {{Use ocamlfind:}} ocamlducefind ocamlc -o -linkpkg -package pcre x.ml
104 </sample>
105
106 </section>
107
108
109 <section title="Building from the CVS">
110
111 <p>
112 The following commands will extract the current development version of
113 OCamlDuce (from OCaml and CDuce CVS repositories):
114 </p>
115
116 <sample>
117 cvs -f -d ":pserver:anoncvs@camlcvs.inria.fr:/caml" co -r cducetrunk ocaml
118 cvs -f -d ":pserver:anonymous@cvs.cduce.org:/cvsroot" co cduce
119 (cd ocaml/cduce; make link)
120 </sample>
121
122 </section>
123
124 </box>
125
126 <box title="Ports and packages" link="ports">
127
128 <section title="GODI">
129
130 <p>
131 There is a <tt>godi-ocamlduce</tt> package available in GODI
132 (sections 3.08 and 3.09).
133 </p>
134
135 </section>
136
137 <section title="DarwinPorts and OpenBSD">
138
139 <p>
140 Anil Madhavapeddy contributed two ports of OCamlDuce for DarwinPorts
141 (in dports/lang/ocamlduce) and for OpenBSD (in ports/lang/ocamlduce).
142 </p>
143
144 </section>
145
146 </box>
147
148 </page>
149
150 <page name="ocaml_manual">
151 <title>OCamlDuce: manual</title>
152
153 <box title="Overview" link="overview">
154
155 <p>
156 The goal of the OCamlDuce project is to extend the OCaml language with features
157 to make it easier to write safe and efficient complex applications
158 that need to deal with XML documents. In particular, it relies
159 on a notion of types and patterns to guarantee statically
160 that all the possible input documents are correctly processed, and
161 that only valid output documents are produced.
162 </p>
163
164 <p>
165 In a nutshell, OCamlDuce extends OCaml with a new kind of values
166 (<em>x-values</em>) to represent XML documents, fragments, tags, Unicode
167 strings. In order to describe these values, it also extends the type algebra
168 with so-called <em>x-types</em>. The philosophy behind these types is that they
169 represent <em>set of x-values</em>. They can be very precise: indeed,
170 each value can be seen as a singleton type (a set with a single
171 value), and it is possible to form Boolean combinations of x-types
172 (intersection, union, difference).
173 </p>
174
175 <p>
176 OCamlDuce's type system can be understood as a refinement of OCaml.
177 For each sub-expression which is inferred to be of the x-kind (using
178 OCaml unification based type-system), OCamlDuce will try to infer to
179 best possible sound x-type. Here, best means smallest for the natural
180 subtyping relation (set inclusion). The inference algorithm is
181 actually a data-flow analysis: the x-type will collect all the values
182 that can be produced by the expression, considering all the possible
183 data-flow in the program. It it sometimes necessary to provide
184 explicit type annotations to help the type checker infer this type, in
185 particular when you define recursive functions or when you use
186 iterators.
187 </p>
188
189 <p>
190 Subtyping is implicit for x-types: if an expression is inferred to be
191 of x-type <code>t</code>, which is a subtype of <code>s</code>, then
192 it is possible to use this expression in any context which expects a
193 value of type <code>s</code>.
194 </p>
195
196 </box>
197
198 <box title="Getting started" link="start">
199
200 <p>
201 Most of the new language features are enclosed within double curly braces
202 <code>{{ON}}{{...}}</code>. For instance, the following code sample
203 defines a value <code>x</code> as an XML element (with tag
204 <code>a</code>, an attribute <code>href</code>, and a simple
205 string as content):
206 </p>
207
208 <sample><![CDATA[{{ON}}
209 # let x = {{ <a href="http://www.cduce.org">['CDuce'] }};;
210 val x : {{<a href=[ 'http://www.cduce.org' ]>[ 'CDuce' ]}} =
211 {{<a href="http://www.cduce.org">[ 'CDuce' ]}}
212 ]]></sample>
213
214 <p>
215 What appears between the curly braces is called an x-expression.
216 Similarly, there are x-types (as seen above), and also x-patterns.
217 The delimiters <code>{{ON}}{{...}}</code> are only used
218 for syntactical reasons, to avoid clashed between OCaml and CDuce
219 syntaxes and lexical conventions. As a matter of fact,
220 an OCaml expression need not be a syntactical x-expression
221 (delimited by double curly braces) to evaluate to an x-value.
222 For instance, once <code>x</code> has been declared as above,
223 the expression <code>x</code> evaluates to an x-value.
224 </p>
225
226
227 <p>
228 It is possible to use an arbitrary
229 OCaml expression as part of an x-expression: it must simply be
230 protected by a new pair of double curly braces. For instance, there is
231 no <code>if-then-else</code> construction for x-expressions, but you
232 can write:
233 </p>
234
235 <sample><![CDATA[{{ON}}
236 # {{ <a href={{if true then {{"a"}} else {{"z"}}}}>[] }};;
237 - : {{<a href=[ 'a' | 'z' ]>[ ]}} = {{<a href="a">[ ]}}
238 ]]></sample>
239
240 <p>
241 Only the highlighted parts are parsed as x-expressions. The
242 <code>if-then-else</code> sub-expression is parsed as an OCaml
243 expression, but its type is an x-type (namely <code>{{ON}}{{[ 'a' |
244 'z' ]}}</code>).
245 </p>
246
247 </box>
248
249 <box title="X-values" link="values">
250
251 <p>
252 X-values are intended to represent XML documents and fragments
253 thereof: elements, tags, text, sequences. In this section, we
254 present the x-value algebra, the syntax of the corresponding
255 x-expression constructors and the associated x-types.
256 </p>
257
258 <p>
259 There are three kinds of atomic kind of x-values:
260 </p>
261 <ul>
262 <li>Unicode characters;</li>
263 <li>qualified names;</li>
264 <li>arbitrarily large integers.</li>
265 </ul>
266
267 <section title="Characters">
268
269 <p>
270 X-characters are different from OCaml characters. They can represent
271 the range of Unicode codepoints defined in the XML specification.
272 Character literals are delimited by single quotes. The escape
273 sequences \n, \r, \t, \b, \', \&quot;, \\ are recognized as usual. The
274 numerical escape sequence are written <code>\n;</code> where n is an integer
275 literal (note the extra semi-colon). The source code is interpreted as
276 being encoded in iso-8859-1. As a consequence, Unicode characters which are not
277 part of the Latin1 character set must be introduced with this
278 numerical escape mechanism. The x-types for x-characters are:
279 </p>
280 <ul>
281 <li>singletons;</li>
282 <li>intervals, written <code>c -- d</code>, where <code>c</code> and
283 <code>d</code> are literals (example: <code>{{ON}}type t = {{ 'a'--'z'
284 }}</code>);</li>
285 <li>the type of all x-characters, written <code>Char</code>;</li>
286 <li>the type of all Latin1 characters, written <code>Latin1Char</code>
287 (defined as <code>\0; -- \255;</code>).</li>
288 </ul>
289
290 </section>
291
292 <section title="Integers">
293
294 <p>
295 X-integers are arbitrarily large. Literals must be written in decimal.
296 Negative literals must be in parenthesis. E.g.: <code>(-3)</code>.
297 The x-types for x-integers are:
298 </p>
299 <ul>
300 <li>singletons;</li>
301 <li>intervals, written <code>i -- j</code>, where <code>i</code> and
302 <code>j</code> are literals (example: <code>{{ON}}type t = {{ 10--20
303 }}</code>); it is possible to replace <code>i</code> or <code>j</code>
304 with <code>**</code> to define open-ended intervals, e.g.
305 <code>{{ON}}type pos = {{ 1 -- ** }}</code>;
306 </li>
307 <li>the type of all x-integers, written <code>Int</code>;</li>
308 <li>the type of all the integers which can be represented by a
309 signed 32 (resp. 64) bit machine word, written <code>Int32</code> (resp.
310 <code>Int64</code>).</li>
311 </ul>
312
313 </section>
314
315 <section title="Qualified names">
316
317 <p>
318 Qualified names are intended to represent XML tag names. Conceptually,
319 they are made of a namespace URI and a local name. Since URIs tends
320 to be long, literals are of the form <code>`prefix:local</code>
321 where <code>local</code> is the local name and <code>prefix</code>
322 is an <em>namespace prefix</em> bound to some URI (in the scope of the
323 literal). The local name follows the definitions from
324 the XML Namespaces specification; a dot character must be protected
325 by a backslash and non-Latin1 characters are written as character
326 literals <code>\n;</code>. <a href="#ns">See below</a> for a
327 explanation on how to bind prefixes to URIs. To refer
328 to the default namespace (or the absence of namespace if not default
329 has been defined), the syntax is simply <code>`local</code>.
330 The x-types for qualified names are:
331 </p>
332 <ul>
333 <li>singletons;</li>
334 <li>the type of all qualified names, written <code>Atom</code>;</li>
335 <li>the type of all qualified names from a specified namespace,
336 written <code>`ns:*</code>.</li>
337 </ul>
338 </section>
339
340 <section title="Records">
341
342 <p>
343 X-records are mainly used to represent the set of attributes of an XML
344 element. An x-record is a binding from a finite set of <em>labels</em>
345 to x-values. Labels follows the same syntax as for qualified names
346 without the leading backquote. However, if the namespace prefix is not
347 given, the default namespace does not apply (the namespace URI is
348 empty). The syntax for record x-expressions is <code> { l1=e1
349 ... ln=en }</code> where the <code>li</code> are labels and the
350 <code>ei</code> are x-expressions. Fields can also be separated with a
351 semi-colon. It is legal to omit the expression for a field; the label is then
352 taken as the content of the field (a value with this name must be
353 defined in the current scope), e.g.: <code>{{ON}}let x = ... and y = ...
354 in {{ {x y z=3} }}</code> is equivalent to <code>{{ON}}let x = ... and
355 y = ... in {{ {x=x y=y z=3} }}</code>. The types for x-records specify
356 which labels are authorized/mandatory, and what the types of the
357 corresponding fields are. There are two kind of record x-types:
358 </p>
359
360 <ul>
361 <li>
362 Closed record types, which only allow a finite number of fields:
363 <code>{ l1=t1 ... ln=tn }</code>;
364 </li>
365 <li>
366 Open record types, which allow additional fields (with arbitrary
367 type):
368 <code>{ l1=t1 ... ln=tn .. }</code> (the final two colons are
369 in the syntax).
370 </li>
371 </ul>
372
373 <p>
374 In both cases, it is possible to make one of
375 the fields optional by changing = to =?.
376 </p>
377
378 <p>
379 The x-type of all x-record is thus <code>{ .. }</code>,
380 and the x-type of x-records with maybe a field <code>l</code>
381 of type <code>Int</code> and maybe arbitrary other fields is
382 <code>{ l=?Int .. }</code>.
383 </p>
384
385 </section>
386
387 <section title="Sequences">
388
389 <p>
390 X-sequences are finite and ordered collections of x-values.
391 The syntax for a sequence x-expression in
392 <code>[ e1 ... en ]</code> (note that elements are <em>not</em> separated
393 by semi-colons as in OCaml list). Each item <code>ei</code>
394 can either be:
395 </p>
396 <ul>
397 <li>an x-expression;</li>
398 <li><code>!e</code> where <code>e</code> is an x-expression which
399 evaluates to a sequence (whose content is inserted in the sequence
400 which is currently defined); e.g.
401 <code>let x = [ 2 3 ] in [ 1 !x 4 ]</code> is equivalent to
402 <code>[ 1 2 3 4 ]</code>;</li>
403 <li>a string literal delimited by simple quotes; e.g.
404 <code>[ 'abc' ]</code> is equivalent to <code>[ 'a' 'b' 'c' ]</code>.</li>
405 </ul>
406
407 <p>
408 X-types for sequences are of the form <code>[R]</code>
409 where <code>R</code> is a regular expression over x-types which
410 describe the possible contents of the sequences. The possible
411 forms of regular expressions are:
412 </p>
413
414 <ul>
415 <li><code>t</code> (one single element of x-type <code>t</code>)</li>
416 <li><code>R*</code> (zero or more repetitions)</li>
417 <li><code>R+</code> (one or more repetitions)</li>
418 <li><code>R?</code> (zero or one repetition)</li>
419 <li><code>R1 R2</code> (sequence)</li>
420 <li><code>R1|R2</code> (alternation)</li>
421 <li><code>(R)</code></li>
422 <li><code>/t</code> (guard: the tail of the sequence must comply with
423 <code>t</code>).</li>
424 <li><code>PCDATA</code> (equivalent to Char*).</li>
425 </ul>
426
427 <note>sequence are actually encoded with embedded pairs and a
428 terminator, and sequences types are encoded with product types and
429 recursive types. The encoding is available to the programmer
430 but not described in this manual.
431 </note>
432
433 </section>
434
435 <section title="Strings">
436
437 <p>
438 Strings are nothing but sequences of characters. There are two
439 predefined types <code>String</code> and <code>Latin1</code>
440 (defined as <code>[ Char* ]</code> and <code>[ Latin1Char* ]</code>).
441 </p>
442
443 <p>
444 A string literal <code>[ '...' ]</code> can also be written
445 <code>"..." </code> (without the square brackets). Note that simple
446 (resp. double) quotes need to be escaped only when the string is
447 delimited with double (resp. simple) quotes.
448 </p>
449
450 </section>
451
452 <section title="XML elements">
453
454 <p>
455 An XML element is a triple of x-values. The syntax for
456 the corresponding x-expression constructor is
457 <code><![CDATA[<(e1) (e2)>e3]]></code>. When <code>e1</code> is a
458 qualified name literal, it is possible to omit the leading
459 backquote and the surrounding parentheses. Similarly,
460 when <code>e2</code> is an x-record literal, it is possible
461 to omit the curly braces and the parentheses. For instance,
462 one can simply write <code><![CDATA[<a href="abc">['def']]]></code>
463 instead of <code><![CDATA[<(`a) ({href="abc"})>['def']]]></code>.
464 </p>
465
466 <p>
467 XML element x-type are written <code><![CDATA[<(t1) (t2)>t3]]></code>,
468 and the same simplifications applies. For instance, if
469 the namespace prefix <code>ns</code> has been defined,
470 the following is a legal x-type <code><![CDATA[<ns:* ..>[]]]></code>;
471 it describes XML elements whose tag is in the namespace bound to
472 <code>ns</code>, with an empty content, and with an arbitrary set of
473 attributes. An underscore in place of <code>(t1)</code> is
474 equivalent to <code>(Atom)</code> (any tag).
475 </p>
476
477 </section>
478
479 </box>
480
481 <box title="X-expressions" link="expr">
482
483 <p>
484 In the previous section, we have seen the syntax for x-values
485 constructors (constant literals, sequence, record, element constructors).
486 In this section, we describe the other kinds of x-expressions.
487 </p>
488
489 <section title="Binary infix operators">
490
491 <p>
492 The arithmetic operators on integers follow the usual precedence.
493 They are written <code>+,*,-,div,mod</code> (they are all infix).
494 </p>
495
496 <p>
497 Record concatenation: <code>e1 ++ e2</code>. The x-expressions
498 <code>e1</code> and <code>e2</code> must evaluate to x-records.
499 The result is obtained by concatening them. If a field with the same
500 label is present in both records, the right-most one is selected.
501 </p>
502
503 <p>
504 Sequence concatenation: <code>e1 @ e2</code>, equivalent
505 to <code>[!e1 !e2]</code>.
506 </p>
507
508 </section>
509
510 <section title="Projections, filtering">
511
512 <p>
513 If the x-expression <code>e</code> evaluates to a record or an XML
514 element, the construction <code>e.l</code> will extract the value of
515 field or attribute <code>l</code>. Similarly, the construction
516 <code>e.?l</code> will extract the value of field or attribute
517 <code>l</code> if present, and return the empty sequence
518 <code>[]</code> otherwise.
519 </p>
520
521 <p>
522 If the x-expression <code>e</code> evaluates to a record,
523 the construction <code>e -. l</code> will produce a new record
524 where the field <code>l</code> has been removed (if present).
525 </p>
526
527 <p>
528 If the x-expression <code>e</code> evaluates to an x-sequence,
529 the construction <code>e/</code> will result in a new x-sequence
530 obtained by taking in order all the children of the XML elements
531 from the sequence <code>e</code>. For instance, the x-expression
532 <code><![CDATA[[<a>[ 1 2 3 ] 4 5 <b>[ 6 7 8 ] ]/]]></code>
533 evaluates to the x-value <code>[ 1 2 3 6 7 8 ]</code>.
534 </p>
535
536 <p>
537 If the x-expression <code>e</code> evaluates to an x-sequence,
538 the construction <code>e.(t)</code> (where <code>t</code> is an
539 x-type) will result in a new x-sequence
540 obtained by filtering <code>e</code> to keep only the elements
541 of type <code>t</code>. For instance, the x-expression
542 <code><![CDATA[[<a>[ 1 2 3 ] 4 5 <b>[ 6 7 8 ] ].(Int)]]></code>
543 evaluates to the x-value <code>[ 4 5 ]</code>.
544 </p>
545 </section>
546
547 <section title="Dynamic type checking">
548
549 <p>
550 If <code>e</code> is an x-expression and <code>t</code> is an x-type,
551 the construction <code>(e :? t)</code> returns the same
552 result as <code>e</code> if it has type <code>t</code>, and otherwise
553 raises a <code>Failure</code> exception whose argument explains
554 why this is not the case.
555 </p>
556
557 <sample><![CDATA[{{ON}}
558 # let f (x : {{ Any }}) = {{ (x :? <a>[ Int* ] ) }} in
559 f {{ <a>[ 1 2 '3' ] }};;
560 Exception:
561 Failure
562 "Value <a>[ 1 2 '3' ] does not match type <a>[ Int* ]\nValue '3' does not match type Int\n".
563 ]]></sample>
564 </section>
565
566 <section title="Pattern matching">
567
568 <p>
569 OCamlDuce comes with a powerful pattern matching operation.
570 X-patterns are described <a href="#patterns">below</a>.
571 The syntax for the pattern matching operation is:
572 <code>match e with p1 -> e1 | ... | pn -> en</code>.
573 The type-system ensures exhaustivivity for the pattern matching
574 and infers precise types for the capture variables.
575 It is also possile to use x-pattern matching as a regular
576 OCaml expression; x-patterns must be surrounded by {{..}}, e.g.:
577 match e with {{p1}} -> e1 | ... | {{pn}} -> en
578 function {{p1}} -> e1 | ... | {{pn}} -> en
579 </p>
580
581 <p>
582 Pattern matching follows is first-match policy. The first pattern
583 that succeeds triggers the corresponding branch.
584 </p>
585
586 <note>
587 currently it is impossible to mix normal OCaml patterns and x-patterns
588 in a single pattern matching.
589 </note>
590
591 </section>
592
593 <section title="Local binding">
594
595 <p>
596 The x-expression <code>let p=e1 in e2</code> is equivalent to
597 <code>match e1 with p -> e2</code>. There is also an local binding
598 with an x-pattern in OCaml expressions: <code>let {{p}}=e1 in
599 e2</code>.
600 </p>
601
602 </section>
603
604
605 <section title="Iterators">
606
607 <p>
608 OCamlDuce comes with a sequence iterator
609 <code>map e with p1 -> e1 | ... | pn -> en</code> and
610 a tree iterator
611 <code>map* e with p1 -> e1 | ... | pn -> en</code>.
612 </p>
613
614 <p>
615 For both constructions, the argument must evaluate to a sequence.
616 The <code>map</code> iterator applies the patterns to each element
617 of this sequence in turns and produces a new sequence by concatenating
618 all the results (all the right-hand sides must thus produce a
619 sequence). The set of patterns must be exhaustive for all the possible
620 elements of the input sequence.
621 </p>
622
623 <p>
624 The tree iterator is similar except that the patterns need not be
625 exhaustive. If some element of the input sequence is not matched,
626 it is simply copied into the result unless it is an XML element. In
627 this case, the transformation is applied recursively to its content.
628 </p>
629
630 </section>
631
632 <section title="OCaml constructions">
633
634 <p>
635 As a convenience, some of the OCaml expression constructors
636 are allowed as x-expressions (without a need to go back to OCaml
637 with double curly braces): (unqualified) value identifiers <b>without
638 apostrophes</b> and
639 function calls.
640 </p>
641
642 </section>
643
644 </box>
645
646 <box title="More on x-types" link="types">
647
648 <p>
649 We have seen how to write simple x-types. We can then combine
650 them with Boolean connectives:
651 </p>
652
653 <ul>
654 <li><code>t1 &amp; t2</code>: intersection;</li>
655 <li><code>t1 | t2</code>: union;</li>
656 <li><code>t1 - t2</code>: difference.</li>
657 </ul>
658
659 <p>
660 The empty x-type is written <code>Empty</code> (it contains no value),
661 and the universal x-type is written <code>Any</code> (it contains
662 all the x-values) or <code>_</code>.
663 </p>
664
665 <p>
666 When an x-type has been bound to some OCaml identifier
667 (<code>{{ON}}type t = {{...}}</code>), it is possible to use
668 this identifier in another x-type. Recursive definitions
669 are allowed:
670 </p>
671
672 <sample><![CDATA[{{ON}}
673 type t1 = {{ <a>[ t2* ] }}
674 and t2 = {{ <b>[ t1* ] }}
675 ]]></sample>
676
677 <p>
678 Note that x-values are always finite and acyclic. The type checker
679 detects type definition which would yield empty types:
680 </p>
681
682 <sample><![CDATA[{{ON}}
683 # type t = {{ <a>[ t+ ] }};;
684 This definition yields an empty type
685 ]]></sample>
686
687 <p>
688 If <code>t1</code> and <code>t2</code> are record x-types,
689 we can combine them with the infix <code>++</code> operator, which
690 mimics the corresponding operator on expressions (record
691 concatenation). Similarly, we can use the infix <code>@</code>
692 concatenation operator on sequence x-types.
693 </p>
694
695 </box>
696
697 <box title="X-patterns" link="patterns">
698
699 <p>
700 X-patterns follow the same syntax as X-types. In particular,
701 any X-type is a valid X-pattern. In addition to X-types constructors,
702 X-patterns can have:
703 </p>
704
705 <ul>
706 <li>capture variables (lowercase OCaml identifiers <b>without apostrophes</b>);</li>
707 <li>constant bindings <code>(x := c)</code> where x is a capture
708 variable and c is
709 a literal x-constant (this pattern always succeeds and returns the
710 binding x->c).</li>
711 </ul>
712
713 <p>
714 An identifier in an X-pattern can be either a reference
715 to a named X-type (if such a type declaration is in scope)
716 or a capture variable (otherwise).
717 </p>
718
719 <p>
720 Here is a brief description of the semantics of patterns. Given
721 an input value, a pattern can either succeed or fail. If it succeeds,
722 it also produces a bindings from the capture variables in the pattern
723 to x-values.
724 </p>
725
726 <ul>
727
728 <li>A pattern which is just a type (no capture variable) succeeds if
729 and only if the value has the type.</li>
730
731 <li>A pattern <code>p1 | p2</code> succeeds if either <code>p1</code>
732 or <code>p2</code> succeed, and returns the corresponding binding; if
733 both patterns succeeds, <code>p1</code> wins. It is required that
734 <code>p1</code> and <code>p2</code> have the same sets of capture
735 variables. </li>
736
737 <li>A pattern <code>p1 &amp; p2</code> succeeds if both <code>p1</code>
738 and <code>p2</code> succeed, and returns the concatenation of the two
739 bindings. It is required that <code>p1</code> and <code>p2</code> have
740 <em>disjoint</em> sets of capture variables. </li>
741
742 </ul>
743
744 <p>
745 In record x-patterns, it is possible to omit the <code>=p</code> part
746 of a field. The content is then replaced with the label name
747 considered as a capture variable (or as a previously defined type).
748 E.g. <code>{ x y=p }</code> is
749 equivalent to <code>{ x=x y=p }</code>.</p>
750
751 <p>It is also possible to add an "else" clause:
752 <code>{ x = (a,_)|(a:=3) }</code>
753 will accept any record with atmost the field <code>x</code>. If the content
754 is a pair, the capture variable a will be bound to its component;
755 otherwise, it is set to <code>3</code>.</p>
756
757 <p>
758 In regular expressions, it is possible to extract whole subsequences
759 with the notation <code>x::R</code>, e.g.: <code>[ _* x::Int+ _* ]</code>
760 </p>
761
762 <p>
763 If the same sequence capture variable appears several times (or below a
764 repetition) in a regexp, it is bound to the concatenation of all
765 matched subsequences. E.g.: <code>[ (x::Int | _)* ]</code> will
766 collect in <code>x</code> all the elements of type <code>Int</code> from
767 a sequence. It is not legal to have repeated simple capture variables.
768 </p>
769
770 <p>
771 The regexp operators <code>+,*,?</code> are greedy by default (they match as long
772 as possible). They admit non-greedy variants <code>+?,*?,??</code>.
773 </p>
774 </box>
775
776 <box title="Namespace bindings" link="ns">
777
778 <p>
779 The binding of namespace prefixes to URIs
780 can be done either by toplevel phrases (structure items) or
781 by local declarations:
782 </p>
783
784 <sample>{{ON}}
785 # {{ namespace ns = "http://..." }};;
786 # let x = {{ `ns: x }};;
787 val x : {{`ns:x}} = {{`ns:x}}
788 # let x = {{ let namespace ns = "http://..." in `ns:x }};;
789 val x : {{`ns:x}} = {{`ns:x}}
790 </sample>
791
792 <p>The toplevel definitions can also appear in module interfaces
793 (signatures). A toplevel prefix binding is not exported by a module: its scope
794 is limited to the current structure or signature. It is possible
795 to specify a default namespace, and to reset it:
796 </p>
797
798 <sample>{{ON}}
799 # {{ namespace "http://..." }};;
800 # {{ `x }};;
801 - : {{`ns1:x}} = {{`ns1:x}}
802 # {{ namespace "" }};;
803 # {{ `x }};;
804 - : {{`x}} = {{`x}}
805 </sample>
806
807 <p>
808 Note that the value pretty-printer invented some prefix
809 for the namespace URI. The default prefix declaration also have a
810 local form <code> let namespace "..." in ... </code>.
811 </p>
812
813 </box>
814
815 <box title="More on type-checking" link="typecheck">
816
817 <section title="Type inference">
818
819 <p>
820 As we said above, the programmer is sometimes required to provide type
821 annotations. To know where to put these annotation, it is necessary to
822 get a basic understanding of how type-checking works.
823 </p>
824
825 <p>
826 The OCaml type-checker is run first to detect which sub-expressions
827 are of the x-kind. A second ML type-checking pass is then done to
828 introduce subsumption (implicit subtyping) steps where allowed. After
829 these two passes, the OCamlDuce type checker obtains a data-flow summary of
830 x-values in the whole compilation unit. This is a directed graph,
831 whose edges represent either simple data-flow or complex operation
832 on x-values. The nodes of the graph can be thought as x-type
833 variables. A data-flow edge corresponds to a subtyping constraints,
834 and an operation edge corresponds to a symbolic constraints which
835 mimics the corresponding operation on values.
836 </p>
837
838 <p>
839 Some of the nodes are given an explicit type by the programmer,
840 through type annotations (on expressions or function arguments)
841 or the other usual mechanism in ML (data type declarations,
842 signatures, ...).
843 </p>
844
845 <p>
846 Also, if there is a loop with only subtyping edges in the graph,
847 all the nodes on the loop are merged together.
848 </p>
849
850 <p>
851 After this operation, the graph is required to be acyclic (assuming
852 that the nodes with an explicit type are removed from the graph). It
853 is the responsibility of the programmer to provide enough type
854 annotation to achieve this property. Otherwise, a type error
855 is issued.
856 </p>
857
858 <sample><![CDATA[{{ON}}
859 # let rec f x = match x with 0 -> {{ [] }} | n -> {{ f {{n-1}} @ ['.'] }};;
860 Cycle detected: cannot type-check
861 # let rec f x : {{ String }} = match x with 0 -> {{ [] }} | n -> {{ f {{n-1}} @ ['.'] }};;
862 val f : int -> {{String}} = <fun>]]>
863 </sample>
864
865 <p>
866 In the example above, there is a cycle between the result type for
867 <code>f</code> and the type for the sub-expression <code>{{ON}}f
868 {{n-1}}</code>. It is here broken with a type annotation on the result; it could
869 have been broken by a type annotation on the expression <code>{{ON}}f
870 {{n-1}}</code>, or on the function <code>f</code> itself, or by a
871 module signature.
872 </p>
873
874 <p>
875 Let us study another simple example:
876 </p>
877
878 <sample>{{ON}}
879 # let f x = {{ x + 1 }} in f {{ 2 }}, f {{ 3 }};;
880 - : {{3--4}} * {{3--4}} = ({{3}}, {{4}})
881 </sample>
882
883 <p>
884 The type-checkers detects that the two x-values <code>2</code> and
885 <code>3</code> can flow to the argument of <code>f</code>. Its body
886 is thus type-checked with the assumption that <code>x</code> has type
887 <code>2--3</code>. The computed result type is then <code>3--4</code>.
888 </p>
889
890
891 <p>
892 The type-inference process described above is global by nature. The
893 acyclicity condition is only imposed after a whole compilation unit
894 has been type-checked by OCaml (and the information from the module
895 interface as been integrated). When a type variable is inferred to
896 be of the x-kind, it is never generalized. As a consequence, there
897 is no parametric polymorphism on x-types.
898 </p>
899
900 <p>
901 In the toplevel, type-checking is done after each phrase. Consider
902 the following session:
903 </p>
904
905 <sample><![CDATA[{{ON}}
906 # let f x = {{ x + 1 }};;
907 val f : {{Empty}} -> {{Empty}} = <fun>
908 # let a = f {{ 2 }};;
909 Subtyping failed 2 <= Empty
910 Sample:
911 2
912 ]]></sample>
913
914 <p>
915 The function <code>f</code> is inferred to have type
916 <code>{{ON}}{{Empty}} -> {{Empty}}</code> because when the first
917 phrase is type-checked, the data-flow graph says that no value
918 can flow to <code>x</code>, and thus the input type is empty
919 (and similarly for the result type). If the two phrases
920 were type-checked together (which would be the case it they had
921 been compiled by the compiler, not in the toplevel), the type checker
922 would have correctly inferred that the input type for <code>f</code>
923 must contain <code>2</code>.
924 </p>
925
926 </section>
927
928 <section title="Implicit subtyping">
929
930 <p>
931 Coercion from an x-type to a super type is automatic in OCamlDuce.
932 However, this automatic subsumption does not carry over to OCaml
933 type constructor, even if there are covariant. Consider:
934 </p>
935
936 <sample><![CDATA[{{ON}}
937 # let f (x : {{ Int }} * {{ Int }}) = 1;;
938 val f : {{Int}} * {{Int}} -> int = <fun>
939 # let g (x : {{ 0 }} * {{ 0 }}) = f x;;
940 This expression has type {{0}} * {{0}} but is here used with type
941 {{Int}} * {{Int}}
942 # let g (x : {{ 0 }} * {{ 0 }}) = let a,b = x in f (a,b);;
943 val g : {{0}} * {{0}} -> int = <fun>
944 # let g (x : {{ 0 }} * {{ 0 }}) = f (x :> {{ Int }} * {{ Int }});;
945 val g : {{0}} * {{0}} -> int = <fun>
946 ]]></sample>
947
948 <p>
949 The first attempt to define <code>g</code> fails because the type for
950 <code>x</code> is not an x-type and thus subsumption does not
951 apply. In the second attempt, we extract the two components of the
952 pair; since they are inferred to be x-values, subtyping applies to
953 both of them. Thus, when the pair <code>(a,b)</code> is reconstructed,
954 it is legal to unify its type with the input type of <code>f</code>.
955 The third definition for <code>g</code> gives an alternative solution:
956 using explicit OCaml type coercions.
957 </p>
958
959 </section>
960
961 </box>
962
963 <box title="Exchanging values" link="transl">
964
965 <p>
966 OCamlDuce strongly seperates regular OCaml values from the new
967 x-values. They have different syntax, expressions, types, patterns,
968 and even type-checking algorithms. This strong segregation is key point
969 which allowed a simple integration between very different type
970 systems.
971 </p>
972
973 <p>
974 At some point, it is still necessary to cross the frontier and
975 translate OCaml values to x-values or the opposite.
976 </p>
977
978 <p>
979 Fortunately, OCamlDuce provides automatic translations in both
980 directions. Instead of double curly braces, you can
981 enclose x-expressions in curly brace+colon <code>{: ... :}</code>
982 (here, the <code>...</code> is an x-expression).
983 The effect is to translate the result of the x-expression
984 (which must be an x-value) to an OCaml value. Similarly,
985 in an x-expression, you can obtain the x-translation of
986 an OCaml value with the same syntax <code>{: ... :}</code>
987 (here, the <code>...</code> is an OCaml expression).
988 </p>
989
990 <p>
991 Here is how the translation works. To each OCaml type <code>t</code>,
992 we associate an x-type <code>T(t)</code> and a pair of translation
993 function between <code>t</code> and <code>T(t)</code>.
994 Actually, not all the features are supported. For instance,
995 free type variables, abstract types, object types, non-regular
996 recursive types cannot be translated. In particular, since
997 type variables are not allowed, the OCaml type must be fully known.
998 </p>
999
1000 <p>
1001 The translation for an OCaml type <code>t</code> is defined by structural
1002 induction on <code>t</code>. Sum types are
1003 translated to union types: a constant constructor <code>A</code> is
1004 translated to the qualified name <code>`A</code>; a non-constant
1005 constructor <code>A of t1 * ... * tn</code> is translated to
1006 <code>&lt;A>[ T(t1) ... T(tn) ]</code>. Closed polymorphic variants
1007 have the same translation. Record types are translated to closed
1008 record x-types. Some other translations:
1009 </p>
1010
1011 <table border="1">
1012 <tr><th>Caml type t</th> <th>X-type T(t)</th></tr>
1013 <tr><td><code>int</code></td> <td><code>Int</code></td></tr>
1014 <tr><td><code>int32</code></td> <td><code>Int32</code></td></tr>
1015 <tr><td><code>int64</code></td> <td><code>Int64</code></td></tr>
1016 <tr><td><code>string</code></td> <td><code>Latin1</code></td></tr>
1017 <tr><td><code>t list</code></td> <td><code>[T(t)*]</code></td></tr>
1018 <tr><td><code>t array</code></td> <td><code>[T(t)*]</code></td></tr>
1019 <tr><td><code>unit</code></td> <td><code>[]</code></td></tr>
1020 <tr><td><code>char</code></td> <td><code>Latin1Char</code></td></tr>
1021 <tr><td><code>{{t}}</code></td> <td><code>t</code></td></tr>
1022 </table>
1023
1024 <p>
1025 Here is an example:
1026 </p>
1027
1028 <sample>{{ON}}
1029 # let f (x : {{ Int }}) = {{ x + 1 }} in List.map f {: [ 1 2 3 ] :};;
1030 - : {{Int}} list = [{{2}}; {{3}}; {{4}}]
1031 </sample>
1032
1033 <p>
1034 In this example, the result type of the translation is inferred
1035 to be <code>{{ON}}{{ Int }} list</code> (because the type for
1036 <code>f</code> is given). The corresponding x-type
1037 is <code>{{ON}}{{ [Int*] }}</code>.
1038 </p>
1039
1040 </box>
1041
1042 <box title="The standard library" link="stdlib">
1043
1044 <p>
1045 In OCamlDuce, the Num library from OCaml is included in the standard
1046 library. In addition, there are two new module called
1047 <code>Ocamlduce</code> and <code>Cduce_types</code> in the standard library.
1048 </p>
1049
1050 <p>
1051 The module <code>Cduce_types</code> gives access to the internal
1052 representation of x-values. It is currently undocumented.
1053 </p>
1054
1055 <p>
1056 The module <code>Ocamlduce</code> provides several useful
1057 functionality x-values. See the <a href="http://yquem.inria.fr/~frisch/ocamlcduce/doc/ocamlduce/Ocamlduce.html">ocamldoc</a> generated
1058 documentation for a description of its interface.
1059 </p>
1060
1061 </box>
1062
1063 <box title="Marshaling" link="marshal">
1064
1065 <p>
1066 OCamlDuce use some tricks on its internal representation of x-values
1067 to reduce memory usage and improve performance. You need to pay
1068 special attention if you want to use OCaml serialization functions
1069 (module <code>Marshal</code>, functions
1070 <code>input_value/output_value</code>) on x-values. In addition to
1071 your values, you also need to save and restore some piece of internal data
1072 using the functions <code>Cduce_types.Value.extract_all</code> and
1073 <code>Cduce_types.Value.intract_all</code>. Of course, this also
1074 applies if the value to be serialized contains deeply nested x-values.
1075 </p>
1076
1077 <p>
1078 Here are generic
1079 serialization/deserializations functions that illustrate how to do it:
1080 </p>
1081
1082 <sample>
1083 let my_output_value oc v =
1084 let p = Cduce_types.Value.extract_all () in
1085 output_value oc (p,v)
1086
1087 let my_input_value ic =
1088 let (p,v) = input_value ic in
1089 Cduce_types.Value.intract_all p;
1090 v
1091 </sample>
1092
1093 </box>
1094
1095 <box title="Performance" link="perf">
1096
1097 <section title="Strings">
1098
1099 <p>
1100 OCaml users might be surprised by the fact that x-strings are simply
1101 represented as sequences in OCamlDuce. Does this mean that they are
1102 actually stored in memory as linked list? Certainly not! The internal
1103 representation of sequence values uses several tricks to improve
1104 performance and memory usage. In particular, a special form in the
1105 representation can store strings as byte buffers, as in OCaml.
1106 It an XML document is loaded, or if a Caml string is converted
1107 to an x-value, this compact representation will be used.
1108 </p>
1109
1110 </section>
1111
1112 <section title="Concatenation">
1113
1114 <p>
1115 Similarly, OCaml users might be relectutant to use the sequence
1116 concatenation <code>@</code> on sequences. In OCaml, the complexity
1117 of this operator is linear in the size of its first argument (which
1118 need to be copied). OCamlDuce use a special form in its internal
1119 representation to store concatenation in a lazy way. The concatenation
1120 will really by computed only when the value is accessed. This means
1121 that it's perfectly ok to build a long sequence by adding
1122 new elements at the end one by one, as long as you don't
1123 simultaneously inspect the sequence.
1124 </p>
1125
1126 </section>
1127
1128 <section title="Pattern matching">
1129
1130 <p>
1131 Another point which is worth knowing when programming in OCamlDuce
1132 is that patterns can be written in a declarative style without
1133 affective performance. The compiler uses static type information
1134 about matched values to produce efficient code for pattern matching.
1135 To illustrate this, consider the following sample:
1136 </p>
1137
1138 <sample><![CDATA[{{ON}}
1139 x.ml:
1140
1141 type a = {{ <a>[ a* ] }}
1142 type b = {{ <b>[ b* ] }}
1143
1144 let f : {{ a|b }} -> int = function {{ a }} -> 0 | {{ _ }} -> 1
1145 ]]></sample>
1146
1147 <sample><![CDATA[{{ON}}
1148 y.ml:
1149
1150 type a = {{ <a>[ a* ] }}
1151 type b = {{ <b>[ b* ] }}
1152
1153 let f : {{ a|b }} -> int = function {{ <a>_ }} -> 0 | {{ _ }} -> 1
1154 ]]></sample>
1155
1156 <p>
1157 The two functions have exactly the same semantics, but the first
1158 implementation is more declarative: it uses type checks to distinguish
1159 between <code>a</code> and <code>b</code> instead of saying
1160 <em>how</em> to distinguish between these two types. Imagine
1161 that the definition of these types change to:
1162 </p>
1163
1164 <sample><![CDATA[{{ON}}
1165 type a = {{ <x kind="a">[ a* ] }}
1166 type b = {{ <x kind="b">[ b* ] }}
1167 ]]></sample>
1168
1169 <p>
1170 Then the first implementation still works as expected, but the
1171 second one needs to be rewritten.</p>
1172
1173 <p>Now one might believe that the second implementation is more
1174 efficient because it tells the compiler to check only the root tag,
1175 whereas the first implementation would force
1176 the compiler to produce code to check that all tags in the tree
1177 are <code>a</code>s. But this is not what happens! Actually,
1178 you can check that the compiler will produce exactly the same code
1179 for both implementations. It considers the static type information
1180 about the argument of the pattern matching (here, the input type
1181 of the function), and computes an efficient way to evaluate
1182 patterns for the values of this type.
1183 </p>
1184
1185 </section>
1186
1187 <section title="The map iterator">
1188
1189 <p>
1190 The <code>map ... with ...</code> iterator is implemented in a
1191 tail-recursive way. You can safely use it on very long sequences.
1192 </p>
1193
1194 </section>
1195
1196 </box>
1197
1198 <box title="OCaml and OCamlDuce" link="ocaml">
1199
1200 <p>
1201 Since the 3.08.4 release, OCamlDuce is binary compatible with the corresponding
1202 OCaml release. This means that OCamlDuce can use OCaml-generated
1203 <tt>.cmi</tt> files and that it produces an OCaml-compatible
1204 <tt>.cmi</tt> file if the interface does not use any x-type
1205 (this file is equal to what would have been obtained by using OCaml).
1206 </p>
1207
1208 <p>
1209 It is thus possible to use existing libraries which were compiled for
1210 OCaml. It is also possible to use OCamlDuce to compile
1211 some modules and use them in an OCaml project provided their interface
1212 is pure OCaml.
1213 </p>
1214
1215 </box>
1216
1217 </page>
1218
1219 <page name="ocaml_code">
1220 <title>OCamlDuce: code samples and applications</title>
1221
1222 <box title="Code samples" link="code">
1223
1224 <section title="Parsing XML files">
1225
1226 <p>
1227 OCamlDuce does not come with any built-in XML parser. However,
1228 the <a href="http://yquem.inria.fr/~frisch/ocamlcduce/doc/ocamlduce/Ocamlduce.Load.html"><code>Ocamlduce.Load</code></a> module in the standard library
1229 makes it easy to plug existing XML parsers. Here is some
1230 code which demonstrate how to do that with three of
1231 the most popular OCaml XML parser libraries:
1232 </p>
1233
1234 <ul>
1235 <li><a
1236 href="http://yquem.inria.fr/~frisch/ocamlcduce/samples/pxp/">PXP</a></li>
1237 <li><a
1238 href="http://yquem.inria.fr/~frisch/ocamlcduce/samples/expat/">Expat</a></li>
1239 <li><a href="http://yquem.inria.fr/~frisch/ocamlcduce/samples/xmllight/">Xml-light</a></li>
1240 </ul>
1241
1242 </section>
1243
1244 <section title="Converting DTD to OCamlDuce types">
1245
1246 <p>
1247 This <a href="http://yquem.inria.fr/~frisch/ocamlcduce/samples/dtd2types/">tool</a> produces a set of OCamlDuce type declarations
1248 from a DTD. It requires PXP.
1249 </p>
1250
1251 <note>This application does not use any of the new features, but it
1252 can be useful in the development of OCamlDuce applications.
1253 </note>
1254
1255 </section>
1256
1257 <section title="Parsing XML Schema, producing valid XHTML output">
1258
1259 <p>
1260 This <a
1261 href="http://yquem.inria.fr/~frisch/ocamlcduce/samples/schema/">application</a>
1262 parses XML Schema Definitions (.xsd files), and produces summaries
1263 (toplevel declaration names) in XHTML. OCamlDuce type system ensures
1264 that the parser is coherent with the input XML type (any valid XML
1265 Schema is accepted) and that the printer is coherent with the output
1266 XML type (it is necessarily a valid XHTML document).
1267 </p>
1268
1269 <p>
1270 Of course, for such a simple transformation, parsing the XML document
1271 into an internal representation is not necessary. A direct XML-to-XML
1272 transformation would be easy to write. We wanted to illustrate
1273 a complex parsing of XML.
1274 </p>
1275
1276 <p>
1277 It it interesting to introduce errors in the parser
1278 <code>schema_loader.ml</code> or the printer
1279 <code>dump_schema.ml</code> and see how the type system catches them.
1280 </p>
1281
1282 <note>
1283 The application uses XML Light to parse XML document.
1284 </note>
1285
1286 <note>
1287 Some features of XML Schema are not parsed, such as
1288 <code>redefine</code> elements or substitution groups.
1289 </note>
1290
1291 <note>
1292 To compile the application with the provided Makefile,
1293 you must make the environment variable <code>OCAMLFIND_CONF</code>
1294 point to the <code>$GODI/etc/findlib-ocamlduce.conf</code> file.
1295 </note>
1296
1297 </section>
1298
1299 <section title="String regular expressions">
1300
1301 <p>
1302 OCamlDuce supports regular expression types and patterns, not only
1303 for sequences of XML elements, but also for strings. The following
1304 example shows how to use regular expressions to split a string
1305 of the form <code>name1=val1,...,namen=valn</code> with
1306 <code>n>0</code> into
1307 a list of pairs <code>[ (name1,val1); ...; (namen,valn) ]</code>.
1308 The <code>*?</code> operator in regular expressions means ``ungreedy
1309 match'' (match the shortest possible subsequence). The last
1310 pattern describes precisely strings which are not matched by
1311 the other cases. It would be possible to replace it with
1312 the wildcard <code>_</code>.
1313 </p>
1314
1315 <sample><![CDATA[{{ON}}
1316 let rec split (s : {{ String }}) =
1317 match s with
1318 | {{ [ n::_*? '=' v::_*? ',' rest::_* ] }} -> (n,v)::(split rest)
1319 | {{ [ n::_*? '=' v::_*? ] }} -> [ (n,v) ]
1320 | {{ Any - [ _* '=' _* ] }} -> failwith "split"
1321 ]]></sample>
1322
1323 </section>
1324
1325 </box>
1326
1327 <box title="Applications in OCamlDuce" link="appli">
1328
1329 <ul>
1330 <li><a
1331 href="http://anil.recoil.org/projects/review2atom.html">Review2Atom</a>
1332 by Anil Madhavapeddy: translates paper review files in XML format into
1333 an Atom feed suitable for aggregation.
1334 </li>
1335 </ul>
1336
1337 </box>
1338
1339 </page>
1340
1341 </page>

CVS Admin">CVS Admin
ViewVC Help
Powered by ViewVC 1.1.5