/[svn]/web/ocaml.xml
ViewVC logotype

Diff of /web/ocaml.xml

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 1790 by abate, Tue Jul 10 19:23:06 2007 UTC revision 1894 by abate, Tue Jul 10 19:29:58 2007 UTC
# Line 15  Line 15 
15  OCamlDuce is a merger between <a  OCamlDuce is a merger between <a
16  href="http://caml.inria.fr/">OCaml</a> and  href="http://caml.inria.fr/">OCaml</a> and
17  <local href="index">CDuce</local>. It comes as a modified  <local href="index">CDuce</local>. It comes as a modified
18  version of OCaml which integrates CDuce features: expressions, types,  version of OCaml which integrates CDuce features: XML expressions,
19  patterns.  regular expression types and patterns, iterators.
20  </p>  </p>
21    
22  <p>  <p>
23  OCamlDuce is distributed under the same licenses as Objective Caml:  OCamlDuce is distributed under the Q Public License version 1.0.
 the Q Public License version 1.0 for the Compiler, and the LGPL  
 version 2 for the Library. The extension has been written by Alain  
 Frisch. Parts of the CDuce implementation, by the same author, have  
 been reused.  
24  </p>  </p>
25    
26    <ul>
27    <li>A <a
28    href="http://cristal.inria.fr/~frisch/ocamlcduce/ocamlduce.pdf">technical
29    report</a> describes the theory behind OCamlDuce's type system (to be
30    presented in PLAN-X 2006).</li>
31    <li><local href="ocaml_install">How to get OCamlDuce:</local> download,
32    installation instructions, packages.</li>
33    <li><local href="ocaml_manual">User's manual</local>.</li>
34    <li><local href="ocaml_code">Code samples and
35    applications</local>.</li>
36    <li><local href="mailing">Mailing lists</local>.</li>
37    </ul>
38    
39  </box>  </box>
40    
41    <page name="ocaml_install">
42    <title>Getting OCamlDuce</title>
43    
44  <box title="Download and installation" link="install">  <box title="Download and installation" link="install">
45    
46  <p>  <p>
47  The build procedure for OCamlDuce is exactly the same as for OCaml:  Currently, OCamlDuce
48  <tt>configure, make world, make install</tt>. The names of the tools  is based on OCaml 3.09.2 and CDuce 0.4.0.
 are unchanged: <tt>ocaml,ocamlc,ocamlopt</tt>. Currently, OCamlDuce  
 is based on CVS snapshots of OCaml (between 3.08.3 and the current  
 <tt>release308</tt> branch) and CDuce (between 0.3.91 and the head).  
49  </p>  </p>
50    
51  <ul>  <ul>
52  <li><a  <li><a
53  href="http://pauillac.inria.fr/~frisch/ocamlcduce/download/cduce-ocaml-0.0.5.tar.gz">Compiler,  href="http://pauillac.inria.fr/~frisch/ocamlcduce/download/ocamlduce-3.08.4pl5.tar.gz">Compiler,
54  version 0.0.5</a></li>  version 3.08.4, patch level 5</a> (to be used with OCaml 3.08.4)</li>
55  <!--<li><a  <li><a
56  href="http://pauillac.inria.fr/~frisch/ocamlcduce/download/xml-support-0.0.4.tar.gz">Support  href="http://pauillac.inria.fr/~frisch/ocamlcduce/download/ocamlduce-3.09.1pl1.tar.gz">Compiler,
57  library, version 0.0.4</a></li>-->  version 3.09.1</a> (to be used with OCaml 3.09.1)</li>
58    <li><a
59    href="http://pauillac.inria.fr/~frisch/ocamlcduce/download/ocamlduce-3.09.2.tar.gz">Compiler,
60    version 3.09.2</a> (to be used with OCaml 3.09.2)</li>
61  </ul>  </ul>
62    
63  <p>  <p>
64  GODI users can upgrade an existing installation by adding this    The following describes the installation procedure for the
65  line to their <tt>etc/godi.conf</tt> file:    3.09.2 release.
66      OCamlDuce is installed on top of an existing OCaml
67      installation (whose version number must match) and it requires
68      a recent version of findlib. The build procedure
69      is: <tt>make all &amp;&amp; make opt &amp;&amp; make
70      install</tt>. The configuration is taken from OCaml's
71      <tt>Makefile.config</tt>.
72    </p>
73    
74    <p>
75      The tools are named <tt>ocamlduce, ocamlducec, ocamlduceopt,
76      ocamlducedep, ocamlducemktop, ocamlducemktop, ocamlducefind</tt>.
77      They are installed in the same directory as the ocaml compiler itself.
78    </p>
79    
80    <p>
81      In addition, a library called <tt>ocamlduce.cma/.cmxa</tt> is built.
82      It depends on the <tt>nums</tt> library. A findlib package named
83      <tt>ocamlduce</tt> is created by the <tt>make install</tt> target.
84      Normally, you don't need to care about the package except if you
85      insist to link your modules with the regular OCaml compilers (not
86      OCamlDuce), but there is no good reason to do so.
87    </p>
88    
89    <p>
90      To generate the ocamldoc documentation for the <tt>Ocamlduce</tt>
91      module: <tt>make htdoc</tt>.
92    </p>
93    
94    <section title="Compiling, linking, calling the toplevel">
95    
96    <p>Starting from OCamlDuce 3.09.2, you don't need to struggle with
97    extra command-line options. You must simply use the OCamlDuce tools:</p>
98    
99    <sample>
100    {{Call the toplevel:}} ocamlduce
101    {{Compile:}}           ocamlducec -c x.ml
102    {{Link:}}              ocamlducec -o x x.cmo
103    {{Use ocamlfind:}}     ocamlducefind ocamlc -o -linkpkg -package pcre  x.ml
104    </sample>
105    
106    </section>
107    
108    
109    <section title="Building from the CVS">
110    
111    <p>
112    The following commands will extract the current development version of
113     OCamlDuce (from OCaml and CDuce CVS repositories):
114  </p>  </p>
115    
116  <sample>  <sample>
117  GODI_BUILD_SITES += http://pauillac.inria.fr/~frisch/ocamlcduce/godi          cvs -f -d ":pserver:anoncvs@camlcvs.inria.fr:/caml" co -r cducetrunk ocaml
118            cvs -f -d ":pserver:anonymous@cvs.cduce.org:/cvsroot" co cduce
119            (cd ocaml/cduce; make link)
120  </sample>  </sample>
121    
122    </section>
123    
124    </box>
125    
126    <box title="Ports and packages" link="ports">
127    
128    <section title="GODI">
129    
130  <p>  <p>
131  and by forcing a recompilation of the <tt>godi-ocaml-src</tt>    There is a <tt>godi-ocamlduce</tt> package available in GODI
132  and <tt>godi-ocaml</tt> packages. <!--They should also build    (sections 3.08 and 3.09).
 the <tt>godi-xml-support</tt> library.-->  
133  </p>  </p>
134    
135  <!--  </section>
136    
137    <section title="DarwinPorts and OpenBSD">
138    
139  <p>  <p>
140  Some simple examples can be found <a -->  Anil Madhavapeddy contributed two ports of OCamlDuce for DarwinPorts
141  <!--href="http://pauillac.inria.fr/~frisch/ocamlcduce/tests/">here</a>.</p>  (in dports/lang/ocamlduce) and for OpenBSD (in ports/lang/ocamlduce).
142  -->  </p>
143    
144    </section>
145    
146  </box>  </box>
147    
148    </page>
149    
150    <page name="ocaml_manual">
151    <title>OCamlDuce: manual</title>
152    
153  <box title="Overview" link="overview">  <box title="Overview" link="overview">
154    
155  <p>  <p>
156    The goal of the OCamlDuce project is to extend the OCaml language with features
157    to make it easier to write safe and efficient complex applications
158    that need to deal with XML documents. In particular, it relies
159    on a notion of types and patterns to guarantee statically
160    that all the possible input documents are correctly processed, and
161    that only valid output documents are produced.
162    </p>
163    
164    <p>
165  In a nutshell, OCamlDuce extends OCaml with a new kind of values  In a nutshell, OCamlDuce extends OCaml with a new kind of values
166  (<em>x-values</em>) to represent XML documents, fragments, tags, Unicode  (<em>x-values</em>) to represent XML documents, fragments, tags, Unicode
167  strings. In order to describe these values, it also extends the type algebra  strings. In order to describe these values, it also extends the type algebra
# Line 488  Line 578 
578     function {{p1}} -> e1 | ... | {{pn}} -> en     function {{p1}} -> e1 | ... | {{pn}} -> en
579  </p>  </p>
580    
581    <p>
582    Pattern matching follows is first-match policy. The first pattern
583    that succeeds triggers the corresponding branch.
584    </p>
585    
586  <note>  <note>
587  currently it is impossible to mix normal OCaml patterns and x-patterns  currently it is impossible to mix normal OCaml patterns and x-patterns
588  in a single pattern matching.  in a single pattern matching.
# Line 539  Line 634 
634  <p>  <p>
635  As a convenience, some of the OCaml expression constructors  As a convenience, some of the OCaml expression constructors
636  are allowed as x-expressions (without a need to go back to OCaml  are allowed as x-expressions (without a need to go back to OCaml
637  with double curly braces): (unqualified) value identifiers and  with double curly braces): (unqualified) value identifiers <b>without
638    apostrophes</b> and
639  function calls.  function calls.
640  </p>  </p>
641    
# Line 607  Line 703 
703  </p>  </p>
704    
705  <ul>  <ul>
706  <li>capture variables (lowercase OCaml identifiers);</li>  <li>capture variables (lowercase OCaml identifiers <b>without apostrophes</b>);</li>
707  <li>constant bindings <code>(x := c)</code> where x is a capture  <li>constant bindings <code>(x := c)</code> where x is a capture
708    variable and c is    variable and c is
709    a literal x-constant (this pattern always succeeds and returns the    a literal x-constant (this pattern always succeeds and returns the
# Line 615  Line 711 
711  </ul>  </ul>
712    
713  <p>  <p>
714  In record x-patterns, it is possible to omit the <code>=p</code> part of a field.  An identifier in an X-pattern can be either a reference
715  The content is then replaced with the label name considered as  to a named X-type (if such a type declaration is in scope)
716  a capture variable. E.g.  <code>{ x y=p }</code> is equivalent to  or a capture variable (otherwise).
717  <code>{ x=x y=p }</code>.</p>  </p>
718    
719    <p>
720    Here is a brief description of the semantics of patterns. Given
721    an input value, a pattern can either succeed or fail. If it succeeds,
722    it also produces a bindings from the capture variables in the pattern
723    to x-values.
724    </p>
725    
726    <ul>
727    
728    <li>A pattern which is just a type (no capture variable) succeeds if
729    and only if the value has the type.</li>
730    
731    <li>A pattern <code>p1 | p2</code> succeeds if either <code>p1</code>
732    or <code>p2</code> succeed, and returns the corresponding binding; if
733    both patterns succeeds, <code>p1</code> wins. It is required that
734    <code>p1</code> and <code>p2</code> have the same sets of capture
735    variables. </li>
736    
737    <li>A pattern <code>p1 &amp; p2</code> succeeds if both <code>p1</code>
738    and <code>p2</code> succeed, and returns the concatenation of the two
739    bindings. It is required that <code>p1</code> and <code>p2</code> have
740    <em>disjoint</em> sets of capture variables. </li>
741    
742    </ul>
743    
744    <p>
745    In record x-patterns, it is possible to omit the <code>=p</code> part
746    of a field.  The content is then replaced with the label name
747    considered as a capture variable (or as a previously defined type).
748     E.g.  <code>{ x y=p }</code> is
749    equivalent to <code>{ x=x y=p }</code>.</p>
750    
751  <p>It is also possible to add an "else" clause:  <p>It is also possible to add an "else" clause:
752  <code>{ x = (a,_)|(a:=3) }</code>  <code>{ x = (a,_)|(a:=3) }</code>
# Line 636  Line 764 
764  repetition) in a regexp, it is bound to the concatenation of all  repetition) in a regexp, it is bound to the concatenation of all
765  matched subsequences. E.g.: <code>[ (x::Int | _)* ]</code> will  matched subsequences. E.g.: <code>[ (x::Int | _)* ]</code> will
766  collect in <code>x</code> all the elements of type <code>Int</code> from  collect in <code>x</code> all the elements of type <code>Int</code> from
767  a sequence.</p>  a sequence. It is not legal to have repeated simple capture variables.
768    </p>
769    
770  <p>  <p>
771  The regexp operators <code>+,*,?</code> are greedy by default (they match as long  The regexp operators <code>+,*,?</code> are greedy by default (they match as long
# Line 931  Line 1060 
1060    
1061  </box>  </box>
1062    
1063  <box title="Code samples" link="code">  <box title="Marshaling" link="marshal">
1064    
1065    <p>
1066    OCamlDuce use some tricks on its internal representation of x-values
1067    to reduce memory usage and improve performance. You need to pay
1068    special attention if you want to use OCaml serialization functions
1069    (module <code>Marshal</code>, functions
1070    <code>input_value/output_value</code>) on x-values. In addition to
1071    your values, you also need to save and restore some piece of internal data
1072    using the functions <code>Cduce_types.Value.extract_all</code> and
1073    <code>Cduce_types.Value.intract_all</code>. Of course, this also
1074    applies if the value to be serialized contains deeply nested x-values.
1075    </p>
1076    
1077    <p>
1078    Here are generic
1079    serialization/deserializations functions that illustrate how to do it:
1080    </p>
1081    
1082    <sample>
1083    let my_output_value oc v =
1084      let p = Cduce_types.Value.extract_all () in
1085      output_value oc (p,v)
1086    
1087    let my_input_value ic =
1088      let (p,v) = input_value ic in
1089      Cduce_types.Value.intract_all p;
1090      v
1091    </sample>
1092    
1093    </box>
1094    
1095    <box title="Performance" link="perf">
1096    
1097    <section title="Strings">
1098    
1099    <p>
1100    OCaml users might be surprised by the fact that x-strings are simply
1101    represented as sequences in OCamlDuce. Does this mean that they are
1102    actually stored in memory as linked list? Certainly not!  The internal
1103    representation of sequence values uses several tricks to improve
1104    performance and memory usage. In particular, a special form in the
1105    representation can store strings as byte buffers, as in OCaml.
1106    It an XML document is loaded, or if a Caml string is converted
1107    to an x-value, this compact representation will be used.
1108    </p>
1109    
1110    </section>
1111    
1112    <section title="Concatenation">
1113    
1114    <p>
1115    Similarly, OCaml users might be relectutant to use the sequence
1116    concatenation <code>@</code> on sequences. In OCaml, the complexity
1117    of this operator is linear in the size of its first argument (which
1118    need to be copied). OCamlDuce use a special form in its internal
1119    representation to store concatenation in a lazy way. The concatenation
1120    will really by computed only when the value is accessed. This means
1121    that it's perfectly ok to build a long sequence by adding
1122    new elements at the end one by one, as long as you don't
1123    simultaneously inspect the sequence.
1124    </p>
1125    
1126    </section>
1127    
1128    <section title="Pattern matching">
1129    
1130    <p>
1131    Another point which is worth knowing when programming in OCamlDuce
1132    is that patterns can be written in a declarative style without
1133    affective performance. The compiler uses static type information
1134    about matched values to produce efficient code for pattern matching.
1135    To illustrate this, consider the following sample:
1136    </p>
1137    
1138    <sample><![CDATA[{{ON}}
1139    x.ml:
1140    
1141    type a = {{ <a>[ a* ] }}
1142    type b = {{ <b>[ b* ] }}
1143    
1144    let f : {{ a|b }} -> int = function {{ a }} -> 0 | {{ _ }} -> 1
1145    ]]></sample>
1146    
1147    <sample><![CDATA[{{ON}}
1148    y.ml:
1149    
1150    type a = {{ <a>[ a* ] }}
1151    type b = {{ <b>[ b* ] }}
1152    
1153    let f : {{ a|b }} -> int = function {{ <a>_ }} -> 0 | {{ _ }} -> 1
1154    ]]></sample>
1155    
1156    <p>
1157    The two functions have exactly the same semantics, but the first
1158    implementation is more declarative: it uses type checks to distinguish
1159    between <code>a</code> and <code>b</code> instead of saying
1160    <em>how</em> to distinguish between these two types. Imagine
1161    that the definition of these types change to:
1162    </p>
1163    
1164    <sample><![CDATA[{{ON}}
1165    type a = {{ <x kind="a">[ a* ] }}
1166    type b = {{ <x kind="b">[ b* ] }}
1167    ]]></sample>
1168    
1169    <p>
1170    Then the first implementation still works as expected, but the
1171    second one needs to be rewritten.</p>
1172    
1173    <p>Now one might believe that the second implementation is more
1174    efficient because it tells the compiler to check only the root tag,
1175    whereas the first implementation would force
1176    the compiler to produce code to check that all tags in the tree
1177    are <code>a</code>s. But this is not what happens! Actually,
1178    you can check that the compiler will produce exactly the same code
1179    for both implementations. It considers the static type information
1180    about the argument of the pattern matching (here, the input type
1181    of the function), and computes an efficient way to evaluate
1182    patterns for the values of this type.
1183    </p>
1184    
1185    </section>
1186    
1187    <section title="The map iterator">
1188    
1189    <p>
1190    The <code>map ... with ...</code> iterator is implemented in a
1191    tail-recursive way. You can safely use it on very long sequences.
1192    </p>
1193    
1194    </section>
1195    
1196    </box>
1197    
1198    <box title="OCaml and OCamlDuce" link="ocaml">
1199    
1200    <p>
1201    Since the 3.08.4 release, OCamlDuce is binary compatible with the corresponding
1202    OCaml release. This means that OCamlDuce can use OCaml-generated
1203    <tt>.cmi</tt> files and that it produces an OCaml-compatible
1204    <tt>.cmi</tt> file if the interface does not use any x-type
1205    (this file is equal to what would have been obtained by using OCaml).
1206    </p>
1207    
1208    <p>
1209    It is thus possible to use existing libraries which were compiled for
1210    OCaml. It is also possible to use OCamlDuce to compile
1211    some modules and use them in an OCaml project provided their interface
1212    is pure OCaml.
1213    </p>
1214    
1215    </box>
1216    
1217    </page>
1218    
1219    <page name="ocaml_code">
1220    <title>OCamlDuce: code samples and applications</title>
1221    
1222    <box title="Code samples" link="code">
1223    
1224  <section title="Parsing XML files">  <section title="Parsing XML files">
1225    
# Line 989  Line 1276 
1276  <p>  <p>
1277  It it interesting to introduce errors in the parser  It it interesting to introduce errors in the parser
1278  <code>schema_loader.ml</code> or the printer  <code>schema_loader.ml</code> or the printer
1279  <code>dump_schema.ml</code> and see how the type system catch them.  <code>dump_schema.ml</code> and see how the type system catches them.
1280  </p>  </p>
1281    
1282  <note>  <note>
# Line 1001  Line 1288 
1288  <code>redefine</code> elements or substitution groups.  <code>redefine</code> elements or substitution groups.
1289  </note>  </note>
1290    
1291    <note>
1292    To compile the application with the provided Makefile,
1293    you must make the environment variable <code>OCAMLFIND_CONF</code>
1294    point to the <code>$GODI/etc/findlib-ocamlduce.conf</code> file.
1295    </note>
1296    
1297    </section>
1298    
1299    <section title="String regular expressions">
1300    
1301    <p>
1302    OCamlDuce supports regular expression types and patterns, not only
1303    for sequences of XML elements, but also for strings. The following
1304    example shows how to use regular expressions to split a string
1305    of the form <code>name1=val1,...,namen=valn</code> with
1306    <code>n>0</code> into
1307    a list of pairs <code>[ (name1,val1); ...; (namen,valn) ]</code>.
1308    The <code>*?</code> operator in regular expressions means ``ungreedy
1309    match'' (match the shortest possible subsequence). The last
1310    pattern describes precisely strings which are not matched by
1311    the other cases. It would be possible to replace it with
1312    the wildcard <code>_</code>.
1313    </p>
1314    
1315    <sample><![CDATA[{{ON}}
1316    let rec split (s : {{ String }}) =
1317      match s with
1318        | {{ [ n::_*? '=' v::_*? ',' rest::_* ] }} -> (n,v)::(split rest)
1319        | {{ [ n::_*? '=' v::_*? ] }} -> [ (n,v) ]
1320        | {{ Any - [ _* '=' _* ] }} -> failwith "split"
1321    ]]></sample>
1322    
1323  </section>  </section>
1324    
1325  </box>  </box>
1326    
1327    <box title="Applications in OCamlDuce" link="appli">
1328    
1329    <ul>
1330    <li><a
1331    href="http://anil.recoil.org/projects/review2atom.html">Review2Atom</a>
1332    by Anil Madhavapeddy: translates paper review files in XML format into
1333    an Atom feed suitable for aggregation.
1334    </li>
1335    </ul>
1336    
1337    </box>
1338    
1339    </page>
1340    
1341  </page>  </page>

Legend:
Removed from v.1790  
changed lines
  Added in v.1894

CVS Admin">CVS Admin
ViewVC Help
Powered by ViewVC 1.1.5