/[svn]/web/ocaml.xml
ViewVC logotype

Diff of /web/ocaml.xml

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 1791 by abate, Tue Jul 10 19:23:09 2007 UTC revision 1792 by abate, Tue Jul 10 19:23:12 2007 UTC
# Line 497  Line 497 
497     function {{p1}} -> e1 | ... | {{pn}} -> en     function {{p1}} -> e1 | ... | {{pn}} -> en
498  </p>  </p>
499    
500    <p>
501    Pattern matching follows is first-match policy. The first pattern
502    that succeeds triggers the corresponding branch.
503    </p>
504    
505  <note>  <note>
506  currently it is impossible to mix normal OCaml patterns and x-patterns  currently it is impossible to mix normal OCaml patterns and x-patterns
507  in a single pattern matching.  in a single pattern matching.
# Line 624  Line 629 
629  </ul>  </ul>
630    
631  <p>  <p>
632  In record x-patterns, it is possible to omit the <code>=p</code> part of a field.  Here is a brief description of the semantics of patterns. Given
633  The content is then replaced with the label name considered as  an input value, a pattern can either succeed or fail. If it succeeds,
634  a capture variable. E.g.  <code>{ x y=p }</code> is equivalent to  it also produces a bindings from the capture variables in the pattern
635  <code>{ x=x y=p }</code>.</p>  to x-values.
636    </p>
637    
638    <ul>
639    
640    <li>A pattern which is just a type (no capture variable) succeeds if
641    and only if the value has the type.</li>
642    
643    <li>A pattern <code>p1 | p2</code> succeeds if either <code>p1</code>
644    or <code>p2</code> succeed, and returns the corresponding binding; if
645    both patterns succeeds, <code>p1</code> wins. It is required that
646    <code>p1</code> and <code>p2</code> have the same sets of capture
647    variables. </li>
648    
649    <li>A pattern <code>p1 &amp; p2</code> succeeds if both <code>p1</code>
650    and <code>p2</code> succeed, and returns the concatenation of the two
651    bindings. It is required that <code>p1</code> and <code>p2</code> have
652    <em>disjoint</em> sets of capture variables. </li>
653    
654    </ul>
655    
656    <p>
657    In record x-patterns, it is possible to omit the <code>=p</code> part
658    of a field.  The content is then replaced with the label name
659    considered as a capture variable. E.g.  <code>{ x y=p }</code> is
660    equivalent to <code>{ x=x y=p }</code>.</p>
661    
662  <p>It is also possible to add an "else" clause:  <p>It is also possible to add an "else" clause:
663  <code>{ x = (a,_)|(a:=3) }</code>  <code>{ x = (a,_)|(a:=3) }</code>
# Line 645  Line 675 
675  repetition) in a regexp, it is bound to the concatenation of all  repetition) in a regexp, it is bound to the concatenation of all
676  matched subsequences. E.g.: <code>[ (x::Int | _)* ]</code> will  matched subsequences. E.g.: <code>[ (x::Int | _)* ]</code> will
677  collect in <code>x</code> all the elements of type <code>Int</code> from  collect in <code>x</code> all the elements of type <code>Int</code> from
678  a sequence.</p>  a sequence. It is not legal to have repeated simple capture variables.
679    </p>
680    
681  <p>  <p>
682  The regexp operators <code>+,*,?</code> are greedy by default (they match as long  The regexp operators <code>+,*,?</code> are greedy by default (they match as long
# Line 940  Line 971 
971    
972  </box>  </box>
973    
974  <box title="Code samples" link="code">  <box title="Marshaling" link="marshal">
975    
976    <p>
977    OCamlDuce use some tricks on its internal representation of x-values
978    to reduce memory usage and improve performance. You need to pay
979    special attention is you want to use OCaml serialization function
980    (module <code>Marshal</code>, functions
981    <code>input_value/output_value</code>) on x-values. In addition to
982    your values, you also need to save and restore some piece of internal data
983    using the functions <code>Cduce_types.Value.extract_all</code> and
984    <code>Cduce_types.Value.intract_all</code>. Of course, this also
985    applies if the value to be serialized contains deeply nested x-values.
986    </p>
987    
988    <p>
989    Here are generic
990    serialization/deserializations functions that illustrate how to do it:
991    </p>
992    
993    <sample>
994    let my_output_value oc v =
995      let p = Cduce_types.Value.extract_all () in
996      output_value oc (p,v)
997    
998    let my_input_value ic =
999      let (p,v) = input_value ic in
1000      Cduce_types.Value.intract_all p;
1001      v
1002    </sample>
1003    
1004    </box>
1005    
1006    <box title="Performance" link="perf">
1007    
1008    <section title="Strings">
1009    
1010    <p>
1011    OCaml users might be surprised by the fact that x-strings are simply
1012    represented as sequences in OCamlDuce. Does this mean that they are
1013    actually stored in memory as linked list? Certainly not!  The internal
1014    representation of sequence values uses several tricks to improve
1015    performance and memory usage. In particular, a special form in the
1016    representation can store strings as byte buffers, as in OCaml.
1017    It an XML document is loaded, or if a Caml string is converted
1018    to an x-value, this compact representation will be used.
1019    </p>
1020    
1021    </section>
1022    
1023    <section title="Concatenation">
1024    
1025    <p>
1026    Similarly, OCaml users might be relectutant to use the sequence
1027    concatenation <code>@</code> on sequences. In OCaml, the complexity
1028    of this operator is linear in the size of its first argument (which
1029    need to be copied). OCamlDuce use a special form in its internal
1030    representation to store concatenation in a lazy way. The concatenation
1031    will really by computed only when the value is accessed. This means
1032    that it's perfectly ok to build a long sequence by adding
1033    new elements at the end one by one, as long as you don't
1034    simultaneously inspect the sequence.
1035    </p>
1036    
1037    </section>
1038    
1039    <section title="Pattern matching">
1040    
1041    <p>
1042    Another point which is worth knowing when programming in OCamlDuce
1043    is that patterns can be written in a declarative style without
1044    affective performance. The compiler uses static type information
1045    about matched values to produce efficient code for pattern matching.
1046    To illustrate this, consider the following sample:
1047    </p>
1048    
1049    <sample><![CDATA[{{ON}}
1050    x.ml:
1051    
1052    type a = {{ <a>[ a* ] }}
1053    type b = {{ <b>[ b* ] }}
1054    
1055    let f : {{ a|b }} -> int = function {{ a }} -> 0 | {{ _ }} -> 1
1056    ]]></sample>
1057    
1058    <sample><![CDATA[{{ON}}
1059    y.ml:
1060    
1061    type a = {{ <a>[ a* ] }}
1062    type b = {{ <b>[ b* ] }}
1063    
1064    let f : {{ a|b }} -> int = function {{ <a>_ }} -> 0 | {{ _ }} -> 1
1065    ]]></sample>
1066    
1067    <p>
1068    The two functions have exactly the same semantics, but the first
1069    implementation is more declarative: it uses type checks to distinguish
1070    between <code>a</code> and <code>b</code> instead of saying
1071    <em>how</em> to distinguish between these two types. Imagine
1072    that the definition of these types change to:
1073    </p>
1074    
1075    <sample><![CDATA[{{ON}}
1076    type a = {{ <x kind="a">[ a* ] }}
1077    type b = {{ <x kind="b">[ b* ] }}
1078    ]]></sample>
1079    
1080    <p>
1081    Then the first implementation still works as expected, but the
1082    second one needs to be rewritten.</p>
1083    
1084    <p>Now one might believe that the second implementation is more
1085    efficient because it tells the compiler to check only the root tag,
1086    whereas the first implementation would force
1087    the compiler to produce code to check that all tags in the tree
1088    are <code>a</code>s. But this is not what happens! Actually,
1089    you can check that the compiler will produce exactly the same code
1090    for both implementations. It considers the static type information
1091    about the argument of the pattern matching (here, the input type
1092    of the function), and computes an efficient way to evaluate
1093    patterns for the values of this type.
1094    </p>
1095    
1096    </section>
1097    
1098    <section title="The map iterator">
1099    
1100    <p>
1101    The <code>map ... with ...</code> iterator is implemented in a
1102    tail-recursive way. You can safely use it on very long sequences.
1103    </p>
1104    
1105    </section>
1106    
1107    </box>
1108    
1109    <box title="Code samples" link="code">
1110    
1111  <section title="Parsing XML files">  <section title="Parsing XML files">
1112    
# Line 998  Line 1163 
1163  <p>  <p>
1164  It it interesting to introduce errors in the parser  It it interesting to introduce errors in the parser
1165  <code>schema_loader.ml</code> or the printer  <code>schema_loader.ml</code> or the printer
1166  <code>dump_schema.ml</code> and see how the type system catch them.  <code>dump_schema.ml</code> and see how the type system catches them.
1167  </p>  </p>
1168    
1169  <note>  <note>

Legend:
Removed from v.1791  
changed lines
  Added in v.1792

CVS Admin">CVS Admin
ViewVC Help
Powered by ViewVC 1.1.5