lambda-calc-walkthrough.txt 15 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186
  1. "Woo hoo!" / Let's parse lambda expressions
  2. :s:"(^y.z y)" / To keep things ASCII, we'll use a caret to denote a lambda
  3. / spaces separate variables and for now everything will be a single character
  4. p:0 0 1 1 1 1 1 0 / We're trying to produce a "parent vector"
  5. / Here each value in this vector indicates the parent of the node at that index
  6. +(s i;i:(1;p[1])) / The parent of the lambda is the open parenthesis
  7. +(s i;i:(2;p[2])) / The parent of the bound variable y is the lambda, etc.
  8. / Let's generate this from the input string
  9. :t:"(hey(you)(whats(new))dude)" / But first let's parse characters at depth defined by parentheses
  10. {$["("=y;x+1;")"=y;x-1;x]}\[0;t] / A simple scan like this calculates the depth
  11. / But we want to build a parent vector ...
  12. / The parent of any node is the index of the most recent open paren
  13. / This means that when we reach a closed paren,
  14. / the following nodes have the previous open paren as a parent
  15. / Stack!
  16. / One more thing. At each node we grow the parent vector.
  17. / The index of the next added node is just the length
  18. / of the parent vector before it was added, or one less after.
  19. `0:`k'{(s;p):x;p,:*|s;$["("=y;s,:-1+#p;")"=y;s:-1_s;];(s;p)}\[(,0;!0);t]
  20. / We don't want a scan, but this shows how the result is constructed
  21. :p:*|{(s;p):x;p,:*|s;$["("=y;s,:-1+#p;")"=y;s:-1_s;];(s;p)}/[(,0;!0);t]
  22. / We've generated a parent vector for the tree indicated by
  23. / nested levels of parentheses
  24. p@!#p / This is the list of parents of all the indices
  25. p / .. of course this is just p
  26. p[p] / Here are all the grandparents of each index, etc.
  27. `0:`k'(p@)\!#p / Of course with a single tree all nodes lead back to the root
  28. `0:`k'1&(p@)\!#p / We can make a mask out of this picture
  29. `0:`k'(1+!#p)*/:1&(p@)\!#p / Multiply each row by the indices offset by one
  30. `0:(" ",t)@(1+!#p)*/:1&(p@)\!#p / And then prepend our string with a space to represent the zeros
  31. / And we get a nice little picture of the depth of each node
  32. / Add a top row with all 1's in the mask so we also get the original string
  33. `0:(" ",t)@(1+!#p)*/:(,1+&#p),1&(p@)\!#p
  34. / Notice that the closed paren is a child of its matching open paren.
  35. / Can you figure out why?
  36. / We can clean that up, but it won't matter that much.
  37. :t:"(hey)(you)" / Here's a different "tree" and its parent
  38. :p:*|{(s;p):x;p,:*|s;$["("=y;s,:-1+#p;")"=y;s:-1_s;];(s;p)}/[(,0;!0);t]
  39. +(t i;i:(5;p[5])) / Small problem here.. The second set of parens is not actually a child of the first.
  40. / This is actually a "forest" and not a tree. Each trees root should be it's own parent
  41. / The fix is first to not default to 0 as the initial root.
  42. / Instead we use null to represent "to be determined"
  43. :p:*|{(s;p):x;p,:*|s;$["("=y;s,:-1+#p;")"=y;s:-1_s;];(s;p)}/[(,0N;!0);t]
  44. / And then we replace the nulls with the index in which they occur
  45. / since root nodes should be their own parents
  46. :p:^'/|1(*/1(!#:)\^:)\*|{(s;p):x;p,:*|s;$["("=y;s,:-1+#p;")"=y;s:-1_s;];(s;p)}/[(,0N;!0);t]
  47. \l lambda-calculus.k
  48. dsp
  49. dsp[t;p]
  50. :s:"(^y.z y)" / Parsing lambdas!
  51. :p:0 0 1 1 1 1 1 0 / The plan is to add to paren parsing by putting everything after
  52. dsp[s;p] / a lambda as a child of the lambda, all as siblings
  53. / We can just toss the current index on the parent stack for "^" as well
  54. `0:`k'{(s;p):x;p,:*|s;$["("=y;s,:-1+#p;"^"=y;s,:-1+#p;")"=y;s:-1_s;];(s;p)}\[(,0;!0);s]
  55. / The trouble is that we only pop 1 element off the stack with each ")"
  56. / Here we've accumulated a bunch of items which should all be popped.
  57. / The idea is to count how many need popping and pop that count.
  58. prs / The code does exactly that.
  59. / Going back to our earlier example for parsing just parentheses
  60. (t;p:*|{(s;p):x;p,:*|s;$["("=y;s,:-1+#p;")"=y;s:-1_s;];(s;p)}/[(,0;!0);t:"(hey(you)(whats(new))dude)"])
  61. / We don't really need all the closed parens any more
  62. / The structure is clear without them.
  63. / But if we just remove them and the corresponding indices in the parent vector
  64. / then the indices in the parent vector will be off.
  65. / We need a permutation of the indices so that everything besides the
  66. / closed parens are packed together, so we can just chuck the unwanted stuff
  67. tblb[t;")"=t] / First we mark the closed parens for deletion
  68. tblb / (tblb just prints a string with pointers according to a boolean mask)
  69. :ro:<m:")"=t / Then we take the grade of that Boolean vector
  70. m@ro / Now the 1's are all at the back
  71. t@ro / And so are the parens. But what about the parent vector?
  72. p@ro / This puts the indices in the right place, but points to the old location
  73. (<ro)@p@ro / Picking these indices out of the reverse permutation does the trick
  74. :(t;p):(-#&m)_/:(t@ro;(<ro)p@ro) / Dropping the number of 1's we had from the back finishes it off
  75. p@t?/:"nd" / It may look like "newdude" is all at the same level
  76. dsp[t;p] / but they have different parents
  77. / All of the structure is now in the parent vector and not the list of nodes
  78. prn / prn takes a lambda expression, generates a parent vector and then prunes unneeded nodes
  79. srt
  80. prn t:"(^x.x)(^y.y)"
  81. dsp/prn t:"(^x.x)(^y.y)"
  82. :t:"(hey(you)(whats(new))dude)" / Going back to our simpler example...
  83. :p:^'/|1(*/1(!#:)\^:)\*|{(s;p):x;p,:*|s;$["("=y;s,:-1+#p;")"=y;s:-1_s;];(s;p)}/[(,0N;!0);t]
  84. :(t;p):(-#&m)_/:(t@ro;(<ro)p@ro)
  85. &"("=t / Let's try to find the scope of the tree starting at the third paren
  86. tbl[t;("("=t;p)] / That's at index 8. All of its immediate children have an 8 in the parent vector
  87. tbl / (tbl just lines up the columns of several lists with a row of indexes at the top)
  88. t@ \&p=8 / But the last one starts a new group, so everything in its scope counts too
  89. t@ \&p=14 / We can do it again, now there is no "(" starting a new group.
  90. &p=17 / Since none start a new group none are in the parent vector.
  91. lc / This is how lc finds the last leaf in the subtree under a given node
  92. lc[p;8]
  93. / Now for the fun part!! Beta reduction
  94. :s:"(^x.x y x y)(z z)(^y.z z)" / For beta reduction we want substitute into the lambda expression
  95. / Big picture:
  96. dsp/ \(n;p):prn[s:s/"()"] / We start with the pruned tree. (Forced to a tree by surrounding with parens)
  97. rdx[n;p] / Look for "redexes"
  98. rrp[n;p] / If we find none, we simply remove redundant parens in the expression and return
  99. / Otherwise we use the first redex to beta reduce the expression
  100. tblb[s;"^"=s] / A redex is a reducible expression, which means an application beginning with a "lambda"
  101. tblb[n]'(p=/:p@p@)l:&"^"=n; / This is simply a sibling of a lambda... (Here we show the siblings of each.)
  102. sb@'&'l<sb:(&'p=/:p@p@)l:&"^"=n / ... which comes after the lambda
  103. tblb[n]'@[&#p;;:;1]'sb@'&'l<sb:(&'p=/:p@p@)l:&"^"=n;
  104. tbl[n;,p]
  105. rdx[n;p] / We return a list of indices of lambdas paired with their first sibling
  106. rdx
  107. s0:"((((^x.((xx))))((^y.y))))" / Let's remove redundant parens from this expression
  108. dsp/(n0;p0):prn s0
  109. tblb[n0;@[&#p0;&2>#'g:=p0;:;1]] / We start by finding which nodes have only a single child
  110. / (i.e. have only one child in the parent vector)
  111. :rm:(,/"^"=n0@g@)_&2>#'g:=p0 / Then we drop those whose child is just a lambda
  112. / since these always have only one child but are necessary
  113. :rm:("("=n0@)#rm / Then we make sure that these nodes are actually parentheses
  114. / We'd like to simply drop them like before, but then we'll have nodes whose parents have been removed.
  115. / So we need to find the new parent before dropping the old parent.
  116. / In the easiest case we just lift the parent to its parent.
  117. / But this might be being removed as well, so we have to keep moving up.
  118. / Since the first child of any node is always one index greater than its parent,
  119. / We can simply backup the index until we find a parent not being removed.
  120. :w:(&|/0,rm=\:p0)^rm / we first find all children of nodes being removed (exclude those being removed)
  121. tblb[n0;@[&#p0;w;:;1]]
  122. (-/1(~^rm?p0@)\)/w / Then we iteratively test if its parent is being removed and stop when it isn't
  123. :np:p0@(-/1(~^rm?p0@)\)/w / The new parent is that final parent.
  124. tbl[n0;,p0]
  125. tbl[n0;,p0:@[p0;w;:;np:p0@(-/1(~^rm?p0@)\)/w]]
  126. / Then we remove the redundant parentheses
  127. :(n0;p0):(-#rm)_/:srt[n0;p0;<@[&#p0;rm;:;1]]
  128. $[2=+/~p0;1_/:(n0;0|p0-1);(n0;p0)] / The last niggling bit is that since the root is self parenting it will never have only one child
  129. / So when it's redundant the algorithm above won't find it. We just shift off the parent in this case.
  130. :rrp[n0;p0]
  131. rrp
  132. dsp[n;p] / Finally we're ready for beta reduction
  133. :(n;p):("(",n;0,(~p=!#p)*1+p) / We start by shifting on a root node just in case we're not given a tree
  134. :rd:rdx[n;p] / Then we see if we have any redexes, and if we do we call the main function with the first redex
  135. :off:*rd / Here rd contains the index of the lambda and of its next sibling
  136. :(l;e):0 1+lc[p]'off / We find the extent of each of these nodes (add one to the end of the end of the sibling to make math easier)
  137. :idx:off+!e-off:*off / Now we find all the indices in range for the substitution
  138. :idx:(2;1+idx?l)_idx / And split them at the end of the lambda expression
  139. n@idx / lopping off the lambda and the variable leaving only the lambda body
  140. tbl[n;,p] / Then we lift all immediate children of the lambda to its parent
  141. tblb[n;@[&#p;(off=p@)#*idx;:;1]] / Since we'll be removing the lambda, we want the each child to end up in the group
  142. :p:@[p;(off=p@)#*idx;:;p[off]]
  143. tbl[n;,p]
  144. :idx[0]:(n[off+1]=n@)#idx[0] / From here on we only need to track references to the bound variable in the body of the lambda
  145. :l:1 0+#'|idx / Now we take the lengths of each of these lists, reverse it and add one
  146. / to the length of the body of the sibling. i.e. argument to the lambda
  147. / This is because we're going to add a group to this to make sure it's a tree with a root
  148. `0:`k'@[&#p;*idx;,;(|l)#1] / We prepare for reshuffling by assembling a grade to insert the substitition after each
  149. / reference to a bound variable. (i.e. idx[0])
  150. :ro:<,/@[&#p;*idx;,;(|l)#1] / We take the inverse of this grade because we want to splice in nodes added at the back
  151. copy[n;p;e;l] / The copy function makes one copy of the substitution for each reference to the bound variable
  152. / This leaves us with a copy of the argument tree for each occurence of the bound variable
  153. / Each root is left with a null parent at its root to be filled in with where it is spliced in
  154. e / Remember that e is the end of the argument
  155. l / and l has the length of the argument + 1 and the count of bound variables
  156. (-*l-1)#/:e#/:(n;p) / This chops off the argument section of both the node and parent vectors
  157. :rr:("(";0),'(-*l-1)#/:e#/:(n;p) / And this prepends a root node for us to hook into where the corresponding bound variable was
  158. (*l;#'rr) / Now *l should match the length of this constructed substitution
  159. */l / We want *|l copies of this, which amounts to collection */l of each of n and p
  160. :(sn;sp):(*/l)#/:rr
  161. / But the parent vectors refer to the old location of this tree, this needs adjusting
  162. #/|l / This is the length of each copy
  163. &#/|l / This tags each character of each copy
  164. (*l)*&#/|l / This turns the tags into offsets for each character since each copy is *l long
  165. e-*l / This is the original index of substitution (the + 1 is already factored in)
  166. 0|sp-e-*l / This shifts down the parent vector values accordingly
  167. :rr:(#p)+(0|sp-e-*l)+(*l)*&#/|l / Finally we find their new position when we append to the original vectors
  168. (*l)*!*|l / This is the position of the start of each copy
  169. :sp:@[rr;(*l)*!*|l;:;0N] / We turn this roots into nulls to be replaced when we're splicing them in
  170. copy
  171. :(n;p):(n;p),'copy[n;p;e;l] / Now we append this onto the original vectors
  172. :p:@[p;&^p;:;p@*idx] / And swap in the indices of the parent each bound index to the root of the copies
  173. / i.e. we make each copy a sibling of its respective bound variable
  174. :(n;p):srt[n;p;<ro] / .. and use the previously constructed grade to move these copies into place
  175. tbl[n;,p]
  176. dsp[n;p]
  177. :l:(off+!2),,/idx / These are the *original* indices of all of the stuff no longer needed
  178. ro@l / And here's where they are after the shuffle
  179. (-#l)_/:srt[n;p;<@[&#p;ro@l;:;1]] / We then remove these indices as before
  180. beta0
  181. beta
  182. dsp/(n;p):prn s
  183. dsp/beta/(n;p)
  184. / FIN