p. 127 of www.cs.indiana.edu/~dyb/papers/3imp.pdf gives a pretty good overview, ...

p. 127 of www.cs.indiana.edu/~dyb/papers/3imp.pdf gives a pretty good overview, including a ton of references.

Roughly, graph reduction execution represents an expression as a node, with an edge to nodes of subexpressions. A value of a node is obtained by walking through it's children, setting the value of each node as the recursion winds down. String reduction simply modifies the expression itself. This does not eliminate common subexpressions (a() * a() will reduce a() twice), which is why graph reduction is favoured. In a parallel world, you often don't care about the extra work if it reduces the number of synchronisation.