Sunday, 6 May 2012

Using Mathematica's Dispatch


How do I construct a Dispatch table?
Say you have a graph (network) consisting of n vertices in a loop:
In[1] := rules =
           With[{n = 1000},
             Table[ToString@i -> ToString@Mod[i + 1, n],
               {i, 0, n - 1}]];
You want to traverse the graph by applying these rewrite rules to an initial vertex. You can perform a single step with i /. rules but this is doing a linear search over rules trying to find the Rule with a lhs that matches the expression i. So applying the rules many times is slow:
In[2] := Nest[# /. rules &, 0, 10000] // AbsoluteTiming
Out[2] = {1.7880482, 0}
Mathematica's Dispatch allows us to precompute a hash table that turns the linear lookup into a constant-time lookup:
In[3] := dispatch = Dispatch[rules];
Applying the dispatch table many times obtains the same answer orders of magnitude faster:
In[4] := Nest[# /. dispatch &, 0, 10000] // AbsoluteTiming
Out[4] = {0.0550031, 0}
When is this advisable?
When:
  • You are doing many rewrites with the same set of rewrite rules, and
  • The set of rewrite rules contains at least 30 rules with constant lhs patterns, i.e. composed only from symbols, sequences and literals.
How does it work?
It just builds a hash table with the constant patterns as keys.
Are there alternative methods?
The most effective general approach is to rewrite the rules in another language. In particular, languages of the ML family (SML, OCaml and F#) have very efficient pattern match compilers and garbage collectors so they are able to rewrite terms much faster than Mathematica's general purpose rewriter does.

Tree data structure in Mathematica


I have used mathematica mostly as a mathematics workbench and for writing relatively small ad-hoc programs.
Mathematica really excels at this.
What approach have you used with regard to data structures? Gradually developing your own util package?
I avoid creating my own data structures in Mathematica because it cannot handle them efficiently. Specifically, general data structures tend to be 10-1,000× slower in Mathematica than elsewhere which greatly limits their practical usefulness. For example, Mathematica is 100× slower than F# at computing the range of depths in a red-black tree.
Logic programming with lists is one example where Mathematica is typically orders of magnitude slower than other compiled languages. The following Mathematica program uses linked lists to solve the n-queens problem:
safe[{x0_, y0_}][{x1_, y1_}] := 
 x0 != x1 && y0 != y1 && x0 - y0 != x1 - y1 && x0 + y0 != x1 + y1

filter[_, {}] := {}
filter[p_, {h_, t_}] := If[p[h], {h, filter[p, t]}, filter[p, t]]

search[n_, nqs_, qs_, {}, a_] := If[nqs == n, a + 1, a]
search[n_, nqs_, qs_, {q_, ps_}, a_] := 
 search[n, nqs, qs, ps, 
  search[n, nqs + 1, {q, qs}, filter[safe[q], ps], a]]

ps[n_] := 
 Fold[{#2, #1} &, {}, Flatten[Table[{i, j}, {i, n}, {j, n}], 1]]

solve[n_] := search[n, 0, {}, ps[n], 0]
Here is the equivalent F#:
let safe (x0, y0) (x1, y1) =
  x0<>x1 && y0<>y1 && x0-y0<>x1-y1 && x0+y0<>x1+y1

let rec filter f = function
  | [] -> []
  | x::xs -> if f x then x::filter f xs else filter f xs

let rec search n nqs qs ps a =
  match ps with
  | [] -> if nqs=n then a+1 else a
  | q::ps ->
      search n (nqs+1) (q::qs) (filter (safe q) ps) a
      |> search n nqs qs ps

let ps n =
  [ for i in 1..n do
      for j in 1..n do
        yield i, j ]

let solve n = search n 0 [] (ps n) 0

solve 8
Solving the 8-queens problem takes 10.5s with Mathematica and 0.07s with F#. So F# is 150× faster than Mathematica in this case.
The Stack Overflow question Mathematica "linked lists" and performance gives a more extreme example. Naive translation of that Mathematica code into F# gives an equivalent program that runs between 4,000 and 200,000× faster than the Mathematica:
let rand = System.Random()
let xs = List.init 10000 (fun _ -> rand.Next 100)
Array.init 100 (fun _ ->
  let t = System.Diagnostics.Stopwatch.StartNew()
  ignore(List.length xs)
  t.Elapsed.TotalSeconds)
Specifically, Mathematica takes 0.156s to 16s to perform a single iteration whereas the F# takes 42µs to 86µs.
If I really want to stay in Mathematica then I shoehorn everything I'm doing into Mathematica's handful of built-in data structures, e.g. Dispatch.