🏠 back to Observable

Mixed Integer Programming

I find Mixed Integer Programming extremely useful for many tasks, so it seemed like a weird omission there is not much in JS land. Certainly none that do the symbolic rearrangement for you into canonical form.

3 Likes

This is great and good timing for me as I was looking for an LP library I could use with Observable. Thanks very much for creating it.

One question: Is there a way of providing a vector of values for the objective function without having to name each of the variables? Something like the ‘low level interface’ of SCPSolver perhaps. I need to set up an LP with potentially several thousand variables and am looking for a reasonably efficient way of specifying them. I guess I could programmatically generate variable names for a single string expression, but this seems like a potentially unnecessary round-trip for the solver.

You can go direct to the GitHub - jvail/glpk.js: JavaScript port of GLPK library, thats a fairly direct wrapper of the low level GLPK C API. The way you have to express your problem is fairly restrictive though. Another one is GitHub - JWally/jsLPSolver: Simple OOP javaScript library to solve linear programs, and mixed integer but again, very algebraically restrictive. Also I think they stil need named variables but they are good for setting coefficients without suprises.

The main thing I was aiming for with my wrapper was symbolic expressivity by bundling a expression parser and rearranger. If you send it a massive string you run the risk of it spinning when rearranging it, things like x1 * 4 vs. 4 * x1 could have performance implications. Though hard to know whether its more inefficient than the LP solver itself. You will have to try things out, I only tested some small toy problems so far.

1 Like

Thanks very much. Yes the symbolic expressivity is a really nice feature of your wrapper and certainly works well for exploring how to use LP to specify optimisation problems. And it sits nicely with the approach of explanatory Observable pages.

I will investigate the relative performance of some of the options you mention an report back…

1 Like

The more I use your LP parser the more I like it :grinning:. Partly for my own benefit, but also for others, I’ve created an introductory LP page with some toy and applied examples: Hello Linear Programming / Jo Wood / Observable

I have narrowed down where there is a significant bottleneck though. The time complexity of your canonicalization routine is high (polynomial?) with respect to the number of variables, even when they are already in canonical form. For example, the following objective function expression with 20 variables takes around 25 seconds to evaluate on my machine:

mip().objective(Math.min,"x1+x2+x3+x4+x5+x6+x7+x8+x9+x10+x11+x12+x13+x14+x15+x16+x17+x18+x19+x20")

I think this can be traced back in part to your cannonicalize function as the following takes around 10 seconds to evaluate:

cannonicalize("x1+x2+x3+x4+x5+x6+x7+x8+x9+x10+x11+x12+x13+x14+x15+x16+x17+x18+x19+x20 < 100")

As I routinely need to be able to create objective functions with hundreds or thousands of variables, this approach is currently infeasible, even though I am able to create expressions already in canonical form. Assuming it is not easy to reduce the time complexity of the rearranger, perhaps allow .objectiveCanonical() and subjectToCanoncal() functions that accept expressions already in canonical form?

I checked whether LPs with hundreds of variables were slow to solve in case the expression parser was not the actual bottleneck, but passing the LP directly to glpk via a JSON can solve LPs with over 1000 variables in a fraction of a second (see the London grid example in my page above).

Happy to do further testing if it would help.

1 Like

Wow amazing examples I cross linked them. :clap: Thanks for your words of encouragement.

Yeah I will see if I can fix that performance issue. I don’t want to put a backdoor straight into glpk as it makes it less pluggable and undermines the licencing argument.

BRB…

yeah I traced it to math.js, expr.equals(other) of all places!

equalsIsSlow = {
  const n = 20;
  const termsA = Array.from({ length: n })
    .map((_, i) => `x${i}`)
    .join(" + ");

  const termsB = Array.from({ length: n })
    .map((_, i) => `x${i}`)
    .join(" + ");

  const a = math.parse(termsA);
  const b = math.parse(termsB);

  // return a.equals(b) // Very slow, uses home rolled deep equals https://github.com/josdejong/mathjs/blob/c6cbf5538915c8964b70a1af086f47c2c0be33df/src/expression/node/Node.js#L221
  return _.isEqual(a, b); // much faster
}

substitution for lodash deep equals seems to work.

Give it a try! I ran it upto 100 terms and it seems to scale better now.

Thanks so much for the detailed reports! It helps a lot.

Thanks for the fix. It’s certainly faster, but I think there are still some significant performance issues with larger numbers of variables. If I test just the objective function with the simplest possible expression (x1+x2+x3\ldots x_n) for various sizes of n I get the following parsing times (seconds):

n time (seconds)
1 0.005
16 0.014
81 0.55
256 14.8
625 did not complete

This is without specifying constraints or solving the LP.

For context, for a typical grid layout LP (Example 5 in Hello Linear Programming / Jo Wood / Observable) , here are the numbers of variables and complete solve time if I supply the expressions directly to glpk.js:

n time (seconds)
1 <1 millisecond
16 0.001
81 0.001
256 0.002
625 0.007
1,296 0.02
2,401 0.04
4,096 0.08
6,561 0.2
10,000 0.3
14,641 0.6
20,736 1.0
28,561 1.9
38,416 2.5
50,625 3.8
65,536 5.3
83,521 8.0
104,976 13.5
: :
c.1m 15 minutes

Arguably, the kinds of LPs with tens of thousands of variables may not be the use-case for which your parser/rearranger is designed, but I provide the above just in case it is a useful benchmark for further optimisations.

So equals issue is to be fixed upstream Fix performance problem with deepStrictEqual by tomlarkworthy · Pull Request #2301 · josdejong/mathjs · GitHub

I traced the next low performance to math.js / simplify and the rule matching in particular.

I don’t see any easy wins so the performance trail goes cold. I think it’s possible to build a better performing rearranger by not using math.js but it’s more work than I would bite off at this time.

At least we have a large number of test cases the replacement would need to satisfy in Cannonicalization with Math.js / Tom Larkworthy / Observable

I guess I could still short circuit an already canonical expression…

1 Like

ok now 1000 vars is instant for me too.

actually the fix was not too ugly, I already had a checker for canonical form in the final extraction process, so I could just run that first and see if it blew up or not. If not, we don’t need to engage the rearranger. makes sense.

it catches things like x0 - 0.25 * x1 as in canonical form too.

Great to see such rapid improvements!

It looks like we are getting performance at roughly similar rates as generating the glpk.js JSON directly now (e.g. 0.4 seconds for solving layout LP with 10,000 variables).

Beyond about 10,000 variables I get a RangeError: Maximum call stack size exceeded error when calling .objective() with the large string of canonical variables. I don’t know if this suggests replacing a recursive algorithm with an iterative one, although that might be buried in math.js somewhere, so possibly not practical.

Sorry for keep throwing more problems your way - it really is great to see what you’ve done here.

Luckily it’s my crappy programming that is throwing stack overflows! Fixed by eliminating the recursion.

Now I begin to see your pattern emerging. You just keep adding a zero! So I tested up to…

1M

which takes 9 seconds on my laptop.

3 Likes

Also its very interesting you are using it for layout as that was also the reason why I programmed this. Though I was thinking more about label placement but this optimal gridding is a great idea too.

Super. Yes, works for a million variables for me too now. And is even faster than my implementation that builds the JSON objects. What a productive Sunday! Thanks for your work on this.

And as for 10 million…

1 Like

My next attempt at using this for ‘serious’ work will be to interpolate flows between sets of checkpoints. It uses earth mover’s distance which is amenable to solving via LP.

2 Likes

For anyone interested, I have created a dedicated page for grid map layouts that uses the LP approach. Should allow anyone to lay out an arbitrary set of points in a grid with some parametrisable control over how points are allocated.

2 Likes

With a lot of helpful bug reports from @jwoLondon, and eventually fuzz testing (Writing a Simple Expression Fuzzer / Tom Larkworthy / Observable), I think I finally fixed the fragility of the rearranger! It seems to be able to rearrange any + - * ( ) expression now and pass it to the solver.

1 Like