Mixed Integer Programming

tomlarkworthy · August 4, 2021, 6:02am

I find Mixed Integer Programming extremely useful for many tasks, so it seemed like a weird omission there is not much in JS land. Certainly none that do the symbolic rearrangement for you into canonical form.

jwoLondon · August 11, 2021, 6:19pm

This is great and good timing for me as I was looking for an LP library I could use with Observable. Thanks very much for creating it.

One question: Is there a way of providing a vector of values for the objective function without having to name each of the variables? Something like the ‘low level interface’ of SCPSolver perhaps. I need to set up an LP with potentially several thousand variables and am looking for a reasonably efficient way of specifying them. I guess I could programmatically generate variable names for a single string expression, but this seems like a potentially unnecessary round-trip for the solver.

tomlarkworthy · August 11, 2021, 7:54pm

You can go direct to the GitHub - jvail/glpk.js: JavaScript port of GLPK library, thats a fairly direct wrapper of the low level GLPK C API. The way you have to express your problem is fairly restrictive though. Another one is GitHub - JWally/jsLPSolver: Simple OOP javaScript library to solve linear programs, and mixed integer but again, very algebraically restrictive. Also I think they stil need named variables but they are good for setting coefficients without suprises.

The main thing I was aiming for with my wrapper was symbolic expressivity by bundling a expression parser and rearranger. If you send it a massive string you run the risk of it spinning when rearranging it, things like x1 * 4 vs. 4 * x1 could have performance implications. Though hard to know whether its more inefficient than the LP solver itself. You will have to try things out, I only tested some small toy problems so far.

jwoLondon · August 11, 2021, 8:22pm

Thanks very much. Yes the symbolic expressivity is a really nice feature of your wrapper and certainly works well for exploring how to use LP to specify optimisation problems. And it sits nicely with the approach of explanatory Observable pages.

I will investigate the relative performance of some of the options you mention an report back…

jwoLondon · August 14, 2021, 6:56am

The more I use your LP parser the more I like it . Partly for my own benefit, but also for others, I’ve created an introductory LP page with some toy and applied examples: Hello Linear Programming / Jo Wood / Observable

I have narrowed down where there is a significant bottleneck though. The time complexity of your canonicalization routine is high (polynomial?) with respect to the number of variables, even when they are already in canonical form. For example, the following objective function expression with 20 variables takes around 25 seconds to evaluate on my machine:

mip().objective(Math.min,"x1+x2+x3+x4+x5+x6+x7+x8+x9+x10+x11+x12+x13+x14+x15+x16+x17+x18+x19+x20")

I think this can be traced back in part to your cannonicalize function as the following takes around 10 seconds to evaluate:

cannonicalize("x1+x2+x3+x4+x5+x6+x7+x8+x9+x10+x11+x12+x13+x14+x15+x16+x17+x18+x19+x20 < 100")

As I routinely need to be able to create objective functions with hundreds or thousands of variables, this approach is currently infeasible, even though I am able to create expressions already in canonical form. Assuming it is not easy to reduce the time complexity of the rearranger, perhaps allow .objectiveCanonical() and subjectToCanoncal() functions that accept expressions already in canonical form?

I checked whether LPs with hundreds of variables were slow to solve in case the expression parser was not the actual bottleneck, but passing the LP directly to glpk via a JSON can solve LPs with over 1000 variables in a fraction of a second (see the London grid example in my page above).

Happy to do further testing if it would help.

tomlarkworthy · August 14, 2021, 8:42am

Wow amazing examples I cross linked them. Thanks for your words of encouragement.

Yeah I will see if I can fix that performance issue. I don’t want to put a backdoor straight into glpk as it makes it less pluggable and undermines the licencing argument.

BRB…

tomlarkworthy · August 14, 2021, 9:16am

yeah I traced it to math.js, expr.equals(other) of all places!

equalsIsSlow = {
  const n = 20;
  const termsA = Array.from({ length: n })
    .map((_, i) => `x${i}`)
    .join(" + ");

  const termsB = Array.from({ length: n })
    .map((_, i) => `x${i}`)
    .join(" + ");

  const a = math.parse(termsA);
  const b = math.parse(termsB);

  // return a.equals(b) // Very slow, uses home rolled deep equals https://github.com/josdejong/mathjs/blob/c6cbf5538915c8964b70a1af086f47c2c0be33df/src/expression/node/Node.js#L221
  return _.isEqual(a, b); // much faster
}

substitution for lodash deep equals seems to work.

Give it a try! I ran it upto 100 terms and it seems to scale better now.

Thanks so much for the detailed reports! It helps a lot.

jwoLondon · August 15, 2021, 1:01pm

Thanks for the fix. It’s certainly faster, but I think there are still some significant performance issues with larger numbers of variables. If I test just the objective function with the simplest possible expression (x1+x2+x3\ldots x_n) for various sizes of n I get the following parsing times (seconds):

n	time (seconds)
1	0.005
16	0.014
81	0.55
256	14.8
625	did not complete

This is without specifying constraints or solving the LP.

For context, for a typical grid layout LP (Example 5 in Hello Linear Programming / Jo Wood / Observable) , here are the numbers of variables and complete solve time if I supply the expressions directly to glpk.js:

n	time (seconds)
1	<1 millisecond
16	0.001
81	0.001
256	0.002
625	0.007
1,296	0.02
2,401	0.04
4,096	0.08
6,561	0.2
10,000	0.3
14,641	0.6
20,736	1.0
28,561	1.9
38,416	2.5
50,625	3.8
65,536	5.3
83,521	8.0
104,976	13.5
:	:
c.1m	15 minutes

Arguably, the kinds of LPs with tens of thousands of variables may not be the use-case for which your parser/rearranger is designed, but I provide the above just in case it is a useful benchmark for further optimisations.

tomlarkworthy · August 15, 2021, 2:45pm

So equals issue is to be fixed upstream Fix performance problem with deepStrictEqual by tomlarkworthy · Pull Request #2301 · josdejong/mathjs · GitHub

I traced the next low performance to math.js / simplify and the rule matching in particular.

I don’t see any easy wins so the performance trail goes cold. I think it’s possible to build a better performing rearranger by not using math.js but it’s more work than I would bite off at this time.

At least we have a large number of test cases the replacement would need to satisfy in Cannonicalization with Math.js / Tom Larkworthy / Observable

I guess I could still short circuit an already canonical expression…

tomlarkworthy · August 15, 2021, 3:09pm

ok now 1000 vars is instant for me too.

actually the fix was not too ugly, I already had a checker for canonical form in the final extraction process, so I could just run that first and see if it blew up or not. If not, we don’t need to engage the rearranger. makes sense.

it catches things like x0 - 0.25 * x1 as in canonical form too.

jwoLondon · August 15, 2021, 4:38pm

Great to see such rapid improvements!

It looks like we are getting performance at roughly similar rates as generating the glpk.js JSON directly now (e.g. 0.4 seconds for solving layout LP with 10,000 variables).

Beyond about 10,000 variables I get a RangeError: Maximum call stack size exceeded error when calling .objective() with the large string of canonical variables. I don’t know if this suggests replacing a recursive algorithm with an iterative one, although that might be buried in math.js somewhere, so possibly not practical.

Sorry for keep throwing more problems your way - it really is great to see what you’ve done here.

tomlarkworthy · August 15, 2021, 5:41pm

Luckily it’s my crappy programming that is throwing stack overflows! Fixed by eliminating the recursion.

Now I begin to see your pattern emerging. You just keep adding a zero! So I tested up to…

which takes 9 seconds on my laptop.

tomlarkworthy · August 15, 2021, 5:43pm

Also its very interesting you are using it for layout as that was also the reason why I programmed this. Though I was thinking more about label placement but this optimal gridding is a great idea too.

jwoLondon · August 15, 2021, 6:57pm

Super. Yes, works for a million variables for me too now. And is even faster than my implementation that builds the JSON objects. What a productive Sunday! Thanks for your work on this.

And as for 10 million…

jwoLondon · August 15, 2021, 7:06pm

My next attempt at using this for ‘serious’ work will be to interpolate flows between sets of checkpoints. It uses earth mover’s distance which is amenable to solving via LP.

jwoLondon · August 18, 2021, 4:32pm

For anyone interested, I have created a dedicated page for grid map layouts that uses the LP approach. Should allow anyone to lay out an arbitrary set of points in a grid with some parametrisable control over how points are allocated.

tomlarkworthy · September 5, 2021, 8:20pm

With a lot of helpful bug reports from @jwoLondon, and eventually fuzz testing (Writing a Simple Expression Fuzzer / Tom Larkworthy / Observable), I think I finally fixed the fragility of the rearranger! It seems to be able to rearrange any + - * ( ) expression now and pass it to the solver.

Topic		Replies	Views
glpk.js and variable Sets - restrict to "either or" Community Help	3	317	August 3, 2023
Enabling other languages? Feedback	13	3377	January 31, 2020
Designated input vs normal (output/computed) variables Feedback	2	954	February 8, 2018
"Recompute with...", or: a way to "climb the ladder of abstraction" Community Help	22	2040	May 30, 2019
approaches to building a "library" of Observable functions Community Help	6	813	December 12, 2020

Mixed Integer Programming

Related topics