Questions (and their answers) about Linear Programming
	and related topics.
Keywords: FAQ, LP, Linear Programming

Archive-name: linear-programming-faq
Last-modified: 1992/10/19
Version: 1.1


           Linear Programming - Frequently Asked Questions List
                                 (LP-FAQ)
                   Most recent update: October 19, 1992

--------------------------------------------------------------------------

1.  "What is Linear Programming?"

A:  A linear program (LP) is a problem that can be put into the form

    minimize   cx
    subject to Ax=b
                x>=0

where x is a vector to be solved for, A is a matrix of known coefficients,
and c and b are vectors  of known coefficients.  All these entities must
have consistent dimensions, of course, and you can add "transposes" to
taste.  The matrix A is generally not square (hence just doing x=inv(A)*b
is not a solution), and usually has more columns than rows.  Ax=b is
therefore underdetermined, leaving (usually) great latitude in the choice
of x with which to minimize cx.

Other formulations can be used in this framework.  For instance, if you
want to maximize instead of minimize, multiply the c vector by -1.  If
you have constraints that are inequalities rather than equations, you
can introduce one new variable (a "slack") for each inequality and treat
the augmented row of the matrix as an equation.  LP codes will often
take care of such "bookkeeping" for you.

LP problems are usually solved by a technique known as the Simplex Method,
developed in the 1940's and after.  Briefly stated, this method works by
taking a sequence of square submatrices of A and solving for x, in such a
way that successive solutions always improve, until a point is reached
where no improvement is possible.  A family of LP algorithms known as
Interior Point methods has been developed starting in the 1980's, that
can be faster for many (but so far not all) problems.

LP has a variety of uses, in such areas as petroleum, finance, trans-
portation, forestry, and military.

--------------------------------------------------------------------------

2.  "Where is there a good code, preferably public domain, to solve
Linear Programming problems?"

A:  It depends on the difficulty of your models.  LP technology and
computer technology have both made such great leaps that models that
were previously considered "large" are now routinely solved.  Nowadays,
models with a few thousand constraints, and several thousand variables,
can be tackled with a PC.  Good LP codes on workstations can often
handle models with variables in the tens of thousands, or even greater.
It's hard to be specific about sizes and speed, a priori, due to the
wide variation in things like model structure and difficulty in fac-
torizing the basis matrices.

There is a recently released public domain code called "lp_solve" that
is available on Usenet in the "comp.sources.reviewed" newsgroup.  Its
author claims to have solved models with up to 30,000 variables and
50,000 constraints.  My own experience with this code is not quite so
uniformly optimistic; still, for someone who isn't sure just what kind
of LP code is needed, it represents a very reasonable first try, and the
price is right.  That program is archived at anonymous ftp site
"ftp.uu.net", in directory "/usenet/comp.sources.reviewed/volume02/lp_solve".

For MS-DOS PC users, Prof. Timo Salmi at the University of Vaasa in
Finland offers a code called "tslin".  You should be able to access it by
ftp at garbo.uwasa.fi in directory /pc/ts, or else I suggest contacting
Prof. Salmi at ts@uwasa.fi.

The consensus is that the LP code published in Numerical Recipes is not at
all strong, and should be avoided for heavy-duty use.  If your requirement
is for a solver that can handle 100-variable models, it might be okay.

There is an ACM TOMS routine for LP, #552, available from the netlib server,
in directory /netlib/toms.  See the section on test models for detail on
how to use this server.

If your models prove to be too difficult for free software to handle,
then you can consider acquiring a commercial LP code.  There are dozens
of such codes on the market. I have my own opinions, but for reasons of
space, generality and fairness, I will not attempt even to list the codes
I know of here.  Instead I refer you to the annual survey of LP software
published in "OR/MS Today", a joint publication of ORSA (Operations
Research Society of America) and TIMS (The Institute of Management
Science).  I think it's likely that you can find a copy of the June, 1992
issue, either through a library, or by contacting a member of these two
organizations (most universities probably have several members among the
faculty and student body).  The survey lists almost fifty actively marketed
products.  This publication also carries advertisements for many of these
products, which may give you additional information to help make a decision.

If you have access to one of the math libraries, such as IMSL or NAG, you
may be able to use an LP routine from there.

There are many considerations in selecting an LP code.  Speed is important,
but LP is complex enough that different codes go faster on different models;
you won't find a "Consumer Reports" 8v)  article to say with certainty which
code is THE fastest.  I usually suggest getting benchmark results for your
particular type of model if speed is paramount to you.  Benchmarking may
also help determine whether a given code has sufficient numerical stability
for your kind of models.

Other questions you should answer: Can you use a stand-alone code, or do
you need a code that can be used as a callable library, or do you require
source code?  Do you want the flexibility of a code that runs on many
platforms and/or operating systems, or do you want code that's tuned to
your particular hardware architecture (in which case your hardware vendor
may have suggestions)?  Is the choice of algorithm (Simplex, Interior
Point) important to you?  Do you need an interface to a spreadsheet
program?  Is the purchase price an overriding concern?  Is the software
offered at an academic discount (assuming you are at a university)?  How
much hotline support do you think you'll need?

It may not always be true that "you get what you pay for," but it is rare
that you get more than you pay for.  8v)   There is usually a large
difference in LP codes, in performance (speed, numerical stability,
adaptability to computer architectures) and features, as you climb the
price scale.  If a code seems overpriced to you, you may not yet
understand all of its features.

--------------------------------------------------------------------------

3.  "Oh, and we also want to solve it as an integer program. I think
there will be only a few thousand variables or so."

A:  Hmmmm.  You want
    - Nontrivial model size
    - Integer solutions
    - Public domain code
Pick one or maybe two of the above.  You can't have all three.  8v)

Integer LP models are ones where the answers must not take fractional
values.  It may not be obvious that this is a VERY much harder problem
than ordinary LP, but it is nonetheless true.  Integer models may be
ones where only some of the variables are to be integer and others may
be real-valued (termed "Mixed Integer LP" or MILP, or "Mixed Integer
Programs" or MIP), or they may be ones where all the variables must be
integral (termed "Integer LP" or ILP).  The class of ILP is often further
subdivided into problems where the only legal values are {0,1} ("Binary"
or "Zero-One" ILP), and general integer problems.  For the sake of
generality, the Integer LP problem will be referred to here as MIP,
since the other classes can be viewed as special cases of MIP.

You should be prepared to solve far smaller MIP models than the
corresponding LP model, given a certain amount of time you wish to
allow (unless you and your model happen to be very lucky). There exist
models that are considered challenging, with mere hundreds of variables.
Conversely, some models with tens of thousands of variables solve
readily.  It all depends, and the best explanations of "why" always
seem to happen after the fact.  8v)

One exception to this gloomy outlook is that there are certain models
whose LP solution always turns out to be integer.  Best known of these
are the so-called Transportation Problem, Assignment Problem, and
Network-Flow Problem.  It turns out that these problems are best solved
by specialized routines that take major shortcuts in the Simplex Method,
and as a result are relatively quick running.  See the section on
references for a book by Kennington and Helgason, which contains some
source code for Netflo.  Netflo is available by anonymous ftp at
dimacs.rutgers.edu, in directory /pub/netflow/mincost/solver-1, but
I don't know the copyright situation (I used to think you had to buy
the book to get the code).

People are sometimes surprised to learn that integer problems are solved
using floating point arithmetic.  Although various algorithms for MIP
have been studied, most if not all available general purpose large-scale
LP codes use a method called "Branch and Bound" to try to find an optimal
solution.  In a nutshell, B&B solves MIP by solving a sequence of related
LP models.  Good codes for MIP distinguish themselves more by solving
shorter sequences of LP's, than by solving the individual LP's faster.
Even moreso than with regular LP, a costly commercial code may prove its
value to you if your MIP model is difficult.

As a point of interest, the Simplex Method currently retains an advantage
over the newer Interior Point methods for solving these sequences of LP's.

The public domain code "lp_solve", mentioned above, accepts MIP models,
as do a large proportion of the commercial LP codes in the OR/MS Today
survey.  I have seen mention made of algorithm 333 in the Collected
Algorithms from CACM, though I'd be surprised if it was robust enough
to solve large models.


The better MIP codes have numerous parameters and options to give the user
control over the solution strategy. Most have the capability of stopping
before an optimum is proved, printing the best answer obtained so far.
For many MIP models, stopping early is a practical necessity. Fortunately,
a solution that has been proved by the algorithm to be within, say, 1% of
optimality often turns out to be the true optimum, and the bulk of the
computation time is spent proving the optimality. For many modeling
situations, a near-optimal solution is acceptable anyway.

Once one accepts that large MIP models are not typically solved to a
proved optimal solution, that opens up a broad area of approximate
methods, probabilistic methods and heuristics, as well as modifications
to B&B.  Claims have been made for Genetic Algorithms and Simulated
Annealing, though (IMHO) these successes have been problem dependent
and difficult to generalize.  (A reference for GA is David Goldberg,
"Genetic Algorithms in Machine Learning.")  When trying to solve a
difficult MIP model, it is usually crucial to understand the workings
of whatever it is you are modeling, and try to find some insight that
will assist your chosen algorithm to work better.  A related observation
is that the way you formulate your model can be as important as the actual
choice of solver.  You should consider getting some assistance if this
is your first time trying to solve a large (>100 variable) problem.

--------------------------------------------------------------------------

4.  "I've written my own optimization code.  How can I test it?"

A:  In light of the comments above, I hope your aims are fairly modest,
for there are already a lot of good codes out there.  I hope your LP
code makes use of sparse matrix techniques, rather than using a tableau
form of the Simplex method, because the latter usually ends up being
numerically unstable and very slow.

If you want to try out your code on some real-world LP models, there is
a very nice collection of small-to-medium-size ones on netlib.  If you
have ftp access, you can try "ftp research.att.com", using "netlib"
as the Name, and your email address as the password. Do a "cd lp/data"
and look around. There should be a "readme" file, which you would
want to look at first.  Alternatively, you can reach an e-mail
server via "netlib@ornl.gov", to which you can send a message saying
"send index from lp/data"; follow the instructions you receive.

The Netlib LP files (after you uncompress them) are in a format called
MPS, which is described in the section below.

There is a collection of MIP models, housed at Rice University.  Send
an email message containing "send catalog" to softlib@rice.edu, to get
started.

There is a collection of network-flow codes and models at DIMACS (Rutgers
University).  Use anonymous FTP at dimacs.rutgers.edu.  Start looking in
/pub/netflow.  Another network generator is called NETGEN and is available
on netlib (lp/generators).

John Beasley has posted information on his OR-Lib, which contains various
optimization test problems.  Send e-mail to umtsk99@vaxa.cc.imperial.ac.uk
to get started.  Or have a look in the Journal of the Operational Research
Society, Volume 41, Number 11, Pages 1069-72.

There are various test sets for NLP (Non-Linear Programming).  Among
those I've seen mentioned are
  - ACM TOMS (Transactions on Mathematical Software), V13 No3 P272
  - Publications (listed in another section of this list) by Schittkowski;
    Hock & Schittkowski;  Floudas & Pardalos; Torn; Hughes & Grawiog.
Many of the other references also contain various problems that you
could use to test a code.

The modelling language GAMS comes with about 100 test models, which
you might be able to test your code with.

--------------------------------------------------------------------------

5.  "What is MPS format?"

A:  MPS format was named after an early IBM LP product and has emerged
as a de facto standard ASCII medium among most of the various commercial
LP codes.  You will need to write your own reader routine for this, but
it's not too hard. The main things to know about MPS format are that it
is column oriented (as opposed to entering the model as equations), and
everything (variables, rows, etc.) gets a name.  MPS format is described
in more detail in Murtagh's book, referenced in another section.  Here
is a little sample model, explained in more detail below:

NAME          METALS
ROWS
  N VALUE
  E YIELD
  L FE
  L MN
  L CU
  L MG
  G AL
  L SI
COLUMNS
    BIN1      VALUE       0.03         YIELD           1.
    BIN1      FE          0.15         CU             .03
    BIN1      MN          0.02         MG             .02
    BIN1      AL          0.7          SI             .02
    BIN2      VALUE       0.08         YIELD           1.
    BIN2      FE           .04         CU             .05
    BIN2      MN           .04         MG             .03
    BIN2      AL           .75         SI             .06
    BIN3      VALUE       0.17         YIELD         1.
    BIN3      FE           .02         CU             .08
    BIN3      MN           .01         AL             .8
    BIN3      SI           .08
    BIN4      VALUE       0.12         YIELD         1.
    BIN4      FE           .04         CU             .02
    BIN4      MN           .02         AL             .75
    BIN4      SI          0.12
    BIN5      VALUE       0.15         YIELD         1.
    BIN5      FE           .02         CU             .06
    BIN5      MN           .02         MG             .01
    BIN5      SI           .02         AL             .8
    ALUM      VALUE       0.21         YIELD         1.
    ALUM      FE           .01         CU             .01
    ALUM      AL           .97         SI             .01
    SILCON    VALUE       0.38         YIELD         1.
    SILCON    FE           .03         SI             .97
RHS
    ALOY1     YIELD        2000.       FE              60.
    ALOY1     CU            100.       MN              40.
    ALOY1     MG             30.       AL            1500.
    ALOY1     SI            300.
BOUNDS
 UP PROD1     BIN1            200.00
 UP PROD1     BIN2            750.00
 LO PROD1     BIN3            400.00
 UP PROD1     BIN3            800.00
 LO PROD1     BIN4            100.00
 UP PROD1     BIN4            700.00
 UP PROD1     BIN5           1500.00
ENDATA


MPS is an old format, so it is set up as though you were using
punch cards, and is not free format. Fields start in column 1,
5, 15, 25, 40 and 50.  Sections of an MPS file are marked by
so-called header cards, which are distinguished by their starting
in column 1.  Although it is typical to use upper-case throughout
the file (like I said, MPS has long historical roots), many
MPS-readers will accept mixed-case for anything except the
header cards, and some allow mixed-case anywhere.

The NAME card can be anything you want.  The ROWS section defines
the names of all the constraints; entries in column 2 or 3 are E
for equality rows, L for less-than ( <= ) rows, G for greater-than
( >= ) rows, and N for non- constraining rows (the first of which
would be interpreted as the objective function).

The largest part of the file is in the COLUMNS section, which is
the place where the entries of the A-matrix are put.  All entries
for a given column must be placed consecutively, although within
a column the order of the entries (rows) is irrelevant. Rows not
mentioned for a column are assumed to have a coefficient of zero.

The RHS section allows one or more right-hand-side vectors to be
defined; most people don't bother having more than one.  In the
above example, its name is ALOY1, and has non-zero values in all
7 of the constraint rows of the problem.  Rows not mentioned are
assumed to have a right-hand-side of zero.

The optional BOUNDS section lets you put lower and upper bounds on
individual variables (no * wildcards, unfortunately), instead of
having to define extra rows in the matrix.  All the bounds that have
a given name in column 5 are taken together as a set.  Variables
not mentioned in a BOUNDS set are taken to be non-negative.  There
is another optional section called RANGES that I won't go into
here. The final card must be the ENDATA, and yes, it is spelled
funny.

For comparison, here is the same model written out in equation
format.

Minimize
 VALUE: 0.03 BIN1 + 0.08 BIN2 + 0.17 BIN3 + 0.12 BIN4 + 0.15 BIN5
 + 0.21 ALUM + 0.38 SILCON
Subject To
 YIELD: BIN1 + BIN2 + BIN3 + BIN4 + BIN5 + ALUM + SILCON  = 2000
 FE:    0.15 BIN1 + 0.04 BIN2 + 0.02 BIN3 + 0.04 BIN4 + 0.02 BIN5
        + 0.01 ALUM + 0.03 SILCON <= 60
 MN:    0.02 BIN1 + 0.04 BIN2 + 0.01 BIN3 + 0.02 BIN4 + 0.02 BIN5
        <= 40
 CU:    0.03 BIN1 + 0.05 BIN2 + 0.08 BIN3 + 0.02 BIN4 + 0.06 BIN5
        + 0.01 ALUM <= 100
 MG:    0.02 BIN1 + 0.03 BIN2 + 0.01 BIN5 <= 30
 AL:    0.7 BIN1 + 0.75 BIN2 + 0.8 BIN3 + 0.75 BIN4 + 0.8 BIN5
        + 0.97 ALUM >= 1500
 SI:    0.02 BIN1 + 0.06 BIN2 + 0.08 BIN3 + 0.12 BIN4 + 0.02 BIN5

        + 0.01 ALUM + 0.97 SILCON <= 300
and
        0 <= BIN1 <= 200
        0 <= BIN2 <= 750
        400 <= BIN3 <= 800
        100 <= BIN4 <= 700
        0 <= BIN5 <= 1500

--------------------------------------------------------------------------

6.  "What software is there for non-linear optimization?"

A:  I don't claim as much expertise in this area, but the question is
frequent enough to be worth addressing.  If I get enough feedback, this
section might grow large enough to need to be split off as a separate
FAQ list.

It's unrealistic to expect to find one general NLP code that's going to
work for every kind of nonlinear model.  Instead, you should try to find
a code that fits the problem you are solving.  Nonlinear solution techniques
can be divided into various categories, such as unconstrained, linearly
constrained, convexly constrained, or general.  If your problem doesn't
fit in any category except "general", you should be prepared to have to
use a method that boils down to exhaustive search, i.e., you have an
intractable problem (see the comments in the MIP section on Simulated
Annealing and Genetic Algorithms).

Several of the commercial LP codes referenced above have specialized
routines, particularly for Quadratic Programming (linear constraints
with a quadratic objective function).  Many nonlinear problems can be
solved by application of a sequence of LP or QP approximations.

There is an ACM TOMS routine for QP, #587, available from the netlib server,
in directory /netlib/toms.  See the section on test models for detail on
how to use this server.

Here is a summary of codes mentioned in newsgroups in the past year, not
sorted into categories.

NPSOL - Stanford University, Office of Technology Licensing, 415-723-0651.

MINOS - Stanford University, Office of Technology Licensing, 415-723-0651.

NAG Library routine E04UCF.

IMSL routine UMINF and UMIDH.

Harwell Library routine VF04.

Hooke and Jeeves algorithm - reference?

MINPAK I and II - Contact Steve Wright, wright@mcs.anl.gov

GENOCOP - Zbigniew Michalewicz, zbyszek@mosaic.uncc.edu

DFPMIN - Numerical Recipes  (Davidon-Fletcher-Powell)

Amoeba - Numerical Recipes

Brent's Method - Numerical Recipes

FSQP -  Contact Andre Tits, andre@src.umd.edu

CONMIN - Vanderplaats and Associates, Goleta CA

NOVA - DOT Products, Houston TX

GRG2 - Leon Lasdon, University of Texas, Austin TX

--------------------------------------------------------------------------

7.  "What references are there on optimization?"

A:  Too many to count.  Here are a few that I like or have been
recommended on the net (I have *not* reviewed them all):

Bazaraa & Shetty, Nonlinear Programming, Theory & Applications.

Coleman & Li, Large Scale Numerical Optimization, SIAM Books.

Dennis & Schnabel, Numerical Methods for Unconstrained Optimization
and Nonlinear Equations, Prentice Hall.

Fiacco & McCormick, Sequential Unconstrained Minimization Techniques,
SIAM Books.

Fletcher, R., Practical Methods of Optimization, Wiley 1987.

Floudas & Pardalos, A collection of test problems for constrained
optimization algorithms, Springer-Verlag, 1990.

Forsythe, Malcolm & Moler, Computer Methods for Mathematical
Computations, Prentice-Hall.

Gill, Murray & Wright, Practical Optimization, Academic Press.

Hock & Schittkowski, Test Examples for Nonlinear Programming Codes,
Springer-Verlag, 1981.

Hughes & Grawiog, Linear Programming: An Emphasis on Decision Making,
Addison-Wesley, 1973.

Kahaner, Moler & Nash, Numerical Methods and Software, Prentice-Hall.

Kennington & Helgason, Algorithms for Network Programming, Wiley, 1980.

Kirkpatrick, Gelatt & Vecchi, Optimization by Simulated Annealing,
Science, 220 (4598) 671-680 (1983).

Luenberger, Linear and Nonlinear Programming, Addison Wesley.

More, "Numerical Solution of Bound Constrained Problems", in Computational
Techniques & Applications, CTAC-87, Noye & Fletcher (eds), North-Holland,
29-37 (1988).

More & Toraldo, Algorithms for Bound Constrained Quadratic Programming
Problems, Numerische Mathematik 55, 377-400, 1989.

Murtagh, B., Advanced Linear Programming, McGraw-Hill, 1981.

Nemhauser, Rinnooy Kan, & Todd (eds), Optimization, North-Holland, 1989.

Press, Flannery, Teukolsky & Vetterling , Numerical Recipes, Cambridge,
1986.    (Comment: use with care.)

Schittkowski, Nonlinear Programming Codes, Springer-Verlag, 1980.

Torn & Zilinskas, Global Optimization, Springer-Verlag, 1989.

Williams, H.P., Model Building in Mathematical Programming, Wiley 1985.

--------------------------------------------------------------------------
8.  "Who maintains this list?"

A:  John W. Gregory
    LP Specialist  (it says that on my business card, it must be true!)
    Applications Department
    Cray Research, Inc.
    Eagan, MN 55121   USA
    jwg@cray.com
    612-683-3673

I suppose I should say something here to the effect that "the material
in this document does not reflect any official position taken by Cray
Research, Inc."  Also, "use at your own risk", "no endorsement of products
mentioned", etc., etc.  I probably should have scattered more "IMHO"s
around in the text, but that acronym seems weaselly and once you start it's
hard to know where to stop.  I should have put in a few more smilies here
and there too, to assist the humor impaired - be on your toes.  8v)

I've tried to keep my own biases (primarily, toward the high end of
computing) from dominating what I write here, and other viewpoints that
I've missed are welcome.  Suggestions, corrections, topics you'd like to
see covered, and additional material (particularly on NLP) are solicited.
Comments to the effect "who died and made *you* optimal?" will likely
result in you getting stuck with maintaining the FAQL instead of me.  8v)

I regret that when I started this list I didn't keep careful track of all
the contributors whose knowledge I tapped.  In several instances, the
information herein comes from postings on the net, which I assumed to be
fair use and in the public domain.  It being too late now to make individual
acknowledgements, I offer a blanket THANKS to all who have posted on these
topics to the net.

This FAQL is also being posted to news.answers, which is archived
in the periodic posting archive on pit-manager.mit.edu [18.172.1.27],
in the anonymous ftp directory /pub/usenet/news.answers.

Copies of this FAQL may be made freely, as long as it is distributed at
no charge and with this disclaimer included.

--------------------------------------------------------------------------