





         MINIX: A CHEAP UNIX CLONE WITH SOURCE CODE

                    Andrew S. Tanenbaum

         Dept. of Mathematics and Computer Science
                     Vrije Universiteit
                 Amsterdam, The Netherlands





1.  OVERVIEW OF THE MINIX SYSTEM ARCHITECTURE
     UNIX- is organized as a single executable program  that
is  loaded  into  memory  at  system boot time and then run.
MINIX is structured in a much more modular way, as a collec-
tion  of processes that communicate with each other and with
user processes by sending and receiving messages.  There are
separate  processes for the memory manager, the file system,
for each device driver, and for certain other  system  func-
tions.   This  structure enforces a better interface between
the pieces.  The file system cannot, for  example,  acciden-
tally  change  the  memory manager's tables because the file
system and  memory  manager  each  have  their  own  private
address spaces.
     These system processes are each full-fledged processes,
with  their  own  memory allocation, process table entry and
state.  They can be run, blocked, and send messages, just as
the  user  processes.   In fact, the memory manager and file
system each run in user space as  ordinary  processes.   The
device  drivers are all linked together with the kernel into
the same binary program,  but  they  communicate  with  each
other and with the other processes by message passing.
     When the system is compiled, four binary  programs  are
independently  created:  the  kernel  (including  the driver
processes), the memory manager, the file  system,  and  init
(which  reads  /etc/ttys and forks off the login processes).
In other words, compiling the system results  in  four  dis-
tinct  a.out  files.  When the system is booted, all four of
these are read into memory from the boot diskette.
     It is possible, and in fact, normal, to modify,  recom-
pile,  and  relink,  say, the file system, without having to
relink the other three pieces.  This design provides a  high
degree of modularity by dividing the system up into indepen-
dent pieces, each with a well-defined function and interface
to  the other pieces.  The pieces communicate by sending and
receiving messages.
     The various processes are structured in four layers:

   4. The user processes (top layer).
   3. The server processes (memory manager and file system).
_________________________
- UNIX is a trademark of AT&T Bell Laboratories.













   2. The device drivers, one process per device.
   1. Process and message handling (bottom layer).

Let us now briefly summarize the function of each layer.
     Layer 1 is  concerned  with  doing  process  management
including  CPU  scheduling  and  interprocess communication.
When a process does a SEND or RECEIVE, it traps to the  ker-
nel,  which  then tries to execute the command.  If the com-
mand cannot be executed (e.g., a process does a RECEIVE  and
there are no messages waiting for it), the caller is blocked
until the command can be executed, at which time the process
is  reactivated.  When an interrupt occurs, layer 1 converts
it into a message to the appropriate  device  driver,  which
will normally be blocked waiting for it.  The decision about
which process to run when is also made in layer 1.  A prior-
ity algorithm is used, giving device drivers higher priority
over ordinary user processes, for example.
     Layer 2 contains the device drivers,  one  process  per
major  device.   These  processes  are  part of the kernel's
address space because they must run in kernel mode to access
I/O device registers and execute I/O instructions.  Although
the IBM PC does not have user mode/kernel mode,  most  other
machines  do,  so  this  decision  has been made with an eye
toward the future.  To distinguish the processes within  the
kernel  from  those  in user space, the kernel processes are
called tasks.
     Layer 3 contains only two processes, the memory manager
and  the  file system.  They are both structured as servers,
with the user processes as clients.   When  a  user  process
(i.e.,  a  client) wants to execute a system call, it calls,
for example,  the  library  procedure  read  with  the  file
descriptor, buffer, and count.  The library procedure builds
a message containing the system call number and the  parame-
ters  and  sends  it  to  the  file system.  The client then
blocks waiting for a reply.  When the file  system  receives
the  message,  it carries it out and sends back a reply con-
taining the number of bytes read or  the  error  code.   The
library  procedure  gets the reply and returns the result to
the caller in the usual way.  The user is completely unaware
of  what  is  going  on  here, making it easy to replace the
local file system with a remote one.
     Layer 4 contains the user programs.   When  the  system
comes  up,  init  forks off login processes, which then wait
for input.  On a successful login, the  shell  is  executed.
Processes  can  fork, resulting in a tree of processes, with
init at the root.  When CTRL-D  is  typed  to  a  shell,  it
exits,  and  init replaces the shell with another login pro-
cess.

2.  LAYER 1 - PROCESSES AND MESSAGES
     The two basic concepts on  which  MINIX  is  built  are
processes  and  messages.   A  process  is  an independently
schedulable entity with its own process table entry.  A mes-
sage  is a structure containing the sender's process number,









                           - 2 -


a message type field, and a variable part (a union) contain-
ing  the  parameters or reply codes of the message.  Message
size is fixed, depending on how big the union happens to  be
on the machine in question.  On the IBM PC it is 24 bytes.
     Three kernel calls are provided:

   - RECEIVE(source, &message)
   - SEND(destination, &message)
   - SENDREC(process, &message)

These are the only true system calls  (i.e.,  traps  to  the
kernel).  RECEIVE announces the willingness of the caller to
accept a message from a specified process, or  ANY,  if  the
RECEIVER  will  accept  any  message.  (From here on, ``pro-
cess'' also includes the tasks.) If no message is available,
the receiving process is blocked.  SEND attempts to transmit
a message to the destination process.   If  the  destination
process  is  currently  blocked  trying  to receive from the
sender, the kernel copies  the  message  from  the  sender's
buffer to the receiver's buffer, and then marks them both as
runnable.  If the receiver is not waiting for a message from
the sender, the sender is blocked.
     The SENDREC primitive combines  the  functions  of  the
other two.  It sends a message to the indicated process, and
then blocks until a reply  has  been  received.   The  reply
overwrites the original message.  User processes use SENDREC
to execute system calls by sending messages to  the  servers
and then blocking until the reply arrives.
     There are two ways to enter the kernel.  One way is  by
the  trap  resulting  from  a  process'  attempt  to send or
receive a message.  The other way is by an interrupt.   When
an  interrupt occurs, the registers and machine state of the
currently running process are saved  in  its  process  table
entry.   Then a general interrupt handler is called with the
interrupt number as parameter.  This procedure builds a mes-
sage of type INTERRUPT, copies it to the buffer of the wait-
ing task, marks that task as runnable (unblocked), and  then
calls the scheduler to see who to run next.
     The scheduler maintains three queues, corresponding  to
layers  2, 3, and 4, respectively.  The driver queue has the
highest priority, the server queue has middle priority,  and
the  user  queue  has lowest priority.  The scheduling algo-
rithm is simple: find the highest priority queue that has at
least  one  process on it, and run the first process on that
queue.  In this way, a clock interrupt will cause a  process
switch  if  the file system was running, but not if the disk
driver was running.  If the disk  driver  was  running,  the
clock  task  will  be put at the end of the highest priority
queue, and run when its turn comes.
     In addition to this rule,  once  every  100  msec,  the
clock  task  checks  to see if the current process is a user
process that has been running for at least 100 msec.  If so,
that  user  is  removed from the front of the user queue and
put on the back.  In effect, compute  bound  user  processes









                           - 3 -


are run using a round robin scheduler.  Once started, a user
process runs until  either  it  blocks  trying  to  send  or
receive a message, or it has had 100 msec of CPU time.  This
algorithm is simple, fair, and easy to implement.

3.  LAYER 2 - DEVICE DRIVERS
     Like all versions of UNIX for the IBM  PC,  MINIX  does
not  use  the  ROM BIOS for input or output because the BIOS
does not support interrupts.  Suppose a  user  forks  off  a
compilation in the background and then calls the editor.  If
the editor tried to read from the terminal using  the  BIOS,
the  compilation  (and  any  other  background  jobs such as
printing) would be stopped dead in their tracks waiting  for
the  the  next  character to be typed.  Such behavior may be
acceptable in the MS-DOS world, but it certainly is  not  in
the  UNIX world.  As a result, MINIX contains a complete set
of drivers that duplicate the functions of the  BIOS.   Like
the  rest  of  MINIX,  these  drivers  are written in C, not
assembly language.
     This design  has  important  implications  for  running
MINIX  on PC clones.  A clone whose hardware is not compati-
ble with the PC down to the chip level, but which  tries  to
hide  the  differences by making the BIOS calls functionally
identical to IBM's will not run an unmodified MINIX  because
MINIX does not use the BIOS.
     Each device driver is a separate process in MINIX.   At
present,  the  drivers  include  the  clock driver, terminal
driver, various disk drivers (e.g., RAM disk, floppy  disk),
and  printer driver.  Each driver has a main loop consisting
of three actions:

   1. Wait for an incoming message.
   2. Perform the request contained in the message.
   3. Send a reply message.

Request messages have  a  standard  format,  containing  the
opcode  (e.g.,  READ,  WRITE,  or  IOCTL),  the minor device
number, the position (e.g., disk block number),  the  buffer
address,  the  byte  count, and the number of the process on
whose behalf the work is being done.
     As an example of where device drivers fit in,  consider
what  happens  when  a  user wants to read from a file.  The
user sends a message to the file system.  If the file system
has  the  needed  data  in its buffer cache, they are copied
back to the user.  Otherwise, the file system sends  a  mes-
sage to the disk task requesting that the block be read into
a buffer within the file  system's  address  space  (in  its
cache).   Users may not send messages to the tasks directly.
Only the servers may do this.
     MINIX supports a RAM disk.  In fact, the  RAM  disk  is
always  used  to  hold  the root device.  When the system is
booted, after the operating system has been loaded, the user
is  instructed to insert the root file system diskette.  The
file system then sees how big it is, allocates the necessary









                           - 4 -


memory, and copies the diskette to the RAM disk.  Other file
systems can then be mounted on the root device.
     This organization puts important  directories  such  as
/bin  and /tmp on the fastest device, and also makes it easy
to work with either floppy disks or hard disks or a  mixture
of  the  two by mounting them on /usr or /user or elsewhere.
In any event, the root device is always in the same place.
     In the standard distribution, the  RAM  disk  is  about
240K,  most of which is full of parts of the C compiler.  In
the 256K system, a much smaller RAM disk  has  to  be  used,
which  explains why this version has no C compiler: there is
no place to put it.  (The /usr diskette is  completely  full
with  the other utility programs and one of the design goals
was to make the system run on a 256K PC with 1 floppy disk.)
Users  with  an unusual configuration such as 256K and three
hard disks are free to juggle things around as they see fit.
     The terminal driver is compatible with the standard  V7
terminal  driver.   It  supports  cooked mode, raw mode, and
cbreak mode.  It also  supports  several  escape  sequences,
such as cursor positioning and reverse scrolling because the
screen editor needs them.
     The printer driver copies  its  input  to  the  printer
character  for  character without modification.  It does not
even convert line feed to carriage return + line feed.  This
makes  it  possible  to  send  escape  sequences to graphics
printers without the driver messing things up.   MINIX  does
not  spool  output  because  floppy disk systems rarely have
enough spare disk space for the spooling directory.  Instead
one normally would print a file f by saying

lpr <f &

to do the printing  in  the  background.   The  lpr  program
insert  carriage  returns,  expands  tabs,  and so on, so it
should only be used for straight ASCII files.  On hard  disk
systems, a spooler would not be difficult to write.

4.  LAYER 3 - SERVERS
     Layer 3  contains  two  server  processes:  the  memory
manager  and  the  file system.  They are both structured in
the same way as the device drivers, that is a main loop that
accepts  requests, performs them, and then replies.  We will
now look at each of these in turn.
     The memory manager's job  is  to  handle  those  system
calls  that  affect memory allocation, as well as a few oth-
ers.  These include FORK, EXEC, WAIT, KILL,  and  BRK.   The
memory  model used by MINIX is exceptionally simple in order
to  accommodate  computers  without  any  memory  management
hardware.  When the shell forks off a process, a copy of the
shell is made in memory.  When the child does an  EXEC,  the
new  core image is placed in memory.  Thereafter it is never
moved.  MINIX does not swap or page.
     The amount of memory allocated to the process is deter-
mined  by  a  field in the header of the executable file.  A









                           - 5 -


program, chmem, has been provided to manipulate this  field.
When  a  process  is started, the text segment is set at the
very bottom of the allocated memory area,  followed  by  the
data  and bss.  The stack starts at the top of the allocated
memory and grows downward.  The space between the bottom  of
the  stack  and the top of the data segment is available for
both segments to grow into as needed.  If the  two  segments
meet, the process is killed.
     In the past, before paging  was  invented,  all  memory
allocation  schemes  worked  like this.  In the future, when
even small microcomputers will use 32-bit CPUs and  1M  x  1
bit  memory  chips,  the  minimum  feasible memory will be 4
megabytes and this allocation scheme  will  probably  become
popular  again  due  to  its  inherent simplicity.  Thus the
MINIX scheme can be regarded as either  hopelessly  outdated
or amazingly futuristic, as you prefer.
     The memory manager keeps track of memory using  a  list
of holes.  When new memory is needed, either for FORK or for
EXEC, it searches the hole list and  takes  the  first  hole
that  is big enough (first fit).  When a process terminates,
if it is adjacent to a hole on  either  side,  the  process'
memory and the hole are merged into a bigger hole.
     The file system is really a  remote  file  server  that
happens  to be running on the user's machine.  However it is
straightforward to convert  it  into  a  true  network  file
server.   All  that  needs  to be done is change the message
interface and provide some way of  authenticating  requests.
(In  MINIX,  the  source  field  in  the incoming message is
trustworthy because it is filled in  by  the  kernel.)  When
running remote, the MINIX file server maintains state infor-
mation, like RFS and unlike NFS.
     The MINIX file system is similar to that  of  V7  UNIX.
The  i-node  is  slightly  different, containing only 9 disk
addresses instead of 13, and  only  1  time  instead  of  3.
These  changes  reduce the i-node from 64 bytes to 32 bytes,
to store more i-nodes per disk block and reduce the size  of
the in-core i-node table.
     Free disk blocks and free  inodes  are  kept  track  of
using bit maps rather than free lists.  The bit maps for the
root device and all mounted file systems are kept in memory.
When  a  file  grows,  the system makes a definite effort to
allocate the new block as close as possible to the old ones,
to  minimize  arm  motion.   Disk storage is not necessarily
allocated one block at a time.  A minor device can  be  con-
figured  to  allocate 2, 4 (or more) contiguous blocks when-
ever a block is allocated.  Although this wastes disk space,
these  multiblock  zones improve disk performance by keeping
file blocks close together.   The  standard  parameters  for
MINIX  as distributed are 1K blocks and 1K zones (i.e., just
1 block per zone).
     MINIX maintains a buffer cache of recently used blocks.
A  hashing algorithm is used to look up blocks in the cache.
When an i-node block, directory  block,  or  other  critical
block  is  modified, it is written back to disk immediately.









                           - 6 -


Data blocks are only written back at the next SYNC  or  when
the buffer is needed for something else.
     The MINIX directory system and format is  identical  to
that of V7 UNIX.  File names are strings of up to 14 charac-
ters, and directories can be arbitrarily long.

5.  LAYER 4 - USER PROCESSES
     This layer contains init, the shell,  the  editor,  the
compiler,  the utilities, and all the user processes.  These
processes may only send messages to the memory  manager  and
the  file system, and these servers only accept valid system
call requests.  Thus the  user  processes  do  not  perceive
MINIX  to be a general-purpose message passing system.  How-
ever, removing the one line of code that checks if the  mes-
sage  destination is valid would convert it into a much more
general system (but less UNIX-like).