| The concept of process
| A process is a program in execution. |
| Each process has a unique identifier, which we refer to as
"process id". |
| Some process identifiers are reserved for special purposes.
| 0 for scheduler, or the code for kernel activity.
| The scheduler decides which processes to run and which should
wait. |
|
| 1 for init. created by the kernel after booting.
| init brings the UNIX to working condition, and may refer to rc
scripts in the /etc/directory. |
| The process is either in /etc or /sbin |
| Now let's take a look at what Linux will do. |
|
| 2 for page daemon.
| Sometimes called pager. |
| A kernel process that supports virtual memory. |
|
|
|
| Process information
| getpid
| Get the process id. |
|
| getppid
| Get the parent process id. |
|
| getuid
| Get the user id of a process. |
|
| geteuid
| Get the effective user id of a process. |
|
| getgid
| Get the group id of a process. |
|
| getegid
| Get the effective group id of a process. |
|
| Now let's write a program to print out these information. |
|
| Process creation
| fork function
| The fork function is the only way to create a process in a UNIX
environment. |
| The fork function is called once, but returned twice. |
| A child process is created by calling the fork. The two processes
are identical except for the return value.
| The parent process returns the child process id. |
| The child process returns 0. |
|
| The two processes continue execution after calling fork. |
| The two processes DO NOT share data -- they are two copies of the
same program. |
| Now let's try the textbook example. fork1.c
| The variables are different copies. |
| The process ids are different. |
| When we run it normally the "before fork" appears
only once. |
| When the output is directed to a disk file, the "before
fork" appears twice. The reason is that when the disk I/O
is fully buffered, so the contents of standard I/O buffers are
copied from the parent to the child. |
| The write appears only once since it is not buffered (because
of write system call). |
|
|
| Information sharing
| Some information is shared between the parent and the child.
| Open file descriptor and offsets. In that case two processes
can share a open file. |
| Various user and group ids. |
| Working directory. |
| Environment. |
| Resource limits. |
| Refer to table on page 192 for a complete list of entries that
the child inherits from the parent. |
|
| Some data are different between parent and child.
| Process id. |
| Return value from fork. |
| Refer to table on page 192 for a complete list of entries that
the child does not inherit from the parent. |
|
|
| Purpose of process creation
| To create a duplicate copy.
| This is done by placing different sections of code after
checking the return value of the fork. |
|
| To run a different program.
| This is usually done by running a "exec" after the
fork. |
| The fork and the exec can be combined (sometimes called spawn)
to improve efficiency. |
|
|
| vfork function
| To run a program using the child. Notice that the child runs in
the addressing space of the parent and no memory copying is
necessary. |
| The child will run first, and the parent will wait for it. |
| The textbook example
vfork1.c
| Notice the values the parent prints out. |
| What will happen if we replace the _exit with exit? |
|
|
|
| Process termination
| exit function
| To terminate a process with an exit code. |
| Notice that this is a library, and _exit is a system call. |
|
| Normal termination
| The main program returns. |
| The program calls exit. |
| The program calls _exit. |
|
| Abnormal termination
| The program calls abort. |
| The program catches a signal. |
|
| No matter how a process terminates, the same code is the kernel does
the following.
| Close all open file descriptor. |
| Release memory. |
| Release process table entry. |
|
| The exit code
| The exit code lets the child to notify the parent about the its
execution status. |
| The process reports exit status, to which the kernel might add
extra information and called termination status. |
| The termination code can be found from the wait family from the
parent. |
|
| Anomaly
| When the parent terminates before the child, the init process
becomes the parent of this orphan process. |
| When the child terminates before the parent, the child becomes a
zombie.
| A zombie is a dead entity, but not completely dead. :-) |
| The child must leave sufficient information in the process
table so that later when its parent wants to fetch its status,
it is able to do so, |
| The information a zombie keeps in the process table includes
process id, termination status, and accounting information. |
| One can use ps to find out the status of all processes,
including zombies. |
|
|
|
| Process synchronization
| wait families
| wait
| A process can call wait to wait for the child process to
complete. |
| The wait function provides a integer buffer for receiving the
termination status. |
| The wait function will block if no child is available. |
| The return value is the process id of the child process. |
|
| waitpid
| A process can call wait to wait for a particular child
process to complete. |
| The waitpid can be non-blocking. |
|
| Termination status
| There are a set of macros to retrieve information from the
termination status.
| WIFEXITED
| true is the child terminates normally. |
| WEXITSTATUS tells us the actual exit code. |
|
| WIFSIGNALED
| true if the child process catches a signal and
terminates. |
| WTERMSIG |
| WCOREDUMP |
|
| WIFSTOPPED
| true if the process is stopped. |
| WSTOPSIG |
|
|
|
| Now we try the textbook example wait1.c
| Can we generate the coredump file? |
| Three cases are tested in this example.
| normal termination with exit. |
| abnormal termination with abort. |
| abnormal termination with arithmetic exception. |
|
|
| Another textbook example fork2.c.
| The first child exits before the second, so that init will
adopt the second child. |
| This is useful when the parent does not want to wait for the
child and we do not want the child to become a zombie either. |
|
|
|
| Race condition
| Multiple processes running simultaneously could result very strange
errors. |
| If the correctness of a program depends on the execution sequence of
consisting processes, then we have a race condition. |
| The race condition is difficult to debug since the error may not
appear when we want it to. |
| The (proc/fork2.c) example has a race condition.
| We cannot guarantee that the first child will exit first. |
| If that happens, init will not adopt the second child. |
|
| Textbook example tellwait1.c
| The stdout is specifically changed to be unbuffered. |
| The outputs from parent and child are mingled together. |
|
| Process synchronization
| To avoid race condition, we need to synchronize the processes. |
| tellwait1.c gives an example of process synchronization. We will
discuss its implementation when we cover IPC and signals. |
|
|
| Exec family
| The exec family runs a specific program, which replaces the image of
the calling process. |
| In Linux only the execve is a system call, and all the other are
library that were built on top of execve. See the figure on page 211 for
details. |
| Here is a list of all the functions.
|
| Here is the trick (page 209).
| program filename
| nothing means pathname |
| p means the file should be found from the path. |
|
| argument passing
| l means the arguments are passed as argument list. |
| v means the arguments are passed as a pointer array. |
|
| environment
| nothing means the environment is from the environ variable. |
| e means the environment is from the environment pointer array. |
|
|
| When the program is given as a filename
| No slash found
| Find it from PATH. |
|
| Slash found
| Treated as a pathname. |
|
|
| Try the textbook example exec1.c
| The echoall.c echoes all command line arguments. |
| The main program runs the echoall program. |
|
|
| Interpreter file
| An interpreter file is the input to an interpreter. |
| An interpreter file is a text file, not a binary executable. |
| An interpreter file starts with "#!", then followed by the
name of the interpreter, then by the optional arguments. |
| When the interpreter is a shell (in most cases it is), the interpreter
file is usually called a shell script -- namely a script that will be
run by the shell. |
| Try the text book example.
| We write an interpreter file testinterp, which will execute the
echoarg program.
| Notice that the path name of the interpreter file is added as
the last argument (by the kernel), and passed to the interpreter
(in this case, echoarg). |
|
| Now we compile and run exec2.c.
| The exec2 executes the testinterp, which executes the echoarg. |
| Notice that the prompt disappeared! |
|
| Now try to use awk as the interpreter.
| awk is a very useful script
interpreter. The basic syntax is "awk -f file", where
file is the name of the awk script. |
| Now we write an awk script to print the first word of every
line. |
| Now we try the textbook awk script that prints all the
arguments.
| There are several files here.
| The interpreter program awk. |
| The interpreter file awkexample. |
|
| When the awkexample is evoked from a shell, the shell
creates a process, and executes the interpreter file. |
| The interpreter files then executes the interpreter (awk),
and use the '-f' mechanism to pass the name of the
interpreter file name as an arguments, along with other
command line arguments from shell. |
| Notice that the awk is given five parameters -- the '-f'
from the interpreter file, the pathname of the
interpreter file is given by the kernel, the last three
arguments are from the command line, then passed to
interpreter file awkexample. |
| See page 220 for a complete illustration. |
|
|
| Reasons for using interpreter files.
| Hide the fact that a command is a script, not a binary
executable. |
| Efficiency gain. |
| Write shell scripts other than for sh. |
|
|
|
| system functions
| A simple way to utilize system facility -- just like typing into a
shell. |
| The return value tells whether the command is executed successfully or
not. |
| system takes a command string and passes it to /bin/sh
for execution. Now exam the following implementation (system.c).
| First we create a new process by fork. |
| Then we use execl to execute /bin/sh and ask it to run the command
string for us. The command string is passed to /bin/sh by way of -c
option. This option directs /bin/sh to take the command from the
string immediately after '-c' option. |
| Finally the system process waits for sh to complete. |
| Note that we use _exit instead of exit in the system process. |
|
| Now we try the textbook example (systest1.c). |
| system and setuid programs
| A setuid program should never use system, since the effective uid
can be carried into the child process. |
| Consider two files tsys and printuids.
| tsys is a program that uses system to run the command given to
it. |
| printuid is a program that prints the real and effective uids. |
|
| If tsys has setuid bit on, then the printuids process will have
the effect of setuid. |
|
|
| Process accounting
| The superuser can turn the accounting on and record the accounting
information into a file. |
| These information can be retrieved from the file by simple file I/O. |
|
| Process times
| The command time reports the time usage. |
| The system call times retrieves the
time usage of a process and its children. |
| The function returns the wall clock time each time it is called. |
| Now try the textbook example (times1.c).
| The function pr_times computes the difference between two tms
records and report the values. |
|
|