 | Interconnection network
 | An interconnection network connects the processors together, and
transports data among them. |
 | An interconnection network has its own topology. In this lecture we
will focus on one dimensional array and multi-dimension networks. |
 | An interconnection network can be viewed as a graph, in which every
node is a processor, and every link is a communication channel. |
 | The topology of a network is essential to the communication
performance.
 | Diameter
 | The diameter is the maximum distance between two nodes in the network.
The distance is measured in number links from one node to the
other. |
 | The larger the diameter, the longer it will take to send
message from one end to the other end of the network. |
|
 | Bisection
 | The bisection of a network is the minimum number of links we
need to cut in order to divide the network in to two
halves of roughly equal size. |
 | The larger the bisection, the less often the messages will
contend with one another while going from one half of the
network to the other. |
|
|
|
 | Array
 | One dimensional array is the "least connected" topology to
connect processors together, and also the least expensive to build |
 | "Compare and select" sorting
 | We pump the data from one end of the array one at a time. |
 | The processor does the following after receiving a data.
 | If it has no data, then it keeps the incoming data. |
 | Otherwise it compares the one it has with the incoming data,
and selects the larger one to be sent to its neighbor. |
|
 | Finally all the data will fall into the correct position. |
 |  |
 | All the figures, unless stated otherwise, are taken from "Introduction to Parallel Algorithms and Architectures: Arrays, Trees,
hypercubes," by F. T. Leighton, Morgan Kaufmann, ISBN 1-55860-117-1 |
 | Analysis
 | Both time and the number of processors used are n (n is the
number of data), so the cost is n2 |
|
|
 | Bubble sort
 | A simple algorithm to sort the data in an array. |
 | Notice that the latter stage can be moved "earlier" to
reduce the time. |
 |  |
|
 | Even-odd transportation sort
 | Even-odd transportation sort works in phases. In even phases, the
even numbered processors compare and exchange data with its
right-hand side neighbors. In odd phases the odd numbered processors
do the same. |
 | After n even phases and n odd phases (n is the number of data),
the array is sorted. |
 |  |
 | Proof of correctness
 | We use 0-1 principle to prove the correctness of the
algorithm. |
 | Consider the rightmost 1. This 1 will move either in the first
or the second phase, and once it is moving, it will not stop
until it hits the end of the array. |
 | Consider the second rightmost 1. The latest time step This 1
will start moving is third phase, and similarly, once it starts
moving, it will not stop until it his the second-to-the-right
cell. |
 | In general the ith rightmost 1 starts moving at time step i +
1, and will reach the n - i + 1 cell. |
 | Everyone reaches his destination in n steps. |
 |  |
|
 | Analysis
 | Both time and the number of processors used are n (n is the
number of data), so the cost is n2 |
|
|
 | Inner product
 | Two vectors are fed into both ends of the array for inner product. |
 | The timing is everything. |
 |  |
|
|
 | Two dimensional mesh
 | Snake sort
 | The algorithm
 | The snake sort proceeds in phases. In the first phase each row
is sorted so that even numbered rows have the largest number to
the right, and the odd numbered rows have the largest number to
the left. |
 | In the second phase each column is sorted so that the smallest
numbers appearing at the top of the columns. |
 | After 2log n phases, where n is the number of rows, the data
becomes sorted. |
|
 |  |
 | Proof of correctness
 | Again we use 0-1 principle. |
 | Let's call a row full of 0's (or 1's) clean, otherwise the row
is dirty. |
 | Two consecutive dirty rows will produce at least one clean row
since they have opposite orientation of 0's and 1's. |
 | The number of dirty rows decreases by half after one row sort
and one column sort. |
 | There will be at most one dirty row after 2log n phases. |
|
 | Analysis
 | The number of phases is 2log n, so the total time is O(n log
n). |
|
|
 | A better snake sort
|
 | A matching lower bound for sortinf
 | Here we want to establish a better lower bound than the obvious
one -- 2N1/2 |
 |  |
 | 2N1/2 unknown values are in the upper left-hand corner, and the
numbers from 1 to N - 2N1/2 are stored elsewhere. |
 | Let x be the element at the lower right-hand corner at step
2 N1/2 - 2 N1/4 -3, and x is independent from
those unknowns. |
 | Assume that the unknown area has m 0's and 2N1/2 - m
1's. If m varies, x must go to different columns (i.e., it will be
pushed by those 0's to the correct column). |
 | We just m so that x must go to the first column and the result
follows. |
|
 | Routing on a two dimensional mesh
 | It is never a problem in routing messages for a array (using a
naive greedy algorithm). |
 | No contention occurs. |
 |  |
 | We use the following simple greedy algorithm to route messages in
a mesh.
 | First all the messages are routed to the correct columns. |
 | Then the messages go to the correct rows. The message going
the longest distance is given the highest priority when
contention occurs. |
 |  |
 | The proof is similar to the one in even-odd transportation
sort. |
|
|
|