US 5638516 Parallel processor that routes messages around blocked or faulty nodes by selecting an output port to a subsequent node from a port vector and transmitting a route ready signal back to a previous node
ABSTRACT – A parallel processor network comprised of a plurality of nodes, each node including a processor containing a number of I/O ports, and a local memory. A communication path is established through a node by comparing a target node address in a first address packet with a processor ID of the node. If node address is equal to the target node address a receive channel is allocated to the input port and a route ready command is sent over an output port paired with the input port. If the node address is not equal to the target node address, then a first unallocated output port is selected from a port vector and the address packet is forwarded to a next node over the selected output port.
FIELD OF THE INVENTION
The invention relates to data-processing systems, and more particularly, to a communication mechanism for use in a high-performance, parallel-processing system.
BACKGROUND OF THE INVENTION
U.S. Pat. No. 5,113,523 describes a parallel processor comprised of a plurality of processing nodes, each node including a processor and a memory. Each processor includes means for executing instructions, logic connected to the memory for interfacing the processor with the memory and an internode communication mechanism. The internode communication mechanism connects the nodes to form a first array of order n having a hypercube topology. A second array of order n having nodes connected together in a hypercube topology is interconnected with the first array to form an order n+1 array. The order n+1 array is made up of the first and second arrays of order n, such that a parallel processor system may be structured with any number of processors that is a power of two. A set of I/O processors are connected to the nodes of the arrays by means of I/O channels. The internode communication comprises a serial data channel driven by a clock that is common to all of the nodes.
The above-referenced U.S. Pat. No. 5,367,636, describes a fixed-routing communication system in which each of the processors in the network described in U.S. Pat. No. 5,113,523 is assigned a unique processor identification (ID). The processor IDs of two processors connected to each other through port number n, vary only in the nth bit. A plurality of input ports and a plurality of output ports are provided at each node. Control means at one of the input ports of the node receives address packets related to a current message from an output port of another of the nodes. A data bus connects the input and output ports of the node together such that a message received on any one input port is routed to any other output port. A compare logic compares a node address in a first address packet with the processor ID of the node to determine the bit position of the first difference between the node address in the first address packet and the processor ID of the node. The compare logic includes means for activating for transmission of the message packet placed on the data bus by the input port, the one of the plurality of output ports whose port number corresponds to the bit position of the first difference, starting at bit n+1, where n is the number of the port on which the message was received.
In the fixed routing scheme described in the above-referenced U.S. Pat. No. 5,367,636, a message from a given source to a given destination can take exactly one routing path, unless it is forwarded, cutting through intermediate nodes and blocking on busy channels until the path is established. The path taken is the dimension-order minimum-length path. While this scheme is deadlock-free, it will not reroute messages around blocked or faulty nodes.
It is desirable to provide a new communication system using reliable messaging mechanisms, whereby both the sender and receiver of a message can know quickly whether the message was delivered reliably, and the receiver may deliver status information back to the sender before an established path is broken.
It is also desirable that the communication system be able to route messages around blocked or faulty nodes and hot spots in a parallel processor, by implementing unique adaptive routing methods that make use of the reliable messaging mechanisms.
It is also desirable that the communication mechanism provide a more efficient utilization of the bandwidth of a hypercube communication network, by duplicating (folding) network links or otherwise unused communications ports, and by avoiding extended network blockages through the adaptive routing methods.
SUMMARY OF THE INVENTION
The reliable messaging mechanisms, whereby both the sender and receiver of a message can know quickly whether that the message was delivered reliably, and the receiver may deliver status information back to the sender before an established path is broken, is accomplished in accordance with an embodiment of the present invention by providing an end-to-end reporting network. When end-to-end is enabled and a message transmission along an established path from node A to node B is fully received at a receiver or “target” node B, hardware error status or a programmed receive code is sent, by the hardware, from node B back to node A along a parallel “back-track” path. Thus the communications architecture provides a transmission network and a corresponding back-track reporting network. These two networks are implemented as virtual networks that share the same physical communications network of internodal links. This has the advantage that a back-track network is added to an existing transmission network without a requirement for additional internodal communications links or signals.
In accordance with an aspect of the invention, end-to-end status packets are delivered via the back-track network to the send channel, at the sending node, that initiated the corresponding transmission. There is provided at each send channel an end-to-and status queue at which the programmed/central processing unit (CPU) is notified of and extracts the status.
The end-to-end hardware has the advantage of providing reliable messaging without additional message transmissions and the corresponding CPU and software overhead. Also, status is returned more reliably and much quicker by these dedicated end-to-end mechanisms than would be the case if a separate message had to be delivered from the receiver to the sender. Therefore, operating system resources dedicated to a given transmission can be released much sooner. In addition, these end-to-end mechanisms provide the back-track network upon which an adaptive routing protocol is built.
A communication system able to route messages around blocked or faulty nodes and hot spots in a parallel processor, is accomplished by implementing a unique maze adaptive routing mechanism that makes use of the back-track mechanism described above.
In a maze adaptive routing scheme for a transmission from node A to node B, all minimum-length paths between the two nodes are searched by a single-packet scout that attempts to find a free path.
One minimum path at a time is scouted, starting with the lowest-order uphill path and doing a depth-first, helical traversal of the minimum-path graph until a free path to the destination is found. If no free minimum-length path is found, other, non-minimum-length paths may be searched; or the central processing unit may be interrupted so that software can restart the search or implement some other policy.
The maze router also exhibits superior bandwidth usage and latency for most message mixes. This is attributed to its exhaustive yet sequential approach to route searching. The maze router eliminates the blockage of the fixed routing wormhole scheme, yet keeps route-search traffic to a minimum.
In accordance with an aspect of the invention, a node address packet, used for finding and establishing a route to a destination node, is provided with a destination-node address, plus other necessary control bits and parameters. The address packet, in conjunction with the back-track routing network, provides the means by which a suitable transmission path can be found adaptively by “scouting-out” various possible routes and reporting routing status back to the sending node as appropriate.
The invention has the advantage that the mechanism automatically routes around blocked or disabled nodes.
A more efficient utilization of the bandwidth of a hypercube communication network, by duplicating network links or otherwise unused communications ports, and by avoiding extended network blockages through the adaptive routing methods, is accomplished in accordance with the present invention by a network folding mechanism. For system configurations in which the hypercube is of a smaller dimension than the maximum supported by available port-links, the unused hypercube links can be used to duplicate the required internodal connections and thus provide additional routing paths and message data bandwidth. These “folded” connections are configured in the processor with a fold-enable mask register. The route-selection logic and node addressing logic is modified to allow transmission paths to be established through these folded/duplicated links.
The folding mechanisms provide the advantage of improved network bandwidth and message latency for systems configured as less than the maximum hypercube. This is accomplished in two ways. First, more links are available to paths within the folded cube and therefore more message data can be directed from one node to another in a given amount of time, thus increasing bandwidth. Second, more routing paths are available so that there is less chance of a blockage or a failure to route, and therefore a higher network bandwidth usage. Also, the average time to establish a route is shorter, thus reducing overall latency. These advantages apply to both fixed and adaptive routing methods, but are most efficiently exploited by the adaptive router.