US 6697325 System, device, and method for expediting reconvergence in a communication network

ABSTRACT – In a system, device, and method for expediting reconvergence in a communication network, a first indication of a communication link failure prompts a node to compute new routes. Upon receiving the first indication of the communication link failure, the node determines the nodes that are associated with the failed communication link. The node disassociates the failed communication link from all such nodes, and computes new routes. Subsequent indications of the same communication link failure are ignored.

FIELD OF THE INVENTION

The present invention relates generally to communication systems, and more particularly to expediting reconvergence in a communication system.

BACKGROUND OF THE INVENTION

In today’s information age, communication networks are often used for interconnecting computers and computer peripherals. A communication network typically includes a number of nodes that interoperate to route protocol messages. The various nodes in the communication network utilize various routing protocols in order to determine the routes that are used to route the protocol messages.

One type of routing protocol, known as a “link state” routing protocol, determines routes based upon the status of communication links between the various nodes. A link state routing protocol, such as OSPF and IS-IS, requires each node to have complete topology information. Each node maintains a topology database that indicates all nodes in the communication network and lists the communication links that are associated with each node.

The various nodes in the communication network exchange link status information using link state advertisement (LSA) protocol messages. Specifically, each node periodically tests the communication links to each of its neighbors and sends a LSA protocol message including the link status information to all of the other nodes. Each node computes the routes based upon the link status information received from the other nodes.

When a node receives a LSA protocol message, the node updates its topology database based upon the link status information received in the LSA protocol message, and runs a special algorithm to determine the routes based upon the updated topology information. One well-known algorithm for determining routes is the Dijkstra shortest path algorithm. The Dijkstra shortest path algorithm computes the shortest paths to all destinations from a single source.

When a communication link fails, the various nodes in the communication network interoperate to route protocol messages around the failed communication link. This is often referred to as “reconvergence.” Each node that supports the failed communication link (referred to hereinafter as a “supporting” node) sends an LSA protocol message to the other nodes in the communication network identifying the failed communication link. Each supporting node may detect the communication link failure at a different time, and therefore each supporting node may send the LSA protocol message at a different time. Each node updates its topology database to reflect the failed communication link, based upon the link status information from the LSA protocol messages, and uses the Dijkstra shortest path algorithm in order to compute new routes.

In one prior art embodiment, a node computes new routes each time it receives an LSA protocol message. Specifically, upon receiving an LSA protocol message, a node updates its topology database based upon the link status information in the LSA protocol message, and computes new routes based upon the updated topology database.

In the case of a communication link failure, each supporting node sends an LSA protocol message identifying the failed communication link. When a node receives such an LSA protocol message from a supporting node, the node updates its topology database, specifically by removing the failed communication link from the list of communication links associated with the supporting node, and then computes new routes based upon the updated topology database. Because each node may receive multiple LSA protocol messages relating to the same communication link failure, each node may update its topology database and compute new routes multiple times for the same communication link failure. Reconvergence is not complete until all LSA protocol messages are processed, since each LSA protocol message identifies the communication link failure from the perspective of one particular supporting node. Furthermore, the Dijkstra shortest path algorithm is computationally intensive and can take a relatively long time to complete, especially in communication networks having many nodes, so it is desirable to reduce the number of times new routes are computed, preferably once per communication link failure.

In another prior art embodiment, each node computes new routes based upon link status information received over a period of time. Specifically, upon receiving a first LSA protocol message, a node updates its topology database based upon the link status information in the LSA protocol message, and starts a timer. If the node receives additional LSA protocol messages while the timer is running, the node updates its topology database based upon the link status information in the additional LSA protocol messages. This allows the node to receive multiple LSA protocol messages relating to the same communication link failure before computing new routes. When the timer expires, the node computes new routes based upon all link status information received during the timeout period. This reduces the number of times the routes are computed, but delays reconvergence.

In the case of a communication link failure, each supporting node sends an LSA protocol message identifying the failed communication link. When a node receives a first such LSA protocol message from a supporting node, the node updates its topology database, specifically by removing the failed communication link from the list of communication links associated with the supporting node, and starts the timer. During the timeout period, the node receives additional LSA protocol messages from the other supporting nodes. For all such additional LSA protocol messages, the node updates its topology database, specifically by removing the failed communication link from the lists of communication links associated with the other supporting nodes. When the timer expires, the node computes new routes based upon the updated topology database, which includes all link status information received during the timeout period. The node only computes new routes once, although, as before, reconvergence is not complete until all LSA protocol messages are processed.

In order for the communication network to operate efficiently, it is important for reconvergence to occur as quickly as possible following a communication link failure. Unfortunately, in both prior art embodiments described above, reconvergence does not occur until each node processes LSA protocol messages from all supporting nodes. This may take a substantial amount of time, during which the nodes may continue routing protocol messages to the failed communication link.

Thus, a technique for expediting reconvergence following a communication link failure is needed.

SUMMARY OF THE INVENTION

In accordance with one aspect of the invention, a first indication of a communication link failure prompts a node to compute new routes. Upon receiving the first indication of the communication link failure, the node determines the nodes that are associated with the failed communication link. The node disassociates the failed communication link from all such nodes, and computes new routes. Subsequent indications of the same communication link failure are ignored.

More particularly, upon receiving a LSA protocol message from a supporting node indicating a communication link failure, the node updates its topology database by removing the failed communication link from the list of communication links associated with the supporting node. The node then uses its topology database to determine other nodes that are associated with the failed communication link, and updates its topology database by removing the failed communication link from the lists of communication links associated with the other nodes. The node then computes new routes based upon the updated topology database.

The node ignores any subsequent LSA protocol messages relating to the same communication link failure. A caching mechanism is preferably used to determine whether a particular LSA protocol message is related to a previous communication link failure.

One advantage of the present invention is that the node only computes new routes once for each communication link failure.

Another advantage of the present invention is that reconvergence is achieved upon processing the first LSA protocol message indicating the communication link failure.

 

Related Posts