[Nfd-dev] NFD RIB Daemon - trigger FIB updates from RIB

Lan Wang (lanwang) lanwang at memphis.EDU
Tue Jun 24 20:25:12 PDT 2014


Just realized one issue: if we really want to be consistent,  then we also need to reverse all fib updates that belong to the same rib update if any of them fails.  This would require keeping state of the original fib.  Also batching rib updates may make this reversal difficult if the fib updates for several rib updates to one optimal set.  There may be other complexities related to this. Just want to raise awareness.

Lan



-------- Original message --------
From: Beichuan Zhang <bzhang at cs.arizona.edu>
Date: 06/24/2014 9:13 PM (GMT-06:00)
To: "Lan Wang (lanwang)" <lanwang at memphis.edu>
Cc: Syed Obaid Amin <obaidasyed at gmail.com>,"<nfd-dev at lists.cs.ucla.edu>" <nfd-dev at lists.cs.ucla.edu>
Subject: Re: [Nfd-dev] NFD RIB Daemon - trigger FIB updates from RIB


I just finished a call with Alex/Junxiao/Steve. Junxiao’s estimate is that this task (http://redmine.named-data.net/issues/1326) shouldn’t take too long, at least less than the time for finishing the RIB/FIB update code (http://redmine.named-data.net/issues/1325). So we suggest making these changes, but to make clear what changes we’re talking about, here’s my understanding:

- RibManager doesn’t issue FIB updates. They are issued by the module implemented in task #1325.

- RibManager doesn’t set/maintain route expiration timers. They are set/maintained by the RIB using EventEmitter.

- In all cases, FIB should be updated first, then the RIB.
  + what is the current code doing? I guess to generate FIB updates, the code may need to update the RIB first, or at least traverse the RIB trie. Not sure how to consolidate these two.

- One RIB update can generate multiple FIB updates, which may take time to complete.

For a prefix registration request from application, the RibManager should pass a callback to Rib::beginApplyUpdate, which is called after all FIB updates are successful and the RIB is updated. The called function can respond to the prefix registration request.

For routing updates, the RibManager can respond with "100 continue" or similar code right away without waiting for FIB updates to complete.

- If a RIB triggers multiple FIB updates, and one of them fails, take actions according to the reason of the failure:
  (a) Signing key of RIB Daemon is not trusted: the RIB Daemon should stop.
  (b) The face doesn’t exist: the RIB update is abandoned, thus the RIB is not changed.
  (c)  Timeout (e.g., NFD is too busy): retry the remaining FIB updates.

- Junxiao’s proposal has a part of bundling some RIB updates together as a batch. All RIB updates in the same batch must have the same nexthop face. Therefore when FIB update fails due to non-existent face, all RIB updates in the same batch can be abandoned at once. This is to improve the performance, but I’m not sure how much it helps. We don’t need this code for this release. If you agree with the design, you may implement the interface.

Beichuan


On Jun 25, 2014, at 5:33 AM, Lan Wang (lanwang) <lanwang at memphis.edu<mailto:lanwang at memphis.edu>> wrote:

I agree with keeping the current code for v0.2.  But for later release, we need to agree on what specific approach to follow.  Do you have any comments for Junxiao's proposal?

Lan
On Jun 24, 2014, at 2:08 PM, Syed Obaid Amin <obaidasyed at gmail.com<mailto:obaidasyed at gmail.com>> wrote:

Lan,

I left a comment on gerrit last night regarding a problem in using EventEmitter. This proposal of Junxiao is one of the possible solutions for that problem. Please see the following link for the details:

http://gerrit.named-data.net/#/c/911/4/rib/rib-manager.cpp

If we use EventEmitter then we need to make these changes as well, which I think would take time. Therefore I suggested that for v0.2 we can use the code that we have under review for task #1326 (http://redmine.named-data.net/issues/1326). For later release we can use EventEmitter approach.

Regards,
Obaid


On Tue, Jun 24, 2014 at 1:07 PM, Lan Wang (lanwang) <lanwang at memphis.edu<mailto:lanwang at memphis.edu>> wrote:
Junxiao,

I've asked Vince and Obaid to look at this proposal closely.  But before you sent this proposal, Obaid and I already talked about the route expiration issue you raised, and he will implement the EvenTimer approach you suggested -- RIB maintains the time-out and sends a signal to RIBManager when the timer expires.  Just FYI.

Lan

On Jun 24, 2014, at 8:22 AM, Junxiao Shi <shijunxiao at email.arizona.edu<mailto:shijunxiao at email.arizona.edu>> wrote:

This is a design proposal for NFD RIB Daemon.

Background
RIB Management functionality is implemented as a separate process: NFD RIB Daemon.
RIB Daemon processes RIB updates from local applications and routing protocols, and then synchronizes the RIB to the FIB in main NFD process via FIB Management protocol.
Currently, RIB Daemon has a RibManager component that is responsible for processing RIB updates, and a RIB data structure which is exclusively accessed by the RibManager.
Historically, before RIB is implemented, RIB Daemon is a pass-through protocol translator that translates each RIB update into one FIB update, ignoring route inheritance flags.
In the protocol translator, FIB updates are sent from RibManager, and a RIB update command is answered after the corresponding FIB update is complete.

Problem
Task 1326 implements Route expiration in RIB: when a Route (prefix registration) expires, the FaceEntry is removed from the RIB, which shall trigger FIB updates.
During the code review http://gerrit.named-data.net/911 (patchset 4 rib-manager.cpp), a question is raised: should RibManager set the timer to remove the FaceEntry and trigger FIB updates, or should the RIB set the timer and notify RibManager?
Reviewer opinion is: expiration period is an attribute of a Route, so the timer should be set in RIB.
This leads to another problem: after removing the FaceEntry, if the FIB update(s) fails, RIB and FIB will be out-of-sync.
A related problem is: if a RIB update triggers multiple FIB updates, if some FIB updates succeed but others fail, RIB and FIB will also be out-of-sync.

Proposed Design
RibManager updates the RIB only; it does not directly update the FIB.

RIB updates are batched. A batch of RIB updates is packaged into a RibUpdateBatch object. All RIB updates in the same RibUpdateBatch object must refer to the same face.
When RibManager receives one or more RIB updates, it generates a RibUpdateBatch, and passes it to Rib::beginApplyUpdate() to begin the RIB update procedure.
When one or more Route expires, the RIB itself generates a RibUpdateBatch, and begin the RIB update procedure.

The RIB update procedure has admission control: if a previous RibUpdateBatch is in progress, the new batch will be put in a queue until the previous batch is complete.
To apply a batch of RIB updates, the current collection of RIB entries and the RibUpdateBatch are passed to FibUpdater component to compute the optimal set of FIB updates, and the FibUpdater component shall send these FIB updates.
If any FIB update fails, (a) in case of a non-recoverable error (eg. signing key of RIB Daemon is not trusted), the RIB Daemon shall crash; (b) in case of a non-existent face error, the RibUpdateBatch is abandoned; (c) in other cases, the FIB update is retried.
If the RibUpdateBatch is abandoned, the RIB is unchanged; otherwise, after all FIB updates are successful, the RIB entries are updated.

For a prefix registration request from application, the RibManager should pass a callback to Rib::beginApplyUpdate, which is called after all FIB updates are successful and the RIB is updated. The called function can respond to the prefix registration request.
For routing updates, the RibManager can respond with "100 continue" or similar code right away without waiting for FIB updates to complete.

Benefits
The proposed design decouples FIB updates from RIB updates.
A RIB update, regardless of source, can trigger FIB updates.
If a RIB update command needs to wait for the RIB update and FIB updates to complete, RibManager can wait on a callback from the RIB.



Yours, Junxiao



_______________________________________________
Nfd-dev mailing list
Nfd-dev at lists.cs.ucla.edu<mailto:Nfd-dev at lists.cs.ucla.edu>
http://www.lists.cs.ucla.edu/mailman/listinfo/nfd-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.lists.cs.ucla.edu/pipermail/nfd-dev/attachments/20140625/78429835/attachment.html>


More information about the Nfd-dev mailing list