[Nfd-dev] Help needed with debugging duplicate Nonce

Ashlesh Gawande (agawande) agawande at memphis.edu
Tue Aug 2 14:19:39 PDT 2016


Hi Junxiao


I am trying to test multicast v2 (https://gerrit.named-data.net/#/c/3013/1) before I push the code review ready patches to resolve our issue.

(Issue 3176 and issue 2602).


Number of duplicate NACKs have increased.

For 3 node topology:

NACKs increase from 58 (v1) to 132 (v2) for 30 second convergence time.


For a 58 node topology:

In multicast v2, NACKs increase from 913073 (v1) to 5624592(v2) and topology does not converge in 30 seconds as it does with v1 (most of them are duplicate NACK). The topology does converge even in 3 minutes. The increased NACK processing seems to take the blame for increased convergence time.


I have attached the 3 node topology test with ndnump and logs.

Maybe I am doing something wrong in the patch. Could you please take a look?


Thanks

Ashlesh

________________________________
From: Lan Wang (lanwang)
Sent: Wednesday, July 13, 2016 2:37:19 PM
To: Junxiao Shi
Cc: Ashlesh Gawande (agawande); nfd-dev at lists.cs.ucla.edu
Subject: Re: [Nfd-dev] Help needed with debugging duplicate Nonce

Junxiao,

Ashlesh can work on these two issues if no one else volunteers.

Lan

On Jul 13, 2016, at 8:54 AM, Junxiao Shi <shijunxiao at email.arizona.edu<mailto:shijunxiao at email.arizona.edu>> wrote:

Hi Lan

Multicast strategy v1 does not have any of the desired behaviors.
It neither allows consumer retransmission, nor processes incoming Nacks.

ChronoSync developers could pick up the work on multicast strategy:
Feature #2063: permit consumer retransmission
Feature #3176: Nack processing

Yours, Junxiao

From: Lan Wang (lanwang)<mailto:lanwang at memphis.edu>
Sent: Wednesday, July 13, 2016 06:46
To: Junxiao Shi<mailto:shijunxiao at email.arizona.edu>
Cc: Ashlesh Gawande (agawande)<mailto:agawande at memphis.edu>; nfd-dev at lists.cs.ucla.edu<mailto:nfd-dev at lists.cs.ucla.edu>
Subject: Re: [Nfd-dev] Help needed with debugging duplicate Nonce

Junxiao,

On Jul 12, 2016, at 12:58 PM, Junxiao Shi <shijunxiao at email.arizona.edu<mailto:shijunxiao at email.arizona.edu>> wrote:

Hi Lan

Look at this topology:
A---B
\ /
  C
Assume every link has the same delay, and the processing delay at each node is zero.

When C sends the same Interest to both A and B, A and B would forward the Interest to each other, and get Nack-Duplicate from each other.
Per ChronoSync protocol, the applications on A and B should express Interests with the same name. Since A and B are symmetric, let’s look at A.
There could be three different timings:

TIMING0 AppA expresses the Interest before C’s Interest arrives at A.
If we relabel A->C B->A C->B, it becomes either TIMING1 or TIMING2.

TIMING1 AppA expresses the Interest after C’s Interest has been forwarded to B, but before B responds Nack-Duplicate.
The duration between A forwarding C’s Interest to B and A receiving Nack-Duplicate from B is most likely shorter than the suppression interval, so appA’s Interest would be suppressed.
When Nack-Duplicate against C’s Interest arrives from B, strategy at A should send appA’s Interest to B.

TIMING2 AppA expresses the Interest after B responds Nack-Duplicate.
When Nack-Duplicate against C’s Interest arrives from B, strategy at A should not return the Nack to C, because appA is another upstream which has not Nacked.
When appA expresses the Interest, strategy at A should not suppress this Interest, because the sole pending upstream (appA) is now a downstream. The strategy should forward the Interest to both B and C.


When the strategy behaves as what I describe above, everyone should be getting Data.

Question is whether the multicast strategy ChronoSync uses has the above behavior or not (e.g., "When Nack-Duplicate against C’s Interest arrives from B, strategy at A should send appA’s Interest to B.”).  I doubt it is the case based on the following.

Below is from your earlier email:

============== below is from Junxiao’s email ========
>From A’s point of view, face261 is the local app, face262 is B, face263 is C.
Incoming+outgoing Interests, and PIT entry contents are as follows:

52.39 from=C=4168 to=app
      PIT in=[C=4168 at 52.39] out=[app at 52.39]
52.72 from=B=6605
      PIT in=[C=4168 at 52.39,B=6605 at 52.72] out=[app at 52.39]
52.74 from=app=7385 to=B,C
      PIT in=[C=4168 at 52.39,B=6605 at 52.72,app=7385 at 52.74] out=[app at 52.39,B at 52.74,C at 52.74]
56.39 PIT in=[B=6605 at 52.72,app=7385 at 52.74]+expired[C=4168 at 52.39] out=[B at 52.74,C at 52.74]+expired[app at 52.39]
56.72 PIT in=[app=7385 at 52.74]+expired[C=4168 at 52.39,B=6605 at 52.72] out=[B at 52.74,C at 52.74]+expired[app at 52.39]
56.74 Interest unsatisfied, PIT erased

56.89 from=C=0641 to=app
      PIT in=[C=0641 at 56.89] out=[app at 56.89]
56.90 from=app=7567 to=B,C
      PIT in=[C=0641 at 56.89,app=7567 at 56.90] out=[app at 56.89,B at 56.90,C at 56.90]
57.06 from=B=1547
      PIT in=[C=0641 at 56.89,app=7567 at 56.90,B=1547 at 57.06] out=[app at 56.89,B at 56.90,C at 56.90]
60.89 PIT in=[app=7567 at 56.90,B=1547 at 57.06]+expired[C=0641 at 56.89] out=[B at 56.90,C at 56.90]+expired[app at 56.89]
60.90 PIT in=[B=1547 at 57.06]+expired[C=0641 at 56.89,app=7567 at 56.90] out=[]+expired[app at 56.89,B at 56.90,C at 56.90]

61.05 from=C=6600 to=app,B,C
      PIT in=[B=1547 at 57.06,C=6600 at 61.05]+expired[app=7567 at 56.90] out=[app at 61.05,B at 61.05,C at 61.05]

Time is represented with last two digits of integral part and first two digits of fractional part. Nonce is represented with last four digits. PIT in-record is represented as <downstream>=<nonce>@<time>. PIT out-record is represented as <upstream>@<time>.

The PIT entry has been erased at 56.74, so anything before that does not affect the later part.

At 61.05, an Interest with nonce 6600 is received from C. Before processing this Interest, the PIT entry has one unexpired in-record from B which has nonce 1547 and was created at 57.06; all out-records have been expired.
This Interest is forwarded with nonce 6600 to both app and B.
Since the PIT entry indicates B still wants the Data, the strategy also forwards the Interest with nonce 1547 (which came from B at 57.06) back to C.

While this may not be the best idea for multicast strategy, and using a nonce from non-local-app may be problematic with /localhop, this forwarding is valid in general: nonce 1547 has never been forwarded earlier, and downstream B still wants the Data, so the strategy can certainly forward the Interest to C using this nonce.
============ above is from Junxiao’s email =========

The above trace did not include some important information, i.e. (1) the NACK-duplicate messages and (2) what happens after receiving the NACKs.  So I searched a’s nfd log from Ashlesh’s email and found the following which I hope are the relevant messages.

1467225861.054557 DEBUG: [Forwarder] onIncomingInterest face=263 interest=/localhop/NLSR/sync/e4c858598a526f7f25d42f4da49cb51b7950cf352e26d5092bd35126b509101b (A received sync interest from C with nonce 6600 — the nonce is inferred from your email, I didn’t check the ndndump log)

1467225861.054705 DEBUG: [Forwarder] onOutgoingInterest face=262 interest=/localhop/NLSR/sync/e4c858598a526f7f25d42f4da49cb51b7950cf352e26d5092bd35126b509101b
 (A sent sync interest to B with the nonce 6600 - again nonce value is inferred from your email)

1467225861.054736 DEBUG: [Forwarder] onOutgoingInterest face=263 interest=/localhop/NLSR/sync/e4c858598a526f7f25d42f4da49cb51b7950cf352e26d5092bd35126b509101b
 (A sent sync interest to C with the nonce 1547 - again nonce value is inferred from your email)

— the above three correspond to the last line in your trace ----

1467225861.075402 DEBUG: [Forwarder] onIncomingNack face=263 nack=/localhop/NLSR/sync/e4c858598a526f7f25d42f4da49cb51b7950cf352e26d5092bd35126b509101b~Duplicate OK

1467225861.075601 DEBUG: [Forwarder] onIncomingNack face=262 nack=/localhop/NLSR/sync/e4c858598a526f7f25d42f4da49cb51b7950cf352e26d5092bd35126b509101b~Duplicate OK

— the above two message show that both B and C sent NACK-duplicate back to A — C has previously received the nonce 1547 from A I think (A has expired the corresponding out record, but C has not deleted the in-record I guess).  I’m not sure why B would also send a NACK-duplicate.  Did it receive one with the nonce 6600 before from A?

— according to your earlier email, these two NACKs would erase the corresponding in record at B and C.

1467225861.224289 DEBUG: [Forwarder] onIncomingInterest face=262 interest=/localhop/NLSR/sync/e4c858598a526f7f25d42f4da49cb51b7950cf352e26d5092bd35126b509101b (B sent another sync interest to A)

1467225861.378008 DEBUG: [Forwarder] onIncomingInterest face=261 interest=/localhop/NLSR/sync/e4c858598a526f7f25d42f4da49cb51b7950cf352e26d5092bd35126b509101b (A’s app sent a sync interest to A’s nfd)

1467225861.378105 DEBUG: [Forwarder] onContentStoreMiss interest=/localhop/NLSR/sync/e4c858598a526f7f25d42f4da49cb51b7950cf352e26d5092bd35126b509101b
1467225865.378406 DEBUG: [Forwarder] onInterestUnsatisfied interest=/localhop/NLSR/sync/e4c858598a526f7f25d42f4da49cb51b7950cf352e26d5092bd35126b509101b
1467225865.378471 DEBUG: [Forwarder] onInterestFinalize interest=/localhop/NLSR/sync/e4c858598a526f7f25d42f4da49cb51b7950cf352e26d5092bd35126b509101b unsatisfied
1467225865.528040 DEBUG: [Forwarder] onIncomingInterest face=263 interest=/localhop/NLSR/sync/e4c858598a526f7f25d42f4da49cb51b7950cf352e26d5092bd35126b509101b
1467225865.528083 DEBUG: [Forwarder] onContentStoreMiss interest=/localhop/NLSR/sync/e4c858598a526f7f25d42f4da49cb51b7950cf352e26d5092bd35126b509101b
1467225865.528126 DEBUG: [Forwarder] onOutgoingInterest face=261 interest=/localhop/NLSR/sync/e4c858598a526f7f25d42f4da49cb51b7950cf352e26d5092bd35126b509101b

— the above shows that A did not send a sync interest to B and C after A’s app sent a sync interest to A’s nfd (there were no such messages between 61.3 and 65), which means that B and C won’t have the corresponding in record (but for ChronoSync to work there needs to be a pending interest on each direction in every link).

Lan

Yours, Junxiao



From: Lan Wang (lanwang)<mailto:lanwang at memphis.edu>
Sent: Tuesday, July 12, 2016 09:28
To: Junxiao Shi<mailto:shijunxiao at email.arizona.edu>
Cc: Ashlesh Gawande (agawande)<mailto:agawande at memphis.edu>; nfd-dev at lists.cs.ucla.edu<mailto:nfd-dev at lists.cs.ucla.edu>
Subject: Re: [Nfd-dev] Help needed with debugging duplicate Nonce

When strategy instructs forwarder to send a Nack to face B, it removes the PIT in-record of face B. In case the PIT entry is later satisfied by Data, face B would not receive a copy of this Data.

The above highlighted part is the problem.  Since the in-record of face B is removed after sending the Nack, any Sync data packet generated by ChronoSync cannot be delivered to the neighbor on face B.  The neighbor would wait for the sync period to expire and resend the next Sync Interest to receive the data.  This is the major problem caused by the duplicate Nacks (not just pure message overhead).

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.lists.cs.ucla.edu/pipermail/nfd-dev/attachments/20160802/e90cb0ae/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: multicastv2-test.tar.gz
Type: application/gzip
Size: 301072 bytes
Desc: multicastv2-test.tar.gz
URL: <http://www.lists.cs.ucla.edu/pipermail/nfd-dev/attachments/20160802/e90cb0ae/attachment.bin>


More information about the Nfd-dev mailing list