[Nfd-dev] Help needed with ASF measurement

Ashlesh Gawande (agawande) agawande at memphis.edu
Wed Oct 26 08:55:01 PDT 2016


I believe I have caught the bug. The code was calling getLastRenewed() on non-existent outRecord.

After this line we must check if the outRecord exist:
https://github.com/named-data/NFD/blob/master/daemon/fw/asf-measurements.cpp#L116

like in the access strategy:
https://github.com/named-data/NFD/blob/3d1eb091c0d6545df8c12aabaef12fafc97f836c/daemon/fw/access-strategy.cpp#L192

I applied the fix:

void
FaceInfo::recordRtt(const shared_ptr<pit::Entry>& pitEntry, const Face& inFace)
{
  //NFD_LOG_ERROR("Recording RTT for inFace: " << inFace.getId() << " name: " << pitEntry->getName());
  // Calculate RTT
  pit::OutRecordCollection::const_iterator outRecord = pitEntry->getOutRecord(inFace);

  if (outRecord == pitEntry->out_end()) { // no out-record
    NFD_LOG_DEBUG(pitEntry->getInterest() << " dataFrom " << inFace.getId() << " no-out-record");
    return;
  }

  NFD_LOG_ERROR("steady clock now: " << time::steady_clock::now() << " outRecordLastRenewed: " << outRecord->getLastRenewed());

  time::steady_clock::Duration steadyRtt = time::steady_clock::now() - outRecord->getLastRenewed();

  NFD_LOG_ERROR("steadyRtt: " << steadyRtt);

  if( steadyRtt.count() <= 0 || steadyRtt.count() > 100000000000){
    NFD_LOG_ERROR("Wrong measurement");
    return;
  }

So if an out record does not exist, we will not try to use getLastRenewed() and return instead.

After the fix it doesn't have any weird measurements. It never enters the second "if statement" above where I check if the steadyRTT is less than zero or some big number.

But now the question is why would the outRecord not exist (for an interest sent to a direct neighbor)?


Ashlesh

________________________________
From: Lan Wang (lanwang)
Sent: Monday, October 24, 2016 1:44:45 PM
To: Ashlesh Gawande (agawande)
Cc: Junxiao Shi; <nfd-dev at lists.cs.ucla.edu>
Subject: Re: Help needed with ASF measurement

Would this have anything to do with virtualization (since you are using min-NDN)? Another possibility: the time was not converted properly to nanoseconds?

Lan

On Oct 24, 2016, at 1:40 PM, Ashlesh Gawande (agawande) <agawande at memphis.edu<mailto:agawande at memphis.edu>> wrote:

I haven't found anything suspicious. There doesn't seem to be a code path that generates an out record without updating last renewed field.

Even if there was a bug in the forwarding pipeline and it is not updated somewhere, I think we should not get a negative value:


1477331706.109322 ERROR: [AsfMeasurements] Recording RTT for inFace: 311

1477331706.109335 ERROR: [AsfMeasurements] steady clock now: 2148492683226381 nanoseconds since boot outRecordLastRenewed: 7018100696742452000 nanoseconds since boot

1477331706.109351 ERROR: [AsfMeasurements] steadyRtt: -7015952204059211821 nanoseconds
1477331706.109368 ERROR: [AsfMeasurements] Wrong measurement

This means time::steady_clock::now() is smaller time::steady_clock::TimePoint::min().

(Machine is Ubuntu 14.04, gcc 4.9.3.
32 NFDs are running on this machine at the same time. Such wrong measurements happen randomly).

Ashlesh
________________________________
From: Lan Wang (lanwang)
Sent: Monday, October 24, 2016 1:21:49 PM
To: Junxiao Shi
Cc: Ashlesh Gawande (agawande); <nfd-dev at lists.cs.ucla.edu<mailto:nfd-dev at lists.cs.ucla.edu>>
Subject: Re: Help needed with ASF measurement

Junxiao,

Thank you very much for the quick feedback.

Ashlesh: have you found anything suspicious?

Lan

On Oct 24, 2016, at 11:00 AM, Junxiao Shi <shijunxiao at email.arizona.edu<mailto:shijunxiao at email.arizona.edu>> wrote:

Hi Ashlesh
This is probably a forwarding pipelines bug, but I won't have time to diagnose this until at least Thursday.
Things to look at:
1. Find every code line that updates last renewed field.
2. Find which forwarding pipelines calls the function that updates last renewed field.
3. Determine if there's a code path that generates an out record without updating last renewed field.
Yours, Junxiao

On Oct 24, 2016 08:52, "Ashlesh Gawande (agawande)" <agawande at memphis.edu<mailto:agawande at memphis.edu>> wrote:
Hi Junxiao

I am testing ASF in Mini-NDN.

The following line calculates the steady RTT by subtracting outRecord->getLastRenewed()
from time::steady_clock::now()
https://github.com/named-data/NFD/blob/master/daemon/fw/asf-measurements.cpp#L117
(which is called from beforeSatisfyInterest https://github.com/named-data/NFD/blob/master/daemon/fw/asf-strategy.cpp#L115)

Sometimes the value of this steadyRTT is huge (ex: 200000 seconds) or negative.
This makes ASF change faces to a higher RTT face.

So what could be the reason that this happening? I see much better and expected results if I ignore these high/negative values.

(I see that https://github.com/named-data/NFD/blob/master/daemon/table/pit-face-record.cpp#L34
initializes getLastRenewed to time::steady_clock::TimePoint::min(). So maybe for some reason getLastRenewed does not get updated?)

Thanks
Ashlesh

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.lists.cs.ucla.edu/pipermail/nfd-dev/attachments/20161026/4330ccb0/attachment.html>


More information about the Nfd-dev mailing list