[Nfd-dev] Help needed with ASF measurement

Davide Pesavento davide.pesavento at lip6.fr
Wed Oct 26 15:30:31 PDT 2016


This is very curious... If the program is dereferencing an "end"
iterator, it should be crashing, especially if it was built with ASan.

On Wed, Oct 26, 2016 at 4:55 PM, Ashlesh Gawande (agawande)
<agawande at memphis.edu> wrote:
> I believe I have caught the bug. The code was calling getLastRenewed() on
> non-existent outRecord.
>
> After this line we must check if the outRecord exist:
> https://github.com/named-data/NFD/blob/master/daemon/fw/asf-measurements.cpp#L116
>
> like in the access strategy:
> https://github.com/named-data/NFD/blob/3d1eb091c0d6545df8c12aabaef12fafc97f836c/daemon/fw/access-strategy.cpp#L192
>
> I applied the fix:
>
> void
> FaceInfo::recordRtt(const shared_ptr<pit::Entry>& pitEntry, const Face&
> inFace)
> {
>   //NFD_LOG_ERROR("Recording RTT for inFace: " << inFace.getId() << " name:
> " << pitEntry->getName());
>   // Calculate RTT
>   pit::OutRecordCollection::const_iterator outRecord =
> pitEntry->getOutRecord(inFace);
>
>   if (outRecord == pitEntry->out_end()) { // no out-record
>     NFD_LOG_DEBUG(pitEntry->getInterest() << " dataFrom " << inFace.getId()
> << " no-out-record");
>     return;
>   }
>
>   NFD_LOG_ERROR("steady clock now: " << time::steady_clock::now() << "
> outRecordLastRenewed: " << outRecord->getLastRenewed());
>
>   time::steady_clock::Duration steadyRtt = time::steady_clock::now() -
> outRecord->getLastRenewed();
>
>   NFD_LOG_ERROR("steadyRtt: " << steadyRtt);
>
>   if( steadyRtt.count() <= 0 || steadyRtt.count() > 100000000000){
>     NFD_LOG_ERROR("Wrong measurement");
>     return;
>   }
>
> So if an out record does not exist, we will not try to use getLastRenewed()
> and return instead.
>
> After the fix it doesn't have any weird measurements. It never enters the
> second "if statement" above where I check if the steadyRTT is less than zero
> or some big number.
>
> But now the question is why would the outRecord not exist (for an interest
> sent to a direct neighbor)?
>
>
> Ashlesh
>
> ________________________________
> From: Lan Wang (lanwang)
> Sent: Monday, October 24, 2016 1:44:45 PM
> To: Ashlesh Gawande (agawande)
> Cc: Junxiao Shi; <nfd-dev at lists.cs.ucla.edu>
>
> Subject: Re: Help needed with ASF measurement
>
> Would this have anything to do with virtualization (since you are using
> min-NDN)? Another possibility: the time was not converted properly to
> nanoseconds?
>
> Lan
>
> On Oct 24, 2016, at 1:40 PM, Ashlesh Gawande (agawande)
> <agawande at memphis.edu> wrote:
>
> I haven't found anything suspicious. There doesn't seem to be a code path
> that generates an out record without updating last renewed field.
>
> Even if there was a bug in the forwarding pipeline and it is not updated
> somewhere, I think we should not get a negative value:
>
> 1477331706.109322 ERROR: [AsfMeasurements] Recording RTT for inFace: 311
>
> 1477331706.109335 ERROR: [AsfMeasurements] steady clock now:
> 2148492683226381 nanoseconds since boot outRecordLastRenewed:
> 7018100696742452000 nanoseconds since boot
>
> 1477331706.109351 ERROR: [AsfMeasurements] steadyRtt: -7015952204059211821
> nanoseconds
> 1477331706.109368 ERROR: [AsfMeasurements] Wrong measurement
>
> This means time::steady_clock::now() is smaller
> time::steady_clock::TimePoint::min().
>
>
> (Machine is Ubuntu 14.04, gcc 4.9.3.
> 32 NFDs are running on this machine at the same time. Such wrong
> measurements happen randomly).
>
> Ashlesh
> ________________________________
> From: Lan Wang (lanwang)
> Sent: Monday, October 24, 2016 1:21:49 PM
> To: Junxiao Shi
> Cc: Ashlesh Gawande (agawande); <nfd-dev at lists.cs.ucla.edu>
> Subject: Re: Help needed with ASF measurement
>
> Junxiao,
>
> Thank you very much for the quick feedback.
>
> Ashlesh: have you found anything suspicious?
>
> Lan
>
> On Oct 24, 2016, at 11:00 AM, Junxiao Shi <shijunxiao at email.arizona.edu>
> wrote:
>
> Hi Ashlesh
> This is probably a forwarding pipelines bug, but I won't have time to
> diagnose this until at least Thursday.
> Things to look at:
> 1. Find every code line that updates last renewed field.
> 2. Find which forwarding pipelines calls the function that updates last
> renewed field.
> 3. Determine if there's a code path that generates an out record without
> updating last renewed field.
> Yours, Junxiao
>
> On Oct 24, 2016 08:52, "Ashlesh Gawande (agawande)" <agawande at memphis.edu>
> wrote:
>
> Hi Junxiao
>
> I am testing ASF in Mini-NDN.
>
> The following line calculates the steady RTT by subtracting
> outRecord->getLastRenewed()
> from time::steady_clock::now()
> https://github.com/named-data/NFD/blob/master/daemon/fw/asf-measurements.cpp#L117
> (which is called from beforeSatisfyInterest
> https://github.com/named-data/NFD/blob/master/daemon/fw/asf-strategy.cpp#L115)
>
> Sometimes the value of this steadyRTT is huge (ex: 200000 seconds) or
> negative.
> This makes ASF change faces to a higher RTT face.
>
> So what could be the reason that this happening? I see much better and
> expected results if I ignore these high/negative values.
>
> (I see that
> https://github.com/named-data/NFD/blob/master/daemon/table/pit-face-record.cpp#L34
> initializes getLastRenewed to time::steady_clock::TimePoint::min(). So maybe
> for some reason getLastRenewed does not get updated?)
>
> Thanks
> Ashlesh
>
>
>
> _______________________________________________
> Nfd-dev mailing list
> Nfd-dev at lists.cs.ucla.edu
> http://www.lists.cs.ucla.edu/mailman/listinfo/nfd-dev
>



More information about the Nfd-dev mailing list