[Nfd-dev] ndn-cxx Bug 4808

Fri Jan 18 15:31:48 PST 2019

Hi Fred

There are two RNGs involved:
1. C++ STL’s mt19937 seeded with 256 bits of entropy; a developer wanted to
conserve OS entropy source, so that it did not use more entropy.
2. OpenSSL’s RAND_bytes function.

The original statistics method is a K-S test for goodness of fit. The
implementation checks lower 5 bits of the random number over 35 samples.
The test rewritten in attempted fix D is also K-S test. It checks upper 8
bits of the random number over 50 samples.

My vote for short term would be E: revert it to soft failure, so that it
does not block other commits; but the warning is kept as a reminder of this
problem.
For longer term, I could prefer F or G over D, but this needs someone that
actually understands statistics. Developer who wrote D is a beginner in
statistics.

Yours, Junxiao

On Fri, Jan 18, 2019 at 18:11 Fred Douglis <fdouglis at perspectalabs.com>
wrote:

> Can I vote "none of the above"?
>
> I think checking the RNG was a good idea, and simply removing the test
> sounds bad.  If in fact it's not really conforming to the expected
> distribution, this is a problem.  You haven't really said much about the
> statistical method that declared it was failing, other than that you could
> repeat the test a few times and dramatically bring down the failure rate.
> If you trust the test, it sounds like the RNG is in fact broken, but maybe
> it's more a bad test than a bad RNG.  If the test is good, then it seems
> like you missed opinion F, which is to keep it as a hard failure and *force
> a fix or change of libraries*.  If the test is bad, then opinion G is, *find
> a better test*.
>
> Not that my opinion counts much, I'm new to the list, and so far simply a
> lurker.
>
> Fred
> On 1/18/2019 6:00 PM, Junxiao Shi wrote:
>
> Dear folks
>
> There is an urgent developer disagreement during code review related to
> issue 4808. It seems that this could not wait until the next NFD call, so
> I’ll explain the facts here.
>
> The technical problem: ndn-cxx has a random number generator implemented
> by calling into third party libraries. There was a unit test using
> statistics method to check that the generated random number conforms to a
> uniform distribution. Given it’s a randomized test, the unit test fails
> “softly”: it creates a warning when fails, not an error.
> Recent changes: developer A made a commit changing the soft failure to
> hard failure. As a result, many Jenkins builds are failing. An independent
> test indicates the failure rate is 14% or more.
>
> Attempted fix A: delete the unit test outright because it’s “semi-broken”.
> Afterwards, there would be no unit test to check the numbers are random.
> Opinion B: “testing random number generator is not ndn-cxx’s business”, so
> the unit test can be deleted.
> Opinion C: every line of code requires at least one failing test.
> Therefore, without any unit test on the random number generator, one could
> just make “return 0;” the random number generator.
> Attempted fix D: rewrite the “goodness of fit” unit test, loosening
> numerical requirements. Execute the test five times, and declare a hard
> failure only if the test fails three or more times out of five. The failure
> rate of this method is 0.01%.
> Opinion E: revert to soft failure.
>
> I am one of the parties involved but I tried to summerize the facts. I
> hope everyone (including myself) can calm down and make a decision at next
> NFD call, and don’t rush with merging one of the changes. Please keep all
> discussions on the mailing list by using reply-all only.
>
> Yours, Junxiao
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.lists.cs.ucla.edu/pipermail/nfd-dev/attachments/20190118/5c925532/attachment-0001.html>