[Nfd-dev] Fwd: NFD Performance testing

Beichuan Zhang bzhang at cs.ARIZONA.EDU
Thu Aug 7 08:01:16 PDT 2014


A task, http://redmine.named-data.net/issues/1819, was created yesterday to get these numbers. Chengyu from CSU will do the measurement before John comes back.

Beichuan

On Aug 7, 2014, at 6:50 AM, Burke, Jeff <jburke at remap.ucla.edu> wrote:

> Thanks. Were there any min/avg/max latency and/or packet drop numbers?  High latency (>200ms) or drops is the most likely cause of what we are seeing.) 
> Jeff
> 
> 
> From: "bzhang at cs.arizona.edu" <bzhang at cs.arizona.edu>
> Date: Tue, 5 Aug 2014 13:54:55 -0700
> To: "nfd-dev at lists.cs.ucla.edu" <nfd-dev at lists.cs.ucla.edu>
> Subject: [Nfd-dev] Fwd: NFD Performance testing
> 
>> This was John DeHart’s performance profiling results back in April. It was NFD 0.1 tested on ONL.
>> 
>> Beichuan
>> 
>> Begin forwarded message:
>> 
>>> From: John DeHart <jdd at seas.wustl.edu>
>>> Subject: Re: NFD Performance testing
>>> Date: April 11, 2014 at 8:28:36 AM MST
>>> To: Alex Afanasyev <alexander.afanasyev at ucla.edu>
>>> Cc: Beichuan Zhang <bzhang at cs.arizona.edu>, Junxiao Shi <shijunxiao at email.arizona.edu>, Patrick Crowley <pcrowley at wustl.edu>, Haowei Yuan <hyuan at wustl.edu>, "Ben Abraham, Hila" <hila at wustl.edu>
>>> 
>>> 
>>> Alex,
>>> 
>>> I just re-ran my tests for forwarding interests and content for unique names
>>> of short, medium and long lengths.
>>> Looks like performance improved  between 8% and 16% depending on length.
>>> Before:
>>>    short: 13700 Interests/sec
>>>    medium: 9500 Interests/sec
>>>    long:       4300 Interests/sec
>>> 
>>> Now:
>>>    short: 14900 Interests/sec
>>>    medium: 10500 Interests/sec
>>>    long:       5000 Interests/sec
>>> 
>>> John
>>> 
>>> On 4/10/14 8:08 PM, Alex Afanasyev wrote:
>>>> John, we just merged a commit that optimizes hash computation.  When you have time, can you try to rerun some of the evaluations and check if we have better numbers (I hope).
>>>> 
>>>> Commit f52dac7f03ac9ba996769cf620badeeb147b6d43 (current master) includes CS fix as well, so you can just get latest code from master branch of github or gerrit.
>>>> 
>>>> Thanks!
>>>> 
>>>> ---
>>>> Alex
>>>> 
>>>> On Apr 10, 2014, at 12:37 PM, John DeHart <jdd at seas.wustl.edu> wrote:
>>>> 
>>>>> Beichuan,
>>>>> 
>>>>> Yes, I have the full gprof output available. I just didn't want to email that
>>>>> to everyone. There are some processing tools I have not tried out yet
>>>>> that might make things easier to read but for now, here is a sample:
>>>>> 
>>>>> http://www.arl.wustl.edu/~jdd/NDN_GPROF_RESULTS/
>>>>> 
>>>>> There you should see this file:  gprof.out.no_content_long_names.txt
>>>>> 
>>>>> If you search in there for "Call graph" you will get to the hierarchical view.
>>>>> 
>>>>> If that is useful, I can put the rest of my files there also.
>>>>> 
>>>>> John
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> On 4/10/14 2:26 PM, Beichuan Zhang wrote:
>>>>>> Hi John,
>>>>>> 
>>>>>> Thanks a lot! These are very informative and useful.
>>>>>> 
>>>>>> To correlate these numbers with the code better, is it possible to get a hierarchical view of the functions and time? That'll make our analysis much easier. (Even the current flat form is already very useful.)
>>>>>> 
>>>>>> Looking forward to the CS test results too.
>>>>>> 
>>>>>> Beichuan
>>>>>> 
>>>>>> 
>>>>>> On Apr 10, 2014, at 12:08 PM, John DeHart <jdd at seas.wustl.edu> wrote:
>>>>>> 
>>>>>>> All:
>>>>>>> 
>>>>>>> Here is another set of data. This set takes content out of the picture.
>>>>>>> I don't run any content servers in these tests so the Interests are never fulfilled.
>>>>>>> I again vary the length of the names used to see that impact.
>>>>>>> 
>>>>>>>  Approximate rate nfd was able to handle:
>>>>>>>    short:     37500 Interests/sec
>>>>>>>    medium: 22400 Interests/sec
>>>>>>>    long:         7100 Interests/sec
>>>>>>> 
>>>>>>> For the short name case I find it interesting that nfd::Cs::find() still consumes 1.67% of
>>>>>>> our processing time. Remember there is no content ever returned so the CS should
>>>>>>> always be empty. Not that we want to optimize anything for the case of an empty
>>>>>>> CS, but that seems kind of high for a case where it doesn't have to do any searching.
>>>>>>> 
>>>>>>> Again lots of usage for Block::Block(). For my amusement, for the long name case,
>>>>>>> I gathered all the usage of all the different Block::Block() signatures at the end of this
>>>>>>> note and tried to match them up to see which ones get used the most. Not sure it tells
>>>>>>> us anything but since I'm not familiar with the code I was curious.
>>>>>>> 
>>>>>>> ndn::Block::Block(ndn::Block const&) on average gets invoked 970 per Interest
>>>>>>> for the long names.
>>>>>>> 
>>>>>>> I've asked Jeff Burke's group for some actual names they use so I can see
>>>>>>> if my names are of reasonable lengths.
>>>>>>> 
>>>>>>> Next I plan to do some tests where I load the CS and then bombard it with
>>>>>>> Interests that will always match something stored.
>>>>>>> 
>>>>>>> Also, I should note that I am NOT seeing any signs of memory growth.
>>>>>>> 
>>>>>>> John
>>>>>>> 
>>>>>>> 
>>>>>>> --------------------------------------------------------------------------------
>>>>>>> 
>>>>>>> These tests are for an applied load of unique interests with no content returned.
>>>>>>> This should put a load on the PIT with no load on the CS.
>>>>>>> 
>>>>>>> Two sets of tests were run.
>>>>>>> 1. Optimized nfd
>>>>>>> 2. Profiled nfd
>>>>>>> 
>>>>>>> 1. Optimized nfd tests
>>>>>>>  For these tests nfd was built with the standard default compilation options, defaultFlags = ['-O2', '-g', '-Wall']
>>>>>>> 
>>>>>>>  The following tests were run with 128 client all routing through one central nfd router.  There are no
>>>>>>>  servers to provide content for the supplied interests. There are hosts running nfd that the interests are routed to.
>>>>>>> 
>>>>>>>  16 client hosts running 8 ndn-traffic processes and one nfd.
>>>>>>>  16 server hosts running one nfd.
>>>>>>>  1  router host running 1 nfd as the central router.
>>>>>>> 
>>>>>>>  Three test cases for name length:
>>>>>>>    short:  /example/000
>>>>>>>    medium: /example/ABCDE/FGHIJ/KLMNO/PQRST/UVWXY/Z/000
>>>>>>>    long: /example/ABCDE/FGHIJ/KLMNO/PQRST/UVWXY/Z/ABCDE/FGHIJ/KLMNO/PQRST/UVWXY/Z/ABCDE/FGHIJ/KLMNO/PQRST/UVWXY/Z/ABCDE/FGHIJ/KLMNO/PQRST/UVWXY/Z/000
>>>>>>> 
>>>>>>>  128 different base names: end of name ranged from 000 to 127.
>>>>>>>  A sequence number is appended by ndn-traffic to each Interest to force every name to be unique.
>>>>>>> 
>>>>>>>  Applied load was approximately 42000 interests/sec.
>>>>>>> 
>>>>>>>  Approximate rate nfd was able to handle:
>>>>>>>    short:  37500 Interests/sec
>>>>>>>    medium: 22400 Interests/sec
>>>>>>>    long:    7100 Interests/sec
>>>>>>> 
>>>>>>> 2. Profiled nfd test
>>>>>>>   In order to generate gprof output, nfd is built with profile enabled, defaultFlags = ['-O2', '-pg', '-g', '-Wall'].
>>>>>>>   This obviously slows nfd down and the performance is not nearly what the optimized case shows. But what we
>>>>>>>   are interested in here is what gprof can tell us about which functions are consuming time.
>>>>>>> 
>>>>>>>  The following tests were run with 128 client/server pairs all routing through one central nfd router.
>>>>>>> 
>>>>>>>  Tests were run for 2,000,000 Interests received by nfd. Counter in pit code added to trigger call
>>>>>>>  to exit() so gmon.out could be generated.
>>>>>>> 
>>>>>>>  16 client hosts running 8 ndn-traffic process and 1 nfd.
>>>>>>>  16 server hosts running 8 ndn-traffic-server process and 1 nfd.
>>>>>>>  1  router host running 1 nfd as the central router.
>>>>>>> 
>>>>>>>  Three test cases for name length:
>>>>>>>    short:  /example/000
>>>>>>>    medium: /example/ABCDE/FGHIJ/KLMNO/PQRST/UVWXY/Z/000
>>>>>>>    long: /example/ABCDE/FGHIJ/KLMNO/PQRST/UVWXY/Z/ABCDE/FGHIJ/KLMNO/PQRST/UVWXY/Z/ABCDE/FGHIJ/KLMNO/PQRST/UVWXY/Z/ABCDE/FGHIJ/KLMNO/PQRST/UVWXY/Z/000
>>>>>>> 
>>>>>>>  128 different base names: end of name ranged from 000 to 127.
>>>>>>>  A sequence number is appended by ndn-traffic to each Interest to force every name to be unique.
>>>>>>> 
>>>>>>>  Applied load was approximately 25500 interests/sec.
>>>>>>>  In the short, medium and long test case, the central router nfd was not able to keep up with
>>>>>>>  the applied load.
>>>>>>> 
>>>>>>>  The gprof data shown below is from the Flat profile given by gprof. I'm only showing the top consumers
>>>>>>>  that consume at least 1% of the cpu time used.
>>>>>>> 
>>>>>>> short:
>>>>>>>  %   cumulative   self              self     total
>>>>>>> time   seconds   seconds    calls   s/call   s/call  name
>>>>>>> 19.76      5.68     5.68 741559024     0.00     0.00 boost::detail::shared_count::~shared_count()
>>>>>>> 13.24      9.49     3.81 150091648     0.00     0.00 ndn::Block::Block(ndn::Block const&)
>>>>>>>  7.39     11.61     2.13 156984996     0.00     0.00 ndn::Block::~Block()
>>>>>>>  4.21     12.82     1.21  1914454     0.00     0.00 nfd::NameTree::eraseEntryIfEmpty(boost::shared_ptr<nfd::name_tree::Entry>)
>>>>>>>  4.18     14.02     1.20  8001344     0.00     0.00 nfd::NameTree::insert(ndn::Name const&)
>>>>>>>  3.93     15.15     1.13  9919137     0.00     0.00 ndn::Name::toUri() const
>>>>>>>  2.99     16.01     0.86  2000193     0.00     0.00 ndn::Scheduler::schedulePeriodicEvent(boost::chrono::duration<long, boost::ratio<1l, 1000000000l> > const&, boost::chrono::duration<long, boost::ratio<1l, 1000000000l> > const&, boost::function<void ()> const&)
>>>>>>>  1.81     16.53     0.52  1916756     0.00     0.00 nfd::NameTree::findExactMatch(ndn::Name const&) const
>>>>>>>  1.79     17.05     0.52  5914452     0.00     0.00 nfd::NameTree::findLongestPrefixMatch(boost::shared_ptr<nfd::name_tree::Entry>, boost::function<bool (nfd::name_tree::Entry const&)> const&) const
>>>>>>>  1.67     17.53     0.48  1999999     0.00     0.00 nfd::Cs::find(ndn::Interest const&) const
>>>>>>>  1.67     18.01     0.48        1     0.48    27.91 boost::asio::detail::task_io_service::run(boost::system::error_code&)
>>>>>>>  1.46     18.43     0.42  2000129     0.00     0.00 nfd::NameTree::lookup(ndn::Name const&)
>>>>>>>  1.04     18.73     0.30 13828583     0.00     0.00 boost::detail::function::functor_manager<boost::_bi::bind_t<void, boost::_mfi::mf1<void, nfd::Forwarder, boost::shared_ptr<nfd::pit::Entry> >, boost::_bi::list2<boost::_bi::value<nfd::Forwarder*>, boost::_bi::value<boost::shared_ptr<nfd::pit::Entry> > > > >::manage(boost::detail::function::function_buffer const&, boost::detail::function::function_buffer&, boost::detail::function::functor_manager_operation_type)
>>>>>>> 
>>>>>>> 
>>>>>>> medium:
>>>>>>>  %   cumulative   self              self     total
>>>>>>> time   seconds   seconds    calls   s/call   s/call  name
>>>>>>> 20.50      8.60     8.60 1266365703     0.00     0.00 boost::detail::shared_count::~shared_count()
>>>>>>> 18.09     16.19     7.59 348085329     0.00     0.00 ndn::Block::Block(ndn::Block const&)
>>>>>>>  7.08     19.16     2.97 351410680     0.00     0.00 ndn::Block::~Block()
>>>>>>>  6.42     21.86     2.70 20001146     0.00     0.00 nfd::NameTree::insert(ndn::Name const&)
>>>>>>>  5.72     24.26     2.40 21952166     0.00     0.00 ndn::Name::toUri() const
>>>>>>>  3.13     25.57     1.32 21951129     0.00     0.00 nfd::name_tree::hashName(ndn::Name const&)
>>>>>>>  2.86     26.77     1.20  1947681     0.00     0.00 nfd::NameTree::eraseEntryIfEmpty(boost::shared_ptr<nfd::name_tree::Entry>)
>>>>>>>  2.29     27.73     0.96  2000129     0.00     0.00 nfd::NameTree::lookup(ndn::Name const&)
>>>>>>>  2.24     28.67     0.94  2000193     0.00     0.00 ndn::Scheduler::schedulePeriodicEvent(boost::chrono::duration<long, boost::ratio<1l, 1000000000l> > const&, boost::chrono::duration<long, boost::ratio<1l, 1000000000l> > const&, boost::function<void ()> const&)
>>>>>>>  1.99     29.51     0.84  5947679     0.00     0.00 nfd::NameTree::findLongestPrefixMatch(boost::shared_ptr<nfd::name_tree::Entry>, boost::function<bool (nfd::name_tree::Entry const&)> const&) const
>>>>>>>  1.50     30.14     0.63  1949983     0.00     0.00 nfd::NameTree::findExactMatch(ndn::Name const&) const
>>>>>>>  1.49     30.76     0.63 78018430     0.00     0.00 std::vector<ndn::Block, std::allocator<ndn::Block> >::_M_insert_aux(__gnu_cxx::__normal_iterator<ndn::Block*, std::vector<ndn::Block, std::allocator<ndn::Block> > >, ndn::Block const&)
>>>>>>> 
>>>>>>> 
>>>>>>> long:
>>>>>>>  %   cumulative   self              self     total
>>>>>>> time   seconds   seconds    calls  ms/call  ms/call  name
>>>>>>> 25.38     26.82    26.82 1940055896     0.00     0.00 ndn::Block::Block(ndn::Block const&)
>>>>>>> 22.66     50.75    23.94 4280104948     0.00     0.00 boost::detail::shared_count::~shared_count()
>>>>>>>  8.42     59.65     8.90 57984812     0.00     0.00 ndn::Name::toUri() const
>>>>>>>  6.59     66.61     6.97 56000552     0.00     0.00 nfd::NameTree::insert(ndn::Name const&)
>>>>>>>  6.56     73.54     6.93 57983775     0.00     0.00 nfd::name_tree::hashName(ndn::Name const&)
>>>>>>>  6.48     80.39     6.85 1941813258     0.00     0.00 ndn::Block::~Block()
>>>>>>>  3.27     83.84     3.45  2000129     0.00     0.04 nfd::NameTree::lookup(ndn::Name const&)
>>>>>>>  2.40     86.37     2.54 282015065     0.00     0.00 std::vector<ndn::Block, std::allocator<ndn::Block> >::_M_insert_aux(__gnu_cxx::__normal_iterator<ndn::Block*, std::vector<ndn::Block, std::allocator<ndn::Block> > >, ndn::Block const&)
>>>>>>>  2.06     88.55     2.18  5980919     0.00     0.00 nfd::NameTree::findLongestPrefixMatch(boost::shared_ptr<nfd::name_tree::Entry>, boost::function<bool (nfd::name_tree::Entry const&)> const&) const
>>>>>>>  1.45     90.08     1.53 564030243     0.00     0.00  __tcf_1
>>>>>>>  1.00     91.14     1.06  1983223     0.00     0.00 nfd::NameTree::findExactMatch(ndn::Name const&) const
>>>>>>> 
>>>>>>> Block::Block Analysis:
>>>>>>> Here are all the gprof listings for Block::Block from the flat profile for the long name case above.
>>>>>>> The numbers in parens (#) are my addition to try to match this with the actual code.
>>>>>>> 
>>>>>>> 25.38     26.82    26.82 1940055896     0.00     0.00  (1) ndn::Block::Block(ndn::Block const&)
>>>>>>>  0.37     96.51     0.40   64016585     0.00     0.00  (2) ndn::Block::Block(boost::shared_ptr<ndn::Buffer const> const&, unsigned int, __gnu_cxx::__normal_iterator<unsigned char const*, std::vector<unsigned char, std::allocator<unsigned char> > > const&, __gnu_cxx::__normal_iterator<unsigned char const*, std::vector<unsigned char, std::allocator<unsigned char> > > const&, __gnu_cxx::__normal_iterator<unsigned char const*, std::vector<unsigned char, std::allocator<unsigned char> > > const&, __gnu_cxx::__normal_iterator<unsigned char const*, std::vector<unsigned char, std::allocator<unsigned char> > > const&)
>>>>>>>  0.05    102.92     0.06 62072947     0.00     0.00  (3) ndn::Block::Block(unsigned int)
>>>>>>>  0.02    104.68     0.02 16006475     0.00     0.00  (4) ndn::Block::Block()
>>>>>>>  0.02    105.14     0.02                             (5) ndn::Block::Block(std::istream&)
>>>>>>>  0.00    105.64     0.00     2457     0.00     0.00  (6) ndn::Block::Block(boost::shared_ptr<ndn::Buffer const> const&, __gnu_cxx::__normal_iterator<unsigned char const*, std::vector<unsigned char, std::allocator<unsigned char> > > const&, __gnu_cxx::__normal_iterator<unsigned char const*, std::vector<unsigned char, std::allocator<unsigned char> > > const&, bool)
>>>>>>>  0.00    105.64     0.00     1744     0.00     0.00  (7) ndn::Block::Block(unsigned int, boost::shared_ptr<ndn::Buffer const> const&)
>>>>>>>  0.00    105.64     0.00     1196     0.00     0.00  (8) ndn::Block::Block(boost::shared_ptr<ndn::Buffer const> const&)
>>>>>>>  0.00    105.64     0.00      517     0.00     0.00  (9) ndn::Block::Block(unsigned int, ndn::Block const&)
>>>>>>>  0.00    105.64     0.00        1     0.00    20.00 (10) ndn::Block::Block(unsigned char const*, unsigned long)
>>>>>>> 
>>>>>>> And here are all the constructors' signatures from block.cpp:
>>>>>>> ndn-cpp-dev/src/encoding$ grep "Block::Block" block.cpp
>>>>>>> (4) Block::Block()
>>>>>>> (1) Block::Block(const EncodingBuffer& buffer)
>>>>>>> (2) Block::Block(const ConstBufferPtr& wire,
>>>>>>>             uint32_t type,
>>>>>>>             const Buffer::const_iterator& begin, const Buffer::const_iterator& end,
>>>>>>>             const Buffer::const_iterator& valueBegin, const Buffer::const_iterator& valueEnd)
>>>>>>> (8) Block::Block(const ConstBufferPtr& buffer)
>>>>>>> (6) Block::Block(const ConstBufferPtr& buffer,
>>>>>>>             const Buffer::const_iterator& begin, const Buffer::const_iterator& end,
>>>>>>>             bool verifyLength/* = true*/)
>>>>>>> (5) Block::Block(std::istream& is)
>>>>>>> (10) Block::Block(const uint8_t* buffer, size_t maxlength)
>>>>>>>     Block::Block(const void* bufferX, size_t maxlength)
>>>>>>> (3) Block::Block(uint32_t type)
>>>>>>> (7) Block::Block(uint32_t type, const ConstBufferPtr& value)
>>>>>>> (9) Block::Block(uint32_t type, const Block& value)
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>> 
>> 
>> _______________________________________________ Nfd-dev mailing list Nfd-dev at lists.cs.ucla.edu http://www.lists.cs.ucla.edu/mailman/listinfo/nfd-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.lists.cs.ucla.edu/pipermail/nfd-dev/attachments/20140807/d6d30244/attachment.html>


More information about the Nfd-dev mailing list