[Nfd-dev] Fwd: NFD Performance testing

Beichuan Zhang bzhang at cs.arizona.edu
Tue Aug 5 13:54:55 PDT 2014


This was John DeHart’s performance profiling results back in April. It was NFD 0.1 tested on ONL.

Beichuan

Begin forwarded message:

> From: John DeHart <jdd at seas.wustl.edu>
> Subject: Re: NFD Performance testing
> Date: April 11, 2014 at 8:28:36 AM MST
> To: Alex Afanasyev <alexander.afanasyev at ucla.edu>
> Cc: Beichuan Zhang <bzhang at cs.arizona.edu>, Junxiao Shi <shijunxiao at email.arizona.edu>, Patrick Crowley <pcrowley at wustl.edu>, Haowei Yuan <hyuan at wustl.edu>, "Ben Abraham, Hila" <hila at wustl.edu>
> 
> 
> Alex,
> 
> I just re-ran my tests for forwarding interests and content for unique names
> of short, medium and long lengths.
> Looks like performance improved  between 8% and 16% depending on length.
> Before:
>    short: 13700 Interests/sec
>    medium: 9500 Interests/sec
>    long:       4300 Interests/sec
> 
> Now:
>    short: 14900 Interests/sec
>    medium: 10500 Interests/sec
>    long:       5000 Interests/sec
> 
> John
> 
> On 4/10/14 8:08 PM, Alex Afanasyev wrote:
>> John, we just merged a commit that optimizes hash computation.  When you have time, can you try to rerun some of the evaluations and check if we have better numbers (I hope).
>> 
>> Commit f52dac7f03ac9ba996769cf620badeeb147b6d43 (current master) includes CS fix as well, so you can just get latest code from master branch of github or gerrit.
>> 
>> Thanks!
>> 
>> ---
>> Alex
>> 
>> On Apr 10, 2014, at 12:37 PM, John DeHart <jdd at seas.wustl.edu> wrote:
>> 
>>> Beichuan,
>>> 
>>> Yes, I have the full gprof output available. I just didn't want to email that
>>> to everyone. There are some processing tools I have not tried out yet
>>> that might make things easier to read but for now, here is a sample:
>>> 
>>> http://www.arl.wustl.edu/~jdd/NDN_GPROF_RESULTS/
>>> 
>>> There you should see this file:  gprof.out.no_content_long_names.txt
>>> 
>>> If you search in there for "Call graph" you will get to the hierarchical view.
>>> 
>>> If that is useful, I can put the rest of my files there also.
>>> 
>>> John
>>> 
>>> 
>>> 
>>> 
>>> On 4/10/14 2:26 PM, Beichuan Zhang wrote:
>>>> Hi John,
>>>> 
>>>> Thanks a lot! These are very informative and useful.
>>>> 
>>>> To correlate these numbers with the code better, is it possible to get a hierarchical view of the functions and time? That'll make our analysis much easier. (Even the current flat form is already very useful.)
>>>> 
>>>> Looking forward to the CS test results too.
>>>> 
>>>> Beichuan
>>>> 
>>>> 
>>>> On Apr 10, 2014, at 12:08 PM, John DeHart <jdd at seas.wustl.edu> wrote:
>>>> 
>>>>> All:
>>>>> 
>>>>> Here is another set of data. This set takes content out of the picture.
>>>>> I don't run any content servers in these tests so the Interests are never fulfilled.
>>>>> I again vary the length of the names used to see that impact.
>>>>> 
>>>>>  Approximate rate nfd was able to handle:
>>>>>    short:     37500 Interests/sec
>>>>>    medium: 22400 Interests/sec
>>>>>    long:         7100 Interests/sec
>>>>> 
>>>>> For the short name case I find it interesting that nfd::Cs::find() still consumes 1.67% of
>>>>> our processing time. Remember there is no content ever returned so the CS should
>>>>> always be empty. Not that we want to optimize anything for the case of an empty
>>>>> CS, but that seems kind of high for a case where it doesn't have to do any searching.
>>>>> 
>>>>> Again lots of usage for Block::Block(). For my amusement, for the long name case,
>>>>> I gathered all the usage of all the different Block::Block() signatures at the end of this
>>>>> note and tried to match them up to see which ones get used the most. Not sure it tells
>>>>> us anything but since I'm not familiar with the code I was curious.
>>>>> 
>>>>> ndn::Block::Block(ndn::Block const&) on average gets invoked 970 per Interest
>>>>> for the long names.
>>>>> 
>>>>> I've asked Jeff Burke's group for some actual names they use so I can see
>>>>> if my names are of reasonable lengths.
>>>>> 
>>>>> Next I plan to do some tests where I load the CS and then bombard it with
>>>>> Interests that will always match something stored.
>>>>> 
>>>>> Also, I should note that I am NOT seeing any signs of memory growth.
>>>>> 
>>>>> John
>>>>> 
>>>>> 
>>>>> --------------------------------------------------------------------------------
>>>>> 
>>>>> These tests are for an applied load of unique interests with no content returned.
>>>>> This should put a load on the PIT with no load on the CS.
>>>>> 
>>>>> Two sets of tests were run.
>>>>> 1. Optimized nfd
>>>>> 2. Profiled nfd
>>>>> 
>>>>> 1. Optimized nfd tests
>>>>>  For these tests nfd was built with the standard default compilation options, defaultFlags = ['-O2', '-g', '-Wall']
>>>>> 
>>>>>  The following tests were run with 128 client all routing through one central nfd router.  There are no
>>>>>  servers to provide content for the supplied interests. There are hosts running nfd that the interests are routed to.
>>>>> 
>>>>>  16 client hosts running 8 ndn-traffic processes and one nfd.
>>>>>  16 server hosts running one nfd.
>>>>>  1  router host running 1 nfd as the central router.
>>>>> 
>>>>>  Three test cases for name length:
>>>>>    short:  /example/000
>>>>>    medium: /example/ABCDE/FGHIJ/KLMNO/PQRST/UVWXY/Z/000
>>>>>    long: /example/ABCDE/FGHIJ/KLMNO/PQRST/UVWXY/Z/ABCDE/FGHIJ/KLMNO/PQRST/UVWXY/Z/ABCDE/FGHIJ/KLMNO/PQRST/UVWXY/Z/ABCDE/FGHIJ/KLMNO/PQRST/UVWXY/Z/000
>>>>> 
>>>>>  128 different base names: end of name ranged from 000 to 127.
>>>>>  A sequence number is appended by ndn-traffic to each Interest to force every name to be unique.
>>>>> 
>>>>>  Applied load was approximately 42000 interests/sec.
>>>>> 
>>>>>  Approximate rate nfd was able to handle:
>>>>>    short:  37500 Interests/sec
>>>>>    medium: 22400 Interests/sec
>>>>>    long:    7100 Interests/sec
>>>>> 
>>>>> 2. Profiled nfd test
>>>>>   In order to generate gprof output, nfd is built with profile enabled, defaultFlags = ['-O2', '-pg', '-g', '-Wall'].
>>>>>   This obviously slows nfd down and the performance is not nearly what the optimized case shows. But what we
>>>>>   are interested in here is what gprof can tell us about which functions are consuming time.
>>>>> 
>>>>>  The following tests were run with 128 client/server pairs all routing through one central nfd router.
>>>>> 
>>>>>  Tests were run for 2,000,000 Interests received by nfd. Counter in pit code added to trigger call
>>>>>  to exit() so gmon.out could be generated.
>>>>> 
>>>>>  16 client hosts running 8 ndn-traffic process and 1 nfd.
>>>>>  16 server hosts running 8 ndn-traffic-server process and 1 nfd.
>>>>>  1  router host running 1 nfd as the central router.
>>>>> 
>>>>>  Three test cases for name length:
>>>>>    short:  /example/000
>>>>>    medium: /example/ABCDE/FGHIJ/KLMNO/PQRST/UVWXY/Z/000
>>>>>    long: /example/ABCDE/FGHIJ/KLMNO/PQRST/UVWXY/Z/ABCDE/FGHIJ/KLMNO/PQRST/UVWXY/Z/ABCDE/FGHIJ/KLMNO/PQRST/UVWXY/Z/ABCDE/FGHIJ/KLMNO/PQRST/UVWXY/Z/000
>>>>> 
>>>>>  128 different base names: end of name ranged from 000 to 127.
>>>>>  A sequence number is appended by ndn-traffic to each Interest to force every name to be unique.
>>>>> 
>>>>>  Applied load was approximately 25500 interests/sec.
>>>>>  In the short, medium and long test case, the central router nfd was not able to keep up with
>>>>>  the applied load.
>>>>> 
>>>>>  The gprof data shown below is from the Flat profile given by gprof. I'm only showing the top consumers
>>>>>  that consume at least 1% of the cpu time used.
>>>>> 
>>>>> short:
>>>>>  %   cumulative   self              self     total
>>>>> time   seconds   seconds    calls   s/call   s/call  name
>>>>> 19.76      5.68     5.68 741559024     0.00     0.00 boost::detail::shared_count::~shared_count()
>>>>> 13.24      9.49     3.81 150091648     0.00     0.00 ndn::Block::Block(ndn::Block const&)
>>>>>  7.39     11.61     2.13 156984996     0.00     0.00 ndn::Block::~Block()
>>>>>  4.21     12.82     1.21  1914454     0.00     0.00 nfd::NameTree::eraseEntryIfEmpty(boost::shared_ptr<nfd::name_tree::Entry>)
>>>>>  4.18     14.02     1.20  8001344     0.00     0.00 nfd::NameTree::insert(ndn::Name const&)
>>>>>  3.93     15.15     1.13  9919137     0.00     0.00 ndn::Name::toUri() const
>>>>>  2.99     16.01     0.86  2000193     0.00     0.00 ndn::Scheduler::schedulePeriodicEvent(boost::chrono::duration<long, boost::ratio<1l, 1000000000l> > const&, boost::chrono::duration<long, boost::ratio<1l, 1000000000l> > const&, boost::function<void ()> const&)
>>>>>  1.81     16.53     0.52  1916756     0.00     0.00 nfd::NameTree::findExactMatch(ndn::Name const&) const
>>>>>  1.79     17.05     0.52  5914452     0.00     0.00 nfd::NameTree::findLongestPrefixMatch(boost::shared_ptr<nfd::name_tree::Entry>, boost::function<bool (nfd::name_tree::Entry const&)> const&) const
>>>>>  1.67     17.53     0.48  1999999     0.00     0.00 nfd::Cs::find(ndn::Interest const&) const
>>>>>  1.67     18.01     0.48        1     0.48    27.91 boost::asio::detail::task_io_service::run(boost::system::error_code&)
>>>>>  1.46     18.43     0.42  2000129     0.00     0.00 nfd::NameTree::lookup(ndn::Name const&)
>>>>>  1.04     18.73     0.30 13828583     0.00     0.00 boost::detail::function::functor_manager<boost::_bi::bind_t<void, boost::_mfi::mf1<void, nfd::Forwarder, boost::shared_ptr<nfd::pit::Entry> >, boost::_bi::list2<boost::_bi::value<nfd::Forwarder*>, boost::_bi::value<boost::shared_ptr<nfd::pit::Entry> > > > >::manage(boost::detail::function::function_buffer const&, boost::detail::function::function_buffer&, boost::detail::function::functor_manager_operation_type)
>>>>> 
>>>>> 
>>>>> medium:
>>>>>  %   cumulative   self              self     total
>>>>> time   seconds   seconds    calls   s/call   s/call  name
>>>>> 20.50      8.60     8.60 1266365703     0.00     0.00 boost::detail::shared_count::~shared_count()
>>>>> 18.09     16.19     7.59 348085329     0.00     0.00 ndn::Block::Block(ndn::Block const&)
>>>>>  7.08     19.16     2.97 351410680     0.00     0.00 ndn::Block::~Block()
>>>>>  6.42     21.86     2.70 20001146     0.00     0.00 nfd::NameTree::insert(ndn::Name const&)
>>>>>  5.72     24.26     2.40 21952166     0.00     0.00 ndn::Name::toUri() const
>>>>>  3.13     25.57     1.32 21951129     0.00     0.00 nfd::name_tree::hashName(ndn::Name const&)
>>>>>  2.86     26.77     1.20  1947681     0.00     0.00 nfd::NameTree::eraseEntryIfEmpty(boost::shared_ptr<nfd::name_tree::Entry>)
>>>>>  2.29     27.73     0.96  2000129     0.00     0.00 nfd::NameTree::lookup(ndn::Name const&)
>>>>>  2.24     28.67     0.94  2000193     0.00     0.00 ndn::Scheduler::schedulePeriodicEvent(boost::chrono::duration<long, boost::ratio<1l, 1000000000l> > const&, boost::chrono::duration<long, boost::ratio<1l, 1000000000l> > const&, boost::function<void ()> const&)
>>>>>  1.99     29.51     0.84  5947679     0.00     0.00 nfd::NameTree::findLongestPrefixMatch(boost::shared_ptr<nfd::name_tree::Entry>, boost::function<bool (nfd::name_tree::Entry const&)> const&) const
>>>>>  1.50     30.14     0.63  1949983     0.00     0.00 nfd::NameTree::findExactMatch(ndn::Name const&) const
>>>>>  1.49     30.76     0.63 78018430     0.00     0.00 std::vector<ndn::Block, std::allocator<ndn::Block> >::_M_insert_aux(__gnu_cxx::__normal_iterator<ndn::Block*, std::vector<ndn::Block, std::allocator<ndn::Block> > >, ndn::Block const&)
>>>>> 
>>>>> 
>>>>> long:
>>>>>  %   cumulative   self              self     total
>>>>> time   seconds   seconds    calls  ms/call  ms/call  name
>>>>> 25.38     26.82    26.82 1940055896     0.00     0.00 ndn::Block::Block(ndn::Block const&)
>>>>> 22.66     50.75    23.94 4280104948     0.00     0.00 boost::detail::shared_count::~shared_count()
>>>>>  8.42     59.65     8.90 57984812     0.00     0.00 ndn::Name::toUri() const
>>>>>  6.59     66.61     6.97 56000552     0.00     0.00 nfd::NameTree::insert(ndn::Name const&)
>>>>>  6.56     73.54     6.93 57983775     0.00     0.00 nfd::name_tree::hashName(ndn::Name const&)
>>>>>  6.48     80.39     6.85 1941813258     0.00     0.00 ndn::Block::~Block()
>>>>>  3.27     83.84     3.45  2000129     0.00     0.04 nfd::NameTree::lookup(ndn::Name const&)
>>>>>  2.40     86.37     2.54 282015065     0.00     0.00 std::vector<ndn::Block, std::allocator<ndn::Block> >::_M_insert_aux(__gnu_cxx::__normal_iterator<ndn::Block*, std::vector<ndn::Block, std::allocator<ndn::Block> > >, ndn::Block const&)
>>>>>  2.06     88.55     2.18  5980919     0.00     0.00 nfd::NameTree::findLongestPrefixMatch(boost::shared_ptr<nfd::name_tree::Entry>, boost::function<bool (nfd::name_tree::Entry const&)> const&) const
>>>>>  1.45     90.08     1.53 564030243     0.00     0.00  __tcf_1
>>>>>  1.00     91.14     1.06  1983223     0.00     0.00 nfd::NameTree::findExactMatch(ndn::Name const&) const
>>>>> 
>>>>> Block::Block Analysis:
>>>>> Here are all the gprof listings for Block::Block from the flat profile for the long name case above.
>>>>> The numbers in parens (#) are my addition to try to match this with the actual code.
>>>>> 
>>>>> 25.38     26.82    26.82 1940055896     0.00     0.00  (1) ndn::Block::Block(ndn::Block const&)
>>>>>  0.37     96.51     0.40   64016585     0.00     0.00  (2) ndn::Block::Block(boost::shared_ptr<ndn::Buffer const> const&, unsigned int, __gnu_cxx::__normal_iterator<unsigned char const*, std::vector<unsigned char, std::allocator<unsigned char> > > const&, __gnu_cxx::__normal_iterator<unsigned char const*, std::vector<unsigned char, std::allocator<unsigned char> > > const&, __gnu_cxx::__normal_iterator<unsigned char const*, std::vector<unsigned char, std::allocator<unsigned char> > > const&, __gnu_cxx::__normal_iterator<unsigned char const*, std::vector<unsigned char, std::allocator<unsigned char> > > const&)
>>>>>  0.05    102.92     0.06 62072947     0.00     0.00  (3) ndn::Block::Block(unsigned int)
>>>>>  0.02    104.68     0.02 16006475     0.00     0.00  (4) ndn::Block::Block()
>>>>>  0.02    105.14     0.02                             (5) ndn::Block::Block(std::istream&)
>>>>>  0.00    105.64     0.00     2457     0.00     0.00  (6) ndn::Block::Block(boost::shared_ptr<ndn::Buffer const> const&, __gnu_cxx::__normal_iterator<unsigned char const*, std::vector<unsigned char, std::allocator<unsigned char> > > const&, __gnu_cxx::__normal_iterator<unsigned char const*, std::vector<unsigned char, std::allocator<unsigned char> > > const&, bool)
>>>>>  0.00    105.64     0.00     1744     0.00     0.00  (7) ndn::Block::Block(unsigned int, boost::shared_ptr<ndn::Buffer const> const&)
>>>>>  0.00    105.64     0.00     1196     0.00     0.00  (8) ndn::Block::Block(boost::shared_ptr<ndn::Buffer const> const&)
>>>>>  0.00    105.64     0.00      517     0.00     0.00  (9) ndn::Block::Block(unsigned int, ndn::Block const&)
>>>>>  0.00    105.64     0.00        1     0.00    20.00 (10) ndn::Block::Block(unsigned char const*, unsigned long)
>>>>> 
>>>>> And here are all the constructors' signatures from block.cpp:
>>>>> ndn-cpp-dev/src/encoding$ grep "Block::Block" block.cpp
>>>>> (4) Block::Block()
>>>>> (1) Block::Block(const EncodingBuffer& buffer)
>>>>> (2) Block::Block(const ConstBufferPtr& wire,
>>>>>             uint32_t type,
>>>>>             const Buffer::const_iterator& begin, const Buffer::const_iterator& end,
>>>>>             const Buffer::const_iterator& valueBegin, const Buffer::const_iterator& valueEnd)
>>>>> (8) Block::Block(const ConstBufferPtr& buffer)
>>>>> (6) Block::Block(const ConstBufferPtr& buffer,
>>>>>             const Buffer::const_iterator& begin, const Buffer::const_iterator& end,
>>>>>             bool verifyLength/* = true*/)
>>>>> (5) Block::Block(std::istream& is)
>>>>> (10) Block::Block(const uint8_t* buffer, size_t maxlength)
>>>>>     Block::Block(const void* bufferX, size_t maxlength)
>>>>> (3) Block::Block(uint32_t type)
>>>>> (7) Block::Block(uint32_t type, const ConstBufferPtr& value)
>>>>> (9) Block::Block(uint32_t type, const Block& value)
>>>>> 
>>>>> 
>>>>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.lists.cs.ucla.edu/pipermail/nfd-dev/attachments/20140805/64998f98/attachment.html>


More information about the Nfd-dev mailing list