[Nfd-dev] Fwd: NFD Performance testing

Burke, Jeff jburke at remap.UCLA.EDU
Thu Aug 7 06:50:27 PDT 2014


Thanks. Were there any min/avg/max latency and/or packet drop numbers?  High latency (>200ms) or drops is the most likely cause of what we are seeing.)
Jeff


From: "bzhang at cs.arizona.edu<mailto:bzhang at cs.arizona.edu>" <bzhang at cs.arizona.edu<mailto:bzhang at cs.arizona.edu>>
Date: Tue, 5 Aug 2014 13:54:55 -0700
To: "nfd-dev at lists.cs.ucla.edu<mailto:nfd-dev at lists.cs.ucla.edu>" <nfd-dev at lists.cs.ucla.edu<mailto:nfd-dev at lists.cs.ucla.edu>>
Subject: [Nfd-dev] Fwd: NFD Performance testing

This was John DeHart’s performance profiling results back in April. It was NFD 0.1 tested on ONL.

Beichuan

Begin forwarded message:

From: John DeHart <jdd at seas.wustl.edu<mailto:jdd at seas.wustl.edu>>
Subject: Re: NFD Performance testing
Date: April 11, 2014 at 8:28:36 AM MST
To: Alex Afanasyev <alexander.afanasyev at ucla.edu<mailto:alexander.afanasyev at ucla.edu>>
Cc: Beichuan Zhang <bzhang at cs.arizona.edu<mailto:bzhang at cs.arizona.edu>>, Junxiao Shi <shijunxiao at email.arizona.edu<mailto:shijunxiao at email.arizona.edu>>, Patrick Crowley <pcrowley at wustl.edu<mailto:pcrowley at wustl.edu>>, Haowei Yuan <hyuan at wustl.edu<mailto:hyuan at wustl.edu>>, "Ben Abraham, Hila" <hila at wustl.edu<mailto:hila at wustl.edu>>


Alex,

I just re-ran my tests for forwarding interests and content for unique names
of short, medium and long lengths.
Looks like performance improved  between 8% and 16% depending on length.
Before:
   short: 13700 Interests/sec
   medium: 9500 Interests/sec
   long:       4300 Interests/sec

Now:
   short: 14900 Interests/sec
   medium: 10500 Interests/sec
   long:       5000 Interests/sec

John

On 4/10/14 8:08 PM, Alex Afanasyev wrote:
John, we just merged a commit that optimizes hash computation.  When you have time, can you try to rerun some of the evaluations and check if we have better numbers (I hope).

Commit f52dac7f03ac9ba996769cf620badeeb147b6d43 (current master) includes CS fix as well, so you can just get latest code from master branch of github or gerrit.

Thanks!

---
Alex

On Apr 10, 2014, at 12:37 PM, John DeHart <jdd at seas.wustl.edu<mailto:jdd at seas.wustl.edu>> wrote:

Beichuan,

Yes, I have the full gprof output available. I just didn't want to email that
to everyone. There are some processing tools I have not tried out yet
that might make things easier to read but for now, here is a sample:

http://www.arl.wustl.edu/~jdd/NDN_GPROF_RESULTS/

There you should see this file:  gprof.out.no_content_long_names.txt

If you search in there for "Call graph" you will get to the hierarchical view.

If that is useful, I can put the rest of my files there also.

John




On 4/10/14 2:26 PM, Beichuan Zhang wrote:
Hi John,

Thanks a lot! These are very informative and useful.

To correlate these numbers with the code better, is it possible to get a hierarchical view of the functions and time? That'll make our analysis much easier. (Even the current flat form is already very useful.)

Looking forward to the CS test results too.

Beichuan


On Apr 10, 2014, at 12:08 PM, John DeHart <jdd at seas.wustl.edu<mailto:jdd at seas.wustl.edu>> wrote:

All:

Here is another set of data. This set takes content out of the picture.
I don't run any content servers in these tests so the Interests are never fulfilled.
I again vary the length of the names used to see that impact.

 Approximate rate nfd was able to handle:
   short:     37500 Interests/sec
   medium: 22400 Interests/sec
   long:         7100 Interests/sec

For the short name case I find it interesting that nfd::Cs::find() still consumes 1.67% of
our processing time. Remember there is no content ever returned so the CS should
always be empty. Not that we want to optimize anything for the case of an empty
CS, but that seems kind of high for a case where it doesn't have to do any searching.

Again lots of usage for Block::Block(). For my amusement, for the long name case,
I gathered all the usage of all the different Block::Block() signatures at the end of this
note and tried to match them up to see which ones get used the most. Not sure it tells
us anything but since I'm not familiar with the code I was curious.

ndn::Block::Block(ndn::Block const&) on average gets invoked 970 per Interest
for the long names.

I've asked Jeff Burke's group for some actual names they use so I can see
if my names are of reasonable lengths.

Next I plan to do some tests where I load the CS and then bombard it with
Interests that will always match something stored.

Also, I should note that I am NOT seeing any signs of memory growth.

John


--------------------------------------------------------------------------------

These tests are for an applied load of unique interests with no content returned.
This should put a load on the PIT with no load on the CS.

Two sets of tests were run.
1. Optimized nfd
2. Profiled nfd

1. Optimized nfd tests
 For these tests nfd was built with the standard default compilation options, defaultFlags = ['-O2', '-g', '-Wall']

 The following tests were run with 128 client all routing through one central nfd router.  There are no
 servers to provide content for the supplied interests. There are hosts running nfd that the interests are routed to.

 16 client hosts running 8 ndn-traffic processes and one nfd.
 16 server hosts running one nfd.
 1  router host running 1 nfd as the central router.

 Three test cases for name length:
   short:  /example/000
   medium: /example/ABCDE/FGHIJ/KLMNO/PQRST/UVWXY/Z/000
   long: /example/ABCDE/FGHIJ/KLMNO/PQRST/UVWXY/Z/ABCDE/FGHIJ/KLMNO/PQRST/UVWXY/Z/ABCDE/FGHIJ/KLMNO/PQRST/UVWXY/Z/ABCDE/FGHIJ/KLMNO/PQRST/UVWXY/Z/000

 128 different base names: end of name ranged from 000 to 127.
 A sequence number is appended by ndn-traffic to each Interest to force every name to be unique.

 Applied load was approximately 42000 interests/sec.

 Approximate rate nfd was able to handle:
   short:  37500 Interests/sec
   medium: 22400 Interests/sec
   long:    7100 Interests/sec

2. Profiled nfd test
  In order to generate gprof output, nfd is built with profile enabled, defaultFlags = ['-O2', '-pg', '-g', '-Wall'].
  This obviously slows nfd down and the performance is not nearly what the optimized case shows. But what we
  are interested in here is what gprof can tell us about which functions are consuming time.

 The following tests were run with 128 client/server pairs all routing through one central nfd router.

 Tests were run for 2,000,000 Interests received by nfd. Counter in pit code added to trigger call
 to exit() so gmon.out could be generated.

 16 client hosts running 8 ndn-traffic process and 1 nfd.
 16 server hosts running 8 ndn-traffic-server process and 1 nfd.
 1  router host running 1 nfd as the central router.

 Three test cases for name length:
   short:  /example/000
   medium: /example/ABCDE/FGHIJ/KLMNO/PQRST/UVWXY/Z/000
   long: /example/ABCDE/FGHIJ/KLMNO/PQRST/UVWXY/Z/ABCDE/FGHIJ/KLMNO/PQRST/UVWXY/Z/ABCDE/FGHIJ/KLMNO/PQRST/UVWXY/Z/ABCDE/FGHIJ/KLMNO/PQRST/UVWXY/Z/000

 128 different base names: end of name ranged from 000 to 127.
 A sequence number is appended by ndn-traffic to each Interest to force every name to be unique.

 Applied load was approximately 25500 interests/sec.
 In the short, medium and long test case, the central router nfd was not able to keep up with
 the applied load.

 The gprof data shown below is from the Flat profile given by gprof. I'm only showing the top consumers
 that consume at least 1% of the cpu time used.

short:
 %   cumulative   self              self     total
time   seconds   seconds    calls   s/call   s/call  name
19.76      5.68     5.68 741559024     0.00     0.00 boost::detail::shared_count::~shared_count()
13.24      9.49     3.81 150091648     0.00     0.00 ndn::Block::Block(ndn::Block const&)
 7.39     11.61     2.13 156984996     0.00     0.00 ndn::Block::~Block()
 4.21     12.82     1.21  1914454     0.00     0.00 nfd::NameTree::eraseEntryIfEmpty(boost::shared_ptr<nfd::name_tree::Entry>)
 4.18     14.02     1.20  8001344     0.00     0.00 nfd::NameTree::insert(ndn::Name const&)
 3.93     15.15     1.13  9919137     0.00     0.00 ndn::Name::toUri() const
 2.99     16.01     0.86  2000193     0.00     0.00 ndn::Scheduler::schedulePeriodicEvent(boost::chrono::duration<long, boost::ratio<1l, 1000000000l> > const&, boost::chrono::duration<long, boost::ratio<1l, 1000000000l> > const&, boost::function<void ()> const&)
 1.81     16.53     0.52  1916756     0.00     0.00 nfd::NameTree::findExactMatch(ndn::Name const&) const
 1.79     17.05     0.52  5914452     0.00     0.00 nfd::NameTree::findLongestPrefixMatch(boost::shared_ptr<nfd::name_tree::Entry>, boost::function<bool (nfd::name_tree::Entry const&)> const&) const
 1.67     17.53     0.48  1999999     0.00     0.00 nfd::Cs::find(ndn::Interest const&) const
 1.67     18.01     0.48        1     0.48    27.91 boost::asio::detail::task_io_service::run(boost::system::error_code&)
 1.46     18.43     0.42  2000129     0.00     0.00 nfd::NameTree::lookup(ndn::Name const&)
 1.04     18.73     0.30 13828583     0.00     0.00 boost::detail::function::functor_manager<boost::_bi::bind_t<void, boost::_mfi::mf1<void, nfd::Forwarder, boost::shared_ptr<nfd::pit::Entry> >, boost::_bi::list2<boost::_bi::value<nfd::Forwarder*>, boost::_bi::value<boost::shared_ptr<nfd::pit::Entry> > > > >::manage(boost::detail::function::function_buffer const&, boost::detail::function::function_buffer&, boost::detail::function::functor_manager_operation_type)


medium:
 %   cumulative   self              self     total
time   seconds   seconds    calls   s/call   s/call  name
20.50      8.60     8.60 1266365703     0.00     0.00 boost::detail::shared_count::~shared_count()
18.09     16.19     7.59 348085329     0.00     0.00 ndn::Block::Block(ndn::Block const&)
 7.08     19.16     2.97 351410680     0.00     0.00 ndn::Block::~Block()
 6.42     21.86     2.70 20001146     0.00     0.00 nfd::NameTree::insert(ndn::Name const&)
 5.72     24.26     2.40 21952166     0.00     0.00 ndn::Name::toUri() const
 3.13     25.57     1.32 21951129     0.00     0.00 nfd::name_tree::hashName(ndn::Name const&)
 2.86     26.77     1.20  1947681     0.00     0.00 nfd::NameTree::eraseEntryIfEmpty(boost::shared_ptr<nfd::name_tree::Entry>)
 2.29     27.73     0.96  2000129     0.00     0.00 nfd::NameTree::lookup(ndn::Name const&)
 2.24     28.67     0.94  2000193     0.00     0.00 ndn::Scheduler::schedulePeriodicEvent(boost::chrono::duration<long, boost::ratio<1l, 1000000000l> > const&, boost::chrono::duration<long, boost::ratio<1l, 1000000000l> > const&, boost::function<void ()> const&)
 1.99     29.51     0.84  5947679     0.00     0.00 nfd::NameTree::findLongestPrefixMatch(boost::shared_ptr<nfd::name_tree::Entry>, boost::function<bool (nfd::name_tree::Entry const&)> const&) const
 1.50     30.14     0.63  1949983     0.00     0.00 nfd::NameTree::findExactMatch(ndn::Name const&) const
 1.49     30.76     0.63 78018430     0.00     0.00 std::vector<ndn::Block, std::allocator<ndn::Block> >::_M_insert_aux(__gnu_cxx::__normal_iterator<ndn::Block*, std::vector<ndn::Block, std::allocator<ndn::Block> > >, ndn::Block const&)


long:
 %   cumulative   self              self     total
time   seconds   seconds    calls  ms/call  ms/call  name
25.38     26.82    26.82 1940055896     0.00     0.00 ndn::Block::Block(ndn::Block const&)
22.66     50.75    23.94 4280104948     0.00     0.00 boost::detail::shared_count::~shared_count()
 8.42     59.65     8.90 57984812     0.00     0.00 ndn::Name::toUri() const
 6.59     66.61     6.97 56000552     0.00     0.00 nfd::NameTree::insert(ndn::Name const&)
 6.56     73.54     6.93 57983775     0.00     0.00 nfd::name_tree::hashName(ndn::Name const&)
 6.48     80.39     6.85 1941813258     0.00     0.00 ndn::Block::~Block()
 3.27     83.84     3.45  2000129     0.00     0.04 nfd::NameTree::lookup(ndn::Name const&)
 2.40     86.37     2.54 282015065     0.00     0.00 std::vector<ndn::Block, std::allocator<ndn::Block> >::_M_insert_aux(__gnu_cxx::__normal_iterator<ndn::Block*, std::vector<ndn::Block, std::allocator<ndn::Block> > >, ndn::Block const&)
 2.06     88.55     2.18  5980919     0.00     0.00 nfd::NameTree::findLongestPrefixMatch(boost::shared_ptr<nfd::name_tree::Entry>, boost::function<bool (nfd::name_tree::Entry const&)> const&) const
 1.45     90.08     1.53 564030243     0.00     0.00  __tcf_1
 1.00     91.14     1.06  1983223     0.00     0.00 nfd::NameTree::findExactMatch(ndn::Name const&) const

Block::Block Analysis:
Here are all the gprof listings for Block::Block from the flat profile for the long name case above.
The numbers in parens (#) are my addition to try to match this with the actual code.

25.38     26.82    26.82 1940055896     0.00     0.00  (1) ndn::Block::Block(ndn::Block const&)
 0.37     96.51     0.40   64016585     0.00     0.00  (2) ndn::Block::Block(boost::shared_ptr<ndn::Buffer const> const&, unsigned int, __gnu_cxx::__normal_iterator<unsigned char const*, std::vector<unsigned char, std::allocator<unsigned char> > > const&, __gnu_cxx::__normal_iterator<unsigned char const*, std::vector<unsigned char, std::allocator<unsigned char> > > const&, __gnu_cxx::__normal_iterator<unsigned char const*, std::vector<unsigned char, std::allocator<unsigned char> > > const&, __gnu_cxx::__normal_iterator<unsigned char const*, std::vector<unsigned char, std::allocator<unsigned char> > > const&)
 0.05    102.92     0.06 62072947     0.00     0.00  (3) ndn::Block::Block(unsigned int)
 0.02    104.68     0.02 16006475     0.00     0.00  (4) ndn::Block::Block()
 0.02    105.14     0.02                             (5) ndn::Block::Block(std::istream&)
 0.00    105.64     0.00     2457     0.00     0.00  (6) ndn::Block::Block(boost::shared_ptr<ndn::Buffer const> const&, __gnu_cxx::__normal_iterator<unsigned char const*, std::vector<unsigned char, std::allocator<unsigned char> > > const&, __gnu_cxx::__normal_iterator<unsigned char const*, std::vector<unsigned char, std::allocator<unsigned char> > > const&, bool)
 0.00    105.64     0.00     1744     0.00     0.00  (7) ndn::Block::Block(unsigned int, boost::shared_ptr<ndn::Buffer const> const&)
 0.00    105.64     0.00     1196     0.00     0.00  (8) ndn::Block::Block(boost::shared_ptr<ndn::Buffer const> const&)
 0.00    105.64     0.00      517     0.00     0.00  (9) ndn::Block::Block(unsigned int, ndn::Block const&)
 0.00    105.64     0.00        1     0.00    20.00 (10) ndn::Block::Block(unsigned char const*, unsigned long)

And here are all the constructors' signatures from block.cpp:
ndn-cpp-dev/src/encoding$ grep "Block::Block" block.cpp
(4) Block::Block()
(1) Block::Block(const EncodingBuffer& buffer)
(2) Block::Block(const ConstBufferPtr& wire,
            uint32_t type,
            const Buffer::const_iterator& begin, const Buffer::const_iterator& end,
            const Buffer::const_iterator& valueBegin, const Buffer::const_iterator& valueEnd)
(8) Block::Block(const ConstBufferPtr& buffer)
(6) Block::Block(const ConstBufferPtr& buffer,
            const Buffer::const_iterator& begin, const Buffer::const_iterator& end,
            bool verifyLength/* = true*/)
(5) Block::Block(std::istream& is)
(10) Block::Block(const uint8_t* buffer, size_t maxlength)
    Block::Block(const void* bufferX, size_t maxlength)
(3) Block::Block(uint32_t type)
(7) Block::Block(uint32_t type, const ConstBufferPtr& value)
(9) Block::Block(uint32_t type, const Block& value)





_______________________________________________ Nfd-dev mailing list Nfd-dev at lists.cs.ucla.edu<mailto:Nfd-dev at lists.cs.ucla.edu> http://www.lists.cs.ucla.edu/mailman/listinfo/nfd-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.lists.cs.ucla.edu/pipermail/nfd-dev/attachments/20140807/7b1812e0/attachment.html>


More information about the Nfd-dev mailing list