[ndnSIM] Simulation terminated with signal SIGSEGV - possible bug

John Baugh jpbaugh at umich.edu
Sat May 19 02:37:47 PDT 2018


Thiago,

I keep coming back to this when I get free time, and it's still a bit
perplexing.  Is the ndn-debug-scenario.cc that you have at
https://gist.github.com/thiteixeira/8c4fd1deb884b548d1f071ebb1bee043#file-valgrind-leak-check-full-txt
up to date?

The Valgrind points to line 195 as being the culprit, but that's just
setting a prefix ( producerHelper.SetPrefix(ndnPrefix);)

Some things we can at least look into:


   1. What version of the C standard library is installed on your system?
       (ldd --version)

   2. When you said "ndnSIM 2.4 didn't work for us", did you mean you
   couldn't install ndnSIM 2.4, or you did install it and it still doesn't
   work?

   3. Deep in the belly of ns-3 (
   https://www.nsnam.org/doxygen/simple-ref-count_8h_source.html#l00105) -
   it appears there is an m_count variable, but it's an unsigned 32 bit
   integer... not 64 bit.  So I'm wondering if the number of references it's
   tracking exceeds 4,294,967,295.

   4. Alternatively, the problem most clearly (unclearly?) says it's "

   Address 0x8 is not stack'd, malloc'd or (recently) free'd
   and


==1647== Process terminating with default action of signal 11
(SIGSEGV) ==1647== Access not within mapped region at address 0x8


So I'm wondering...  where is this 0x8 address having a value assigned or
being accessed?  0x8 doesn't seem to be a likely memory address to be
allocated...  so a pointer is being set to the value "8" somewhere....  I
don't see you doing that anywhere.  So I'm wondering if there's a buffer
overflow somewhere.


Thanks,

John

On Wed, May 16, 2018 at 5:10 PM, Thiago Teixeira <tteixeira at umass.edu>
wrote:

> Hi John,
>
>
>
> I added more memory when I upgraded to the latest Valgrind version. This
> test was run with 4GB of memory on the VM. My colleague also performed the
> same test on a VM with 8GB, same result.
>
>
>
> Is there any other test that we can run? (ndnSIM 2.4 didn’t work for us)
>
>
>
> Thanks,
>
> Thiago
>
>
>
>
>
> *From:* John Baugh [mailto:jpbaugh at umich.edu]
> *Sent:* Wednesday, May 16, 2018 3:53 PM
> *To:* Thiago Teixeira <tteixeira at umass.edu>
>
> *Cc:* Junxiao Shi <shijunxiao at email.arizona.edu>; ndnsim <
> ndnsim at lists.cs.ucla.edu>
> *Subject:* Re: [ndnSIM] Simulation terminated with signal SIGSEGV -
> possible bug
>
>
>
> Thiago,
>
>
>
> My best estimate is this based on valgrind:  your simulation is using
> almost 900 MB of memory, and your system only has 2.0 GB.  I think by the
> time it gets to 2,000 or so seconds, you simply are running out of memory
> and there's a SIGSEGV.  You'll probably need more memory in this system, or
> to use a more powerful system.  Under HEAP SUMMARY, it says
>
>   in use at exit: *897,122,422* bytes in 7,147,817 blocks
>
> ==1647==   total heap usage: 40,377,629,822 allocs, 40,370,482,008 frees, 4,349,761,010,258 bytes allocated
>
>
>
> I assume with the OS, other apps running, and the simulator, it's just too
> much for the device you're on.
>
>
>
> Thanks!
>
>
>
> John
>
>
>
> On Wed, May 16, 2018 at 10:18 AM, Thiago Teixeira <tteixeira at umass.edu>
> wrote:
>
> Hi John,
>
>
>
> The Valgrind (version 3.13) test ended. Please see output attached. I also
> posted here <https://umass.box.com/s/s9qql8vo1kjy170qm8ijxcm6181ioip4>
> for future reference.
>
>
>
> Please let me know is there’s something else I can do.
>
>
>
> Best,
>
> Thiago
>
>
>
>
>
> *From:* John Baugh [mailto:jpbaugh at umich.edu]
> *Sent:* Monday, May 7, 2018 4:02 AM
>
>
> *To:* Thiago Teixeira <tteixeira at umass.edu>
> *Cc:* Junxiao Shi <shijunxiao at email.arizona.edu>; ndnsim <
> ndnsim at lists.cs.ucla.edu>
> *Subject:* Re: [ndnSIM] Simulation terminated with signal SIGSEGV -
> possible bug
>
>
>
> Thiago,
>
>
>
> I am sorry to ask you to do this, but could you perhaps upgrade your
> Valgrind and run again if possible?  I've been looking into your issue and
> I found a few places that said Valgrind 3.11.0 doesn't recognize the
> random_device::_M_getval()  instruction so Valgrind is exiting too soon
> before it finds the actual problem, in my estimation.  The _M_getval is
> used several calls deep from some of the ndnSIM code, and Valgrind doesn't
> recognize it, so it's terminating and not giving much useful information.
>
>
>
> Thanks,
>
>
>
> John
>
>
>
>
>
>
>
> On Sun, May 6, 2018 at 12:58 PM, Thiago Teixeira <tteixeira at umass.edu>
> wrote:
>
> Hi John,
>
>
>
> Please find the Valgrind output. I ran  both Valgrind leak-check=yes and
> leak-check=full
>
>
>
> https://gist.github.com/thiteixeira/8c4fd1deb884b548d1f071ebb1bee0
> 43#file-valgrind-leak-check-full-txt
>
>
>
> https://gist.github.com/thiteixeira/8c4fd1deb884b548d1f071ebb1bee0
> 43#file-valgrind-leak-check-yes-txt
>
>
>
>
>
> Thanks,
>
> TT
>
>
>
> *From:* John Baugh [mailto:jpbaugh at umich.edu]
> *Sent:* Sunday, May 6, 2018 8:40 AM
> *To:* Thiago Teixeira <tteixeira at umass.edu>
> *Cc:* Junxiao Shi <shijunxiao at email.arizona.edu>; ndnsim <
> ndnsim at lists.cs.ucla.edu>
>
>
> *Subject:* Re: [ndnSIM] Simulation terminated with signal SIGSEGV -
> possible bug
>
>
>
> Looks like a heap corruption of some sort.
>
>
>
> Can we try valgrind?
>
>
>
> This may help:. http://www.lists.cs.ucla.edu/pipermail/ndnsim/
> 2017-July/003991.html
>
>
>
> Thanks
>
>
>
> John
>
>
>
> On Sun, May 6, 2018, 7:35 AM Thiago Teixeira <tteixeira at umass.edu> wrote:
>
> Hi,
>
>
>
> Thanks for your answers. I ran the simulation again and posted the gdb
> back trace full on the Gist (https://gist.github.com/thiteixeira/
> 8c4fd1deb884b548d1f071ebb1bee043#file-gbd_bt_full_output-txt)
>
>
>
> @John, I will try ndnSIM 2.4, thanks.
>
>
>
> Best,
>
> Thiago
>
>
>
> *From:* John Baugh [mailto:jpbaugh at umich.edu]
> *Sent:* Saturday, May 5, 2018 6:16 PM
> *To:* Thiago Teixeira <tteixeira at umass.edu>
> *Cc:* ndnsim at lists.cs.ucla.edu
> *Subject:* Re: [ndnSIM] Simulation terminated with signal SIGSEGV -
> possible bug
>
>
>
> Thiago,
>
>
>
> I was able to run your scenario in ndnSIM 2.4.  So I suspect this is
> either an introduced bug in the simulator itself, or perhaps there's
> something not correctly configured in your environment.
>
>
>
> I can't reproduce the error, so as Dr. Shi suggested, it would be useful
> to see your GDB and/or Valgrind output.
>
>
>
> Thanks!
>
>
>
> John
>
>
>
> On Sat, May 5, 2018 at 9:23 AM, Thiago Teixeira <tteixeira at umass.edu>
> wrote:
>
> Hi all,
>
>
>
> We have a scenario where nodes have two interfaces, one wireless and one
> wired. Nodes are position on a grid fashion and the producer is located in
> the center of the grid.
>
> We run this scenario for 4,000 seconds, but the simulation ends at 2,099
> seconds with the code
>
>     Command ['/home/vagrant/ndnSIM/ns-3/build/scratch/ndn-debug-scenario']
> terminated with signal SIGSEGV. Run it under a debugger to get more
> information (./waf --run <program> --command-template="gdb --args %s
> <args>").
>
>
>
> Running with GDB didn’t offer any other insights. Here are the steps to
> reproduce the issue:
>
>
>
> #### Expected behavior
>
> Simulations run until the specified time Simulator::Stop(Seconds(4000.0));
>
>
>
> #### Actual behavior
>
> Simulations crash (see error message below) before
> Simulator::Stop(Seconds(4000.0));
>
>
>
> `Command ['/home/vagrant/ndnSIM/ns-3/build/scratch/ndn-debug-scenario']
> terminated with signal SIGSEGV. Run it under a debugger to get more
> information (./waf --run <program> --command-template="gdb --args %s
> <args>").`
>
>
>
> #### Code to reproduce the problem
>
> See Gist:
>
> https://gist.github.com/thiteixeira/8c4fd1deb884b548d1f071ebb1bee043
>
>
>
> #### ndnSIM version
>
> ndnSIM-2.5-2-ge674a01
>
>
>
> #### Operating system and version
>
> Ubuntu 16.04.4 LTS
>
> memory size: 2000MiB
>
> 1 cpu: Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz
>
>
>
> #### Other relevant information
>
> compiled ndnSIM using the optimized version:
>
> ./waf configure -d optimized
>
>
>
> Increasing the number of nodes makes the simulation crash at an earlier
> time.
>
>
> _______________________________________________
> ndnSIM mailing list
> ndnSIM at lists.cs.ucla.edu
> http://www.lists.cs.ucla.edu/mailman/listinfo/ndnsim
>
>
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.lists.cs.ucla.edu/pipermail/ndnsim/attachments/20180519/66b12a77/attachment-0001.html>


More information about the ndnSIM mailing list