[ndnSIM] Simulation terminated with signal SIGSEGV - possible bug

Thiago Teixeira tteixeira at umass.edu
Wed May 16 14:10:11 PDT 2018


Hi John,

I added more memory when I upgraded to the latest Valgrind version. This test was run with 4GB of memory on the VM. My colleague also performed the same test on a VM with 8GB, same result.

Is there any other test that we can run? (ndnSIM 2.4 didn’t work for us)

Thanks,
Thiago


From: John Baugh [mailto:jpbaugh at umich.edu]
Sent: Wednesday, May 16, 2018 3:53 PM
To: Thiago Teixeira <tteixeira at umass.edu>
Cc: Junxiao Shi <shijunxiao at email.arizona.edu>; ndnsim <ndnsim at lists.cs.ucla.edu>
Subject: Re: [ndnSIM] Simulation terminated with signal SIGSEGV - possible bug

Thiago,

My best estimate is this based on valgrind:  your simulation is using almost 900 MB of memory, and your system only has 2.0 GB.  I think by the time it gets to 2,000 or so seconds, you simply are running out of memory and there's a SIGSEGV.  You'll probably need more memory in this system, or to use a more powerful system.  Under HEAP SUMMARY, it says

  in use at exit: 897,122,422 bytes in 7,147,817 blocks

==1647==   total heap usage: 40,377,629,822 allocs, 40,370,482,008 frees, 4,349,761,010,258 bytes allocated

I assume with the OS, other apps running, and the simulator, it's just too much for the device you're on.

Thanks!

John

On Wed, May 16, 2018 at 10:18 AM, Thiago Teixeira <tteixeira at umass.edu<mailto:tteixeira at umass.edu>> wrote:
Hi John,

The Valgrind (version 3.13) test ended. Please see output attached. I also posted here<https://umass.box.com/s/s9qql8vo1kjy170qm8ijxcm6181ioip4> for future reference.

Please let me know is there’s something else I can do.

Best,
Thiago


From: John Baugh [mailto:jpbaugh at umich.edu<mailto:jpbaugh at umich.edu>]
Sent: Monday, May 7, 2018 4:02 AM

To: Thiago Teixeira <tteixeira at umass.edu<mailto:tteixeira at umass.edu>>
Cc: Junxiao Shi <shijunxiao at email.arizona.edu<mailto:shijunxiao at email.arizona.edu>>; ndnsim <ndnsim at lists.cs.ucla.edu<mailto:ndnsim at lists.cs.ucla.edu>>
Subject: Re: [ndnSIM] Simulation terminated with signal SIGSEGV - possible bug

Thiago,

I am sorry to ask you to do this, but could you perhaps upgrade your Valgrind and run again if possible?  I've been looking into your issue and I found a few places that said Valgrind 3.11.0 doesn't recognize the random_device::_M_getval()  instruction so Valgrind is exiting too soon before it finds the actual problem, in my estimation.  The _M_getval is used several calls deep from some of the ndnSIM code, and Valgrind doesn't recognize it, so it's terminating and not giving much useful information.

Thanks,

John



On Sun, May 6, 2018 at 12:58 PM, Thiago Teixeira <tteixeira at umass.edu<mailto:tteixeira at umass.edu>> wrote:
Hi John,

Please find the Valgrind output. I ran  both Valgrind leak-check=yes and leak-check=full

https://gist.github.com/thiteixeira/8c4fd1deb884b548d1f071ebb1bee043#file-valgrind-leak-check-full-txt

https://gist.github.com/thiteixeira/8c4fd1deb884b548d1f071ebb1bee043#file-valgrind-leak-check-yes-txt


Thanks,
TT

From: John Baugh [mailto:jpbaugh at umich.edu<mailto:jpbaugh at umich.edu>]
Sent: Sunday, May 6, 2018 8:40 AM
To: Thiago Teixeira <tteixeira at umass.edu<mailto:tteixeira at umass.edu>>
Cc: Junxiao Shi <shijunxiao at email.arizona.edu<mailto:shijunxiao at email.arizona.edu>>; ndnsim <ndnsim at lists.cs.ucla.edu<mailto:ndnsim at lists.cs.ucla.edu>>

Subject: Re: [ndnSIM] Simulation terminated with signal SIGSEGV - possible bug

Looks like a heap corruption of some sort.

Can we try valgrind?

This may help:. http://www.lists.cs.ucla.edu/pipermail/ndnsim/2017-July/003991.html

Thanks

John

On Sun, May 6, 2018, 7:35 AM Thiago Teixeira <tteixeira at umass.edu<mailto:tteixeira at umass.edu>> wrote:
Hi,

Thanks for your answers. I ran the simulation again and posted the gdb back trace full on the Gist (https://gist.github.com/thiteixeira/8c4fd1deb884b548d1f071ebb1bee043#file-gbd_bt_full_output-txt)

@John, I will try ndnSIM 2.4, thanks.

Best,
Thiago

From: John Baugh [mailto:jpbaugh at umich.edu<mailto:jpbaugh at umich.edu>]
Sent: Saturday, May 5, 2018 6:16 PM
To: Thiago Teixeira <tteixeira at umass.edu<mailto:tteixeira at umass.edu>>
Cc: ndnsim at lists.cs.ucla.edu<mailto:ndnsim at lists.cs.ucla.edu>
Subject: Re: [ndnSIM] Simulation terminated with signal SIGSEGV - possible bug

Thiago,

I was able to run your scenario in ndnSIM 2.4.  So I suspect this is either an introduced bug in the simulator itself, or perhaps there's something not correctly configured in your environment.

I can't reproduce the error, so as Dr. Shi suggested, it would be useful to see your GDB and/or Valgrind output.

Thanks!

John

On Sat, May 5, 2018 at 9:23 AM, Thiago Teixeira <tteixeira at umass.edu<mailto:tteixeira at umass.edu>> wrote:
Hi all,

We have a scenario where nodes have two interfaces, one wireless and one wired. Nodes are position on a grid fashion and the producer is located in the center of the grid.
We run this scenario for 4,000 seconds, but the simulation ends at 2,099 seconds with the code
    Command ['/home/vagrant/ndnSIM/ns-3/build/scratch/ndn-debug-scenario'] terminated with signal SIGSEGV. Run it under a debugger to get more information (./waf --run <program> --command-template="gdb --args %s <args>").

Running with GDB didn’t offer any other insights. Here are the steps to reproduce the issue:

#### Expected behavior
Simulations run until the specified time Simulator::Stop(Seconds(4000.0));

#### Actual behavior
Simulations crash (see error message below) before Simulator::Stop(Seconds(4000.0));

`Command ['/home/vagrant/ndnSIM/ns-3/build/scratch/ndn-debug-scenario'] terminated with signal SIGSEGV. Run it under a debugger to get more information (./waf --run <program> --command-template="gdb --args %s <args>").`

#### Code to reproduce the problem
See Gist:
https://gist.github.com/thiteixeira/8c4fd1deb884b548d1f071ebb1bee043

#### ndnSIM version
ndnSIM-2.5-2-ge674a01

#### Operating system and version
Ubuntu 16.04.4 LTS
memory size: 2000MiB
1 cpu: Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz

#### Other relevant information
compiled ndnSIM using the optimized version:
./waf configure -d optimized

Increasing the number of nodes makes the simulation crash at an earlier time.

_______________________________________________
ndnSIM mailing list
ndnSIM at lists.cs.ucla.edu<mailto:ndnSIM at lists.cs.ucla.edu>
http://www.lists.cs.ucla.edu/mailman/listinfo/ndnsim



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.lists.cs.ucla.edu/pipermail/ndnsim/attachments/20180516/a35b088c/attachment-0001.html>


More information about the ndnSIM mailing list