Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

the module got coredump #7

Open
vvhungy opened this issue Feb 5, 2018 · 14 comments
Open

the module got coredump #7

vvhungy opened this issue Feb 5, 2018 · 14 comments
Labels

Comments

@vvhungy
Copy link

vvhungy commented Feb 5, 2018

Our redis instance running with tdigest module got coredump in every 4-5 days.
gdb provides backtrace as below.
Could you please take a look at it?
Tks.

(gdb) bt
#0 0x00007f015ab41495 in raise () from /lib64/libc.so.6
#1 0x00007f015ab42c75 in abort () from /lib64/libc.so.6
#2 0x00007f015ab7f3a7 in __libc_message () from /lib64/libc.so.6
#3 0x00007f015ab84dee in malloc_printerr () from /lib64/libc.so.6
#4 0x00007f015ab87c80 in _int_free () from /lib64/libc.so.6
#5 0x00007f0151ffa3f5 in tdigestCompress (t=0x7f0130086750) at src/tdigest.c:176
#6 0x00007f0151ff953c in TDigestTypeAdd_RedisCommand (ctx=0x7ffccde4c350, argv=, argc=) at src/command.c:110
#7 0x0000000000490c90 in RedisModuleCommandDispatcher (c=0x7f0150a2be40) at module.c:466
#8 0x0000000000429337 in call (c=0x7f0150a2be40, flags=15) at server.c:2224
#9 0x00000000004299a5 in processCommand (c=0x7f0150a2be40) at server.c:2505
#10 0x0000000000439b2d in processInputBuffer (c=0x7f0150a2be40) at networking.c:1330
#11 0x0000000000424aed in aeProcessEvents (eventLoop=0x7f015463b0a0, flags=11) at ae.c:421
#12 0x0000000000424e0b in aeMain (eventLoop=0x7f015463b0a0) at ae.c:464
#13 0x000000000042da22 in main (argc=, argv=0x7ffccde4c648) at server.c:3885

(gdb) bt full
#0 0x00007f015ab41495 in raise () from /lib64/libc.so.6
No symbol table info available.
#1 0x00007f015ab42c75 in abort () from /lib64/libc.so.6
No symbol table info available.
#2 0x00007f015ab7f3a7 in __libc_message () from /lib64/libc.so.6
No symbol table info available.
#3 0x00007f015ab84dee in malloc_printerr () from /lib64/libc.so.6
No symbol table info available.
#4 0x00007f015ab87c80 in _int_free () from /lib64/libc.so.6
No symbol table info available.
#5 0x00007f0151ffa3f5 in tdigestCompress (t=0x7f0130086750) at src/tdigest.c:176
unmerged_centroids = 0x1f20a30
unmerged_weight =
num_unmerged =
old_num_centroids = 630
i =
j = 630
args = {t = 0x7f0130086750, centroids = 0x1f21200, idx = 630, weight_so_far = 393204, k1 = 399.2966162507218, min = 0.14999999999999999, max = 511.43741628850984}
#6 0x00007f0151ff953c in TDigestTypeAdd_RedisCommand (ctx=0x7ffccde4c350, argv=, argc=) at src/command.c:110
key = 0x7f0154623000
type = 6
num_added = 1
values = 0x7f0130bc6810
counts = 0x7f0130bc6818
i =
t = 0x7f0130086750
total_count =
#7 0x0000000000490c90 in RedisModuleCommandDispatcher (c=0x7f0150a2be40) at module.c:466
cp =
ctx = {getapifuncptr = 0x491320, module = 0x7f015461b0c0, client = 0x7f0150a2be40, blocked_client = 0x0, amqueue = 0x7f013e59c300, amqueue_len = 16, amqueue_used = 1, flags = 2,
postponed_arrays = 0x0, postponed_arrays_count = 0, blocked_privdata = 0x0, keys_pos = 0x0, keys_count = 0, pa_head = 0x7f0130bc6800}
#8 0x0000000000429337 in call (c=0x7f0150a2be40, flags=15) at server.c:2224
dirty = 23916351
start = 1517577099955319
duration =
client_old_flags = 0
prev_also_propagate = {ops = 0x0, numops = 0}
#9 0x00000000004299a5 in processCommand (c=0x7f0150a2be40) at server.c:2505
No locals.
#10 0x0000000000439b2d in processInputBuffer (c=0x7f0150a2be40) at networking.c:1330
No locals.
#11 0x0000000000424aed in aeProcessEvents (eventLoop=0x7f015463b0a0, flags=11) at ae.c:421
fe = 0x7f01542024a0
mask = 1
fd = 293
rfired = 1
j =
shortest =
tvp =
processed =
numevents = 1
#12 0x0000000000424e0b in aeMain (eventLoop=0x7f015463b0a0) at ae.c:464
No locals.
#13 0x000000000042da22 in main (argc=, argv=0x7ffccde4c648) at server.c:3885
tv = {tv_sec = 1517061481, tv_usec = 930087}
j =
hashseed = "1a86649ef203608a"
background =

@usmanm
Copy link
Owner

usmanm commented Feb 6, 2018

Would it be possible for you to share a script that reproduces the issue? Or share the core dump?

@usmanm usmanm added the bug label Feb 6, 2018
@vvhungy
Copy link
Author

vvhungy commented Feb 6, 2018

Seem its a race-condition bug. Core-dump file's around 640M so I put on mega.nz, the link to download
CORE-DUMP

The binary was compiled from my forked source-code (I added a TDIGEST.CENTROIDS myself). You can find at: https://github.com/vvhungy/redis-tdigest.

Some more info:

Server

redis_version:4.0.6
redis_build_id:8ff1ddc2d25bbf03
redis_mode:standalone
os:Linux 2.6.32-696.16.1.el6.x86_64 x86_64
arch_bits:64
multiplexing_api:epoll
atomicvar_api:sync-builtin
gcc_version:4.4.7

@usmanm
Copy link
Owner

usmanm commented Feb 8, 2018

Great, thanks so much! I'll dig into it sometime this week.

As a side note, I'm curios why you had to implement the tdigest.centroids command. Was tdigest.debug not good for your use case?

@vvhungy
Copy link
Author

vvhungy commented Feb 8, 2018

ah yes, the tdigest.debug result not works well with phpredis rawCommand then I should create tdigest.centroids to reformat the result, and also the function name's not good with my usage :)

@usmanm
Copy link
Owner

usmanm commented Feb 8, 2018

Can you also share your compiled binaries?

@vvhungy
Copy link
Author

vvhungy commented Feb 8, 2018

Binary file redis-server and tdigest.so (running on CentOS 6.9 x86_64), download HERE

@usmanm
Copy link
Owner

usmanm commented Feb 8, 2018

I'm on Ubuntu, so seems like I can't attach to the core dump.

usmanm@usmanm-puget:~/Downloads $ gdb ./redis-server core.18554 
GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.5) 7.11.1
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./redis-server...done.

warning: exec file is newer than core file.
[New LWP 18554]
[New LWP 18556]
[New LWP 18557]
[New LWP 18558]

warning: .dynamic section for "/lib64/ld-linux-x86-64.so.2" is not at the expected address (wrong library or version mismatch?)

warning: Could not load shared library symbols for 6 libraries, e.g. /lib64/libm.so.6.
Use the "info sharedlibrary" command to see the complete listing.
Do you need "set solib-search-path" or "set sysroot"?
Core was generated by `/abserver/redis/redis-server 0.0.0.0:6392                   '.
Program terminated with signal SIGABRT, Aborted.
#0  0x00007f015ab41495 in ?? ()
[Current thread is 1 (LWP 18554)]

I wonder if it'll work in a CentOS VM?

@vvhungy
Copy link
Author

vvhungy commented Feb 8, 2018

I think it should works on a CentOS VM, make sure you use CentOS 6.9.

@vvhungy
Copy link
Author

vvhungy commented Mar 6, 2018

Hi @usmanm, any progress on this?

@usmanm
Copy link
Owner

usmanm commented Apr 4, 2018

Hey @vvhungy, sorry I have not been to spend time on this. I will try to look into it this weekend.

@vvhungy
Copy link
Author

vvhungy commented Apr 6, 2018

yes, hope you can fix this soon.

@rkarthick
Copy link

Is the error before the core dump free(): invalid next size (normal)?

@vvhungy
Copy link
Author

vvhungy commented May 6, 2022

@rkarthick not sure where to get the coredump message. But looking at coredump backtrace, the line which cause core-dump is at src/tdigest.c:176 (pls see my first comment).

@rkarthick
Copy link

rkarthick commented May 9, 2022 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants