You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to run STAR-2.7.11b using --soloType CB_UMI_Complex with four barcodes. The white list files have between 96 and 768 sequences of about 10 bases each. I find that the program seems to get hung up while the memory usage grows beyond 256 GB before it dies. It does not begin reading the genome index files.
I compiled STAR with symbols. I stopped it periodically and found that it stalls at the resize method marked below
#0 0x00007ffff7d3fe30 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string() () from /lib/x86_64-linux-gnu/libstdc++.so.6
#1 0x0000555555595b2b in std::_Construct<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*) (
__p=0x7fee5add4050) at /usr/include/c++/12/bits/stl_construct.h:119
#2 0x0000555555595a7e in std::__uninitialized_default_n_1<false>::__uninit_default_n<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, unsigned long> (
__first=0x7febb7a1e010, __n=2363892990) at /usr/include/c++/12/bits/stl_uninitialized.h:638
#3 0x00005555555959eb in std::__uninitialized_default_n<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, unsigned long> (__first=0x7febb7a1e010, __n=2717908992)
at /usr/include/c++/12/bits/stl_uninitialized.h:701
#4 0x0000555555595990 in std::__uninitialized_default_n_a<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, unsigned long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > (__first=0x7febb7a1e010, __n=2717908992) at /usr/include/c++/12/bits/stl_uninitialized.h:766
#5 0x000055555559573b in std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >::_M_default_append (this=0x7fffffffbbb8, __n=2717908992) at /usr/include/c++/12/bits/vector.tcc:655
#6 0x000055555559554b in std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >::resize (this=0x7fffffffbbb8, __new_size=2717908992) at /usr/include/c++/12/bits/stl_vector.h:1011
#7 0x00005555555ddded in ParametersSolo::complexWLstrings (this=0x7fffffffb9a8) at ParametersSolo.cpp:513
#8 0x00005555555dc82c in ParametersSolo::initialize (this=0x7fffffffb9a8, pPin=0x7fffffffa5f0) at ParametersSolo.cpp:401
#9 0x00005555556695de in Parameters::inputParameters (this=0x7fffffffa5f0, argInN=57, argIn=0x7fffffffe2a8) at Parameters.cpp:962
#10 0x0000555555652cfb in main (argInN=57, argIn=0x7fffffffe2a8) at STAR.cpp:75
I found that the function void ParametersSolo::initialize(Parameters *pPin) is on the stack at these interrupts and added some diagnostics in the code block that start with
} else if (type==SoloTypes::CB_UMI_Complex) {//complex barcodes: multiple whitelist (one for each CB), varying CB length
cbWLsize=1;
for (uint32 icb=0; icb<cbV.size(); icb++) {//cycle over WL files
fprintf(stderr, "whitelist file n: %d\n", icb);
cbV[icb].adapterLength=adapterSeq.size();//one adapter for all
ifstream & cbWlStream = ifstrOpen(soloCBwhitelist[icb], ERROR_OUT, "SOLUTION: check the path and permissions of the CB whitelist file: " + soloCBwhitelist[icb], *pP);
string seq1;
while (cbWlStream >> seq1) {//cycle over one WL file
uint64 cb1;
if (!convertNuclStrToInt64(seq1,cb1)) {//convert to 2-bit format
pP->inOut->logMain << "WARNING: CB whitelist sequence contains non-ACGT base and is ignored: " << seq1 <<endl;
continue;
};
uint32 len1=seq1.size();
if (len1>=cbV[icb].wl.size())
cbV[icb].wl.resize(len1+1);//add new possible lengths to this CB
cbV[icb].wl.at(len1).push_back(cb1);
};
cbV[icb].sortWhiteList(this);
cbV[icb].wlFactor=cbWLsize;
cbWLsize *= cbV[icb].totalSize;
fprintf(stderr, "cbV_total_size: %u cbWLsize: %lld\n", cbV[icb].totalSize, cbWLsize);
};
complexWLstrings();
As the whitelist files are read, the resulting memory usage looks like
I am guessing that cbWLsize is much too large, and I suspect that the expression marked below
while (cbWlStream >> seq1) {//cycle over one WL file
uint64 cb1;
if (!convertNuclStrToInt64(seq1,cb1)) {//convert to 2-bit format
pP->inOut->logMain << "WARNING: CB whitelist sequence contains non-ACGT base and is ignored: " << seq1 <<endl;
continue;
};
uint32 len1=seq1.size();
if (len1>=cbV[icb].wl.size())
cbV[icb].wl.resize(len1+1);//add new possible lengths to this CB
cbV[icb].wl.at(len1).push_back(cb1);
};
cbV[icb].sortWhiteList(this);
cbV[icb].wlFactor=cbWLsize;
cbWLsize *= cbV[icb].totalSize; <------
};
is the reason why the value is so large. So I naively hoped that using cbWLsize += cbV[icb].totalSize; instead might fix the problem; however, this gives me a segmentation fault later in the program
Oct 01 16:05:44 ..... finished mapping
Oct 01 16:05:45 ..... started Solo counting
Thread 1 "STAR" received signal SIGSEGV, Segmentation fault.
SoloReadFeature::inputRecords (this=0x5555592d54b0, cbP=0x555565e94170, cbPstride=3, cbReadCountTotal=std::vector of length 1345, capacity 1345 = {...},
readInfo=std::vector of length 18867, capacity 18867 = {...}, readFlagCounts=..., nReadPerCBunique1=std::vector of length 1345, capacity 1345 = {...},
nReadPerCBmulti1=std::vector of length 1345, capacity 1345 = {...}) at SoloReadFeature_inputRecords.cpp:53
53 cbP[cb][0]=feature;
(gdb) where
#0 SoloReadFeature::inputRecords (this=0x5555592d54b0, cbP=0x555565e94170, cbPstride=3, cbReadCountTotal=std::vector of length 1345, capacity 1345 = {...},
readInfo=std::vector of length 18867, capacity 18867 = {...}, readFlagCounts=..., nReadPerCBunique1=std::vector of length 1345, capacity 1345 = {...},
nReadPerCBmulti1=std::vector of length 1345, capacity 1345 = {...}) at SoloReadFeature_inputRecords.cpp:53
#1 0x00005555555c0776 in SoloFeature::countCBgeneUMI (this=0x555565e91e90) at SoloFeature_countCBgeneUMI.cpp:46
#2 0x00005555555fa204 in SoloFeature::processRecords (this=0x555565e91e90) at SoloFeature_processRecords.cpp:54
#3 0x00005555555f3b3d in Solo::processAndOutput (this=0x7fffffffa150) at Solo.cpp:82
#4 0x0000555555653a66 in main (argInN=57, argIn=0x7fffffffe2a8) at STAR.cpp:256
Now I am way over my head in this code so I'm writing for assistance/guidance.
I appreciate your patience with me.
Thank you,
Ever grateful,
Brent
The text was updated successfully, but these errors were encountered:
Hi,
I am trying to run STAR-2.7.11b using
--soloType CB_UMI_Complex
with four barcodes. The white list files have between 96 and 768 sequences of about 10 bases each. I find that the program seems to get hung up while the memory usage grows beyond 256 GB before it dies. It does not begin reading the genome index files.I compiled STAR with symbols. I stopped it periodically and found that it stalls at the resize method marked below
The stack is
I found that the function void ParametersSolo::initialize(Parameters *pPin) is on the stack at these interrupts and added some diagnostics in the code block that start with
} else if (type==SoloTypes::CB_UMI_Complex) {//complex barcodes: multiple whitelist (one for each CB), varying CB length
As the whitelist files are read, the resulting memory usage looks like
I am guessing that cbWLsize is much too large, and I suspect that the expression marked below
is the reason why the value is so large. So I naively hoped that using
cbWLsize += cbV[icb].totalSize;
instead might fix the problem; however, this gives me a segmentation fault later in the programNow I am way over my head in this code so I'm writing for assistance/guidance.
I appreciate your patience with me.
Thank you,
Ever grateful,
Brent
The text was updated successfully, but these errors were encountered: