Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

STARsolo run with --soloType CB_UMI_Complex uses too much memory. #2222

Open
brgew opened this issue Oct 1, 2024 · 1 comment
Open

STARsolo run with --soloType CB_UMI_Complex uses too much memory. #2222

brgew opened this issue Oct 1, 2024 · 1 comment

Comments

@brgew
Copy link

brgew commented Oct 1, 2024

Hi,

I am trying to run STAR-2.7.11b using --soloType CB_UMI_Complex with four barcodes. The white list files have between 96 and 768 sequences of about 10 bases each. I find that the program seems to get hung up while the memory usage grows beyond 256 GB before it dies. It does not begin reading the genome index files.

I compiled STAR with symbols. I stopped it periodically and found that it stalls at the resize method marked below

void ParametersSolo::complexWLstrings() {

    fprintf(stderr, "ParametersSolo::complexWLstrings: start...\n");
          
    cbWLstr.resize(cbWLsize);  <----

    fprintf(stderr, "ParametersSolo::complexWLstrings: continue...\n");

The stack is

#0  0x00007ffff7d3fe30 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string() () from /lib/x86_64-linux-gnu/libstdc++.so.6
#1  0x0000555555595b2b in std::_Construct<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*) (
    __p=0x7fee5add4050) at /usr/include/c++/12/bits/stl_construct.h:119
#2  0x0000555555595a7e in std::__uninitialized_default_n_1<false>::__uninit_default_n<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, unsigned long> (
    __first=0x7febb7a1e010, __n=2363892990) at /usr/include/c++/12/bits/stl_uninitialized.h:638
#3  0x00005555555959eb in std::__uninitialized_default_n<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, unsigned long> (__first=0x7febb7a1e010, __n=2717908992)
    at /usr/include/c++/12/bits/stl_uninitialized.h:701
#4  0x0000555555595990 in std::__uninitialized_default_n_a<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, unsigned long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > (__first=0x7febb7a1e010, __n=2717908992) at /usr/include/c++/12/bits/stl_uninitialized.h:766
#5  0x000055555559573b in std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >::_M_default_append (this=0x7fffffffbbb8, __n=2717908992) at /usr/include/c++/12/bits/vector.tcc:655
#6  0x000055555559554b in std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >::resize (this=0x7fffffffbbb8, __new_size=2717908992) at /usr/include/c++/12/bits/stl_vector.h:1011
#7  0x00005555555ddded in ParametersSolo::complexWLstrings (this=0x7fffffffb9a8) at ParametersSolo.cpp:513
#8  0x00005555555dc82c in ParametersSolo::initialize (this=0x7fffffffb9a8, pPin=0x7fffffffa5f0) at ParametersSolo.cpp:401
#9  0x00005555556695de in Parameters::inputParameters (this=0x7fffffffa5f0, argInN=57, argIn=0x7fffffffe2a8) at Parameters.cpp:962
#10 0x0000555555652cfb in main (argInN=57, argIn=0x7fffffffe2a8) at STAR.cpp:75

I found that the function void ParametersSolo::initialize(Parameters *pPin) is on the stack at these interrupts and added some diagnostics in the code block that start with

} else if (type==SoloTypes::CB_UMI_Complex) {//complex barcodes: multiple whitelist (one for each CB), varying CB length

        cbWLsize=1;
        for (uint32 icb=0; icb<cbV.size(); icb++) {//cycle over WL files
fprintf(stderr, "whitelist file n: %d\n", icb);
            cbV[icb].adapterLength=adapterSeq.size();//one adapter for all
            
            ifstream & cbWlStream = ifstrOpen(soloCBwhitelist[icb], ERROR_OUT, "SOLUTION: check the path and permissions of the CB whitelist file: " + soloCBwhitelist[icb], *pP);
            
            string seq1;
            while (cbWlStream >> seq1) {//cycle over one WL file
                uint64 cb1;
                if (!convertNuclStrToInt64(seq1,cb1)) {//convert to 2-bit format
                    pP->inOut->logMain << "WARNING: CB whitelist sequence contains non-ACGT base and is ignored: " << seq1 <<endl;
                    continue;
                };
                
                uint32 len1=seq1.size();
                if (len1>=cbV[icb].wl.size())
                    cbV[icb].wl.resize(len1+1);//add new possible lengths to this CB
                cbV[icb].wl.at(len1).push_back(cb1);
            };
            
            cbV[icb].sortWhiteList(this);
            cbV[icb].wlFactor=cbWLsize;
            cbWLsize *= cbV[icb].totalSize;
fprintf(stderr, "cbV_total_size: %u  cbWLsize: %lld\n", cbV[icb].totalSize, cbWLsize);
        };

        complexWLstrings();

As the whitelist files are read, the resulting memory usage looks like

whitelist file n: 0
cbV_total_size: 384  cbWLsize: 384
whitelist file n: 1
cbV_total_size: 768  cbWLsize: 294912
whitelist file n: 2
cbV_total_size: 96  cbWLsize: 28311552
whitelist file n: 3
cbV_total_size: 96  cbWLsize: 2717908992

I am guessing that cbWLsize is much too large, and I suspect that the expression marked below

            while (cbWlStream >> seq1) {//cycle over one WL file
                uint64 cb1;
                if (!convertNuclStrToInt64(seq1,cb1)) {//convert to 2-bit format
                    pP->inOut->logMain << "WARNING: CB whitelist sequence contains non-ACGT base and is ignored: " << seq1 <<endl;
                    continue;
                };
                
                uint32 len1=seq1.size();
                if (len1>=cbV[icb].wl.size())
                    cbV[icb].wl.resize(len1+1);//add new possible lengths to this CB
                cbV[icb].wl.at(len1).push_back(cb1);
            };
            
            cbV[icb].sortWhiteList(this);
            cbV[icb].wlFactor=cbWLsize;
            cbWLsize *= cbV[icb].totalSize;  <------
        };

is the reason why the value is so large. So I naively hoped that using cbWLsize += cbV[icb].totalSize; instead might fix the problem; however, this gives me a segmentation fault later in the program

Oct 01 16:05:44 ..... finished mapping
Oct 01 16:05:45 ..... started Solo counting

Thread 1 "STAR" received signal SIGSEGV, Segmentation fault.
SoloReadFeature::inputRecords (this=0x5555592d54b0, cbP=0x555565e94170, cbPstride=3, cbReadCountTotal=std::vector of length 1345, capacity 1345 = {...}, 
    readInfo=std::vector of length 18867, capacity 18867 = {...}, readFlagCounts=..., nReadPerCBunique1=std::vector of length 1345, capacity 1345 = {...}, 
    nReadPerCBmulti1=std::vector of length 1345, capacity 1345 = {...}) at SoloReadFeature_inputRecords.cpp:53
53                          cbP[cb][0]=feature;
(gdb) where
#0  SoloReadFeature::inputRecords (this=0x5555592d54b0, cbP=0x555565e94170, cbPstride=3, cbReadCountTotal=std::vector of length 1345, capacity 1345 = {...}, 
    readInfo=std::vector of length 18867, capacity 18867 = {...}, readFlagCounts=..., nReadPerCBunique1=std::vector of length 1345, capacity 1345 = {...}, 
    nReadPerCBmulti1=std::vector of length 1345, capacity 1345 = {...}) at SoloReadFeature_inputRecords.cpp:53
#1  0x00005555555c0776 in SoloFeature::countCBgeneUMI (this=0x555565e91e90) at SoloFeature_countCBgeneUMI.cpp:46
#2  0x00005555555fa204 in SoloFeature::processRecords (this=0x555565e91e90) at SoloFeature_processRecords.cpp:54
#3  0x00005555555f3b3d in Solo::processAndOutput (this=0x7fffffffa150) at Solo.cpp:82
#4  0x0000555555653a66 in main (argInN=57, argIn=0x7fffffffe2a8) at STAR.cpp:256

Now I am way over my head in this code so I'm writing for assistance/guidance.

I appreciate your patience with me.

Thank you,

Ever grateful,
Brent

@wrb2012
Copy link

wrb2012 commented Oct 16, 2024

if you dont have 384*768 9696 barcodes, try using a merged whitelist, like wl1: 1152, wl2: 192

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants