Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG: mistralai/mamba-codestral-7B-v0.1 AttributeError: 'Mamba2' object has no attribute 'dconv' #196

Open
s-natsubori opened this issue Jul 19, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@s-natsubori
Copy link

s-natsubori commented Jul 19, 2024

Python -VV

Python 3.10.12 (main, Jun 11 2023, 05:26:28) [GCC 11.4.0]

Pip Freeze

absl-py==2.0.0
accelerate==0.28.0
aiohttp @ file:///rapids/aiohttp-3.8.5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl#sha256=df72ac063b97837a80d80dec8d54c241af059cc9bb42c4de68bd5b61ceb37caa
aiorwlock==1.3.0
aiosignal @ file:///rapids/aiosignal-1.3.1-py3-none-any.whl#sha256=f8376fb07dd1e86a584e4fcdec80b36b7f81aac666ebc724e2c090300dd83b17
annotated-types==0.5.0
antlr4-python3-runtime==4.9.3
anyio==4.4.0
apex @ file:///opt/pytorch/apex
argilla==1.24.0
argon2-cffi==23.1.0
argon2-cffi-bindings==21.2.0
asttokens==2.4.0
astunparse==1.6.3
async-timeout @ file:///rapids/async_timeout-4.0.3-py3-none-any.whl#sha256=7405140ff1230c310e51dc27b3145b9092d659ce68ff733fb0cefe3ee42be028
asyncio==3.4.3
attrs==23.1.0
audioread==3.0.1
av==12.2.0
backcall==0.2.0
backoff==2.2.1
beautifulsoup4==4.12.2
bleach==6.0.0
blis==0.7.11
cachetools==5.3.1
catalogue==2.0.10
causal-conv1d==1.4.0
certifi==2023.7.22
cffi==1.16.0
charset-normalizer @ file:///rapids/charset_normalizer-3.2.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl#sha256=193cbc708ea3aca45e7221ae58f0fd63f933753a9bfb498a3b474878f12caaad
click @ file:///rapids/click-8.1.6-py3-none-any.whl#sha256=fa244bb30b3b5ee2cae3da8f55c9e5e0c0e86093306301fb418eb9dc40fbded5
cloudpathlib==0.15.1
cloudpickle @ file:///rapids/cloudpickle-2.2.1-py3-none-any.whl#sha256=61f594d1f4c295fa5cd9014ceb3a1fc4a70b0de1164b94fbc2d854ccba056f9f
cmake==3.27.6
coloredlogs==15.0.1
comm==0.1.4
compel==2.0.2
confection==0.1.3
contourpy==1.1.1
controlnet_aux==0.0.7
cssselect==1.2.0
ctranslate2==4.3.1
cubinlinker @ file:///rapids/cubinlinker-0.3.0%2B2.gce0680b-cp310-cp310-linux_x86_64.whl#sha256=8cff93be2d63d7db8f1d15fc72cf813abe3d8fd31c35be439e3fb6b7b4c89f76
cuda-python @ file:///rapids/cuda_python-12.2.0rc5%2B5.g84845d1-cp310-cp310-linux_x86_64.whl#sha256=19bb8c6dd62e976182ff183aab18d2c9f0a698add93a1037f2cbaa5d0f739d9d
cudf @ file:///rapids/cudf-23.8.0-cp310-cp310-linux_x86_64.whl#sha256=12228d0949a6be3a7a383262f77c37372d48e02e57c4d0b8ed3763ced4d26ccb
cugraph @ file:///rapids/cugraph-23.8.0-cp310-cp310-linux_x86_64.whl#sha256=209757e66f1ef51a5bace52774f9fc5575cdc6a00e11287ca8f0be78f57a9661
cugraph-dgl @ file:///rapids/cugraph_dgl-23.8.0-py3-none-any.whl#sha256=ef49cc4464b39aa686b97faa50186bd104cf965a7b7215c7ffb7b94011b6bcea
cugraph-service-client @ file:///rapids/cugraph_service_client-23.8.0-py3-none-any.whl#sha256=54d3f0367285be37ed4166483e4402e71e6a4747fb55e5a32a6ca9abfe264cb5
cugraph-service-server @ file:///rapids/cugraph_service_server-23.8.0-py3-none-any.whl#sha256=1fd5d70166ff9023c2b451f63e1a4a25c0e55e018811fc1549f52dffb7a422f6
cuml @ file:///rapids/cuml-23.8.0-cp310-cp310-linux_x86_64.whl#sha256=f9209e5d1e2c765a4bc0b2955e4bc29016b9c4186b7e0512553f3fff879bf697
cupy-cuda12x @ file:///rapids/cupy_cuda12x-12.1.0-cp310-cp310-linux_x86_64.whl#sha256=840d1f4560436be5aaa9b6071d4947a391ab8c7b4810f035fc7815d43c29ed6d
cycler==0.12.1
cymem==2.0.8
Cython==3.0.3
dask @ file:///rapids/dask-2023.7.1-py3-none-any.whl#sha256=8ca3969805dd1cceee66f1138f103fba6fbaf22ba488f15b2382b4579ee39f02
dask-cuda @ file:///rapids/dask_cuda-23.8.0-py3-none-any.whl#sha256=68d2bef0df1307a28a0306e3501d63e6d19994d8bbe5e5dccd8b0967bcca8d30
dask-cudf @ file:///rapids/dask_cudf-23.8.0-py3-none-any.whl#sha256=8783c9089041462b8a4418d8645db2a7b2bc32c4c4b1800512f387d466ee1f16
dataclasses-json==0.6.7
datasets==2.19.2
debugpy==1.8.0
decorator==5.1.1
defusedxml==0.7.1
Deprecated==1.2.14
diffusers==0.29.0
dill==0.3.8
diskcache==5.6.3
distributed @ file:///rapids/distributed-2023.7.1-py3-none-any.whl#sha256=1237f8ae11baa9f80070329a33f9d5af32da5c272a98bab088c9b0578c2d816e
distro==1.9.0
dm-tree==0.1.8
docstring_parser==0.16
einops==0.7.0
exceptiongroup==1.1.3
execnet==2.0.2
executing==2.0.0
expecttest==0.1.3
fastapi==0.110.0
faster-whisper==1.0.2
fastjsonschema==2.18.1
fastrlock @ file:///rapids/fastrlock-0.8.1-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_24_x86_64.whl#sha256=d6c53abeae3f9a55b5c65824cec9df59159fa50e8fa800a5c6e8de42b2219c28
feedfinder2==0.0.4
feedparser==6.0.11
ffmpeg-python==0.2.0
filelock==3.12.4
fire==0.6.0
flash-attn @ https://github.com/Dao-AILab/flash-attention/releases/download/v2.5.9.post1/flash_attn-2.5.9.post1+cu122torch2.3cxx11abiFALSE-cp310-cp310-linux_x86_64.whl#sha256=5022ba11d48bf74926da9c16260f4ea2b9bb7f4e29bdb4bd6e1383ad1c55d16f
flatbuffers==24.3.25
fonttools==4.43.1
frozenlist @ file:///rapids/frozenlist-1.4.0-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl#sha256=6918d49b1f90821e93069682c06ffde41829c346c66b721e65a5c62b4bab0300
fsspec @ file:///rapids/fsspec-2023.6.0-py3-none-any.whl#sha256=1cbad1faef3e391fba6dc005ae9b5bdcbf43005c9167ce78c915549c352c869a
fugashi==1.3.1
future==1.0.0
gast==0.5.4
google-auth==2.23.2
google-auth-oauthlib==0.4.6
graphsurgeon @ file:///workspace/TensorRT-8.6.1.6/graphsurgeon/graphsurgeon-0.4.6-py2.py3-none-any.whl#sha256=0fbadaefbbe6e9920b9f814ae961c4a279be602812edf3ed7fb9cc6f8f4809fe
greenlet==3.0.3
grpcio==1.59.0
h11==0.14.0
httpcore==1.0.5
httptools==0.6.1
httpx==0.26.0
huggingface-hub==0.24.0
humanfriendly==10.0
hypothesis==5.35.1
idna==3.4
imageio==2.34.2
importlib-metadata @ file:///rapids/importlib_metadata-6.8.0-py3-none-any.whl#sha256=3ebb78df84a805d7698245025b975d9d67053cd94c79245ba4b3eb694abe68bb
iniconfig==2.0.0
intel-openmp==2021.4.0
interegular==0.3.3
ipykernel==6.25.2
ipython==8.16.1
ipython-genutils==0.2.0
ja-sentence-segmenter==0.0.2
jedi==0.19.1
jieba3k==0.35.1
Jinja2==3.1.2
joblib==1.3.2
json5==0.9.14
jsonpatch==1.33
jsonpointer==3.0.0
jsonschema==4.21.1
jsonschema-specifications==2023.7.1
jupyter-tensorboard @ git+https://github.com/cliffwoolley/jupyter_tensorboard.git@ffa7e26138b82549453306e06b535a9ac36db17a
jupyter_client==8.3.1
jupyter_core==5.3.2
jupyterlab==2.3.2
jupyterlab-pygments==0.2.2
jupyterlab-server==1.2.0
jupytext==1.15.2
kiwisolver==1.4.5
langchain==0.2.3
langchain-community==0.2.4
langchain-core==0.2.5
langchain-openai==0.1.8
langchain-text-splitters==0.2.1
langcodes==3.3.0
langsmith==0.1.92
lark==1.1.9
lazy_loader==0.4
librosa==0.9.2
llvmlite @ file:///rapids/llvmlite-0.40.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl#sha256=bbd5e82cc990e5a3e343a3bf855c26fdfe3bfae55225f00efd01c05bbda79918
lm-format-enforcer==0.10.1
locket @ file:///rapids/locket-1.0.0-py2.py3-none-any.whl#sha256=b6c819a722f7b6bd955b80781788e4a66a55628b858d347536b7e81325a3a5e3
lxml==5.2.1
lxml_html_clean==0.1.1
-e git+https://github.com/state-spaces/mamba@c0a00bd1808881831ddf43206c69362d4df90cf7#egg=mamba_ssm
Markdown==3.4.4
markdown-it-py==3.0.0
MarkupSafe==2.1.3
marshmallow==3.21.3
matplotlib==3.8.0
matplotlib-inline==0.1.6
mdit-py-plugins==0.4.0
mdurl==0.1.2
mediapipe==0.10.8
mistral_common==1.3.1
mistral_inference==1.3.0
mistune==3.0.2
mkl==2021.1.1
mkl-devel==2021.1.1
mkl-include==2021.1.1
mock==5.1.0
monotonic==1.6
mpmath==1.3.0
msgpack @ file:///rapids/msgpack-1.0.5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl#sha256=e42b9594cc3bf4d838d67d6ed62b9e59e201862a25e9a157019e171fbe672dd3
multidict @ file:///rapids/multidict-6.0.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl#sha256=36c63aaa167f6c6b04ef2c85704e93af16c11d20de1d133e39de6a0e84582a93
multiprocess==0.70.16
murmurhash==1.0.10
mypy-extensions==1.0.0
nbclient==0.8.0
nbconvert==7.9.2
nbformat==5.9.2
nest-asyncio==1.5.8
networkx==3.3
newspaper3k==0.2.8
ninja==1.11.1.1
nltk==3.8.1
notebook==6.4.10
numba @ file:///rapids/numba-0.57.1%2B1.g5fba9aa8f-cp310-cp310-linux_x86_64.whl#sha256=348d18dbb5ce363133fa7d033ae804b5440bf51778395f08b337a9ca6ac98e53
numpy==1.23.5
nvfuser==0.0.20+gitunknown
nvidia-cublas-cu12==12.1.3.1
nvidia-cuda-cupti-cu12==12.1.105
nvidia-cuda-nvrtc-cu12==12.1.105
nvidia-cuda-runtime-cu12==12.1.105
nvidia-cudnn-cu12==8.9.2.26
nvidia-cufft-cu12==11.0.2.54
nvidia-curand-cu12==10.3.2.106
nvidia-cusolver-cu12==11.4.5.107
nvidia-cusparse-cu12==12.1.0.106
nvidia-dali-cuda120==1.30.0
nvidia-ml-py==12.555.43
nvidia-nccl-cu12==2.20.5
nvidia-nvjitlink-cu12==12.5.82
nvidia-nvtx-cu12==12.1.105
nvidia-pyindex==1.0.9
nvtx @ file:///rapids/nvtx-0.2.5-cp310-cp310-linux_x86_64.whl#sha256=b8024910cace4d07e6c9677eaf3be1b3e626fa1923ec6e3c7e5d3fdca053c9c9
oauthlib==3.2.2
omegaconf==2.3.0
onnx @ file:///opt/pytorch/pytorch/third_party/onnx
onnxruntime==1.18.1
openai==1.35.15
opencv @ file:///opencv-4.7.0/modules/python/package
opencv-contrib-python==4.10.0.84
opencv-python==4.10.0.84
orjson==3.10.6
outlines==0.0.46
packaging==23.2
pandas @ file:///rapids/pandas-1.5.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl#sha256=7a0a56cef15fd1586726dace5616db75ebcfec9179a3a55e78f72c5639fa2a23
pandocfilters==1.5.0
parso==0.8.3
partd @ file:///rapids/partd-1.4.0-py3-none-any.whl#sha256=7a63529348cf0dff14b986db641cd1b83c16b5cb9fc647c2851779db03282ef8
pathy==0.10.2
pexpect==4.8.0
pickleshare==0.7.5
Pillow==10.1.0
platformdirs==3.11.0
pluggy==1.3.0
ply @ file:///rapids/ply-3.11-py2.py3-none-any.whl#sha256=096f9b8350b65ebd2fd1346b12452efe5b9607f7482813ffca50c22722a807ce
polygraphy==0.49.0
pooch==1.7.0
preshed==3.0.9
prettytable==3.9.0
prometheus-fastapi-instrumentator==7.0.0
prometheus_client==0.20.0
prompt-toolkit==3.0.39
protobuf==3.20.3
psutil @ file:///rapids/psutil-5.9.4-cp310-abi3-linux_x86_64.whl#sha256=e711cfad802fd4061d559d17e9f175e866551434c3418af2925881a3e5f3440e
ptxcompiler @ file:///rapids/ptxcompiler-0.8.1%2B1.g2cb1b35-cp310-cp310-linux_x86_64.whl#sha256=461049ad74511c8d923967e1826861a0d9a2bcee0cfcf3ebc338fc48b3ecc724
ptyprocess==0.7.0
pure-eval==0.2.2
py-cpuinfo==9.0.0
pyairports==2.1.1
pyarrow==17.0.0
pyarrow-hotfix==0.6
pyasn1==0.5.0
pyasn1-modules==0.3.0
pybind11==2.11.1
pybind11-global==2.11.1
pycocotools @ git+https://github.com/nvidia/cocoapi.git@fa44301f7a8b3f95a9f2751d19bfd735b0f6c65d#subdirectory=PythonAPI
pycountry==24.6.1
pycparser==2.21
pydantic==2.6.1
pydantic_core==2.16.2
Pygments==2.16.1
pylibcugraph @ file:///rapids/pylibcugraph-23.8.0-cp310-cp310-linux_x86_64.whl#sha256=8327053f864ed56bf0d0d8fb69a2291ca1e044fa1f447e63b85b29bf72102c74
pylibcugraphops @ file:///rapids/pylibcugraphops-23.8.0-cp310-cp310-linux_x86_64.whl#sha256=17364a79cda63c9f6c62ef6f2bd37151a9e70539f6d60e43fb26ab40e163bba2
pylibraft @ file:///rapids/pylibraft-23.8.0-cp310-cp310-linux_x86_64.whl#sha256=f74580fec4d0e1603f9b3027da33d915ce07a37d2790c28b1d784d133e90a6d2
pynvml @ file:///rapids/pynvml-11.4.1-py3-none-any.whl#sha256=d27be542cd9d06558de18e2deffc8022ccd7355bc7382255d477038e7e424c6c
pyparsing==3.1.1
pytest==7.4.2
pytest-flakefinder==1.1.0
pytest-rerunfailures==12.0
pytest-shard==0.1.2
pytest-xdist==3.3.1
python-dateutil==2.8.2
python-dotenv==1.0.1
python-hostlist==1.23.0
python-multipart==0.0.9
pytorch-quantization==2.1.2
pytz @ file:///rapids/pytz-2023.3-py2.py3-none-any.whl#sha256=a151b3abb88eda1d4e34a9814df37de2a80e301e68ba0fd856fb9b46bfbbbffb
PyYAML==6.0
pyzmq==25.1.1
raft-dask @ file:///rapids/raft_dask-23.8.0-cp310-cp310-linux_x86_64.whl#sha256=9464bd2889aff217d63f2ff804f06328123119e72745399900315fc85f4d6b7e
ray==2.32.0
redis==5.0.3
referencing==0.30.2
regex==2023.10.3
requests==2.32.3
requests-file==2.1.0
requests-oauthlib==1.3.1
resampy==0.4.2
rich==13.7.1
rmm @ file:///rapids/rmm-23.8.0-cp310-cp310-linux_x86_64.whl#sha256=11e3bc42ddfa51f8293ddb37fb006e4dd59fc20534e8f027b5453c8d00fa089f
rpds-py==0.10.4
rsa==4.9
safetensors==0.4.3
scikit-image==0.24.0
scikit-learn @ file:///rapids/scikit_learn-1.2.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl#sha256=184a42842a4e698ffa4d849b6019de50a77a0aa24d26afa28fa49c9190bb144b
scipy @ file:///rapids/scipy-1.11.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl#sha256=366a6a937110d80dca4f63b3f5b00cc89d36f678b2d124a01067b154e692bab1
Send2Trash==1.8.2
sentence-transformers==3.0.1
sentencepiece==0.2.0
sgmllib3k==1.0.0
simple_parsing==0.1.5
six==1.16.0
smart-open==6.4.0
sniffio==1.3.1
sortedcontainers==2.4.0
sounddevice==0.4.7
soundfile==0.12.1
soupsieve==2.5
spacy==3.7.1
spacy-legacy==3.0.12
spacy-loggers==1.0.5
sphinx-glpi-theme==0.3
SQLAlchemy==2.0.31
srsly==2.4.8
stack-data==0.6.3
starlette==0.36.3
sympy==1.12
tabulate==0.9.0
tbb==2021.10.0
tblib @ file:///rapids/tblib-2.0.0-py3-none-any.whl#sha256=9100bfa016b047d5b980d66e7efed952fbd20bd85b56110aaf473cb97d18709a
tenacity==8.5.0
tensorboard==2.9.0
tensorboard-data-server==0.6.1
tensorboard-plugin-wit==1.8.1
tensorrt @ file:///workspace/TensorRT-8.6.1.6/python/tensorrt-8.6.1-cp310-none-linux_x86_64.whl#sha256=2684b4772cb16088184266728a0668f5dac14e66f088c4ccff2096ccb222d74c
termcolor==2.4.0
terminado==0.17.1
thinc==8.2.1
thread6==0.2.0
threadpoolctl==3.2.0
thriftpy2 @ file:///rapids/thriftpy2-0.4.16-cp310-cp310-linux_x86_64.whl#sha256=3b41ffe57f0a10ee592e06b4843e37ae1bc7f0309a2478f0bf1368ede2ad4ed4
tifffile==2024.7.2
tiktoken==0.7.0
timm==1.0.7
tinycss2==1.2.1
tinysegmenter==0.3
tldextract==5.1.2
tokenizers==0.19.1
toml==0.10.2
tomli==2.0.1
toolz @ file:///rapids/toolz-0.12.0-py3-none-any.whl#sha256=2059bd4148deb1884bb0eb770a3cde70e7f954cfbbdc2285f1f2de01fd21eb6f
torch==2.3.0
torch-tensorrt @ file:///opt/pytorch/torch_tensorrt/dist/torch_tensorrt-0.0.0-cp310-cp310-linux_x86_64.whl#sha256=239cc59958283c8fd764ec360b93adf63db94d231c6dbae3212736187d1c1f21
torchdata @ file:///opt/pytorch/data
torchtext @ file:///opt/pytorch/text
torchvision==0.18.0
tornado==6.3.3
tqdm==4.66.2
traitlets==5.9.0
transformers==4.42.4
treelite @ file:///rapids/treelite-3.2.0-cp310-cp310-linux_x86_64.whl#sha256=7627a3fed44ce1dda4c35ce707cca4b6108d74a661997c0451be59d03f2155ca
treelite-runtime @ file:///rapids/treelite_runtime-3.2.0-cp310-cp310-linux_x86_64.whl#sha256=085ec1ba71007d357ecebb493c490133c20778cd51d8662a0a10d1dc56b1623e
triton==2.3.0
typer==0.9.0
types-dataclasses==0.6.6
typing==3.7.4.3
typing-inspect==0.9.0
typing_extensions==4.12.2
ucx-py @ file:///rapids/ucx_py-0.33.0-cp310-cp310-linux_x86_64.whl#sha256=55d9f5f80627ba1f00577fca41ecd6ab8c72cc518e392a078d108b7dbd809c1e
uff @ file:///workspace/TensorRT-8.6.1.6/uff/uff-0.6.9-py2.py3-none-any.whl#sha256=618a3f812d491f0d3c4f2e38b99e03217ca37b206db14cee079f2bf681eb4fe3
unidic-lite==1.0.8
urllib3 @ file:///rapids/urllib3-1.26.16-py2.py3-none-any.whl#sha256=8d36afa7616d8ab714608411b4a3b13e58f463aee519024578e062e141dce20f
uvicorn==0.30.1
uvloop==0.19.0
vllm==0.5.0
vllm-flash-attn==2.5.9
wasabi==1.1.2
watchfiles==0.22.0
wcwidth==0.2.8
weasel==0.3.2
webencodings==0.5.1
websockets==12.0
Werkzeug==3.0.0
wrapt==1.14.1
xdoctest==1.0.2
xformers==0.0.26.post1
xgboost @ file:///rapids/xgboost-1.7.5-cp310-cp310-linux_x86_64.whl#sha256=56f29fb999f8272bf8498ecbaf0659de4becf693b96a545f0e52f627270cf80d
xxhash==3.4.1
yarl @ file:///rapids/yarl-1.9.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl#sha256=891c0e3ec5ec881541f6c5113d8df0315ce5440e244a716b95f2525b7b9f3608
zict @ file:///rapids/zict-3.0.0-py2.py3-none-any.whl#sha256=5796e36bd0e0cc8cf0fbc1ace6a68912611c1dbd74750a3f3026b9b9d6a327ae
zipp @ file:///rapids/zipp-3.16.2-py3-none-any.whl#sha256=679e51dd4403591b2d6838a48de3d283f3d188412a9782faadf845f298736ba0

Reproduction Steps

mistral-chat $LLM_MODEL --instruct --max_tokens 256
$LLM_MODEL is mamba-codestral-7B-v0.1 folder

Traceback (most recent call last):
  File "/usr/local/bin/mistral-chat", line 8, in <module>
    sys.exit(mistral_chat())
  File "/usr/local/lib/python3.10/dist-packages/mistral_inference/main.py", line 203, in mistral_chat
    fire.Fire(interactive)
  File "/usr/local/lib/python3.10/dist-packages/fire/core.py", line 143, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/usr/local/lib/python3.10/dist-packages/fire/core.py", line 477, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
  File "/usr/local/lib/python3.10/dist-packages/fire/core.py", line 693, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/mistral_inference/main.py", line 117, in interactive
    generated_tokens, _ = generate_fn(  # type: ignore[operator]
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/mistral_inference/generate.py", line 21, in generate_mamba
    output = model.model.generate(
  File "/usr/local/setup/mamba/mamba_ssm/utils/generation.py", line 260, in generate
    output = decode(
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/usr/local/setup/mamba/mamba_ssm/utils/generation.py", line 221, in decode
    scores.append(get_logits(sequences[-1], inference_params))
  File "/usr/local/setup/mamba/mamba_ssm/utils/generation.py", line 184, in get_logits
    logits = model(
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/setup/mamba/mamba_ssm/models/mixer_seq_simple.py", line 279, in forward
    hidden_states = self.backbone(input_ids, inference_params=inference_params, **mixer_kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/setup/mamba/mamba_ssm/models/mixer_seq_simple.py", line 194, in forward
    hidden_states, residual = layer(
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/setup/mamba/mamba_ssm/modules/block.py", line 67, in forward
    hidden_states = self.mixer(hidden_states, inference_params=inference_params, **mixer_kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/setup/mamba/mamba_ssm/modules/mamba2.py", line 233, in forward
    self.conv1d(xBC.transpose(1, 2)).transpose(1, 2)[:, -(self.dconv - 1):]
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1709, in __getattr__
    raise AttributeError(f"'{type(self).__name__}' object has no attribute '{name}'")
AttributeError: 'Mamba2' object has no attribute 'dconv'. Did you mean: 'd_conv'?

Expected Behavior

chat output

Additional Context

I install mistral-inference and causal-conv1d from pip
mamba-ssm build from github source. (2.2.2 )
because it raise Undefined Symbol Error.

Suggested Solutions

No response

@s-natsubori s-natsubori added the bug Something isn't working label Jul 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant