Linux服务器Mamba2安装及example运行问题

久爱 · 发表于 2024-9-3 22:04:57

Mamba从发表到现在已经有段时间了，先前的Mamba代码有些地方不完善或者需要对源码做些修改后编译，最近可能需要用到Mamba，发现之前实现的VisionMamba块，注释掉了一些源码可能会导致训练速度下降，然后官方实现了Mamba2，这里尝试安装，做一些记录，防止服务器重置🫤目录CUDA安装Conda创建环境Pytorch安装causal-conv1d安装Mamba安装问题解决CUDA安装在安装其它东西之前，需要安装CUDA，这里参照了VisionMamba的环境安装cuda-11.8版本，直接到CUDA官网找对应版本就行 CUDA官网在服务器上运行这两行代码，然后安装CUDA，网上有很多教程我这里服务器已经装好了，不在赘述wgethttps://developer.download.nvidia.com/compute/cuda/11.8.0/local_installers/cuda_11.8.0_520.61.05_linux.runsudoshcuda_11.8.0_520.61.05_linux.runConda创建环境Anaconda创建一个虚拟环境配置mamba，创建并激活，同样，这里创建python3.10的环境condacreate-nyour_env_namepython=3.10.13condaactivateyour_env_namePytorch安装pytorch这里安装2.1.1版本，直接进入官网去找相应命令就行，PyTorch官网找到对应的命令就行，我这里使用pip安装，安装会有点慢，可以替换清华源安装。pipinstalltorch==2.1.1torchvision==0.16.1torchaudio==2.1.1--index-urlhttps://download.pytorch.org/whl/cu118pip清华源临时使用pipinstall-ihttps://pypi.tuna.tsinghua.edu.cn/simplesome-package永久替换python-mpipinstall--upgradepippipconfigsetglobal.index-urlhttps://pypi.tuna.tsinghua.edu.cn/simplecausal-conv1d安装Mamba依赖causal-con1d库，这里可以直接使用pip安装pipinstallcausal-conv1d这里可以根据需要自己指定版本，但部分人可能在这里安装报错，可以通过GitHub下载到本地安装，causal-conv1dGitHub找到对应的版本然后下载到服务器上或者复制链接使用wget命令下载，再使用pip命令安装，这里以1.4.0版本为例，对应CUDA11.8，torch-2.1.1，python-3.10wgethttps://github.com/Dao-AILab/causal-conv1d/releases/download/v1.4.0/causal_conv1d-1.4.0+cu118torch2.1cxx11abiTRUE-cp310-cp310-linux_x86_64.whpipinstallcausal_conv1d-1.4.0+cu118torch2.1cxx11abiTRUE-cp310-cp310-linux_x86_64.whl当然也可以将GitHub项目下载到本地编译gitclonehttps://github.com/Dao-AILab/causal-conv1d.gitcdcausal-conv1dCAUSAL_CONV1D_FORCE_BUILD=TRUEpipinstall.网上操作较多，这里不在赘述Mamba安装Mamba安装也和causal-conv1d一样的几种方法，这里懒得写了，我直接pip安装成功了，由于需要编译安装过程需要编译些东西，会需要等很久，但我这里运行的时候出了点问题。MambaGithub问题解决Mamba的Github给了两端代码，用于测试是否安装成功Mambaimporttorchfrommamba_ssmimportMambabatch,length,dim=2,64,16x=torch.randn(batch,length,dim).to("cuda")model=Mamba(#Thismoduleusesroughly3*expand*d_model^2parametersd_model=dim,#Modeldimensiond_modeld_state=16,#SSMstateexpansionfactord_conv=4,#Localconvolutionwidthexpand=2,#Blockexpansionfactor).to("cuda")y=model(x)asserty.shape==x.shapeMamba2frommamba_ssmimportMamba2model=Mamba2(#Thismoduleusesroughly3*expand*d_model^2parametersd_model=dim,#Modeldimensiond_modeld_state=64,#SSMstateexpansionfactor,typically64or128d_conv=4,#Localconvolutionwidthexpand=2,#Blockexpansionfactor).to("cuda")y=model(x)asserty.shape==x.shape这里第一段代码直接运行成功，但测试Mamba2时报错AttributeError:'NoneType'objecthasnoattribute'causal_conv1d_fwd'这里我尝试了Mamba给的其它测试代码（需要下载预训练模型，需要一段时间，也可以根据需要修改--model-name参数）Mambapythonbenchmarks/benchmark_generation_mamba_simple.py--model-name"state-spaces/mamba-2.8b"--prompt"MycatwroteallthisCUDAcodeforanewlanguagemodeland"--topp0.9--temperature0.7--repetition-penalty1.2pythonbenchmarks/benchmark_generation_mamba_simple.py--model-name"state-spaces/mamba-2.8b"--prompt"MycatwroteallthisCUDAcodeforanewlanguagemodeland"--minp0.05--topk0--temperature0.7--repetition-penalty1.2Mamba2pythonbenchmarks/benchmark_generation_mamba_simple.py--model-name"state-spaces/mamba2-2.7b"--prompt"MycatwroteallthisCUDAcodeforanewlanguagemodeland"--topp0.9--temperature0.7--repetition-penalty1.2这里Mamba2测试也是出现了一些问题，会出现一个dconv和d_conv的问题，这里我尝试修改了源码但仍有问题，于是重新安装了causal-conv1d，MambaGithubissue中有人指出需要causal-conv1d>=1.2.0，但我安装为1.4.0仍有问题，我猜可能我从GitHub中下载的版本是1.4.0的问题，于是重新安装。然后下面这个测试就调通了pythonbenchmarks/benchmark_generation_mamba_simple.py--model-name"state-spaces/mamba2-2.7b"--prompt"MycatwroteallthisCUDAcodeforanewlanguagemodeland"--topp0.9--temperature0.7--repetition-penalty1.2但上边的测试还没有成功，出现以下错误RuntimeError:causal_conv1dwithchannellastlayoutrequiresstrides(x.stride(0)andx.stride(2))tobemultiplesof8就是causal_conv1d要求步幅(x.sarstride(0)和x.sarstride(2))为8的倍数网上去找了解决方法，主要是说要求d_model*expand/headdim是 8的倍数修改成如下代码重新运行就成功了importtorchfrommamba_ssmimportMambabatch,length,dim=2,64,16x=torch.randn(batch,length,dim).to("cuda")model=Mamba(#Thismoduleusesroughly3*expand*d_model^2parametersd_model=dim,#Modeldimensiond_modeld_state=16,#SSMstateexpansionfactord_conv=4,#Localconvolutionwidthexpand=2,#Blockexpansionfactor).to("cuda")y=model(x)print("Mambaresult",y.shape)asserty.shape==x.shapeimporttorchfrommamba_ssmimportMamba2batch,length,dim=2,64,512x=torch.randn(batch,length,dim).to("cuda")model=Mamba2(#Thismoduleusesroughly3*expand*d_model^2parameters#makesured_model*expand/headdim=multipleof8d_model=dim,#Modeldimensiond_modeld_state=64,#SSMstateexpansionfactor,typically64or128d_conv=4,#Localconvolutionwidthexpand=2,#Blockexpansionfactorheaddim=64,#default64).to("cuda")y=model(x)print("Mamba2result",y.shape)asserty.shape==x.shape

		自动登录	找回密码
密码			会员注册