找回密码
 会员注册
查看: 23|回复: 0

昇腾环境下使用docker部署mindie-service

[复制链接]

2万

主题

0

回帖

6万

积分

超级版主

积分
64164
发表于 2024-9-12 13:43:49 | 显示全部楼层 |阅读模式
MindIE是基于昇腾硬件的运行加速、调试调优、快速迁移部署的高性能深度学习推理框架。它包含了MindIE-Service、MindIE-Torch和MindIE-RT等组件。我主要用MindIE-Service的功能,这个组件对标的是vllm这样的大语言推理框架。启动docker容器先拉取镜像(要去官网获取最新镜像版本)dockerpullswr.cn-central-221.ovaijisuan.com/dxy/mindie:1.0.RC1-800I-A2-aarch641然后启动容器,我这里将前2张NPU加速卡映射到docker内:dockerrun--namemy_mindie-it-d--net=host--shm-size=500g\--device=/dev/davinci0\--device=/dev/davinci1\-w/home\--device=/dev/davinci_manager\--device=/dev/hisi_hdc\--device=/dev/devmm_svm\--entrypoint=bash\-v/usr/local/Ascend/driver:/usr/local/Ascend/driver\-v/usr/local/dcmi:/usr/local/dcmi\-v/usr/local/bin/npu-smi:/usr/local/bin/npu-smi\-v/usr/local/sbin:/usr/local/sbin\-v/root/xxx/mindformer_share/:/home/xxx_share\-v/tmp:/tmp\-v/etc/hccn.conf:/etc/hccn.conf\-v/usr/share/zoneinfo/Asia/Shanghai:/etc/localtime\-ehttp_proxy=$http_proxy\-ehttps_proxy=$https_proxy\swr.cn-central-221.ovaijisuan.com/dxy/mindie:1.0.RC1-800I-A2-aarch6412345678910111213141516171819上面-v/root/xxx/mindformer_share/:/home/xxx_share是在映射我的磁盘进容器,需要根据自己的环境做修改。进入容器:dockerexec-itmy_mindiebash1进入之后执行环境设置:source/usr/local/Ascend/ascend-toolkit/set_env.shsource/usr/local/Ascend/mindie/set_env.sh12修改服务配置上述操作做完,就可以修改mindie-service的配置文件了,这个文件位于/usr/local/Ascend/mindie/latest/mindie-service/conf/config.json。"ipAddress":"0.0.0.0","port":1025,"ModelDeployParam":{"maxSeqLen":4096,"npuDeviceIds":[[0,1]],"ModelParam":[{"modelName":"baichuan2","modelWeightPath":"/home/xxxx/baichuan-inc/Baichuan2-13B-Chat/","worldSize":2,"cpuMemSize":5,"npuMemSize":10,"backendType":"atb"}]},123456789101112131415161718我这里罗列下我关注的字段。ipAddress和port是监听网络和地址modelName是tritton-url请求里要用的字段,需要记下来npuDeviceIds指定用哪几张卡worldSize是使用npu的数量,必须与npuDeviceIds中的卡数一致modelWeightPath模型路径maxSeqLen最大长度启动服务cd/usr/local/Ascend/mindie/latest/mindie-service/bin/mindieservice_daemon12如何使用服务可以用postman或者python接口调用http服务。POSThttp://223.106.234.6:2250/generate{"prompt":"你是谁?\n","max_tokens":1024,"repetition_penalty":1.03,"presence_penalty":1.2,"frequency_penalty":1.2,"temperature":0.5,"top_k":10,"top_p":0.95,"stream":false}12345678910111213mindie支持openai\triton\vllm等接口。具体可参考文档这里参考资料MindIE是什么昇腾docker镜像仓库
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 会员注册

本版积分规则

QQ|手机版|心飞设计-版权所有:微度网络信息技术服务中心 ( 鲁ICP备17032091号-12 )|网站地图

GMT+8, 2024-12-26 14:16 , Processed in 0.474065 second(s), 25 queries .

Powered by Discuz! X3.5

© 2001-2024 Discuz! Team.

快速回复 返回顶部 返回列表