|
目录前言一、TensorRT简介1.1TensorRT的主要特点1.2TensorRT的工作流程二、具体示例2.1代码2.2代码结构2.3打印结果前言TensorRT是NVIDIA开发的一款高性能深度学习推理引擎,旨在优化神经网络模型并加速其在NVIDIAGPU上的推理性能。它支持多种深度学习框架,并提供一系列优化技术,以实现更高的吞吐量和更低的延迟。一、TensorRT简介TensorRT(NVIDIATensorRuntime)是由NVIDIA开发的一款高性能深度学习推理库,用于在NVIDIAGPU上进行高效的深度学习推理。它可以优化深度学习模型并将其部署在生产环境中,以实现低延迟和高吞吐量的推理任务。1.1TensorRT的主要特点模型优化:层融合:将多个层融合为一个层以减少内存访问和计算开销。权重量化:将浮点数权重转换为低精度(如INT8或FP16)以减少模型大小和加快计算速度。内存优化:优化内存使用以减少内存带宽和提高数据传输效率。高效推理:异构计算:利用GPU的高并行计算能力进行高效推理。批处理推理:支持批处理输入,提高GPU使用效率。动态输入形状:支持动态输入形状,灵活处理不同大小的输入。易于集成:支持多种深度学习框架:如TensorFlow、PyTorch、ONNX等。多语言支持:提供C++和PythonAPI,方便开发和集成。插件机制:支持自定义层和操作,通过插件机制扩展TensorRT的功能。灵活性和可扩展性:网络定义API:允许用户通过API手动构建和调整深度学习网络。混合精度推理:支持FP32、FP16和INT8的混合精度计算,兼顾性能和精度。1.2TensorRT的工作流程导出模型:将训练好的模型从深度学习框架(如PyTorch、TensorFlow)导出为ONNX格式。解析和构建引擎:使用TensorRT解析ONNX模型,创建网络定义,并进行优化(如层融合、权重量化)。构建TensorRT引擎,这是一个高度优化的二进制文件,可以在GPU上高效运行。推理:加载TensorRT引擎并创建执行上下文。为输入和输出分配内存缓冲区。将输入数据复制到GPU,执行推理,并将输出结果从GPU复制回主机。二、具体示例这个示例展示了如何使用TensorRT将一个预训练的ResNet-18模型从PyTorch导出为ONNX格式,并将其转换为TensorRT引擎,最后在GPU上进行高效的推理。2.1代码importtorchimporttorchvision.modelsasmodelsimportonnximporttensorrtastrtimportpycuda.driverascudaimportpycuda.autoinitimportnumpyasnp#1.导出ResNet-18模型为ONNX格式model=models.resnet18(pretrained=True).eval()#加载预训练的ResNet-18模型,并设置为评估模式dummy_input=torch.randn(1,3,224,224)#创建一个随机输入张量,形状为(1,3,224,224)torch.onnx.export(model,dummy_input,"resnet18.onnx",verbose=True)#导出模型为ONNX格式#2.使用TensorRT将ONNX模型转换为TensorRT引擎TRT_LOGGER=trt.Logger(trt.Logger.WARNING)#创建TensorRT日志记录器defbuild_engine(onnx_file_path,shape=(1,3,224,224)):withtrt.Builder(TRT_LOGGER)asbuilder,builder.create_network()asnetwork,trt.OnnxParser(network,TRT_LOGGER)asparser:config=builder.create_builder_config()config.max_workspace_size=1<< 30 # 设置最大工作空间大小为 1 GB builder.max_batch_size = 1 # 设置最大批处理大小为 1 # 读取 ONNX 模型文件 with open(onnx_file_path, 'rb') as model: if not parser.parse(model.read()): print('Failed parsing ONNX file.') for error in range(parser.num_errors): print(parser.get_error(error)) return None network.get_input(0).shape = shape # 设置输入形状 engine = builder.build_engine(network, config) # 构建 TensorRT 引擎 return engine engine = build_engine("resnet18.onnx") # 构建 TensorRT 引擎 # 3. 加载 TensorRT 引擎并进行推理def allocate_buffers(engine): inputs = [] outputs = [] bindings = [] stream = cuda.Stream() # 创建 CUDA 流 for binding in engine: size = trt.volume(engine.get_binding_shape(binding)) * engine.max_batch_size # 计算绑定形状的体积 dtype = trt.nptype(engine.get_binding_dtype(binding)) # 获取绑定数据类型 host_mem = cuda.pagelocked_empty(size, dtype) # 分配页锁定内存 dev_mem = cuda.mem_alloc(host_mem.nbytes) # 分配设备内存 bindings.append(int(dev_mem)) if engine.binding_is_input(binding): inputs.append((host_mem, dev_mem)) # 添加输入绑定 else: outputs.append((host_mem, dev_mem)) # 添加输出绑定 return inputs, outputs, bindings, stream inputs, outputs, bindings, stream = allocate_buffers(engine) # 分配缓冲区context = engine.create_execution_context() # 创建执行上下文 # 4. 推理函数def infer(context, bindings, inputs, outputs, stream): [cuda.memcpy_htod_async(inp[1], inp[0], stream) for inp in inputs] # 将输入数据从主机复制到设备 context.execute_async(bindings=bindings, stream_handle=stream.handle) # 异步执行推理 [cuda.memcpy_dtoh_async(out[0], out[1], stream) for out in outputs] # 将输出数据从设备复制到主机 stream.synchronize() # 同步流 return [out[0] for out in outputs] # 5. 准备输入数据并进行推理input_data = np.random.random_sample((1, 3, 224, 224)).astype(np.float32) # 创建随机输入数据np.copyto(inputs[0][0], input_data.ravel()) # 将输入数据复制到输入缓冲区 trt_outputs = infer(context, bindings=bindings, inputs=inputs, outputs=outputs, stream=stream) # 进行推理print(trt_outputs) # 打印推理结果
2.2代码结构
导出 ResNet-18 模型为 ONNX 格式:
使用 torchvision.models 加载 ResNet-18 预训练模型,并将其设置为评估模式。创建一个形状为 (1, 3, 224, 224) 的随机输入张量。使用 torch.onnx.export 将模型导出为名为 resnet18.onnx 的 ONNX 文件。 将 ONNX 模型转换为 TensorRT 引擎:
初始化 TensorRT 日志记录器。定义一个函数 build_engine,该函数接收 ONNX 文件路径和输入形状作为参数。在函数内部,创建 TensorRT 构建器、网络和解析器。读取并解析 ONNX 文件。设置网络输入形状并构建 TensorRT 引擎。 分配内存缓冲区:
定义一个函数 allocate_buffers,该函数分配输入和输出的主机和设备内存。通过遍历引擎的所有绑定,计算每个绑定的内存大小,并分配相应的主机和设备内存。 创建执行上下文:
使用 TensorRT 引擎创建一个执行上下文,用于执行推理任务。 推理函数:
定义一个函数 infer,该函数接收执行上下文、绑定、输入、输出和流作为参数。将输入数据从主机复制到设备,并异步执行推理。将输出数据从设备复制回主机,并同步 CUDA 流以确保推理完成。 准备输入数据并进行推理:
创建一个随机的输入数据,并将其复制到输入缓冲区。调用 infer 函数执行推理,并打印输出结果。
2.3打印结果
Exported graph: graph(%input.1 : Float(1, 3, 224, 224, strides=[150528, 50176, 224, 1], requires_grad=0, device=cpu), %fc.weight : Float(1000, 512, strides=[512, 1], requires_grad=1, device=cpu), %fc.bias : Float(1000, strides=[1], requires_grad=1, device=cpu), %onnx::Conv_193 : Float(64, 3, 7, 7, strides=[147, 49, 7, 1], requires_grad=0, device=cpu), %onnx::Conv_194 : Float(64, strides=[1], requires_grad=0, device=cpu), %onnx::Conv_196 : Float(64, 64, 3, 3, strides=[576, 9, 3, 1], requires_grad=0, device=cpu), %onnx::Conv_197 : Float(64, strides=[1], requires_grad=0, device=cpu), %onnx::Conv_199 : Float(64, 64, 3, 3, strides=[576, 9, 3, 1], requires_grad=0, device=cpu), %onnx::Conv_200 : Float(64, strides=[1], requires_grad=0, device=cpu), %onnx::Conv_202 : Float(64, 64, 3, 3, strides=[576, 9, 3, 1], requires_grad=0, device=cpu), %onnx::Conv_203 : Float(64, strides=[1], requires_grad=0, device=cpu), %onnx::Conv_205 : Float(64, 64, 3, 3, strides=[576, 9, 3, 1], requires_grad=0, device=cpu), %onnx::Conv_206 : Float(64, strides=[1], requires_grad=0, device=cpu), %onnx::Conv_208 : Float(128, 64, 3, 3, strides=[576, 9, 3, 1], requires_grad=0, device=cpu), %onnx::Conv_209 : Float(128, strides=[1], requires_grad=0, device=cpu), %onnx::Conv_211 : Float(128, 128, 3, 3, strides=[1152, 9, 3, 1], requires_grad=0, device=cpu), %onnx::Conv_212 : Float(128, strides=[1], requires_grad=0, device=cpu), %onnx::Conv_214 : Float(128, 64, 1, 1, strides=[64, 1, 1, 1], requires_grad=0, device=cpu), %onnx::Conv_215 : Float(128, strides=[1], requires_grad=0, device=cpu), %onnx::Conv_217 : Float(128, 128, 3, 3, strides=[1152, 9, 3, 1], requires_grad=0, device=cpu), %onnx::Conv_218 : Float(128, strides=[1], requires_grad=0, device=cpu), %onnx::Conv_220 : Float(128, 128, 3, 3, strides=[1152, 9, 3, 1], requires_grad=0, device=cpu), %onnx::Conv_221 : Float(128, strides=[1], requires_grad=0, device=cpu), %onnx::Conv_223 : Float(256, 128, 3, 3, strides=[1152, 9, 3, 1], requires_grad=0, device=cpu), %onnx::Conv_224 : Float(256, strides=[1], requires_grad=0, device=cpu), %onnx::Conv_226 : Float(256, 256, 3, 3, strides=[2304, 9, 3, 1], requires_grad=0, device=cpu), %onnx::Conv_227 : Float(256, strides=[1], requires_grad=0, device=cpu), %onnx::Conv_229 : Float(256, 128, 1, 1, strides=[128, 1, 1, 1], requires_grad=0, device=cpu), %onnx::Conv_230 : Float(256, strides=[1], requires_grad=0, device=cpu), %onnx::Conv_232 : Float(256, 256, 3, 3, strides=[2304, 9, 3, 1], requires_grad=0, device=cpu), %onnx::Conv_233 : Float(256, strides=[1], requires_grad=0, device=cpu), %onnx::Conv_235 : Float(256, 256, 3, 3, strides=[2304, 9, 3, 1], requires_grad=0, device=cpu), %onnx::Conv_236 : Float(256, strides=[1], requires_grad=0, device=cpu), %onnx::Conv_238 : Float(512, 256, 3, 3, strides=[2304, 9, 3, 1], requires_grad=0, device=cpu), %onnx::Conv_239 : Float(512, strides=[1], requires_grad=0, device=cpu), %onnx::Conv_241 : Float(512, 512, 3, 3, strides=[4608, 9, 3, 1], requires_grad=0, device=cpu), %onnx::Conv_242 : Float(512, strides=[1], requires_grad=0, device=cpu), %onnx::Conv_244 : Float(512, 256, 1, 1, strides=[256, 1, 1, 1], requires_grad=0, device=cpu), %onnx::Conv_245 : Float(512, strides=[1], requires_grad=0, device=cpu), %onnx::Conv_247 : Float(512, 512, 3, 3, strides=[4608, 9, 3, 1], requires_grad=0, device=cpu), %onnx::Conv_248 : Float(512, strides=[1], requires_grad=0, device=cpu), %onnx::Conv_250 : Float(512, 512, 3, 3, strides=[4608, 9, 3, 1], requires_grad=0, device=cpu), %onnx::Conv_251 : Float(512, strides=[1], requires_grad=0, device=cpu)): %input.4 : Float(1, 64, 112, 112, strides=[802816, 12544, 112, 1], requires_grad=1, device=cpu) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[7, 7], pads=[3, 3, 3, 3], strides=[2, 2], onnx_name="Conv_0"](%input.1, %onnx::Conv_193, %onnx::Conv_194) # D:\anaconda3\envs\ystorch\lib\site-packages\torch\nn\modules\conv.py:453:0 %onnx::MaxPool_125 : Float(1, 64, 112, 112, strides=[802816, 12544, 112, 1], requires_grad=1, device=cpu) = onnx::Relu[onnx_name="Relu_1"](%input.4) # D:\anaconda3\envs\ystorch\lib\site-packages\torch\nn\functional.py:1455:0 %input.8 : Float(1, 64, 56, 56, strides=[200704, 3136, 56, 1], requires_grad=1, device=cpu) = onnx::MaxPool[ceil_mode=0, kernel_shape=[3, 3], pads=[1, 1, 1, 1], strides=[2, 2], onnx_name="MaxPool_2"](%onnx::MaxPool_125) # D:\anaconda3\envs\ystorch\lib\site-packages\torch\nn\functional.py:782:0 %input.16 : Float(1, 64, 56, 56, strides=[200704, 3136, 56, 1], requires_grad=1, device=cpu) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[3, 3], pads=[1, 1, 1, 1], strides=[1, 1], onnx_name="Conv_3"](%input.8, %onnx::Conv_196, %onnx::Conv_197) # D:\anaconda3\envs\ystorch\lib\site-packages\torch\nn\modules\conv.py:453:0 %onnx::Conv_129 : Float(1, 64, 56, 56, strides=[200704, 3136, 56, 1], requires_grad=1, device=cpu) = onnx::Relu[onnx_name="Relu_4"](%input.16) # D:\anaconda3\envs\ystorch\lib\site-packages\torch\nn\functional.py:1455:0 %onnx::Add_198 : Float(1, 64, 56, 56, strides=[200704, 3136, 56, 1], requires_grad=1, device=cpu) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[3, 3], pads=[1, 1, 1, 1], strides=[1, 1], onnx_name="Conv_5"](%onnx::Conv_129, %onnx::Conv_199, %onnx::Conv_200) # D:\anaconda3\envs\ystorch\lib\site-packages\torch\nn\modules\conv.py:453:0 %onnx::Relu_132 : Float(1, 64, 56, 56, strides=[200704, 3136, 56, 1], requires_grad=1, device=cpu) = onnx::Add[onnx_name="Add_6"](%onnx::Add_198, %input.8) # D:\anaconda3\envs\ystorch\lib\site-packages\torchvision\models\resnet.py:102:0 %input.24 : Float(1, 64, 56, 56, strides=[200704, 3136, 56, 1], requires_grad=1, device=cpu) = onnx::Relu[onnx_name="Relu_7"](%onnx::Relu_132) # D:\anaconda3\envs\ystorch\lib\site-packages\torch\nn\functional.py:1455:0 %input.32 : Float(1, 64, 56, 56, strides=[200704, 3136, 56, 1], requires_grad=1, device=cpu) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[3, 3], pads=[1, 1, 1, 1], strides=[1, 1], onnx_name="Conv_8"](%input.24, %onnx::Conv_202, %onnx::Conv_203) # D:\anaconda3\envs\ystorch\lib\site-packages\torch\nn\modules\conv.py:453:0 %onnx::Conv_136 : Float(1, 64, 56, 56, strides=[200704, 3136, 56, 1], requires_grad=1, device=cpu) = onnx::Relu[onnx_name="Relu_9"](%input.32) # D:\anaconda3\envs\ystorch\lib\site-packages\torch\nn\functional.py:1455:0 %onnx::Add_204 : Float(1, 64, 56, 56, strides=[200704, 3136, 56, 1], requires_grad=1, device=cpu) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[3, 3], pads=[1, 1, 1, 1], strides=[1, 1], onnx_name="Conv_10"](%onnx::Conv_136, %onnx::Conv_205, %onnx::Conv_206) # D:\anaconda3\envs\ystorch\lib\site-packages\torch\nn\modules\conv.py:453:0 %onnx::Relu_139 : Float(1, 64, 56, 56, strides=[200704, 3136, 56, 1], requires_grad=1, device=cpu) = onnx::Add[onnx_name="Add_11"](%onnx::Add_204, %input.24) # D:\anaconda3\envs\ystorch\lib\site-packages\torchvision\models\resnet.py:102:0 %input.40 : Float(1, 64, 56, 56, strides=[200704, 3136, 56, 1], requires_grad=1, device=cpu) = onnx::Relu[onnx_name="Relu_12"](%onnx::Relu_139) # D:\anaconda3\envs\ystorch\lib\site-packages\torch\nn\functional.py:1455:0 %input.48 : Float(1, 128, 28, 28, strides=[100352, 784, 28, 1], requires_grad=1, device=cpu) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[3, 3], pads=[1, 1, 1, 1], strides=[2, 2], onnx_name="Conv_13"](%input.40, %onnx::Conv_208, %onnx::Conv_209) # D:\anaconda3\envs\ystorch\lib\site-packages\torch\nn\modules\conv.py:453:0 %onnx::Conv_143 : Float(1, 128, 28, 28, strides=[100352, 784, 28, 1], requires_grad=1, device=cpu) = onnx::Relu[onnx_name="Relu_14"](%input.48) # D:\anaconda3\envs\ystorch\lib\site-packages\torch\nn\functional.py:1455:0 %onnx::Add_210 : Float(1, 128, 28, 28, strides=[100352, 784, 28, 1], requires_grad=1, device=cpu) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[3, 3], pads=[1, 1, 1, 1], strides=[1, 1], onnx_name="Conv_15"](%onnx::Conv_143, %onnx::Conv_211, %onnx::Conv_212) # D:\anaconda3\envs\ystorch\lib\site-packages\torch\nn\modules\conv.py:453:0 %onnx::Add_213 : Float(1, 128, 28, 28, strides=[100352, 784, 28, 1], requires_grad=1, device=cpu) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[1, 1], pads=[0, 0, 0, 0], strides=[2, 2], onnx_name="Conv_16"](%input.40, %onnx::Conv_214, %onnx::Conv_215) # D:\anaconda3\envs\ystorch\lib\site-packages\torch\nn\modules\conv.py:453:0 %onnx::Relu_148 : Float(1, 128, 28, 28, strides=[100352, 784, 28, 1], requires_grad=1, device=cpu) = onnx::Add[onnx_name="Add_17"](%onnx::Add_210, %onnx::Add_213) # D:\anaconda3\envs\ystorch\lib\site-packages\torchvision\models\resnet.py:102:0 %input.60 : Float(1, 128, 28, 28, strides=[100352, 784, 28, 1], requires_grad=1, device=cpu) = onnx::Relu[onnx_name="Relu_18"](%onnx::Relu_148) # D:\anaconda3\envs\ystorch\lib\site-packages\torch\nn\functional.py:1455:0 %input.68 : Float(1, 128, 28, 28, strides=[100352, 784, 28, 1], requires_grad=1, device=cpu) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[3, 3], pads=[1, 1, 1, 1], strides=[1, 1], onnx_name="Conv_19"](%input.60, %onnx::Conv_217, %onnx::Conv_218) # D:\anaconda3\envs\ystorch\lib\site-packages\torch\nn\modules\conv.py:453:0 %onnx::Conv_152 : Float(1, 128, 28, 28, strides=[100352, 784, 28, 1], requires_grad=1, device=cpu) = onnx::Relu[onnx_name="Relu_20"](%input.68) # D:\anaconda3\envs\ystorch\lib\site-packages\torch\nn\functional.py:1455:0 %onnx::Add_219 : Float(1, 128, 28, 28, strides=[100352, 784, 28, 1], requires_grad=1, device=cpu) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[3, 3], pads=[1, 1, 1, 1], strides=[1, 1], onnx_name="Conv_21"](%onnx::Conv_152, %onnx::Conv_220, %onnx::Conv_221) # D:\anaconda3\envs\ystorch\lib\site-packages\torch\nn\modules\conv.py:453:0 %onnx::Relu_155 : Float(1, 128, 28, 28, strides=[100352, 784, 28, 1], requires_grad=1, device=cpu) = onnx::Add[onnx_name="Add_22"](%onnx::Add_219, %input.60) # D:\anaconda3\envs\ystorch\lib\site-packages\torchvision\models\resnet.py:102:0 %input.76 : Float(1, 128, 28, 28, strides=[100352, 784, 28, 1], requires_grad=1, device=cpu) = onnx::Relu[onnx_name="Relu_23"](%onnx::Relu_155) # D:\anaconda3\envs\ystorch\lib\site-packages\torch\nn\functional.py:1455:0 %input.84 : Float(1, 256, 14, 14, strides=[50176, 196, 14, 1], requires_grad=1, device=cpu) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[3, 3], pads=[1, 1, 1, 1], strides=[2, 2], onnx_name="Conv_24"](%input.76, %onnx::Conv_223, %onnx::Conv_224) # D:\anaconda3\envs\ystorch\lib\site-packages\torch\nn\modules\conv.py:453:0 %onnx::Conv_159 : Float(1, 256, 14, 14, strides=[50176, 196, 14, 1], requires_grad=1, device=cpu) = onnx::Relu[onnx_name="Relu_25"](%input.84) # D:\anaconda3\envs\ystorch\lib\site-packages\torch\nn\functional.py:1455:0 %onnx::Add_225 : Float(1, 256, 14, 14, strides=[50176, 196, 14, 1], requires_grad=1, device=cpu) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[3, 3], pads=[1, 1, 1, 1], strides=[1, 1], onnx_name="Conv_26"](%onnx::Conv_159, %onnx::Conv_226, %onnx::Conv_227) # D:\anaconda3\envs\ystorch\lib\site-packages\torch\nn\modules\conv.py:453:0 %onnx::Add_228 : Float(1, 256, 14, 14, strides=[50176, 196, 14, 1], requires_grad=1, device=cpu) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[1, 1], pads=[0, 0, 0, 0], strides=[2, 2], onnx_name="Conv_27"](%input.76, %onnx::Conv_229, %onnx::Conv_230) # D:\anaconda3\envs\ystorch\lib\site-packages\torch\nn\modules\conv.py:453:0 %onnx::Relu_164 : Float(1, 256, 14, 14, strides=[50176, 196, 14, 1], requires_grad=1, device=cpu) = onnx::Add[onnx_name="Add_28"](%onnx::Add_225, %onnx::Add_228) # D:\anaconda3\envs\ystorch\lib\site-packages\torchvision\models\resnet.py:102:0 %input.96 : Float(1, 256, 14, 14, strides=[50176, 196, 14, 1], requires_grad=1, device=cpu) = onnx::Relu[onnx_name="Relu_29"](%onnx::Relu_164) # D:\anaconda3\envs\ystorch\lib\site-packages\torch\nn\functional.py:1455:0 %input.104 : Float(1, 256, 14, 14, strides=[50176, 196, 14, 1], requires_grad=1, device=cpu) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[3, 3], pads=[1, 1, 1, 1], strides=[1, 1], onnx_name="Conv_30"](%input.96, %onnx::Conv_232, %onnx::Conv_233) # D:\anaconda3\envs\ystorch\lib\site-packages\torch\nn\modules\conv.py:453:0 %onnx::Conv_168 : Float(1, 256, 14, 14, strides=[50176, 196, 14, 1], requires_grad=1, device=cpu) = onnx::Relu[onnx_name="Relu_31"](%input.104) # D:\anaconda3\envs\ystorch\lib\site-packages\torch\nn\functional.py:1455:0 %onnx::Add_234 : Float(1, 256, 14, 14, strides=[50176, 196, 14, 1], requires_grad=1, device=cpu) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[3, 3], pads=[1, 1, 1, 1], strides=[1, 1], onnx_name="Conv_32"](%onnx::Conv_168, %onnx::Conv_235, %onnx::Conv_236) # D:\anaconda3\envs\ystorch\lib\site-packages\torch\nn\modules\conv.py:453:0 %onnx::Relu_171 : Float(1, 256, 14, 14, strides=[50176, 196, 14, 1], requires_grad=1, device=cpu) = onnx::Add[onnx_name="Add_33"](%onnx::Add_234, %input.96) # D:\anaconda3\envs\ystorch\lib\site-packages\torchvision\models\resnet.py:102:0 %input.112 : Float(1, 256, 14, 14, strides=[50176, 196, 14, 1], requires_grad=1, device=cpu) = onnx::Relu[onnx_name="Relu_34"](%onnx::Relu_171) # D:\anaconda3\envs\ystorch\lib\site-packages\torch\nn\functional.py:1455:0 %input.120 : Float(1, 512, 7, 7, strides=[25088, 49, 7, 1], requires_grad=1, device=cpu) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[3, 3], pads=[1, 1, 1, 1], strides=[2, 2], onnx_name="Conv_35"](%input.112, %onnx::Conv_238, %onnx::Conv_239) # D:\anaconda3\envs\ystorch\lib\site-packages\torch\nn\modules\conv.py:453:0 %onnx::Conv_175 : Float(1, 512, 7, 7, strides=[25088, 49, 7, 1], requires_grad=1, device=cpu) = onnx::Relu[onnx_name="Relu_36"](%input.120) # D:\anaconda3\envs\ystorch\lib\site-packages\torch\nn\functional.py:1455:0 %onnx::Add_240 : Float(1, 512, 7, 7, strides=[25088, 49, 7, 1], requires_grad=1, device=cpu) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[3, 3], pads=[1, 1, 1, 1], strides=[1, 1], onnx_name="Conv_37"](%onnx::Conv_175, %onnx::Conv_241, %onnx::Conv_242) # D:\anaconda3\envs\ystorch\lib\site-packages\torch\nn\modules\conv.py:453:0 %onnx::Add_243 : Float(1, 512, 7, 7, strides=[25088, 49, 7, 1], requires_grad=1, device=cpu) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[1, 1], pads=[0, 0, 0, 0], strides=[2, 2], onnx_name="Conv_38"](%input.112, %onnx::Conv_244, %onnx::Conv_245) # D:\anaconda3\envs\ystorch\lib\site-packages\torch\nn\modules\conv.py:453:0 %onnx::Relu_180 : Float(1, 512, 7, 7, strides=[25088, 49, 7, 1], requires_grad=1, device=cpu) = onnx::Add[onnx_name="Add_39"](%onnx::Add_240, %onnx::Add_243) # D:\anaconda3\envs\ystorch\lib\site-packages\torchvision\models\resnet.py:102:0 %input.132 : Float(1, 512, 7, 7, strides=[25088, 49, 7, 1], requires_grad=1, device=cpu) = onnx::Relu[onnx_name="Relu_40"](%onnx::Relu_180) # D:\anaconda3\envs\ystorch\lib\site-packages\torch\nn\functional.py:1455:0 %input.140 : Float(1, 512, 7, 7, strides=[25088, 49, 7, 1], requires_grad=1, device=cpu) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[3, 3], pads=[1, 1, 1, 1], strides=[1, 1], onnx_name="Conv_41"](%input.132, %onnx::Conv_247, %onnx::Conv_248) # D:\anaconda3\envs\ystorch\lib\site-packages\torch\nn\modules\conv.py:453:0 %onnx::Conv_184 : Float(1, 512, 7, 7, strides=[25088, 49, 7, 1], requires_grad=1, device=cpu) = onnx::Relu[onnx_name="Relu_42"](%input.140) # D:\anaconda3\envs\ystorch\lib\site-packages\torch\nn\functional.py:1455:0 %onnx::Add_249 : Float(1, 512, 7, 7, strides=[25088, 49, 7, 1], requires_grad=1, device=cpu) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[3, 3], pads=[1, 1, 1, 1], strides=[1, 1], onnx_name="Conv_43"](%onnx::Conv_184, %onnx::Conv_250, %onnx::Conv_251) # D:\anaconda3\envs\ystorch\lib\site-packages\torch\nn\modules\conv.py:453:0 %onnx::Relu_187 : Float(1, 512, 7, 7, strides=[25088, 49, 7, 1], requires_grad=1, device=cpu) = onnx::Add[onnx_name="Add_44"](%onnx::Add_249, %input.132) # D:\anaconda3\envs\ystorch\lib\site-packages\torchvision\models\resnet.py:102:0 %input.148 : Float(1, 512, 7, 7, strides=[25088, 49, 7, 1], requires_grad=1, device=cpu) = onnx::Relu[onnx_name="Relu_45"](%onnx::Relu_187) # D:\anaconda3\envs\ystorch\lib\site-packages\torch\nn\functional.py:1455:0 %onnx::Flatten_189 : Float(1, 512, 1, 1, strides=[512, 1, 1, 1], requires_grad=1, device=cpu) = onnx::GlobalAveragePool[onnx_name="GlobalAveragePool_46"](%input.148) # D:\anaconda3\envs\ystorch\lib\site-packages\torch\nn\functional.py:1214:0 %onnx::Gemm_190 : Float(1, 512, strides=[512, 1], requires_grad=1, device=cpu) = onnx::Flatten[axis=1, onnx_name="Flatten_47"](%onnx::Flatten_189) # D:\anaconda3\envs\ystorch\lib\site-packages\torchvision\models\resnet.py:279:0 %191 : Float(1, 1000, strides=[1000, 1], requires_grad=1, device=cpu) = onnx::Gemm[alpha=1., beta=1., transB=1, onnx_name="Gemm_48"](%onnx::Gemm_190, %fc.weight, %fc.bias) # D:\anaconda3\envs\ystorch\lib\site-packages\torch\nn\modules\linear.py:114:0 return (%191)
|
|