nndocs:infiniband
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
nndocs:infiniband [2024/03/26 18:15] – [Partitions: Host configuration] add example commands naptastic | nndocs:infiniband [2025/01/21 14:38] (current) – [Networking] correct a thing naptastic | ||
---|---|---|---|
Line 7: | Line 7: | ||
For hardware support, Mellanox provides MLNX_OFED, an overlay for several distributions. Unfortunately, | For hardware support, Mellanox provides MLNX_OFED, an overlay for several distributions. Unfortunately, | ||
- | ^ MLNX_OFED version | + | ^ Version |
| Inbox | | All | All | 3.3.23-2 | | | Inbox | | All | All | 3.3.23-2 | | ||
- | | 4.9-x | ConnectX-2 | ≤ 11 | ≤ 20.04 | 5.7.2 | | + | | MLNX_OFED |
- | | 5.8-x | ConnectX-4 | ≥ 9 | ≥ 18.04 | 5.17.0 | | + | | MLNX_OFED |
===How I'm Getting Around It=== | ===How I'm Getting Around It=== | ||
Line 29: | Line 29: | ||
====The MLNX part==== | ====The MLNX part==== | ||
- | It's worth investigating other tools provided with MLNX_OFED to see if they offer compelling advantages over inbox versions. I'm not doing that right now because I suspect | + | Old OpenSM has this annoying problem where, |
- | MLNX_OFED_LINUX-4.9-7.1.0.0-ubuntu20.04-x86_64/ | + | MLNX_OFED_LINUX-24.07-0.6.1.0-debian12.5-x86_64/ |
- | + | ||
- | Newer versions of MLNX_OFED have newer versions of OpenSM. I haven' | + | |
There' | There' | ||
- | # dpkg -i ibdump_6.0.0-1.49710_amd64.deb | + | # dpkg -i ibdump_6.0.0-1.2407061_amd64.deb |
=====The Subnet Manager: OpenSM===== | =====The Subnet Manager: OpenSM===== | ||
Line 80: | Line 78: | ||
Here's a block for my ATA over Ethernet experiments. Subject to change. IP addresses are necessary for setting up VXLAN tunnels. Checking if IPv6 tunnels perform differently from IPv4 tunnels is on the to-do list. I suspect they perform better. Needs testing. | Here's a block for my ATA over Ethernet experiments. Subject to change. IP addresses are necessary for setting up VXLAN tunnels. Checking if IPv6 tunnels perform differently from IPv4 tunnels is on the to-do list. I suspect they perform better. Needs testing. | ||
- | | + | |
mgid=ff12: | mgid=ff12: | ||
mgid=ff12: | mgid=ff12: | ||
Line 87: | Line 85: | ||
====Partitions: | ====Partitions: | ||
There' | There' | ||
- | # echo 0xb129 | + | # echo 0xb128 |
- | + | ||
- | The sysfs interface for deleting child interfaces doesn' | + | |
- | # ip link del ib0.b129 | + | |
Resist the temptation to rename the interface to something descriptive. **It's already self-descriptive**. Creative naming is for VXLAN tunnels and bridges, e.g.: | Resist the temptation to rename the interface to something descriptive. **It's already self-descriptive**. Creative naming is for VXLAN tunnels and bridges, e.g.: | ||
- | # ip link add vx129 type vxlan id 129 local 172.20.129.9 group 225.172.20.129 | + | # ip link add vx128 type vxlan id 128 local 172.20.128.13 group 225.172.20.128 |
- | # ip link set master | + | # ip link set master |
- | If you unset the high bit on the partition number (0x3129 | + | The sysfs interface for deleting child interfaces doesn' |
+ | # ip link del ib0.b128 | ||
+ | |||
+ | If you unset the high bit on the partition number (0x3128 | ||
It's worth finding out if Netplan can manage IB child interfaces. | It's worth finding out if Netplan can manage IB child interfaces. | ||
+ | |||
====Connected vs. Datagram==== | ====Connected vs. Datagram==== | ||
IPoIB can run in one of three modes: | IPoIB can run in one of three modes: | ||
Line 121: | Line 120: | ||
Since I don't have any newer hardware, I don't have any information about Enhanced IPoIB. | Since I don't have any newer hardware, I don't have any information about Enhanced IPoIB. | ||
+ | |||
=====SR-IOV===== | =====SR-IOV===== | ||
====Hardware Settings==== | ====Hardware Settings==== | ||
Line 131: | Line 131: | ||
SRIOV_EN | SRIOV_EN | ||
- | FPP_EN (Flow Priority something | + | FPP_EN (Function Per Port ENable) controls whether the card appears as two PCI devices, or as a single device with two ports. Under mlx4, every VF on a dual-port HCA has both ports, and NUM_OF_VFs is how many dual-port devices to create. Under mlx5, each port gets its own pool of VFs and NUM_OF_VFs is per-port. |
I haven' | I haven' | ||
Line 182: | Line 182: | ||
GUIDs need to be set before attaching a VF to a VM. It should be possible to change state (simulating unplugging the cable) while a VM is using a VF but I haven' | GUIDs need to be set before attaching a VF to a VM. It should be possible to change state (simulating unplugging the cable) while a VM is using a VF but I haven' | ||
- | Lazy copy-pasta for southpark. Note this only sets up VFs for port 1, but that's the only port plugged | + | Configuration is managed |
- | ip link set dev ib0 vf 0 node_guid 58: | + | |
- | ip link set dev ib0 vf 0 port_guid 58: | + | |
- | ip link set dev ib0 vf 0 state enable | + | |
- | ip link set dev ib0 vf 1 node_guid 58: | + | |
- | ip link set dev ib0 vf 1 port_guid 58: | + | |
- | ip link set dev ib0 vf 1 state enable | + | |
- | ip link set dev ib0 vf 2 node_guid 58: | + | |
- | ip link set dev ib0 vf 2 port_guid 58: | + | |
- | ip link set dev ib0 vf 2 state enable | + | |
- | ip link set dev ib0 vf 3 node_guid 58: | + | |
- | ip link set dev ib0 vf 3 port_guid 58: | + | |
- | ip link set dev ib0 vf 3 state enable | + | |
- | ip link set dev ib0 vf 4 node_guid 58: | + | |
- | ip link set dev ib0 vf 4 port_guid 58: | + | |
- | ip link set dev ib0 vf 4 state enable | + | |
- | ip link set dev ib0 vf 5 node_guid 58: | + | |
- | ip link set dev ib0 vf 5 port_guid 58: | + | |
- | ip link set dev ib0 vf 5 state enable | + | |
- | ip link set dev ib0 vf 6 node_guid 58: | + | |
- | ip link set dev ib0 vf 6 port_guid 58: | + | |
- | ip link set dev ib0 vf 6 state enable | + | |
- | + | ||
- | Lazy copy-pasta for sadness: | + | |
- | ip link set dev ib0 vf 0 node_guid 58: | + | |
- | ip link set dev ib0 vf 0 port_guid 58: | + | |
- | ip link set dev ib0 vf 0 state enable | + | |
- | ip link set dev ib0 vf 1 node_guid 58: | + | |
- | ip link set dev ib0 vf 1 port_guid 58: | + | |
- | ip link set dev ib0 vf 1 state enable | + | |
- | ip link set dev ib0 vf 2 node_guid 58: | + | |
- | ip link set dev ib0 vf 2 port_guid 58: | + | |
- | ip link set dev ib0 vf 2 state enable | + | |
- | ip link set dev ib0 vf 3 node_guid 58: | + | |
- | ip link set dev ib0 vf 3 port_guid 58: | + | |
- | ip link set dev ib0 vf 3 state enable | + | |
- | ip link set dev ib0 vf 4 node_guid 58: | + | |
- | ip link set dev ib0 vf 4 port_guid 58: | + | |
- | ip link set dev ib0 vf 4 state enable | + | |
- | ip link set dev ib0 vf 5 node_guid 58: | + | |
- | ip link set dev ib0 vf 5 port_guid 58: | + | |
- | ip link set dev ib0 vf 5 state enable | + | |
- | ip link set dev ib0 vf 6 node_guid 58: | + | |
- | ip link set dev ib0 vf 6 port_guid 58: | + | |
- | ip link set dev ib0 vf 6 state enable | + | |
- | + | ||
- | Lazy (and incomplete) copy-pasta for shark: | + | |
- | ip link set dev ib0 vf 0 node_guid 58: | + | |
- | ip link set dev ib0 vf 0 port_guid 58: | + | |
- | ip link set dev ib0 vf 0 state enable | + | |
- | + | ||
- | These should really go on their own page. Or better yet, figure out how to configure them on the host! | + | |
=====Upper-Layer Protocols (ULPs)===== | =====Upper-Layer Protocols (ULPs)===== | ||
RDMA opens all kinds of possibilities for RDMA-aware protocols to be amazing and fast. They probably all deserve their own pages. | RDMA opens all kinds of possibilities for RDMA-aware protocols to be amazing and fast. They probably all deserve their own pages. | ||
Line 260: | Line 208: | ||
====Networking==== | ====Networking==== | ||
===VXLAN=== | ===VXLAN=== | ||
- | VXLAN is not the only way to get an Ethernet device on Infiniband, but as far as I can tell it's the only decent one. Neither ConnectX-3 nor Connect-IB | + | VXLAN is not the only way to get an Ethernet device on Infiniband, but as far as I can tell it's the only decent one. None of my hardware |
* VXLAN id can be anything from 0-16777215 inclusive. I make it match the network number. | * VXLAN id can be anything from 0-16777215 inclusive. I make it match the network number. | ||
Line 288: | Line 236: | ||
I also want to throw audio frames around with "no latency added" | I also want to throw audio frames around with "no latency added" | ||
+ | |||
+ | =====GUIDs===== | ||
+ | * 5849560e59150301 - shark Connect-IB | ||
+ | * 5849560e53b70b01 - southpark Connect-IB | ||
+ | * 5849560e53660101 - duckling Connect-IB | ||
+ | * 7cfe900300a0a080 - uninstalled Connect-IB | ||
+ | * (there are several more uninstalled Connect-IB cards) | ||
+ | * f4521403002c18b0 - uninstalled ConnectX-3 2014-01-29 | ||
+ | * 0002c90300b37f10 - uninstalled ConnectX-3 with no date on the label | ||
+ | * 001175000079b560 - uninstalled qib | ||
+ | * 001175000079b856 - uninstalled qib | ||
+ |
nndocs/infiniband.1711476957.txt.gz · Last modified: 2024/03/26 18:15 by naptastic