nndocs:ata-over-ethernet
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
nndocs:ata-over-ethernet [2024/03/10 20:17] – [Perfect vs. Good: A Fight to the Death] add info about the problem with using aoe naptastic | nndocs:ata-over-ethernet [2024/08/23 16:02] (current) – remove bug; no longer able to reproduce. naptastic | ||
---|---|---|---|
Line 3: | Line 3: | ||
====Perfect vs. Good: A Fight to the Death==== | ====Perfect vs. Good: A Fight to the Death==== | ||
- | ---- | ||
===Preface (maybe doesn' | ===Preface (maybe doesn' | ||
A few months ago I had to move suddenly and put my lab into storage. Where I moved, there was basic WiFi, and nowhere to set up a desktop. My web services were offline for weeks and I got pretty discouraged. Now I've got an opportunity to set it all up again, and enough people have expressed interest, I'm going to document and publish the whole process, or try anyway. | A few months ago I had to move suddenly and put my lab into storage. Where I moved, there was basic WiFi, and nowhere to set up a desktop. My web services were offline for weeks and I got pretty discouraged. Now I've got an opportunity to set it all up again, and enough people have expressed interest, I'm going to document and publish the whole process, or try anyway. | ||
- | Follow-through | + | The first set of videos |
- | The first set of videos is going to be details on how my SAN is set up, along with a comparison of all the things I've tried. The format consists of a description of each technology, when I do and don't use it and why, and then a little bit of actual how-to in case that technology appeals to you. I hope any instruction I provide is helpful. | + | ===Introduction |
- | + | You will almost certainly never see ATA over Ethernet used in production. It was used in a few SAN products but eventually lost out to iSCSI and Fibre Channel. I'm covering it anyway, and first mainly because it's a good teaching tool. It's easy to get started, and easy to show off different concepts that will become relevant with the more popular technologies. It' | |
- | ---- | + | |
- | + | ||
- | You will almost certainly never see ATA over Ethernet used in production. It was used in a few SAN products but eventually lost out to iSCSI and Fibre Channel. I'm covering it anyway, and first mainly because it's a good teaching tool. It's easy to get started, and easy to show off different concepts that will become relevant with the more popular technologies. It's a really handy tool to have in your toolbox for moving data if all you have is Ethernet. | + | |
- | + | ||
- | Right now, it has a bug that can cause systems on the network not to shut down or reboot if there' | + | |
- | + | ||
- | rmmod aoe | + | |
- | + | ||
- | If my testing is right, the only things necessary for a host to crash on shutdown are (1) there is an ATA-over-Ethernet device in a broadcast domain your host is part of, and (2) the aoe module is loaded. | + | |
For full support (initiator and target) you just need two packages: | For full support (initiator and target) you just need two packages: | ||
- | + | | |
- | | + | |
To export a block device to the network, you use a program called vblade. A daemonized version, vbladed, works with the same options. It starts a server that listens on layer 2 for ATA commands and responds to them. Here is (basically) how you use vblade: | To export a block device to the network, you use a program called vblade. A daemonized version, vbladed, works with the same options. It starts a server that listens on layer 2 for ATA commands and responds to them. Here is (basically) how you use vblade: | ||
- | + | | |
- | | + | |
Other options include sharing only part of a file, SYNC and DIRECT I/O modes, and buffer count. I/O modes and buffer counts require testing. Partial file sharing is there so the operator can logically divide a disk or file but in my opinion that's a bad enough idea I'm not even going to try it. Splitting a device for export is a concern that belongs to a filesystem, or a controller, or something that provides thin provisioning behind a strong layer of abstraction, | Other options include sharing only part of a file, SYNC and DIRECT I/O modes, and buffer count. I/O modes and buffer counts require testing. Partial file sharing is there so the operator can logically divide a disk or file but in my opinion that's a bad enough idea I'm not even going to try it. Splitting a device for export is a concern that belongs to a filesystem, or a controller, or something that provides thin provisioning behind a strong layer of abstraction, | ||
Line 34: | Line 22: | ||
On BTRFS, if a directory has the +C attribute, you can preallocate a file of a given size and (is it contiguous? | On BTRFS, if a directory has the +C attribute, you can preallocate a file of a given size and (is it contiguous? | ||
- | ATA over Ethernet organizes disks by " | + | ATA over Ethernet organizes disks by " |
+ | * Shelf can be any value from 0-65534 except 4095. | ||
+ | * Slot can be any value from 0-254. | ||
+ | * Ethdev must be an **Ethernet** device. Bridges, VLAN interfaces, and VXLAN tunnels are all as good as gigabit Ethernet. | ||
+ | |||
+ | Unfortunately, | ||
So, if you want to export a raw VM image from your current directory, you'd do this: | So, if you want to export a raw VM image from your current directory, you'd do this: | ||
- | + | | |
- | | + | |
...then on the initiator machine, run aoe-discover, | ...then on the initiator machine, run aoe-discover, | ||
- | Once the remote device is in /dev, you can use it like any other device. If it has partitions, Linux will find them automatically. | + | Once the remote device is in /dev, you can use it almost |
- | ---- | + | The one downside is that, for reasons I haven' |
+ | ===Aside: Why " | ||
Difference between server/ | Difference between server/ | ||
- server/ | - server/ | ||
- | - high performance, | + | - target/ |
- | - requires low latency and delivery | + | - storage protocols require guarantees about reliability, |
- | - Most importantly: | + | - Most importantly: |
+ | |||
+ | ===Multipath=== | ||
+ | ATA over Ethernet supports multipath natively and automatically. If AoE discovers a new link to the same slot and shelf on a different Ethernet interface, it will start sending commands and responses on both links round-robin, | ||
Section 1.1 of the ATA over Ethernet standard: | Section 1.1 of the ATA over Ethernet standard: | ||
Line 57: | Line 53: | ||
https:// | https:// | ||
- | When something goes wrong such as a link disappearing, | + | When something goes wrong such as a link disappearing, |
- | ---- | + | ===Persistent Configuration=== |
+ | vblade and vbladed do not maintain state between or across instances. If you need an ATA over Ethernet export to come back after a reboot, you will need your OS to manage vblade processes. On Debian, that's done by putting shell script fragments in / | ||
- | So I glossed over security and VLANs earlier. ATA over Ethernet | + | # This is a POSIX shell fragment |
+ | |||
+ | # configuration | ||
+ | |||
+ | # Supported variables: | ||
+ | |||
+ | # shelf address. Mandatory | ||
+ | # shelf= | ||
+ | |||
+ | # slot address. Mandatory | ||
+ | # slot= | ||
+ | |||
+ | # Network interface name. Mandatory | ||
+ | # netif= | ||
+ | |||
+ | # The name of the regular file or block device to export. Mandatory | ||
+ | # filename= | ||
+ | |||
+ | # Other options, see vblade(8) | ||
+ | # options= | ||
+ | |||
+ | # ionice= | ||
+ | # Set the I/O scheduling class and priority. | ||
+ | # Must be understood by ionice(1) | ||
+ | |||
+ | # Example: | ||
+ | # shelf=10 | ||
+ | # slot=3 | ||
+ | # netif=em3 | ||
+ | # filename=/ | ||
+ | # options=' | ||
+ | # ionice=' | ||
- | ---- | + | ===Security=== |
+ | ATA over Ethernet is intended to run inside of trusted networks. By default, it runs wide open: any host in the same layer 2 broadcast domain has full access to any exported volume. There is no distinction between read-only and read-write access. Preventing unwanted access has to be done by dividing broadcast domains. Originally that meant physical separation--different network adapters, cables, and switches. Now, that separation is more likely to be implemented inside the switch using VLANs or VXLAN tunnels. | ||
- | AoE can also restrict access | + | SAN technologies generally have some kind of ACL mechanism. This has benefits for security and discoverability. As a configuration or command-line option, vblade |
As you put these values into these configuration files, imagine that you are actually plugging different hard drives into different computers. It's not about moving data to a different drive anymore; it's about moving the drive to where the user needs it to be, and doing so in a completely virtual way. | As you put these values into these configuration files, imagine that you are actually plugging different hard drives into different computers. It's not about moving data to a different drive anymore; it's about moving the drive to where the user needs it to be, and doing so in a completely virtual way. | ||
- | -> ACLs | + | ===Boot=== |
- | -> VLANs | + | And that brings us neatly to maybe the most useful thing about a SAN: It makes local storage unnecessary. iPXE supports ATA over Ethernet directly. The DHCP has to provide a suitable root-path option. For isc-dhcp-server, telling a host to boot from shelf 12, slot 9 looks like this: |
- | ---- | + | option root-path " |
- | ATA over Ethernet supports multipath natively | + | The DHCP server must not also provide a TFTP next-server |
+ | FIXME As far as I can tell, there' |
nndocs/ata-over-ethernet.1710101821.txt.gz · Last modified: 2024/03/10 20:17 by naptastic