Is it RDMA? Is it a modification of SR-IOV?
I’m having trouble even trying to find out more about this since the RDMA definition just says “remote access to device memory” and I’d like to confirm if that includes virtual instances of PCIe devices over the network.
Essentially, I’m looking for a way to share virtual instances of supported PCIe devices over IP. I.e. If you have a GPU, you can create virtual slices of it with SR-IOV on KVM-based hypervisors. I’m looking for something that will take this and make it available over IP.
I have come across Infiniband and QLogic, Mellanox and HP and IBM and RDMA support on Debian and all of that. I just need someone to ELI5 this to me so I know where/what to search and see if what I want is really even possible with FOSS.
I know that Nutanix allows one to serve PCIe hardware over IP on their hypervisor, but I plan to stick with FOSS as far as possible.
Thanks!
Edit: Please let me know what makes my post so hard to grasp - the answer was simple RoCE/iWARP. RDMA is definitely the underlying technology that offers access to the memory of the device whilst bypassing the kernel for good performance; security considerations aside, this is a very good idea since RoCE/iWARP work on the UDP/IP and the TCP/IP stack, making them routable.
Apologies if my post didn’t make the most sense, I tried to describe it the best I could. Thanks
So it is RDMA.
Indeed, I have come across RoCE, and support seems to be quite active on Debian. I was looking at QLogic hardware for this, and whilst I know that firmware for such stuff is really difficult to find, I’m fine with just FOSS support on Debian
I think I misunderstood what exactly you wanted. I don’t think you’re getting remote GPU passthrough to virtual machines over ROCE without an absolute fuckton of custom work. The only people who can probably do this are Google or Microsoft. And they probably just use proprietary Nvidia implementations.
Well, I’m not a systems engineer, so I probably don’t understand the scale of something like this.
With that said, is it really hard to slap TCP/IP on top of SR-IOV? That is literally what I wanted to know, and I thought RDMA could do that. Can it not?