So, i got persuaded to switch from a “server that is going to do everything” to “compute server + storage server”
The two are connected via a DAC on an intel x520 network card.
Compute is 10.0.0.1, Storage is 10.255.255.254 and i left the usable hosts in the middle for future expansion.
Before I start to use it, I’m wondering if i chose the right protocols to share data between them.
I set NFS and iSCSI.
With iSCSI i create an image, share that image on the compute server, format it as btrfs, use it as a native drive. Files are not accessible anywhere else.
With NFS i just mount the share and files can be accessed from another computer.
Speed:
I tried to time how long it takes to fill a dummy file with zeroes.
/iscsi# time sh -c "dd if=/dev/zero of=ddfile bs=8k count=250000 && sync"
250000+0 records in
250000+0 records out
2048000000 bytes (2.0 GB, 1.9 GiB) copied, 0.88393 s, 2.3 GB/s
real 0m2.796s
user 0m0.051s
sys 0m0.915s
/nfs# time sh -c "dd if=/dev/zero of=ddfile bs=8k count=250000 && sync"
250000+0 records in
250000+0 records out
2048000000 bytes (2.0 GB, 1.9 GiB) copied, 2.41414 s, 848 MB/s
real 0m3.539s
user 0m0.038s
sys 0m1.453s
/sata-smr-wd-green-drive-for-fun# time sh -c "dd if=/dev/zero of=ddfile bs=8k count=250000 && sync"
250000+0 records in
250000+0 records out
2048000000 bytes (2.0 GB, 1.9 GiB) copied, 10.1339 s, 202 MB/s
real 0m46.885s
user 0m0.132s
sys 0m2.423s
what i see from this results:
the sata slow drive goes at 1.6 gigabit/s but then for some reason the computer needs so much time to acknowledge the operation.
nfs transferred it at 6.8 gigabit/s which is what i expected from a nvme array. Same command on the storage server gives similar speed.
iscsi transfers at 18.4 gigabit/s which is not possible with my drives and the fiber connection. Probably is using some native file system trickery to detect “it’s just a file full of zeroes, just tell the user it’s done”
The biggest advantage of NFS is that I can share a whole directory and get direct access. Also sharing another disk image via iscsi requires a service restart which means i have to take down the compute server.
But with iscsi i am the owner of the disk so i can do whatever i want, don’t need to worry about permissions, i am root, chown all the stuff
So… after this long introduction and explanation, what protocol would you use for…:
-
/var/lib/mysql - a database. Inside a disk image shared via iscsi or via nfs?
-
virtual machine images. Copy them inside another image that’s then shared via iscsi? Maybe nfs is much better for this case. Otherwise with iscsi i would have a single giant disk image that contains other disk images…
-
lots of small files like WordPress. Maybe nfs would add too much overhead? But it would be much easier to backup if it was an NFS share instead of a disk image
For a DB, iscsi all day long. At work we have some of the fastest storage available on the market today (SCM drives in front of E1.L super fast SSD all connected with 100gbe mellanox switches and NICs) and it barely can keep up with the MySQL DB. NFS would absolutely work for the other use cases. But these are enterprise class systems for an entire university so yeah lots of load.
For a home system, I’m not sure it matters if the load isn’t much. I prefer iscsi because I know how it works. I use it for VMware datastores even at home. Anything that is shared storage between webservers or the like, NFS is almost required since shared iscsi storage is going to be very difficult to set up outside of something like VMware datastores since the file system is setup for multi-writers. Other filesystems like Redhat’s GFS2 or Oracle’s OCFS can do it but it’s not cheap or easy.
Tl;dr summary: my suggestion is iscsi for the DB and NFS for the other stuff.
DD won’t test random writes so it’s not a good test for dB performance. Iscsi is the best connection for file storage of vms
That’s a great question. Sounds like iscsi is less flexible but more performant, but potentially only in particular situations you may not encounter in a homelab, while nfs is more flexible and not as performant, depending on what you use.
From what I’ve learned just reading this thread, you should make iscsi for db and vms, and nfs for stuff like linux isos and other shared media. That said, with iscsi, I believe it’s possible to resize the disk pretty easily. For the DB, you can probably have it be its own little container with an iscsi drive, and expose it over tcp for applications that need access.
As for your last question about worpress, you could archive it before transfer and either store as an archive or extract it after it reaches its destination. Would be a simple script.
I haven’t ever run an iSCSI setup, but…
I don’t know what your application is, but if you’re planning on running a MySQL database on this, I can imagine that a throughput test isn’t going to be representative of your performance, since latency may matter a lot and throughput not so much. You may want to specifically test that.
ponders
I would guess that iSCSI probably exposes write barriers. That is, btrfs can say “all writes prior to this point must become durable before writes subsequent to this point”, without actually requiring that any data is committed to the disk at the time that the write barrier is issued.
But I believe that the Linux file API has a more-limited set of ways in which it can provide ordering without durability. There’s no
fwritebarrier()
, justfsync()
, and that forces a change to become durable.Depending upon how MySQL works, that might have a significant impact on performance.
Also, NFSv3, which I assume you are using, has behavior around locking and caching that differs from NFSv4 and I don’t know for sure how it will interact with something like MySQL, which may care a lot about precise write ordering behavior.
Disk images will also rely on write ordering to avoid corruption on power loss.
googles
Yeah.
https://dev.mysql.com/doc/refman/8.2/en/disk-issues.html
Using NFS with MySQL
You should be cautious when considering whether to use NFS with MySQL. Potential issues, which vary by operating system and NFS version, include the following:
-
MySQL data and log files placed on NFS volumes becoming locked and unavailable for use. Locking issues may occur in cases where multiple instances of MySQL access the same data directory or where MySQL is shut down improperly, due to a power outage, for example. NFS version 4 addresses underlying locking issues with the introduction of advisory and lease-based locking. However, sharing a data directory among MySQL instances is not recommended.
-
Data inconsistencies introduced due to messages received out of order or lost network traffic. To avoid this issue, use TCP with hard and intr mount options.
-
Maximum file size limitations. NFS Version 2 clients can only access the lowest 2GB of a file (signed 32 bit offset). NFS Version 3 clients support larger files (up to 64 bit offsets). The maximum supported file size also depends on the local file system of the NFS server.
Using NFS within a professional SAN environment or other storage system tends to offer greater reliability than using NFS outside of such an environment. However, NFS within a SAN environment may be slower than directly attached or bus-attached non-rotational storage.
If you choose to use NFS, NFS Version 4 or later is recommended, as is testing your NFS setup thoroughly before deploying into a production environment.
That’s kind of hand-wavy, but it does reinforce my concern about sticking a MySQL database on the thing.
I don’t have an answer for you as to which to use – it’s been a while since I’ve worked on network filesystem stuff, and I’m kinda shaking loose rusty bits trying to recall this – but in general I would be a little concerned about data integrity of both disk images and MySQL databases stored over a network. One can build a system that does it correctly, but I would try to do what I can to research potential issues there.
I would also probably test your actual workload if you’re concerned about performance, because it may differ a lot from what a simple throughput test might suggest for those uses.
yes after more thought, database is much better on iscsi. I can just create a 10gb image and share that. And getting backups from daily ZFS snapshots
-
Acronyms, initialisms, abbreviations, contractions, and other phrases which expand to something larger, that I’ve seen in this thread:
Fewer Letters More Letters IP Internet Protocol SAN Storage Area Network SSD Solid State Drive mass storage TCP Transmission Control Protocol, most often over IP
3 acronyms in this thread; the most compressed thread commented on today has 5 acronyms.
[Thread #268 for this sub, first seen 8th Nov 2023, 18:10] [FAQ] [Full list] [Contact] [Source code]
Myself over NFS can have serious latency issues. Some software can’t correctly file lock over NFS too which will cause write latency or just full blown errors.
iSCSI drops however can be really really bad and cause full filesystem corruption. Also backing up iSCSI volumes can be tricky. Software will likely work better and feel happy however and underlying issues may be masked until they’re unfixable. (I had an iSCSI volume attached to vmware silently corrupt for months before it failed and lost the data even though all scrubs/checksums were good until the very least moment).
You can make your situations with with either technology, both are just as correct. Would get a touch more throughput on iSCSI simply down to the write confirmation being drive based and not filesystem locks / os based.
YMMV
Are we just gonna not talk about OP using 10/8? 😂
I mean, there is the whole 128/8 for localhost, kinda hard to beat that with crazy allocations. And OP still has another /12 and /16 networks available even if they refuse to further divide them.
What someone does with their 16,777,215 private IPv4 addresses is none of our business…
Now just connect all of that with dumb L2 switches and watch those broadcasts fly!
“future expansion” - if OP adds an average of 10 servers every day for the next ~4600 years they’ll run out of address space.
Just in time to move to IPv6!
For the sata drive behavior it’s probably finishing the writes from buffer. I like to use the iotop utility to watch storage IO activity on my systems. Could try running it on both systems to get a better picture of what’s going on.
I currently use NFS and CIFS but have used iSCSI in the past. I like the simplicity of NFS & CIFS and they meet my uses. iSCSI has it’s strengths as others have stated.
- /var/lib/mysql - I would say iSCSI in it’s own image+lun. Should get lower latency as well as higher transfer rates compared to NFS for DB but it depends on the kinds & how much usage.
- virtual machine images - I prefer NFS mounts for same reason, easier to work with the files directly. If you do go with iSCSI you can have different disk images for different kinds of VMs. Should be able to use both at same time on most hypervisors if you want to play with them too.
- lots of small files - NFS should work without issue