commit 74437717e9772bcf5f4c14e4febfb2e88d0dd04a Author: Marcin Date: Wed May 6 09:03:46 2020 +0200 Migration saga begins diff --git a/NAS_data_migration.md b/NAS_data_migration.md new file mode 100644 index 0000000..6b44779 --- /dev/null +++ b/NAS_data_migration.md @@ -0,0 +1,173 @@ +# The Story + +I was trying to migrate my data from one WD MyCloud EX 2 Ultra to another. This is my story of frustration, unanswered questions and a journey through bash + ssh. + +| Parameter | Source | Target | +|-|-|-| +| Device | WD MyCloud EX2 Ultra | WD MyCloud EX2 Ultra | +| Firmware | 2.31.204 | 2.31.204 | +| Drives | 2 x 3 TB | 2 x 4 TB | +| Raid | Raid-1 | Raid-0 | +| Encrypted | No | Yes | +| IP address | 192.168.1.54 | 192.168.1.53 | + + +First, I enabled ssh access in both NASes via web-ui: + +```Settings > Network > SSH``` + +Default user for WD Mycloud EX2 Ultra is `sshd` so to connect to NAS I used command: + +```ssh sshd@192.168.1.53``` + +To avoid multiple data transfers I logged into target ssh and tried to copy from source using `scp` command: + +``` +scp -rp sshd@192.168.1.54:/mnt/HD/HD_a2/Marcin /mnt/HD/HD_a2/ +``` + +However, the transfer speed oscillated between 20-25 MB/s. This is way below expected 70 MB/s. + +First I checked if network cards indeed connected using 1000 Mbps full-duplex using `ethtool`: + +``` +# ethtool egiga0 +Settings for egiga0: + Supported ports: [ TP MII ] + Supported link modes: 10baseT/Half 10baseT/Full + 100baseT/Half 100baseT/Full + 1000baseT/Full + Supported pause frame use: No + Supports auto-negotiation: Yes + Advertised link modes: 10baseT/Half 10baseT/Full + 100baseT/Half 100baseT/Full + 1000baseT/Half 1000baseT/Full + Advertised pause frame use: No + Advertised auto-negotiation: No + Link partner advertised link modes: 10baseT/Half 10baseT/Full + 100baseT/Half 100baseT/Full + 1000baseT/Full + Link partner advertised pause frame use: No + Link partner advertised auto-negotiation: Yes + Speed: 1000Mb/s + Duplex: Full + Port: MII + PHYAD: 0 + Transceiver: internal + Auto-negotiation: on + Link detected: yes +``` + +And that the IO and CPU are not saturated (those were already visible with web-ui, but I used `iostat`): + +``` +# iostat +Linux 3.10.39 (KlinkierChmurka) 05/05/20 _armv7l_ (2 CPU) + +avg-cpu: %user %nice %system %iowait %steal %idle + 29.23 6.16 28.23 1.36 0.00 35.02 +``` + +``` +# iostat -dx /dev/sda 5 +Linux 3.10.39 (KlinkierChmurka) 05/05/20 _armv7l_ (2 CPU) + +Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util +sda 0.55 5.39 4.01 95.83 212.68 10755.40 219.70 0.45 4.53 4.82 4.52 1.19 11.90 +``` + +I then checked if they are properly routing through the switch with `traceroute`: + +``` +# traceroute 192.168.1.54 +traceroute to 192.168.1.54 (192.168.1.54), 30 hops max, 38 byte packets + 1 192.168.1.54 (192.168.1.54) 0.428 ms 0.428 ms 0.391 ms + ``` + +Then I enabled jumbo frames in the web-ui and verified they are working using `ping`: + +``` +# ping -s 8972 192.168.1.54 +PING 192.168.1.54 (192.168.1.54): 8972 data bytes +8980 bytes from 192.168.1.54: seq=0 ttl=64 time=0.898 ms +8980 bytes from 192.168.1.54: seq=1 ttl=64 time=2.904 ms +8980 bytes from 192.168.1.54: seq=2 ttl=64 time=0.786 ms +``` + +To do performance test I stopped the scp pressing `Ctrl-Z` (that would allow me to resume it later using `bg` of `fg`) and ran a network throughput test using `iperf`: + +Source: +``` +# iperf -s +------------------------------------------------------------ +Server listening on TCP port 5001 +TCP window size: 85.3 KByte (default) +------------------------------------------------------------ +[ 4] local 192.168.1.53 port 5001 connected with 192.168.1.54 port 41261 +[ ID] Interval Transfer Bandwidth +[ 4] 0.0-10.0 sec 1.16 GBytes 991 Mbits/sec +``` + +Target: +``` +# iperf -c 192.168.1.53 +------------------------------------------------------------ +Client connecting to 192.168.1.53, TCP port 5001 +TCP window size: 93.3 KByte (default) +------------------------------------------------------------ +[ 3] local 192.168.1.54 port 41261 connected with 192.168.1.53 port 5001 +[ ID] Interval Transfer Bandwidth +[ 3] 0.0-10.0 sec 1.16 GBytes 992 Mbits/sec +``` + +This would rule out the network bottleneck. I decided to perform test on one large file using both `scp` and `rsync`, which can sometimes outperform `scp`. + +``` +# cd /mnt/HD/HD_a2/Marcin/ +# dd if=/dev/zero of=1GB_TEST_FILE bs=1G count=1 +1+0 records in +1+0 records out +1073741824 bytes (1.0GB) copied, 37.631790 seconds, 27.2MB/s +``` + +Starting with `scp`. Speed is consistent with what I obserwe with real data: +``` +# scp -rp sshd@192.168.1.54:/mnt/HD/HD_a2/Marcin/1GB_TEST_FILE /mnt/HD/HD_a2/Marcin/1GB_TEST_FILE +sshd@192.168.1.54's password: +1GB_TEST_FILE 100% 1024MB 22.2MB/s 00:46 +``` + +Now `rsync`. Unfortunately, no improvement there: + +``` +# rsync -a sshd@192.168.1.54:/mnt/HD/HD_a2/Marcin/1GB_TEST_FILE /mnt/HD/HD_a2/Marcin/1GB_TEST_FILE --progress +sshd@192.168.1.54's password: +receiving incremental file list +1GB_TEST_FILE + 1073741824 100% 21.57MB/s 0:00:47 (xfer#1, to-check=0/1) + +sent 30 bytes received 1073872980 bytes 20851903.11 bytes/sec +total size is 1073741824 speedup is 1.00 +``` + +But wait, didn't the `dd` created the file on source NAS with speed around 27 MB/s? Time to bench the drives. + +Source read speed: + +``` +# hdparm -t /dev/sda + +/dev/sda: +Timing buffered disk reads: 538 MB in 3.01 seconds = 178.73 MB/sec +``` + +Target write speed: + +``` +# dd if=/dev/zero of=1GB_TEST_FILE bs=1G count=1 +1+0 records in +1+0 records out +1073741824 bytes (1.0GB) copied, 41.367580 seconds, 24.8MB/s +``` + +Ok, so it would seem drive writing speed would be at fault. But why? \ No newline at end of file