How to get Speed out of Amazon's EBS volumes: Software RAID it!
mdadm --create /dev/md1 -v --raid-devices=8 --chunk=256 --level=raid10 /dev/xvdk /dev/xvdl /dev/xvdm /dev/xvdn /dev/xvdo /dev/xvdp /dev/xvdq /dev/xvdr
Take 8 EBS 125 GB volumes create a raid10 array with a 256KB chunk size. After various and mind numbing benchmarks I found that 256K is a good sweet spot. Feel free to do your own benches. The results have to be interpreted because of the nature of using a shared resource.
What I end up with is a 500GB partition, and I am roughly able to get around 22-25 MB of second of random I/O from 20 threads. To compare this to an 8 DISK 15K RPM PERC-6 2.5" SAS system I am able to get around 44 MB of second at a constant 1-2 ms response time for the same physical space. EBS volumes Response time per iop range from 6ms to 200ms. This sucks. Note: these numbers are based on RANDOM I/O 16KB Page size (4 iops per block write), what INNODB uses not sequential I/O.
Here is some iostat numbers from a live box with this configuration
avg-cpu: %user %nice %system %iowait %steal %idle 1.83 0.00 1.75 22.32 0.08 74.01 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util xvdap1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 xvdh 0.00 0.00 0.00 1.00 0.00 8.00 8.00 0.00 0.00 0.00 0.00 xvdk 0.00 0.00 34.40 26.40 1100.80 1503.20 42.83 0.49 8.01 6.39 38.88 xvdl 0.00 0.00 13.20 26.40 422.40 1503.20 48.63 0.27 6.71 4.38 17.36 xvdm 0.00 0.20 32.40 27.00 1036.80 1524.20 43.11 0.30 5.13 4.19 24.88 xvdn 0.00 0.20 9.40 27.00 300.80 1524.20 50.14 0.15 4.11 2.48 9.04 xvdo 0.00 0.00 30.20 27.40 968.00 1496.80 42.79 0.45 7.76 6.56 37.76 xvdp 0.00 0.00 14.60 27.40 478.40 1496.80 47.03 0.22 5.26 3.92 16.48 xvdq 0.00 0.00 31.20 25.60 998.40 1501.60 44.01 0.38 6.73 5.32 30.24 xvdr 0.00 0.00 9.80 25.60 313.60 1501.60 51.28 0.16 4.50 2.35 8.32 md1 0.00 0.00 174.80 98.60 5606.40 6009.80 42.49 0.00 0.00 0.00 0.00
So, now that I have acceptable speed what is the drawback? A weekly cron job that runs a check across the raid array. On Amazon’s EBS system it cuts my throughput in 1/2
For my Amazon Linux system the cron job is located
-rwxr-xr-x 1 root root 2770 Jan 16 2011 /etc/cron.weekly/99-raid-check
It essentially runs
echo check > /sys/block/md1/md/sync_action
Yet, the check lasts for around 9000 min or 6.25 days! Thus I will only have .75 days of full throughput.
So to stop this I must run
echo idle > /sys/block/md1/md/sync_action
I do not recommend turning off the check, its needed. Now to find out a way to make this check happen faster.
No comments:
Post a Comment