Hi! For the record, posting some ha_cassandra benchmarks: <contents> Run #1: re-run previous benchmarks Run #2: evaluate max. network performance Run #3: Load two dbt3 databases at once Run #4: load two datasets into different nodes Results Conclusions </contents> === Run #1: re-run previous benchmarks === My results with - m1.large mysqld node (mysqld uses optimized binary) - Cassandra on 2*m1.large nodes - Loading data with the standard LOAD DATA INFILE command. - %CPU mysqld =14 sar -n DEV 10: 03:04:41 PM IFACE rxpck/s txpck/s rxkB/s txkB/s rxcmp/s txcmp/s rxmcst/s 03:04:51 PM lo 0.00 0.00 0.00 0.00 0.00 0.00 0.00 03:04:51 PM eth0 1575.20 156.60 81.47 4458.13 0.00 0.00 0.00 On cassandra node: - jsvc %CPU 70-170... 02:51:00 PM IFACE rxpck/s txpck/s rxkB/s txkB/s rxcmp/s txcmp/s rxmcst/s 02:51:05 PM lo 0.00 0.00 0.00 0.00 0.00 0.00 0.00 02:51:05 PM eth0 8325.20 7333.80 3376.25 1696.74 0.00 0.00 0.00 Total load time: time: 19 min. === Run #2: evaluate max. network performance === for comparison, sending /dev/zero on one host to /dev/null on another with nc: 03:39:24 PM IFACE rxpck/s txpck/s rxkB/s txkB/s rxcmp/s txcmp/s rxmcst/s 03:39:34 PM lo 0.00 0.00 0.00 0.00 0.00 0.00 0.00 03:39:34 PM eth0 16756.70 3688.50 851.91 87477.61 0.00 0.00 0.00 sending in both directions: SQL: 03:45:36 PM IFACE rxpck/s txpck/s rxkB/s txkB/s rxcmp/s txcmp/s rxmcst/s 03:45:46 PM lo 0.00 0.00 0.00 0.00 0.00 0.00 0.00 03:45:46 PM eth0 63308.20 4473.50 73002.80 81484.77 0.00 0.00 0.00 Cassandra: 03:47:32 PM IFACE rxpck/s txpck/s rxkB/s txkB/s rxcmp/s txcmp/s rxmcst/s 03:47:42 PM lo 0.20 0.20 0.01 0.01 0.00 0.00 0.00 03:47:42 PM eth0 60741.50 15241.70 84110.07 70712.32 0.00 0.00 0.00 === Run #3: Load two dbt3 databases at once === - Loading two DBT-3 databases at once. - The databases have their own key spaces, and mysql tables - Still, both loaders connect to the same cassandra host. Cassandra host: 04:27:57 PM IFACE rxpck/s txpck/s rxkB/s txkB/s rxcmp/s txcmp/s rxmcst/s 04:28:07 PM eth0 9979.10 8984.00 6006.77 2621.08 0.00 0.00 0.00 04:29:07 PM eth0 9181.80 7315.00 6416.25 2686.83 0.00 0.00 0.00 04:29:17 PM eth0 10277.00 9629.60 6047.00 2672.02 0.00 0.00 0.00 04:30:23 PM CPU %user %nice %system %iowait %steal %idle 04:30:33 PM all 34.56 24.80 24.71 0.18 0.05 15.69 04:30:53 PM all 32.08 11.20 24.25 1.31 0.08 31.07 04:31:03 PM all 35.59 22.55 22.28 0.27 0.00 19.31 it has periodical slow-downs like this: 04:32:38 PM all 53.25 1.30 19.05 3.03 0.00 23.38 04:32:39 PM all 65.84 10.40 21.78 0.00 0.00 1.98 04:32:40 PM all 37.14 33.81 18.10 0.95 0.00 10.00 04:32:41 PM all 42.65 25.98 26.96 0.00 0.00 4.41 04:32:42 PM all 33.17 39.42 19.23 0.00 0.00 8.17 04:32:43 PM all 36.89 36.89 19.90 0.00 0.00 6.31 04:32:44 PM all 39.81 35.55 13.74 0.00 0.00 10.90 04:32:45 PM all 34.45 33.97 22.01 0.00 0.00 9.57 04:32:46 PM all 35.58 36.06 21.15 0.00 0.00 7.21 04:32:47 PM all 31.05 35.16 15.07 0.00 0.00 18.72 04:32:48 PM all 41.31 26.29 20.66 0.00 0.00 11.74 04:32:49 PM all 39.02 32.20 25.37 0.00 0.00 3.41 04:32:50 PM all 26.64 34.50 13.97 0.00 0.00 24.89 04:32:51 PM all 56.93 27.72 12.87 0.00 0.00 2.48 04:32:52 PM all 71.14 15.42 11.44 0.00 0.00 1.99 04:32:53 PM all 64.22 8.82 23.04 0.00 0.00 3.92 04:32:54 PM all 76.73 1.98 19.31 0.00 0.00 1.98 04:32:55 PM all 61.24 0.00 30.14 0.96 0.00 7.66 04:32:56 PM all 29.20 0.00 32.40 0.00 0.00 38.40 04:32:57 PM all 30.29 0.00 29.88 0.00 0.00 39.83 04:32:58 PM all 23.19 0.00 20.29 0.00 0.36 56.16 04:32:59 PM all 23.34 0.00 16.38 0.00 0.35 59.93 04:33:00 PM all 20.07 0.00 21.90 0.00 0.00 58.03 04:33:01 PM all 24.20 0.00 18.86 0.00 0.00 56.94 04:33:02 PM all 20.07 0.00 22.18 0.00 0.70 57.04 Is this a compaction? mysqld host: 04:29:01 PM IFACE rxpck/s txpck/s rxkB/s txkB/s rxcmp/s txcmp/s rxmcst/s 04:29:01 PM eth0 1416.20 247.90 75.43 6578.11 0.00 0.00 0.00 04:29:11 PM eth0 1175.00 199.60 62.49 5292.67 0.00 0.00 0.00 04:29:21 PM eth0 1074.60 186.90 57.21 4952.19 0.00 0.00 0.00 04:30:30 PM CPU %user %nice %system %iowait %steal %idle 04:30:40 PM all 7.77 0.00 0.56 0.00 0.05 91.62 04:30:50 PM all 8.20 0.00 0.60 0.00 0.30 90.89 04:31:00 PM all 8.01 0.00 0.66 0.00 0.05 91.28 Conclusions: - mysqld is still idle most of the time. - Cassandra is busier. Sometimes, it is 1014/Sep/12 12:50 PM0% busy. - network use betwen mysqld and cassandra is 2x more, which is expected Run time: 24 minutes for both (compare to 19 minutes) === Run #4: load two datasets into different nodes === ==== Network ==== SQL node: 05:44:21 PM IFACE rxpck/s txpck/s rxkB/s txkB/s rxcmp/s txcmp/s rxmcst/s 05:44:41 PM eth0 1178.10 210.10 62.81 5510.03 0.00 0.00 0.00 05:44:51 PM eth0 1062.10 200.90 57.53 5152.63 0.00 0.00 0.00 05:45:01 PM eth0 965.80 194.00 52.53 5012.62 0.00 0.00 0.00 Cassandra1: 05:44:19 PM IFACE rxpck/s txpck/s rxkB/s txkB/s rxcmp/s txcmp/s rxmcst/s 05:44:49 PM eth0 10159.00 8659.20 4443.26 1798.75 0.00 0.00 0.00 05:44:59 PM eth0 9018.20 7750.10 3982.21 1578.30 0.00 0.00 0.00 05:45:09 PM eth0 8741.30 7521.70 3921.69 1541.55 0.00 0.00 0.00 Cassandra2: 05:44:56 PM IFACE rxpck/s txpck/s rxkB/s txkB/s rxcmp/s txcmp/s rxmcst/s 05:45:06 PM eth0 9305.60 7906.00 4121.10 1622.92 0.00 0.00 0.00 05:45:16 PM eth0 9186.50 7828.80 4258.12 1681.78 0.00 0.00 0.00 05:45:26 PM eth0 9675.70 8283.10 4306.89 1698.37 0.00 0.00 0.00 ==== CPU ==== SQL node: 05:46:45 PM CPU %user %nice %system %iowait %steal %idle 05:46:55 PM all 8.39 0.00 0.70 0.00 0.40 90.51 05:47:05 PM all 7.65 0.00 0.74 0.00 0.70 90.91 05:47:15 PM all 8.76 0.00 0.64 0.00 0.89 89.70 cassandra 1: 05:47:01 PM CPU %user %nice %system %iowait %steal %idle 05:47:11 PM all 27.28 28.24 27.55 0.00 0.05 16.89 05:47:21 PM all 31.45 24.44 27.56 1.38 0.05 15.13 05:47:31 PM all 29.46 19.60 23.19 2.35 0.04 25.36 05:47:41 PM all 38.47 0.04 14.21 0.15 0.07 47.06 05:47:51 PM all 31.04 9.92 17.97 0.39 0.08 40.61 05:48:01 PM all 24.73 28.07 21.77 0.26 0.04 25.12 05:48:11 PM all 19.79 11.67 19.23 1.83 0.07 47.41 05:48:21 PM all 20.29 0.00 11.07 0.06 0.13 68.44 05:48:31 PM all 24.08 0.00 11.96 0.13 0.13 63.69 Cassandra 2: 05:46:56 PM CPU %user %nice %system %iowait %steal %idle 05:47:06 PM all 31.97 27.14 26.91 0.09 0.05 13.85 05:47:16 PM all 32.79 20.43 23.39 2.34 0.04 21.01 05:47:26 PM all 26.43 27.07 27.29 0.14 0.09 18.98 05:47:36 PM all 17.72 8.59 12.00 2.97 0.21 58.52 05:47:46 PM all 13.63 0.00 6.77 0.06 0.30 79.23 05:47:56 PM all 27.19 3.92 8.52 0.11 0.25 60.01 05:48:06 PM all 19.94 27.83 18.47 0.13 0.08 33.54 05:48:16 PM all 13.69 16.01 14.24 2.69 0.15 53.22 05:48:26 PM all 18.13 0.00 9.57 0.25 0.38 71.66 === Results === The first client failed with ERROR 1928 (HY000) at line 145: Internal error: 'TimedOutException: Default TException.' real 4m3.463s user 0m0.008s sys 0m0.000s The second completed after: real 19m57.875s (which matches load time for loading one dataset). === Conclusions === - need to fix the Timeout error (filed as MDEV-535) - CPU time spent by mysqld is negligible now (I don't know whether this comes from using faster CPU for SQL node, or using a release build of mariadb, or both) - A confirmation of previous findings: if we use only one connection, the speed at which we can insert the data is very limited (see: with two connections, we have inserted 2x data in 20% more time. This is where we used two connections to connect to the *same* node) - I am not sure whether I should just start using multiple connections, or also automatically route the data (both writes and reads) to the right node? - need to check out Hector client. It seems to be able to connect to multiple nodes of Cassandra cluster. BR Sergei -- Sergei Petrunia, Software Developer Monty Program AB, http://askmonty.org Blog: http://s.petrunia.net/blog