tag:blogger.com,1999:blog-31421954.comments2023-10-30T08:23:12.960-07:00mySQL DBA, Architecture, Dev, Scale, HA, Code Dathan Pattishallhttp://www.blogger.com/profile/00356367514107959723noreply@blogger.comBlogger424125tag:blogger.com,1999:blog-31421954.post-73280882025803111002010-06-25T16:20:13.624-07:002010-06-25T16:20:13.624-07:00I told Fino about mk-table-checksum so many times....I told Fino about mk-table-checksum so many times... he always just did it by hand.Streeterhttp://www.chrisstreeter.comnoreply@blogger.comtag:blogger.com,1999:blog-31421954.post-42771049233663893222010-04-20T13:03:19.547-07:002010-04-20T13:03:19.547-07:00This is a production environment.
On every api cal...This is a production environment.<br />On every api call and page load the client does the following<br /><br />1. Make a new Connection to the F5 loadbalancer<br /><br />2. F5 loadbalancer distributes load to a least connected node<br /><br />3. The server makes the call to its own data structure or to the server that has the hash<br /><br />4. The server responds to the request<br /><br />5. The client returns the data and closes the connection.<br /><br />From connect to query takes 3 ms<br /><br />From connect to data return takes around 100ms<br /><br />Most of the time is spent on the read and sort.<br /><br />To enable a lot of frequent short requests the article enables you to do that.Dathan Pattishallhttps://www.blogger.com/profile/00356367514107959723noreply@blogger.comtag:blogger.com,1999:blog-31421954.post-33819417664714056312010-04-14T07:16:42.084-07:002010-04-14T07:16:42.084-07:00Thanks a lot for this differentiated review. Just ...Thanks a lot for this differentiated review. Just to be clear, I do not want to argue about anything, just provide some background on why things in the book are the way they are. In the end, everyone has to decide for herself, whether a book serves their needs or not :-)<br /><br />The publisher asked us to do lots of screen shots to illustrate the book and make it feel "lighter weight" than it would have been with lots and lots of text. I agree with you that command line stuff tends to be more stable over time, however I'd say that in most cases slightly changed GUIs over newer versions of software do not make them generally unsuitable. As for your Eclipse example, even though there are some changes to the chrome, the general concepts (perspectives, views, tabs) are still applicable and have been form the earliest versions.<br />Nevertheless, in general I see your point.<br /><br />I also agree with you in that this is not a book for application or SQL optimization. This however was never the focus. Instead, the concept was to provide readers with a means to look at the TOC, scan through the recipe titles and see if one matches their needs of the moment. Getting the task done quickly was the priority; that's why in all the recipes we have the "How to do it" section first, and only after that, if readers care, they can read an explanation of what they just did. In general this seemed a little suspicious to me at first, too, because before I apply any recipe to my servers, I definitely want to understand what the stuff is going to do and how and why. But at this point we had to adhere to the general style of the Packt Cookbook series.<br /><br />Also I would have liked more background information on most topics, but we already overshot the initial 300 page limit the publisher had set, so we were reluctant, but needed to cut down on some background material.<br /><br />Cheers,<br />Daniel SchnellerDaniel Schnellerhttps://www.blogger.com/profile/10703859800169283952noreply@blogger.comtag:blogger.com,1999:blog-31421954.post-9095893495855832362010-03-27T06:36:44.236-07:002010-03-27T06:36:44.236-07:00Presumably this is in a test environment, right?
...Presumably this is in a test environment, right?<br /><br />How are you generating the test loads, can we see the load test client code or some ideas of how the client works?Mark Robsonhttps://www.blogger.com/profile/15864507044869250062noreply@blogger.comtag:blogger.com,1999:blog-31421954.post-38546215420795286872010-03-26T12:21:17.667-07:002010-03-26T12:21:17.667-07:00An F5 Loadbalancer is in front of it. I can turn o...An F5 Loadbalancer is in front of it. I can turn on Persistent connections but the connect itself is less then 3 ms. Plus with a persistent connection I don't get an even distribution of requests among each node. I'll give it a try yet connect fetch close seems to work really well. With a 20ms over head of range reads I'm really not going to see a reduction in R(t).Dathan Pattishallhttps://www.blogger.com/profile/00356367514107959723noreply@blogger.comtag:blogger.com,1999:blog-31421954.post-85298598366406824252010-03-26T12:04:16.164-07:002010-03-26T12:04:16.164-07:00Are you pooling your Thrift connections client-sid...Are you pooling your Thrift connections client-side? It sounds like you're not. Connection-per-request is going to give you really bad performance.Jonathan Ellishttps://www.blogger.com/profile/11003648392946638242noreply@blogger.comtag:blogger.com,1999:blog-31421954.post-14929750412990431022010-03-25T22:39:11.042-07:002010-03-25T22:39:11.042-07:00Thanks for the version info Dathan. If you've ...Thanks for the version info Dathan. If you've got the inclination it might be worthwhile to test with 0.6, which adds a row cache as per <br />https://issues.apache.org/jira/browse/CASSANDRA-678<br /><br />Depending on your workload that might be of significant help, then again some workloads don't get much from a cache.Kapil Thangaveluhttps://www.blogger.com/profile/07074376054583992994noreply@blogger.comtag:blogger.com,1999:blog-31421954.post-22540707470608782392010-03-25T17:40:38.250-07:002010-03-25T17:40:38.250-07:00@Kapil Thangavelu using latest 5
@Gustavo Niemeye...@Kapil Thangavelu using latest 5<br /><br />@Gustavo Niemeyer - if you look at the graph Cassandra reads are in the 20 - 60 ms range same as I see.Dathan Pattishallhttps://www.blogger.com/profile/00356367514107959723noreply@blogger.comtag:blogger.com,1999:blog-31421954.post-63937800386172790652010-03-25T06:55:07.300-07:002010-03-25T06:55:07.300-07:00You neglected to mention which version of cassandr...You neglected to mention which version of cassandra your using. Cassandra 0.6 (currently in beta) has significantly improved read performance via better caches.Kapil Thangaveluhttps://www.blogger.com/profile/07074376054583992994noreply@blogger.comtag:blogger.com,1999:blog-31421954.post-47379504983568628122010-03-25T03:34:04.666-07:002010-03-25T03:34:04.666-07:00I don't think Cassandra reads are as bad as yo...I don't think Cassandra reads are as bad as you make it look like in the post. For a very thorough benchmark, check out the following whitepaper by Yahoo! Research:<br /><br />http://www.brianfrankcooper.net/pubs/ycsb.pdf<br /><br />Note that the whitepaper is based on an old version of Cassandra, and there were specific improvements in the read area in recent releases, which means the figure is even less significant.Gustavo Niemeyerhttp://niemeyer.netnoreply@blogger.comtag:blogger.com,1999:blog-31421954.post-77541503278228842772010-03-24T11:59:16.956-07:002010-03-24T11:59:16.956-07:00@mark sorry you said read only my assumption was t...@mark sorry you said read only my assumption was that for write only read only I do not have a good test yet.Dathan Pattishallhttps://www.blogger.com/profile/00356367514107959723noreply@blogger.comtag:blogger.com,1999:blog-31421954.post-22042598395573968682010-03-24T11:51:54.664-07:002010-03-24T11:51:54.664-07:00@jules
On #cassandra all the time. jellis helped m...@jules<br />On #cassandra all the time. jellis helped me get rid of a crash bug<br /><br />@anon about partitions - yes I thought about it and use it in other places but since this is a high write high concurrency throughput blocking for a few 100 ms is not ideal.<br /><br />@mark<br />calculating the raw disk iops seem to be on par with MYISAM (since append only) I'm still building benches to get a good test that can fit the mustard of a good profile of this service. I'll update this soon.<br /><br />@drift - this is not a super column<br /><br /> <br /><br />so maybe your get slice bug that you mentioned also effects regular ColumnTypes?Dathan Pattishallhttps://www.blogger.com/profile/00356367514107959723noreply@blogger.comtag:blogger.com,1999:blog-31421954.post-25221686927680070502010-03-24T11:39:58.934-07:002010-03-24T11:39:58.934-07:00Since you're using supercolumns, you need to b...Since you're using supercolumns, you need to be aware of <a href="https://issues.apache.org/jira/browse/CASSANDRA-598" rel="nofollow">CASSANDRA-598</a>. If you're inserting 10,000 subcolumns but only asking for the last 10, the entire 10,000 will need to be deserialized until that ticket is closed. If you instead used a simple CF, or less subcolumns, get_slice would be much faster.Unknownhttps://www.blogger.com/profile/05500725751374846581noreply@blogger.comtag:blogger.com,1999:blog-31421954.post-77049542044646116322010-03-24T09:26:55.026-07:002010-03-24T09:26:55.026-07:00Interesting post, but i'm not sure the goal is...Interesting post, but i'm not sure the goal is incredibly fast reads but consistent read times as data grows and infrastructure scales.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-31421954.post-21024434033044620662010-03-24T06:40:36.927-07:002010-03-24T06:40:36.927-07:00Nice post.
I am not sure you should trust all of...Nice post. <br /><br />I am not sure you should trust all of the anon advice you get here (InfiniDB for OTLP?). However, TokuDB would have been good for your workload.<br /><br />What is the max IOPs rate you can drive using Cassandra? I can get ~20k from InnoDB, ~40k from PBXT and ~80k from MyISAM for a read-only workload. I ask because one server that provides 10k IOPs is cheaper to run than 10 that do 100. I have yet to find results for HBase and Cassandra.Mark Callaghanhttps://www.blogger.com/profile/09590445221922043181noreply@blogger.comtag:blogger.com,1999:blog-31421954.post-21227249943055623122010-03-24T04:49:31.262-07:002010-03-24T04:49:31.262-07:00You should also look at MySQL column DB's whic...You should also look at MySQL column DB's which may be a good fit for you (e.g. Calpont/InfiniDB, others). Best of both worlds for some use cases.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-31421954.post-8046819216279807342010-03-24T03:30:12.143-07:002010-03-24T03:30:12.143-07:00Have you tried partitioning with MySQL? Like parti...Have you tried partitioning with MySQL? Like partitioning by date. Selects which are in a range which is in one partition won't touch the other partitions. You can set innodb to have file per table and in this case every partition will be another file. It is another table because partitioning is performed at MySQL level, not the storage engine. Dropping a partition is quite fast operation.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-31421954.post-67519951401906079612010-03-23T21:24:37.728-07:002010-03-23T21:24:37.728-07:00Dathan,
Feel free to join us in the #cassandra ir...Dathan,<br /><br />Feel free to join us in the #cassandra irc channel on freenode for some help in getting better performance out of Cassandra. I think you can probably redesign your column families in such a way that you can use the ordered partitioner and still get the data you need.<br /><br />James (jbathgate)Unknownhttps://www.blogger.com/profile/04254946063067265784noreply@blogger.comtag:blogger.com,1999:blog-31421954.post-6160993862770945572010-03-11T19:58:51.203-08:002010-03-11T19:58:51.203-08:00Thanks for the presentation (and swag) :-)Thanks for the presentation (and swag) :-)John Dnoreply@blogger.comtag:blogger.com,1999:blog-31421954.post-59160576531433032802010-03-10T16:34:31.760-08:002010-03-10T16:34:31.760-08:00I'd like more details, please... if there'...I'd like more details, please... if there's still time to attend.Markhttps://www.blogger.com/profile/14668153232723718277noreply@blogger.comtag:blogger.com,1999:blog-31421954.post-56566283428190731392010-02-04T14:14:44.621-08:002010-02-04T14:14:44.621-08:00Little scary as you deleting the files directly; y...Little scary as you deleting the files directly; you instead instead drop them (or atleast truncate and then drop).. if you just remove, tablespace still has entries for them.<br /><br />Thats always a problem until InnoDB supports a easy way to purge and reclaim the space without re-creation of the table. Its easy to introduce a command, that can purge this without affecting the online activity on the table.<br /><br />Normally, I create a new table; and copy over the data and flip at the end... (provided you have a easy way to identify whats new/old records)venuhttp://venublog.com/noreply@blogger.comtag:blogger.com,1999:blog-31421954.post-44070939768118884132010-02-04T13:48:34.532-08:002010-02-04T13:48:34.532-08:00I haven't administered a production MySQL box ...I haven't administered a production MySQL box in a few years, but this reminds me of Oracle's <i>ALTER TABLE MOVE TABLESPACE X</i> where X is the tablespace in which it already resides.<br /><br />From <a href="http://dev.mysql.com/doc/refman/5.0/en/innodb-file-defragmenting.html" rel="nofollow">MySQL 5.0 Reference Manual ... Defragmenting a Table</a>:<br /><br />It can speed up index scans if you periodically perform a “null” ALTER TABLE operation, which causes MySQL to rebuild the table:<br /><br /><i>ALTER TABLE tbl_name ENGINE=INNODB;</i>Markhttps://www.blogger.com/profile/14668153232723718277noreply@blogger.comtag:blogger.com,1999:blog-31421954.post-64799454221749555032010-01-08T18:11:19.702-08:002010-01-08T18:11:19.702-08:00Post the slides here, much better :)Post the slides here, much better :)Venuhttp://venublog.com/noreply@blogger.comtag:blogger.com,1999:blog-31421954.post-23596810742091559302009-11-10T12:47:58.903-08:002009-11-10T12:47:58.903-08:00Thats awesome Eric. I will give it a tryThats awesome Eric. I will give it a tryDathan Pattishallhttps://www.blogger.com/profile/00356367514107959723noreply@blogger.comtag:blogger.com,1999:blog-31421954.post-40554286071969918532009-11-10T01:15:41.881-08:002009-11-10T01:15:41.881-08:00... the link http://blog.ulf-wendel.de/?p=201 poin...... the link http://blog.ulf-wendel.de/?p=201 points you "PHP: How mysqlnd async queries help you with sharding!" from about a year ago.Anonymousnoreply@blogger.com