The image above graphs all the exceptions that are produced from Cassandra. The two big lines are
Transport Exceptions (te) - meaning that Cassandra could not answer the request think of this as MAX Connection errors in mySQL.
Unavailable Exceptions (ue) - meaning that Cassandra could answer the request but the "storage engine" cannot do anything with it because its busy doing something like communicating with other nodes or maintenance like a node cleanup.
So how did I get the graph to drop to 0? After looking at the error logs, I saw that Cassandra was getting flooded with SYN Requests and the kernel thought that it was a SYN Flood and did this
possible SYN flooding on port 9160. Sending cookies.
To stop this the puppet profile was changed to have
sysctl -w net.ipv4.tcp_max_syn_backlog=4096
sysctl -w net.ipv4.tcp_syncookies=0
Next looking into the Cassandra log which I defined to exist in /var/log/cassandra/system.log
WARN [TCP Selector Manager] 2010-03-26 02:46:31,619 TcpConnectionHandler.java (line 53) Exception was generated at : 03/26/2010 02:
Too many open files
java.io.IOException: Too many open files
Then noticed that
ulimit -n == 1024
thus I changed
/etc/security/limits.conf so that It's at a server setting by adding this:
* - nofile 8000
Now my Transport Exceptions and Unavailable Exceptions are gone and data is being written to it consistently.
There are many other ways of doing the same thing, I could have modified my init script or did some other stuff but I choose this way. Default Distros set kernel and limits fields too low: settings for desktop levels.
4 comments:
Are you pooling your Thrift connections client-side? It sounds like you're not. Connection-per-request is going to give you really bad performance.
An F5 Loadbalancer is in front of it. I can turn on Persistent connections but the connect itself is less then 3 ms. Plus with a persistent connection I don't get an even distribution of requests among each node. I'll give it a try yet connect fetch close seems to work really well. With a 20ms over head of range reads I'm really not going to see a reduction in R(t).
Presumably this is in a test environment, right?
How are you generating the test loads, can we see the load test client code or some ideas of how the client works?
This is a production environment.
On every api call and page load the client does the following
1. Make a new Connection to the F5 loadbalancer
2. F5 loadbalancer distributes load to a least connected node
3. The server makes the call to its own data structure or to the server that has the hash
4. The server responds to the request
5. The client returns the data and closes the connection.
From connect to query takes 3 ms
From connect to data return takes around 100ms
Most of the time is spent on the read and sort.
To enable a lot of frequent short requests the article enables you to do that.
Post a Comment