With IPC or threads we can speed up common O(N) problems to reduce the N with parallelism, its still functionally a O(N) yet from a single instance point of view N is much less because threads (IPC) takes that Serial computing component and executes the code in parallel. To better explain what I am talking about lets look at some PHP code:
foreach($friends as $friend){
$data[] = getMySQLData(“SELECT * FROM AccountData WHERE userid = $friend);
}
The Primary key for the AccountData table is userid. Assuming that you have 5000 friends, the query has to be executed 5000 times.
We can reduce the O(N) and change it to a O(nlogn) (Binary Tree - doesn't take into account other factors) by switching the query to
$data = getMYSQLData(“SELECT * FROM AccountData WHERE userid IN (….)”);
We just sped up the retrieval of the data significantly, yet we just introduced a bottle neck on the datalayer. Our architecture requires that the data is located in a single location.
What if AccountData’s data is spread across many servers federated by userid? This means that userid belongs to a server, so the server contains a shard of the AccountData’s Data.
Now we are back to a O(N) where each query needs to be executed on the corresponding shard. The logical next step is to group queries per shard and run across them all. For instance
$multiShardIDs = $genericShard->getMultipleShardIDs($objIds);
foreach ($multiShardIDs as $shardID => $shardUserIDs) {
if (stripos($orgQuery, " WHERE ") !== false){
$query = $orgQuery." AND {$column} IN (".implode(',', $shardUserIDs).") ";
}
else{
$query = $orgQuery." WHERE {$column} IN (".implode(',', $shardUserIDs).") ";
}
$shard_to_sql[$shardID] = $query;
.... more stuff ....
Yet this is still a O(N) its just that N is smaller. Each query is still executed serially.
Let’s look at some stats of synchronous queries of SELECT 1; This query is executed across 35 shards and the timings are from PHP point of view.
Field | End Value | Start Value | Delta |
---|---|---|---|
ru_oublock | 0 | 0 | 0 |
ru_inblock | 0 | 0 | 0 |
ru_msgsnd | 0 | 0 | 0 |
ru_msgrcv | 0 | 0 | 0 |
ru_maxrss | 0 | 0 | 0 |
ru_ixrss | 0 | 0 | 0 |
ru_idrss | 0 | 0 | 0 |
ru_minflt | 9872 | 9865 | 7 |
ru_majflt | 0 | 0 | 0 |
ru_nsignals | 0 | 0 | 0 |
ru_nvcsw | 11344 | 11114 | 230 |
ru_nivcsw | 977 | 968 | 9 |
ru_nswap | 0 | 0 | 0 |
ru_utime.tv_usec | 865054 | 849053 | 16001 |
ru_utime.tv_sec | 16 | 16 | 0 |
ru_stime.tv_usec | 556097 | 552097 | 4000 |
ru_stime.tv_sec | 1 | 1 | 0 |
Total Execution Time | 0.18323707580566 |
As you can see, to execute this from PHP it took 100 ms, 100s pages reclaimed and 200s voluntary context switches to query 35 servers.
Now let’s look at Asynchronous execution of SELECT 1; // the query generation is from PHP yet the execution is performed on a server that executes the query in parallel
Field | End Value | Start Value | Delta |
---|---|---|---|
ru_oublock | 0 | 0 | 0 |
ru_inblock | 0 | 0 | 0 |
ru_msgsnd | 0 | 0 | 0 |
ru_msgrcv | 0 | 0 | 0 |
ru_maxrss | 0 | 0 | 0 |
ru_ixrss | 0 | 0 | 0 |
ru_idrss | 0 | 0 | 0 |
ru_minflt | 9131 | 9121 | 10 |
ru_majflt | 0 | 0 | 0 |
ru_nsignals | 0 | 0 | 0 |
ru_nvcsw | 3891 | 3889 | 2 |
ru_nivcsw | 290 | 290 | 0 |
ru_nswap | 0 | 0 | 0 |
ru_utime.tv_usec | 596287 | 596287 | 0 |
ru_utime.tv_sec | 4 | 4 | 0 |
ru_stime.tv_usec | 460028 | 460028 | 0 |
ru_stime.tv_sec | 0 | 0 | 0 |
Total Execution Time | 0.019363880157471 |
As you can see from the table above executing the query asynchronously produced results with less context switching, less pages reclaimed and almost 10 times execution improvement over the synchronous query counterpart.
How is the asynchronous query executed? Lets take a look at the figure below.
So A user comes through the firewall / load balancer with a HTTP Request to the www pool that runs PHP. PHP now makes a CURL request to the Async Shard Servers (through a LB same LB different PORT). The HTTP Request to the Async Shard Server contains the SQL we wish to execute. The Async Shard Servers has a thread per shard and executes the request in parallel. The results are merged and sent to the calling CURL process via JSON. The returned JSON is then converted into a PHP object. This is a typical three-tier environment.
When having to query multiple servers using an Asynchronous Tier is dramatically faster; in fact its as fast as the slowest server. This is the main sticking point of why asynchronous queries are faster then synchronous queries (in this context) since the total execution time for serial queries is the SUM of all the query execution.
The current version of the server is used for Friend Query execution across the datalayer. Its been solid for a few months now, and I'm currently getting permission to release it as an Open Source Product. The features this server contains:
- Lightweight
- CPU bounded
- Scales Linearly
- A Timer Thread to keep the database config up to date in memory and fetching the config from PHP so if PHP changes connections to the shards so does Java
- Uses Java-6 Executor Service
- Merges the result set prior to sending it to the calling process
- Communicates via JSON
- Uses MySQL Connector/J
- Supports a high concurrency
- Optimized thread usage
4 comments:
Nice post, the power of async queries is yet to be realized by most, mostly due to the APIs not being there. Rather than introducing an entire new layer into the mix, I would suggest using non-blocking I/O directly on the first layer PHP machines. This is what I added into libdrizzle and the corresponding PHP extension. For example, you can queue N queries and have them run concurrently, possibly against multiple servers, all within PHP.
http://cvs.php.net/viewvc.cgi/pecl/drizzle/drizzle.php?view=markup
Towards the end of this is a "concurrent query interface" example.
-Eric
http://blog.ulf-wendel.de/?p=201
... the link http://blog.ulf-wendel.de/?p=201 points you "PHP: How mysqlnd async queries help you with sharding!" from about a year ago.
Thats awesome Eric. I will give it a try
Post a Comment