Wednesday, January 30, 2008

Flickr is Hiring for a DBA position!!

We're looking for an experienced and motivated MySQL DBA to help make things go at Flickr.

We're looking for a DBA with strong LAMP background.

Specific Duties:
• Work with engineers on performance tuning, query optimization, index tuning.
• Monitor databases for problems and to diagnose where those problems are.
• Work with Senior DBA to maintain a scalable, reliable, and robust database environment.
• Build database tools and scripts to automate where possible.
• Support MySQL databases for production and development.
• Provide 24x7 escalated on-call support on a pager rotation.

Requirements & Education:
• 3-4+ years MySQL experience.
• 2+ years of experience as a MySQL DBA in a high traffic, transactional environment.
• 2+ years working in a LAMP environment, particularly PHP/MySQL combination.
• Proficient with database performance strategies.
• Proficient tuning MySQL processes and queries.
• Experience in administration of InnoDB
• Experience with MySQL Replication, with both Master-Slave and Master-Master replication.
• Ability to work cooperatively with software engineers and system administrators.
• Excellent communication skills
• Exceptional problem-solving expertise and attention to detail.
• BS in Computer Science or equivalent.

Bonus Points For:
• Experience with Data Sharding and federated architectures.
• Experience with multi-datacenter MySQL replication.
• Experience working in a social media environment.

Send me an email with your resume.

Thursday, January 24, 2008

filefrag a DBA's Best Friend

EXT3 has performance problems as the filesystem gets fragmented-although this is counter intuitive to the design of EXT3-fragmentation really happens.

Really there is no safe de-fragmenter tool out there for ext3. There is this one but I will not use it.

Fragmentation can get so bad that performance from EXT-3 can drop 7 fold!!

Here is some good info on fragmentation and a comparison of various other file systems.

Now for filefrag, I use this to see how bad an innodb file is fragmented.

TABLE.ibd: 4020 extents found, perfection would be 298 extents

Our wiz of a system administrator Kevin M., who I am teaching to become a mySQL DBA got me hooked on this utility: and he came up with a good method to fix this.

cp TABLE.ibd to a new location
rm TABLE.ibd
cp TABLE.ibd from new location back to the Database Directory

Tada fragmentation is nearly gone. FSCK will help as well.

Friday, January 18, 2008

O_DIRECT + EXT3 Update

A few days ago I wrote about O_DIRECT + EXT3 not working.

This is not a wide-spread problem, and may be isolated to 2.6.9. So, it makes sense that others who run O_DIRECT with EXT3 do not see the issue. I will use this post for future updates.

Ok here is the research that I did, and found the cause of my O_DIRECT problem

RHEL Bug ID Description
161985O_DIRECT on RHEL v4 may not return correct number of bytes when concurrent I/O
178084Last AIO read of a file opened with O_DIRECT returns wrong length
178720O_DIRECT bug when reading last block of sparse file
191736CVE-2004-2660 O_DIRECT write sometimes leaks memory

So, I can't use O_DIRECT

Tuesday, January 15, 2008

MySQL support == AWSOME

Well, none of my issues have been fixed yet but MySQL support is on top of it.

I've ran into many S1 bugs: all at the same time. Support has been able to help me identify them. Some of them have proposed fixes, some fixes are being tested in

MySQL is by far the best Open source Database on the planet-support reflects that fact. I highly recommend getting a support contract to trouble shoot issues that make it into production, less learning the entire mysql code base and doing it yourself. (I know alot about the code-base but the 5 issues I am tracking was to much for me to debug alone. On top of that I don't know enough of the code base to make fixes to some of the bugs.)

If you do more then 30K selects per second across all your servers, get piece of mind that someone will do there best to address any issues that you can't figure out. Get a MySQL support contract today.

Monday, January 14, 2008

MAJOR Problems in mysql-5.0.51

mysql-5.0.51 causes huge blocking locks under high load.

It also causes relay-log corruptions. For instance, how can a syntax error make it into replication, the reason is mysql-5.0.51 is truncating the stream.

The only work around is to rebuild the relay log.

The BUG is listed here for replication problems. I suspected IO_CACHE corruption as the cause.

update: this is the cause of the huge blocks

InnoDB: Warning: a long semaphore wait:
--Thread 1173899616 has waited at btr0cur.c line 424 for 292.00 seconds the semaphore:
S-lock on RW-latch at 0x2df5159f58 created in file buf0buf.c line 497
a writer (thread id 1173899616) has reserved it in mode exclusive
number of readers 0, waiters flag 1
Last time read locked in file btr0cur.c line 424
Last time write locked in file buf0buf.c line 1768
InnoDB: Warning: a long semaphore wait:
--Thread 1172302176 has waited at btr0cur.c line 424 for 295.00 seconds the semaphore:
S-lock on RW-latch at 0x2df50706a8 created in file buf0buf.c line 497
a writer (thread id 1172302176) has reserved it in mode exclusive
number of readers 0, waiters flag 1
Last time read locked in file btr0cur.c line 424
Last time write locked in file buf0buf.c line 1768
InnoDB: Warning: a long semaphore wait:
--Thread 1182951776 has waited at trx0trx.c line 1627 for 283.00 seconds the semaphore:
Mutex at 0x2a9eac52b8 created file srv0srv.c line 872, lock var 0
waiters flag 0
InnoDB: ###### Starts InnoDB Monitor for 30 secs to print diagnostic info:
InnoDB: Pending preads 1, pwrites 0

Additionally mySQL is putting self in swap for a memory config that has not changed from my previous version.

I'm thinking that there is some memory leak in 5.0.51


O_DIRECT under high load causes these issues

This is a symptom

Losing some ticks... checking if CPU frequency changed.
warning: many lost ticks.
Your time source seems to be instable or some driver is hogging interrupts
rip __do_softirq+0x4d/0xd0
ttyS1: 1 input overrun(s)
ttyS1: 1 input overrun(s)
ttyS1: 1 input overrun(s)
ttyS1: 1 input overrun(s)
ttyS1: 1 input overrun(s)

BUT THE REAL PROBLEM is it Locks up the partition that the ibdata file is on.

Systems where the server locked up.

2.6.9-34.ELsmp #1 SMP Fri Feb 24 16:56:28 EST 2006 x86_64 x86_64 x86_64 GNU/Linux

total used free shared buffers cached
Mem: 16412760 16389976 22784 0 82440 686368
-/+ buffers/cache: 15621168 791592
Swap: 8393952 144 8393808

I had O_DIRECT running in production for over a month on some pretty loaded servers, but once I put it on some older servers, all hell broke loose.

Pretty Loaded is defined as

1000 qps mainly selects mixed with large ranges at a high concurrency of 30 threads.

CPU WIO is around 10-15% (acceptable thresholds)

If you insist on running O_DIRECT I recommend

1. Test O_DIRECT on ever OS version in your farm
2. Test O_DIRECT by producing so much load that it's unrealistic.