Friday, August 17, 2007

Oooh really bad bug in 4.1.23, 4.1.24b INNODB only

Bug: 30485


[miguel@skybr 4.1]$ bin/mysql -uroot db77
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 2 to server version: 4.1.24-debug

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> CREATE TABLE `GiftCodes` (
-> `code` varchar(32) collate utf8_bin NOT NULL default '',
-> `used_by_id` bigint(10) unsigned NOT NULL default '0',
-> PRIMARY KEY (`code`)
-> ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin;
Query OK, 0 rows affected (0.00 sec)

mysql>
mysql> insert into GiftCodes values ('foo',7);
Query OK, 1 row affected (0.00 sec)

mysql> select * from GiftCodes where code='foo';
+------+------------+
| code | used_by_id |
+------+------------+
| foo | 7 |
+------+------------+
1 row in set (0.00 sec)

mysql> update GiftCodes set used_by_id='1' where code='foo';
Query OK, 0 rows affected (0.00 sec)
Rows matched: 0 Changed: 0 Warnings: 0

mysql> select * from GiftCodes where code='foo';
+------+------------+
| code | used_by_id |
+------+------------+
| foo | 7 |
+------+------------+
1 row in set (0.00 sec)

12 comments:

burtonator said...

Well THAT's bad!

Anonymous said...

used_by_id is a bigint. So you're telling me that you expect mysql to do a string-to-number conversion for you? forget it. This is not behaviour you should count on, no matter what the db, or that it works for some particular engine.

Further, MySQL is being nice and telling you it updated 0 rows, so getting the original data back is not a surprise.

Your statement should read:

update GiftCodes set used_by_id=1 where code='foo';

buggy sql will result in buggy results. or, in other words, garbage in garbage out.

--Raj.

Dathan Pattishall said...

Raj has no idea what he is talking about

Anonymous said...

For Dathan:
Do you?

Dathan Pattishall said...

Yup, I do. To stop sql injection all rvalues should be escaped. The drawback is a type conversion internally in mysql which is not that bad at all.

Anonymous said...

For Dathan:
I thought the best way to avoid sql injection attack is to use prepared statements and positional/bind parameters, NOT using the escaping parameters process.
As it turns out, MySQL used to have (I don't know if it still has) problems with real_escape_chars() function.

On another note, Raj is right. You cannot do that kind of update providing invalid data type for the filtering/set clauses. And counting on someone implementing some kind of auto converting process, it's plain dum. You wanna select integers, use "...used_by_id=1"; wanna select character data types, use "...used_by_id='1'". That's why quoting was implemented in a RDBMS, to differentiate between numeric data types and character data types. Cheers dude!

Dathan Pattishall said...

I don't know if I agree with this, it's been an established feature for quite some time.

Raj doesn't believe that it works or should work

"So you're telling me that you expect mysql to do a string-to-number conversion for you? forget it" -- Yes I do and it does :)

He's wrong.

Escaping this way is safe and reliable.

Using prepared statements is not an option for many folks since in many cases there is no performance gain from setting up a prepared statement executed from a webpage.


Anyway this has nothing to do with the bug listed above. It's an update bug due to a UTF-8 padding issue caused by a blind commit which was rolled back in 5.0

Anonymous said...

Ok, I think the whole conversion is on the wrong track. The point is I don't trust MySQL to do right data type conversions, especially from char to int, without getting weird results. In this case, MySQL acted half way correctly, notifying the user that no data was changed, like it was supposed to, given the data type conversion thing, BUT it didn't throw an error like any serious database would do.

Using this conversion, coupled with the escaping process you're promoting, is a time bomb.

"Using prepared statements is not an option for many folks since in many cases there is no performance gain from setting up a prepared statement executed"
Well, overlooking the possibility that you could use prepared statements in your application before writing a single line of code, that's the real problem.
In my mind set, I'd chase first reliability and THEN performance, especially if I'd have to tackle a more serious application.

Dathan Pattishall said...

Yea, I guess that would be a good method to do in retrospect. Don't you find the whole notion of using prepared statements so cumbersome to debug when trying to find out why a query is taking so long? (I guess this is a mute point when mysql releases the query log toggle).

Anonymous said...

Probably yeah, it is cumbersome but no more cumbersome than grepping through the whole darn code file to look for the statement where is it generated.
In this case, the best thing you could do is to attach a SQL comment that will inform you about the path/filename/line number where that prepared statement is called. In this way whenever you encounter performance problems, you'll see the statement + the SQL comment either in mytop either in the slow log/general log. Definitely it will help you.

Dathan Pattishall said...

Yup, I do that already. Many of my talks show this as a great tool for graphing load based on query.

Without rvalues tracking down ranges that span many rows is a bit difficult.

I.E. this statement is taking X seconds, but why?

Anonymous said...

By the way, I forgot to mention that one of my preferred mysql versions are Oracle 9i/10g and PostgreSQL 8.2.xx onwards.