Given a table
CREATE TABLE `hmm` (
`a` int(10) unsigned NOT NULL default '0',
`b` text NOT NULL,
`c` int(10) unsigned NOT NULL default '0',
PRIMARY KEY (`a`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
INSERT INTO hmm VALUES (1, 'hmm8',1) ON DUPLICATE KEY UPDATE b = IF(FIND_IN_SET('hmm8', b)>0, b, CONCAT_WS(',',b, 'hmm8')), c = c+1;
SELECT * FROM hmm;
+---+--------------------------+---+
| a | b | c |
+---+--------------------------+---+
| 1 | hmm4,hmm5,hmm6,hmm7,hmm8 | 7 |
+---+--------------------------+---+
Let's break apart the INSERT statement, if column 'a' produces a duplicate ON DUPLICATE KEY UPDATE will issue an IF statement
FIND_IN_SET will return the position of the found string. The logic of IF say if FIND_IN_SET is is greater then 0 i.e. the string was found, update column b with b itself (do nothing) else concat the column with the column contents and the new string separated by a comma. Additionally in all cases increment column c by 1.
Why do it this way? Well I removed the need to do a SELECT and provided a solution to store distinct text values instead of the raw value, thus saving space and resources on network transfers etc.
A little quick statement that might be useful for others.
3 comments:
Why are you sacrificing normalization that way? Wouldn't it be more appropriate to use a second table which contains all the strings for a given entry in the master table?
sure, why not. This is an example of how to store unique strings in a text field.
Personally I'm not a fan of normalization, since I can get faster lookups when I don't normalize in many cases.
This is a bad solution to what isn't really a problem; If your normalization is slower than this, then your normalization is wrong.
Offloading the given processing onto the database does not scale well. It also makes indexing on the values, well, not really possible. Placing an index across multiple fields will simplify processing, speed up the search, and is nicely scalable.
In a world where TB drives are commonplace and 10Mb connections are considered slow, there's no reason to do this sort of thing. Unless, of course, you're developing on an embedded platform with a very small footprint. Which then begs the question, why use MySQL in the first place???
Post a Comment