Concatenate Two Fields And Get A Smaller Answer !?!?

Had a “strange” problem last week that really stumped me for longer than it should have.

My colleague had run a query to list some data and was getting approximately 200,000 rows back. We were investigating why a unique constraint was being violated so I took his query and did a count distinct to try and find the duplicates. I got 30,000 distinct values.

This really confused me, I was expecting only a handfull of duplicates, if any. Then I noticed that the first query already WAS a distinct list !

select distinct col1_id, col2_id from tablea

select count(distinct col1_id + ‘-‘ + col2_id) from tablea

I concatenated the fields together so I am counting a single field, I included the separator so that I could differentiate 1-11 from 11-1.

See if you can spot what the problem is before you read on.

Datatype Conversion !

As an ex-SQL Trainer my mantra always was, and always has been ALWAYS do explicit datatype conversion.

SQLServer tries to derive the datatype from the first field, in this case both col1_id and col2_id are integers so it tries to do an arithmetic operation rather than the character conversion that I was expecting.

How on earth, can it convert ‘-‘ to an integer? ‘-0’ of course ! So what it is doing is col1_id + -0 + col2_id. I proved this with:

select distinct col1_id, col2_id, col1_id + ‘-‘ + col2_id

Which gave me:

10, 3, 7

Of course what I should have done was:

select convert(varchar, col1_id) + ‘-‘ + convert(varchar, col2_id) from tablea

Lesson learned AGAIN ! Always explicitly convert your datatypes !

One thought on “Concatenate Two Fields And Get A Smaller Answer !?!?”

Leave a Reply

To respond on your own website, enter the URL of your response which should contain a link to this post's permalink URL. Your response will then appear (possibly after moderation) on this page. Want to update or remove your response? Update or delete your post and re-enter your post's URL again. (Learn More)