Concatenate Two Fields And Get A Smaller Answer !?!?

Had a “strange” problem last week that really stumped me for longer than it should have.

My colleague had run a query to list some data and was getting approximately 200,000 rows back. We were investigating why a unique constraint was being violated so I took his query and did a count distinct to try and find the duplicates. I got 30,000 distinct values.

This really confused me, I was expecting only a handfull of duplicates, if any. Then I noticed that the first query already WAS a distinct list !

select distinct col1_id, col2_id from tablea

select count(distinct col1_id + ‘-‘ + col2_id) from tablea

I concatenated the fields together so I am counting a single field, I included the separator so that I could differentiate 1-11 from 11-1.

See if you can spot what the problem is before you read on.

Datatype Conversion !

As an ex-SQL Trainer my mantra always was, and always has been ALWAYS do explicit datatype conversion.

SQLServer tries to derive the datatype from the first field, in this case both col1_id and col2_id are integers so it tries to do an arithmetic operation rather than the character conversion that I was expecting.

How on earth, can it convert ‘-‘ to an integer? ‘-0’ of course ! So what it is doing is col1_id + -0 + col2_id. I proved this with:

select distinct col1_id, col2_id, col1_id + ‘-‘ + col2_id

Which gave me:

10, 3, 7

Of course what I should have done was:

select convert(varchar, col1_id) + ‘-‘ + convert(varchar, col2_id) from tablea

Lesson learned AGAIN ! Always explicitly convert your datatypes !

One thought on “Concatenate Two Fields And Get A Smaller Answer !?!?”

Leave a Reply