MySQL Union is column name and data-type agnostic

I stumbled upon a rare care while working on one of my projects. The data in one of my variables in PHP was being jumbled up. It took an hour of debugging my PHP code till I realized the the issue was in the MySQL query and that MySQL was mixing up the data in the columns for a particular query.

The issue was in the way that MySQL parses executes the queries with UNION. When you do a UNION of two SELECT queries, MySQL just appends the second data-set to the first data set without matching the column names or warning if the data types of corresponding columns are not same. If you mix up the order of the column names, the resulting data you get will be incorrect or at least not what you would be expecting in your code.

Although the official documentation does talk about being data-type agnostic, it doesn’t mention about column name being in same order in all the queries.

Example

Lets take this example of a table.

CREATE TABLE users (
	id SMALLINT NULL,
	first_name VARCHAR(50) NULL,
	last_name VARCHAR(50) NULL,
	points SMALLINT NULL,
	level SMALLINT NULL
);
+------+------------+-----------+--------+-------+
| id   | first_name | last_name | points | level |
+------+------------+-----------+--------+-------+
|    1 | Lalit      | Patel     |      5 |    50 |
|    2 | Chuck      | Norris    |     10 |   100 |
+------+------------+-----------+--------+-------+
2 rows in set (0.00 sec)

The following query runs without any error and it gives the result set given below. Note that it takes the column names in the order they were specified in the first query, but mixes the values up.

(SELECT id, first_name, last_name, level, points FROM users WHERE id = 1)
UNION
(SELECT id, last_name, first_name, points, level FROM users WHERE id = 2);
+------+------------+-----------+-------+--------+
| id   | first_name | last_name | level | points |
+------+------------+-----------+-------+--------+
|    1 | Lalit      | Patel     |    50 |      5 |
|    2 | Norris     | Chuck     |    10 |    100 |
+------+------------+-----------+-------+--------+
2 rows in set (0.00 sec)

It doesn’t even throw an error if the data type of corresponding columns dont match. See the query and the result set below. MySQL doesn’t even check the column data types of the columns. It just appends the result of 2nd query below the result of the first.

(SELECT id, first_name, last_name, level, points FROM users WHERE id = 1)
UNION
(SELECT id, points, level, first_name, last_name FROM users WHERE id = 2);
+------+------------+-----------+-------+--------+
| id   | first_name | last_name | level | points |
+------+------------+-----------+-------+--------+
|    1 | Lalit      | Patel     | 50    | 5      |
|    2 | 10         | 100       | Chuck | Norris |
+------+------------+-----------+-------+--------+
2 rows in set (0.00 sec)

MySQL will return the result as long as the number of columns are same in both the queries. A bit too much forgiving.

• • •

4 Responses to MySQL Union is column name and data-type agnostic

  1. Susam Pal says:

    It makes sense since we can insert a value of type SMALLINT into a column of type VARCHAR and vice versa. MySQL allows implicit conversions between various data types.

    Most implementations of SQL require the corresponding columns in a UNION statement to be of compatible types only. When the types are different but compatible, type conversions are carried out.

    • Lalit says:

      Yes, ignoring data-type and typecasting between columns makes sense and is also documented in the manual.
      But ignoring column names and matching columns just by order of columns is bit unexpected and more often then not will catch the developer off guard.

      • Susam Pal says:

        Actually, ignoring column names makes more sense to me. It is possible for two different tables to use two different column names for similar data and you might want to use UNION on them. So, it becomes very necessary to ignore column names.

        I would say that ignoring the data type of columns could be a bit surprising at first.

      • Lalit says:

        MySQL provides the AS syntax to change column names, which I was using the the query where I got stuck.

        So I think both these features will surprise the developer although both of them makes sense if you give it a thought but you wouldn’t be expecting them.