Data from multiple rows in 1 column

I was building database tables for a project and came across a problem that I’ve tried to ignore in the past: When I have a main table, a second table, and a table linking the two, and the link-table can link a single record in the main table with multiple records in the second table, doing a join would return multiple rows for a single entry in my main table — a row for each link between the main and second tables. But I wanted to retrieve the data in such a way that only a single row is returned for each row in the main table, which contains all the data from the second table. A mouth-full, I know! Keep reading to see examples as well as the answer I’ve found.

GROUP_CONCAT to the rescue

The GROUP_CONCAT function for MySQL allows you to group multiple entries into a single column with a specified delimiter/separator.

The Query

SELECT main_table.*,
GROUP_CONCAT(second_table.name ORDER BY second_table.name ASC SEPARATOR ', ') AS link_table_column
FROM main_table
INNER JOIN link_table ON link_table.main_id = main_table.id
INNER JOIN second_table ON link_table.second_id = second_table.id
GROUP BY main_table.id

What happens when you run the query is the following:

MySQL would find the multiple entries from the second_table that is linked to the main_table via the link_table and group all the names (the column I selected in the example) into a single column (which I called link_table_column), separating them with a comma.

The results would look something like this:

main_table
column 1
main_table
column 2
main_table
column 3
link_table_column
data data data Second Table Entry 1 Name,
Second Table Entry 2 Name,
etc…

Example

Let’s say we have a list of people and every person plays multiple sports, and this data was stored in a database in 3 tables.

The first table is the list of people:

id Name
1 Foo
2 Bar

The second table is a list of sports:

id Name
1 Rugby
2 Football
3 Tennis
4 Karate
5 Cricket
6 Polo
7 Hockey

The third table is the link-table connecting each person with the sports they play:

id people_id sport_id
1 1 1
2 1 4
3 1 6
4 2 1
5 2 2
6 2 7

Normally, if you run a query like :

SELECT p.name, s.name AS sport FROM people AS p
INNER JOIN link_table AS lt ON lt.people_id = p.id
INNER JOIN sport AS s ON lt.sport_id = s.id

Your results would look as follows:

name sport
Foo Rugby
Foo Karate
Foo Polo
Bar Rugby
Bar Football
Bar Hockey

Now if you use the following query which uses the GROUP_CONCAT function :

SELECT people.name
GROUP_CONCAT(sport.name ORDER BY sport.name ASC SEPARATOR ', ') AS sports
FROM people AS p
INNER JOIN link_table AS lt ON lt.people_id = p.id
INNER JOIN sport AS s ON s.id = lt.sport_id
GROUP BY people.name

Your results would look as follows:

name sport
Foo Karate, Polo, Rugby
Bar Football, Hockey, Rugby

Optimization

The query does everything it needs to, but it’s not optimized. To make it a little faster we create two indexes: One for the link-table that contains both the foreign key ids and one for the field being sorted by in the GROUP_CONCAT.

The generic format is as follows:

CREATE INDEX link ON link_table(link_table.main_id, link_table.second_id);
CREATE INDEX sort_name ON second_table(name);

You should create these indexes before you attempt to use the query to improve the speed.

For the people/sport example above the create script for the indexes would look as follows:

CREATE INDEX link ON link_table(people_id,sport_id);
CREATE INDEX sport_name ON sport(name);

In Conclusion

As you can see this saves you the hassle of manually filtering through the extra rows of results returned by a standard join query. The optimization also makes the query faster helping in your overall project performance.

2 Responses to “Data from multiple rows in 1 column”

  1. Jesus Trujillo says:

    Thank you for share this nice article, I was dealing with a similar query, but now I have other issue because I have other 2 tables. For example, I have these tables: restaurants restaurants_amenities amenities These 3 tables match with your example query. But now I would like to show the categories too. I have 2 tables more: restaurants_categories categories Any help will be very appreciated, Thanks Jesus

  2. Luis says:

    Hello, how should be the query if i want to have the column “sport” in 3 (sport1, sport2, sport3)?

Leave a Reply