Hello, and thank you very much.

Actually I thought to that one already.

But I have reasons to think this doesn't have a lot of chances to be more efficient than the “LIKE” method against TINYTEXT.

How could I expect MariaDB to use any index with JSON_TABLE()?

Regards,

Gingko
De : Markus Mäkelä via discuss [mailto:discuss@lists.mariadb.org]
Envoyé : samedi 14 septembre 2024 à 9:19 AM
Pour : discuss@lists.mariadb.org
Objet : [MariaDB discuss] Re: Joining two tables on a list of values

Hi,

You could use JSON_TABLE() if you store the values as JSON. The nice thing about JSON is that MariaDB already has a lot of functions to work with it.

The syntax for converting each list in a row into its own row is IMO a bit tricky to get right the first time but it seems to work. Here’s an example that I tested it with:

CREATE TABLE main_table(id INT PRIMARY KEY NOT NULL, to_join INT NOT NULL);
CREATE TABLE joined_table(id INT PRIMARY KEY NOT NULL, int_join JSON NOT NULL);
INSERT INTO main_table VALUES (1, 1), (2, 2), (3, 3), (4, 4), (5, 5);
INSERT INTO joined_table VALUES (123, '[2,4]'), (456, '[3,5]'), (789, '[1, 5, 3]');


-- This one splits the join table’s values into individual rows
SELECT jt.*, js_t.*
  FROM joined_table AS jt
  CROSS JOIN JSON_TABLE(
    jt.int_join, '$[*]' COLUMNS(
      id INT PATH '$'
    )
  ) js_t;


-- This one joins it to the main table
SELECT mt.id AS main_table_id, jt.id AS joined_table_id, js_t.id AS joined_on
  FROM joined_table AS jt
  CROSS JOIN JSON_TABLE(
    jt.int_join, '$[*]' COLUMNS(
      id INT PATH '$'
    )
  ) js_t
  JOIN main_table mt ON (js_t.id = mt.to_join)
  ORDER BY jt.id;

The SQLFiddle for it can be found here: https://sqlfiddle.com/mariadb/online-compiler?id=5f6d5772-815f-48a4-977c-9903a00d683c

Markus Mäkelä

On 9/13/24 17:09, Gingko via discuss wrote:

Hello,

I would like to know if is there some (efficient!) way to join two tables on a list of values
contained in (likely a single column) of the joined table, whatever form could have
this list of values ?

For example, I have a main table containing:

main_table:
+------------+------------------+------+-----+------------+----------------+
| Field      | Type             | Null | Key | Default    | Extra          |
+------------+------------------+------+-----+------------+----------------+
| index      | int(11)          | NO   | PRI | 0          |                |
| to_join    | int(11)          | NO   |     | 0          |                |
| <other fields, irrelevant>    |      |     |            |                |
+------------+------------------+------+-----+------------+----------------+

… and a secondary table containing:

joined_table:
+------------+------------------+------+-----+------------+----------------+
| Field      | Type             | Null | Key | Default    | Extra          |
+------------+------------------+------+-----+------------+----------------+
| index      | int(11)          | NO   | PRI |            | auto_increment |
| int_list   | <list_of_ints>   | YES  |     | (empty)    |                |
| <other fields, irrelevant>    |      |     |            |                |
+------------+------------------+------+-----+------------+----------------+


We could suppose, here, as a simple test case, that main_table has five rows
where to_join contains respectively 1, 2, 3, 4, and 5,
and the joined_table has a single row with the single column int_list combining 2 and 4
in any form (but of course, this list could actually have any length).

I’d like to join these tables using a statement like:

SELECT
   <anything>
FROM main_table AS M
LEFT JOIN
joined_table AS J
   ON M.to_join <some_operator to> J.int_list;

… in such a way that main_table would be joined to joined_table using any value
contained in int_list, and the statement is expected to give two rows as a result.

I tried to set int_list as a TINYTEXT containing "2,4" and use:

SELECT
   <anything>
FROM main_table AS M
LEFT JOIN joined_table AS J
   ON M.to_join IN (J.int_list);

… but this doesn’t work, of course, my comma separated list is seen as a whole single value.

I also tried, still using TINYTEXT, the list as "[2][4]" and use:

SELECT
   <anything>
FROM main_table AS M
LEFT JOIN joined_table AS J
   ON J.int_list LIKE CONCAT('%[', M.to_join , ']%');

This works (more or less), but it is excruciatingly slow, especially knowing that main_table
can actually contain millions of rows, and joined_table several thousands.

Is there a better solution for doing that?

As the final goal implies, somewhere later, associating to_join in main_table, via  int_list,
to the index auto increment primary key of the joined_table, using multiple rows there
is not an option.

Regards,

Gingko
_______________________________________________

discuss mailing list -- discuss@lists.mariadb.org
To unsubscribe send an email to discuss-leave@lists.mariadb.org
-- 
Markus Mäkelä, Senior Software Engineer
MariaDB Corporation

_______________________________________________
discuss mailing list -- discuss@lists.mariadb.org
To unsubscribe send an email to discuss-leave@lists.mariadb.org