[Maria-discuss] Looking for advice where to start looking at a slow query case
Hey, I have tried to run a local service which claims to be a YouTube alike which is named AVideo(used to be PHPTube). I loaded the service with 220k+ videos from YouTube and have used MariaDB as the DB backend. The service moved slowly as the size of the videos DB growed from 100k to 150k. I got a recommendation from the developer to use MySQL which didn’t made any sense to me. I tracked the issue to the SQL queries that the service runs. Currently I am upgrading my local servers so the DB cannot be downloaded for testing however I have seen a very interesting thing. I will first put the query here: ```sql SELECT u.*, v.*, c.iconClass, c.name as category, c.clean_name as clean_category,c.description as category_description, v.created as videoCreation, v.modified as videoModified FROM videos as v LEFT JOIN categories c ON categories_id = c.id LEFT JOIN users u ON v.users_id = u.id WHERE 2=2 AND u.status = 'a' AND v.status IN ('a','k','f') ORDER BY likes DESC LIMIT 36, 12; ``` The ANALYZE FORMAT=JSON for that query is: ```json { "query_block": { "select_id": 1, "r_loops": 1, "r_total_time_ms": 9643.807074, "filesort": { "sort_key": "v.likes desc", "r_loops": 1, "r_total_time_ms": 756.223544, "r_limit": 48, "r_used_priority_queue": true, "r_output_rows": 49, "r_sort_mode": "sort_key,rowid", "temporary_table": { "table": { "table_name": "u", "access_type": "ref", "possible_keys": ["PRIMARY", "users_status_IDX"], "key": "users_status_IDX", "key_length": "1", "used_key_parts": ["status"], "ref": ["const"], "r_loops": 1, "rows": 1, "r_rows": 1, "r_table_time_ms": 0.034852389, "r_other_time_ms": 0.020715696, "filtered": 100, "r_filtered": 100, "index_condition": "u.`status` = 'a'" }, "table": { "table_name": "v", "access_type": "ref", "possible_keys": [ "fk_videos_users_idx", "video_status_idx", "videos_status_index" ], "key": "fk_videos_users_idx", "key_length": "4", "used_key_parts": ["users_id"], "ref": ["AVideo.u.id"], "r_loops": 1, "rows": 391299, "r_rows": 990067, "r_table_time_ms": 2691.791742, "r_other_time_ms": 6129.887127, "filtered": 50.00025558, "r_filtered": 100, "attached_condition": "v.`status` in ('a','k','f')" }, "table": { "table_name": "c", "access_type": "eq_ref", "possible_keys": ["PRIMARY"], "key": "PRIMARY", "key_length": "4", "used_key_parts": ["id"], "ref": ["AVideo.v.categories_id"], "r_loops": 990067, "rows": 1, "r_rows": 1, "r_table_time_ms": 0.00546745, "r_other_time_ms": 0.569425897, "filtered": 100, "r_filtered": 100 } } } } } ``` While for the next: ```sql SELECT * FROM (SELECT u.id as uid , u.user as uuser , u.name as uname , u.email as uemail , u.password as upassword , u.created as ucreated , u.modified as umodified , u.isAdmin as uisAdmin , u.status as ustatus , u.photoURL as uphotoURL , u.lastLogin as ulastLogin , u.recoverPass as urecoverPass , u.backgroundURL as ubackgroundURL , u.canStream as ucanStream , u.canUpload as ucanUpload , u.canCreateMeet as ucanCreateMeet , u.canViewChart as ucanViewChart , u.about as uabout , u.channelName as uchannelName , u.emailVerified as uemailVerified , u.analyticsCode as uanalyticsCode , u.externalOptions as uexternalOptions , u.first_name as ufirst_name , u.last_name as ulast_name , u.address as uaddress , u.zip_code as uzip_code , u.country as ucountry , u.region as uregion , u.city as ucity , u.donationLink as udonationLink , u.extra_info as uextra_info , u.phone as uphone , u.is_company as uis_company , v.*, c.iconClass, c.name as category, c.clean_name as clean_category,c.description as category_description, v.created as videoCreation, v.modified as videoModified FROM videos as v LEFT JOIN categories c ON categories_id = c.id LEFT JOIN users u ON v.users_id = u.id WHERE 2=2 AND v.status IN ('a','k','f') ORDER BY likes DESC ) a WHERE a.ustatus IN ('a') LIMIT 1000, 900; ``` The ANALYZE FORMAT=JSON is the next: ```json { "query_block": { "select_id": 1, "r_loops": 1, "r_total_time_ms": 8.047324959, "table": { "table_name": "u", "access_type": "ref", "possible_keys": ["PRIMARY", "users_status_IDX"], "key": "users_status_IDX", "key_length": "1", "used_key_parts": ["status"], "ref": ["const"], "r_loops": 1, "rows": 1, "r_rows": 1, "r_table_time_ms": 0.045761237, "r_other_time_ms": 0.0201165, "filtered": 100, "r_filtered": 100, "index_condition": "u.`status` = 'a'" }, "table": { "table_name": "v", "access_type": "ref", "possible_keys": [ "fk_videos_users_idx", "video_status_idx", "videos_status_index" ], "key": "fk_videos_users_idx", "key_length": "4", "used_key_parts": ["users_id"], "ref": ["AVideo.u.id"], "r_loops": 1, "rows": 391299, "r_rows": 1900, "r_table_time_ms": 6.939235389, "r_other_time_ms": 1.010973728, "filtered": 50.00025558, "r_filtered": 100, "attached_condition": "v.`status` in ('a','k','f')" }, "table": { "table_name": "c", "access_type": "eq_ref", "possible_keys": ["PRIMARY"], "key": "PRIMARY", "key_length": "4", "used_key_parts": ["id"], "ref": ["AVideo.v.categories_id"], "r_loops": 1900, "rows": 1, "r_rows": 1, "r_table_time_ms": 0.007669714, "r_other_time_ms": 0.001959112, "filtered": 100, "r_filtered": 100 } } } ``` And what I am trying to understand is, what's the difference? Why would the same exact query will result with this issue? Just to notice that the next query will not run at all due to fields naming conflict between u and v: ```sql SELECT * FROM (SELECT u.* , v.*, c.iconClass, c.name as category, c.clean_name as clean_category,c.description as category_description, v.created as videoCreation, v.modified as videoModified FROM videos as v LEFT JOIN categories c ON categories_id = c.id LEFT JOIN users u ON v.users_id = u.id WHERE 2=2 AND v.status IN ('a','k','f') ORDER BY likes DESC ) a WHERE a.ustatus IN ('a') LIMIT 1000, 900; ``` I am not new to the DB world that much but it's the first time I am getting my hands dirty about such an issue. I will share the full DB later on today so it could be used as a learning material. Also, are there any big public databases that can be used for learning and testing purposes? Thanks, Eliezer ---- Eliezer Croitoru NgTech, Tech Support Mobile: +972-5-28704261 Email: ngtech1ltd@gmail.com Web: https://ngtech.co.il/ My-Tube: https://tube.ngtech.co.il/
It isn't the same query. Sub-selects tend to execute by fully constructing the data set, putting it into a temporary table, and then selecting from that for the wrapping query. That means you get no benefit from early stopping, and often the optimizer doesn't figure out the appropriate indexing so your temporary table ends up with a full table scan. If you care about performance, avoid sub-selects by re-writing the query without it. If sub-selects are unavoidable, use an appropriately indexed materialized view (not a regular view, a regular view is just a sub-select in disguise in the vast majority of cases) that you can either refresh periodically via a scheduled event, or if necessary in real-time using triggers on the underlying tables. On Wed, Jul 13, 2022 at 9:25 AM <ngtech1ltd@gmail.com> wrote:
The full DB can be downloaded from: https://www.ngtech.co.il/static/AVideo.sql.gz
Thanks for any advice. Eliezer
---- Eliezer Croitoru NgTech, Tech Support Mobile: +972-5-28704261 Email: ngtech1ltd@gmail.com Web: https://ngtech.co.il/ My-Tube: https://tube.ngtech.co.il/
-----Original Message----- From: ngtech1ltd@gmail.com <ngtech1ltd@gmail.com> Sent: Wednesday, 13 July 2022 8:46 To: 'maria-discuss@lists.launchpad.net' <maria-discuss@lists.launchpad.net> Subject: Looking for advice where to start looking at a slow query case
Hey,
I have tried to run a local service which claims to be a YouTube alike which is named AVideo(used to be PHPTube). I loaded the service with 220k+ videos from YouTube and have used MariaDB as the DB backend. The service moved slowly as the size of the videos DB growed from 100k to 150k. I got a recommendation from the developer to use MySQL which didn’t made any sense to me. I tracked the issue to the SQL queries that the service runs. Currently I am upgrading my local servers so the DB cannot be downloaded for testing however I have seen a very interesting thing. I will first put the query here: ```sql SELECT u.*, v.*, c.iconClass, c.name as category, c.clean_name as clean_category,c.description as category_description, v.created as videoCreation, v.modified as videoModified FROM videos as v LEFT JOIN categories c ON categories_id = c.id LEFT JOIN users u ON v.users_id = u.id WHERE 2=2 AND u.status = 'a' AND v.status IN ('a','k','f') ORDER BY likes DESC LIMIT 36, 12; ```
The ANALYZE FORMAT=JSON for that query is: ```json { "query_block": { "select_id": 1, "r_loops": 1, "r_total_time_ms": 9643.807074, "filesort": { "sort_key": "v.likes desc", "r_loops": 1, "r_total_time_ms": 756.223544, "r_limit": 48, "r_used_priority_queue": true, "r_output_rows": 49, "r_sort_mode": "sort_key,rowid", "temporary_table": { "table": { "table_name": "u", "access_type": "ref", "possible_keys": ["PRIMARY", "users_status_IDX"], "key": "users_status_IDX", "key_length": "1", "used_key_parts": ["status"], "ref": ["const"], "r_loops": 1, "rows": 1, "r_rows": 1, "r_table_time_ms": 0.034852389, "r_other_time_ms": 0.020715696, "filtered": 100, "r_filtered": 100, "index_condition": "u.`status` = 'a'" }, "table": { "table_name": "v", "access_type": "ref", "possible_keys": [ "fk_videos_users_idx", "video_status_idx", "videos_status_index" ], "key": "fk_videos_users_idx", "key_length": "4", "used_key_parts": ["users_id"], "ref": ["AVideo.u.id"], "r_loops": 1, "rows": 391299, "r_rows": 990067, "r_table_time_ms": 2691.791742, "r_other_time_ms": 6129.887127, "filtered": 50.00025558, "r_filtered": 100, "attached_condition": "v.`status` in ('a','k','f')" }, "table": { "table_name": "c", "access_type": "eq_ref", "possible_keys": ["PRIMARY"], "key": "PRIMARY", "key_length": "4", "used_key_parts": ["id"], "ref": ["AVideo.v.categories_id"], "r_loops": 990067, "rows": 1, "r_rows": 1, "r_table_time_ms": 0.00546745, "r_other_time_ms": 0.569425897, "filtered": 100, "r_filtered": 100 } } } } } ``` While for the next: ```sql SELECT * FROM (SELECT u.id as uid , u.user as uuser , u.name as uname , u.email as uemail , u.password as upassword , u.created as ucreated , u.modified as umodified , u.isAdmin as uisAdmin , u.status as ustatus , u.photoURL as uphotoURL , u.lastLogin as ulastLogin , u.recoverPass as urecoverPass , u.backgroundURL as ubackgroundURL , u.canStream as ucanStream , u.canUpload as ucanUpload , u.canCreateMeet as ucanCreateMeet , u.canViewChart as ucanViewChart , u.about as uabout , u.channelName as uchannelName , u.emailVerified as uemailVerified , u.analyticsCode as uanalyticsCode , u.externalOptions as uexternalOptions , u.first_name as ufirst_name , u.last_name as ulast_name , u.address as uaddress , u.zip_code as uzip_code , u.country as ucountry , u.region as uregion , u.city as ucity , u.donationLink as udonationLink , u.extra_info as uextra_info , u.phone as uphone , u.is_company as uis_company , v.*, c.iconClass, c.name as category, c.clean_name as clean_category,c.description as category_description, v.created as videoCreation, v.modified as videoModified FROM videos as v LEFT JOIN categories c ON categories_id = c.id LEFT JOIN users u ON v.users_id = u.id WHERE 2=2 AND v.status IN ('a','k','f') ORDER BY likes DESC ) a WHERE a.ustatus IN ('a') LIMIT 1000, 900; ```
The ANALYZE FORMAT=JSON is the next: ```json { "query_block": { "select_id": 1, "r_loops": 1, "r_total_time_ms": 8.047324959, "table": { "table_name": "u", "access_type": "ref", "possible_keys": ["PRIMARY", "users_status_IDX"], "key": "users_status_IDX", "key_length": "1", "used_key_parts": ["status"], "ref": ["const"], "r_loops": 1, "rows": 1, "r_rows": 1, "r_table_time_ms": 0.045761237, "r_other_time_ms": 0.0201165, "filtered": 100, "r_filtered": 100, "index_condition": "u.`status` = 'a'" }, "table": { "table_name": "v", "access_type": "ref", "possible_keys": [ "fk_videos_users_idx", "video_status_idx", "videos_status_index" ], "key": "fk_videos_users_idx", "key_length": "4", "used_key_parts": ["users_id"], "ref": ["AVideo.u.id"], "r_loops": 1, "rows": 391299, "r_rows": 1900, "r_table_time_ms": 6.939235389, "r_other_time_ms": 1.010973728, "filtered": 50.00025558, "r_filtered": 100, "attached_condition": "v.`status` in ('a','k','f')" }, "table": { "table_name": "c", "access_type": "eq_ref", "possible_keys": ["PRIMARY"], "key": "PRIMARY", "key_length": "4", "used_key_parts": ["id"], "ref": ["AVideo.v.categories_id"], "r_loops": 1900, "rows": 1, "r_rows": 1, "r_table_time_ms": 0.007669714, "r_other_time_ms": 0.001959112, "filtered": 100, "r_filtered": 100 } } } ```
And what I am trying to understand is, what's the difference? Why would the same exact query will result with this issue? Just to notice that the next query will not run at all due to fields naming conflict between u and v: ```sql SELECT * FROM (SELECT u.* , v.*, c.iconClass, c.name as category, c.clean_name as clean_category,c.description as category_description, v.created as videoCreation, v.modified as videoModified FROM videos as v LEFT JOIN categories c ON categories_id = c.id LEFT JOIN users u ON v.users_id = u.id WHERE 2=2 AND v.status IN ('a','k','f') ORDER BY likes DESC ) a WHERE a.ustatus IN ('a') LIMIT 1000, 900; ```
I am not new to the DB world that much but it's the first time I am getting my hands dirty about such an issue.
I will share the full DB later on today so it could be used as a learning material. Also, are there any big public databases that can be used for learning and testing purposes?
Thanks, Eliezer
---- Eliezer Croitoru NgTech, Tech Support Mobile: +972-5-28704261 Email: ngtech1ltd@gmail.com Web: https://ngtech.co.il/ My-Tube: https://tube.ngtech.co.il/
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Sorry, I didn't got why what works faster is faster. Also when I am removing the Order by from the original query it runs faster. There is index to the relevant fields. The big table is the videos so I assumed that the db engine would be smart enough to first run the search in an optimized fashion and then compare to the other tables by the id. also why the difference between the same exact query when the u.* is used it's slow but when I am using the AS to define each and every field renaming it would run faster. It doesn't make any sense to me as a programmer. If I would have written the code to do this query it would be in such a way that it will run in couple ms... I have started re-learning my DBA materials with hope to JOIN my sysadmin, programming and dba skills to make sense in the future what doesn't make sense now. Thanks! בתאריך יום ד׳, 13 ביולי 2022, 9:45, מאת Gordan Bobic < gordan.bobic@gmail.com>:
It isn't the same query. Sub-selects tend to execute by fully constructing the data set, putting it into a temporary table, and then selecting from that for the wrapping query. That means you get no benefit from early stopping, and often the optimizer doesn't figure out the appropriate indexing so your temporary table ends up with a full table scan. If you care about performance, avoid sub-selects by re-writing the query without it. If sub-selects are unavoidable, use an appropriately indexed materialized view (not a regular view, a regular view is just a sub-select in disguise in the vast majority of cases) that you can either refresh periodically via a scheduled event, or if necessary in real-time using triggers on the underlying tables.
On Wed, Jul 13, 2022 at 9:25 AM <ngtech1ltd@gmail.com> wrote:
The full DB can be downloaded from: https://www.ngtech.co.il/static/AVideo.sql.gz
Thanks for any advice. Eliezer
---- Eliezer Croitoru NgTech, Tech Support Mobile: +972-5-28704261 Email: ngtech1ltd@gmail.com Web: https://ngtech.co.il/ My-Tube: https://tube.ngtech.co.il/
-----Original Message----- From: ngtech1ltd@gmail.com <ngtech1ltd@gmail.com> Sent: Wednesday, 13 July 2022 8:46 To: 'maria-discuss@lists.launchpad.net' <
Subject: Looking for advice where to start looking at a slow query case
Hey,
I have tried to run a local service which claims to be a YouTube alike which is named AVideo(used to be PHPTube). I loaded the service with 220k+ videos from YouTube and have used MariaDB as the DB backend. The service moved slowly as the size of the videos DB growed from 100k to 150k. I got a recommendation from the developer to use MySQL which didn’t made any sense to me. I tracked the issue to the SQL queries that the service runs. Currently I am upgrading my local servers so the DB cannot be downloaded for testing however I have seen a very interesting thing. I will first put the query here: ```sql SELECT u.*, v.*, c.iconClass, c.name as category, c.clean_name as clean_category,c.description as category_description, v.created as videoCreation, v.modified as videoModified FROM videos as v LEFT JOIN categories c ON categories_id = c.id LEFT JOIN users u ON v.users_id = u.id WHERE 2=2 AND u.status = 'a' AND v.status IN ('a','k','f') ORDER BY likes DESC LIMIT 36, 12; ```
The ANALYZE FORMAT=JSON for that query is: ```json { "query_block": { "select_id": 1, "r_loops": 1, "r_total_time_ms": 9643.807074, "filesort": { "sort_key": "v.likes desc", "r_loops": 1, "r_total_time_ms": 756.223544, "r_limit": 48, "r_used_priority_queue": true, "r_output_rows": 49, "r_sort_mode": "sort_key,rowid", "temporary_table": { "table": { "table_name": "u", "access_type": "ref", "possible_keys": ["PRIMARY", "users_status_IDX"], "key": "users_status_IDX", "key_length": "1", "used_key_parts": ["status"], "ref": ["const"], "r_loops": 1, "rows": 1, "r_rows": 1, "r_table_time_ms": 0.034852389, "r_other_time_ms": 0.020715696, "filtered": 100, "r_filtered": 100, "index_condition": "u.`status` = 'a'" }, "table": { "table_name": "v", "access_type": "ref", "possible_keys": [ "fk_videos_users_idx", "video_status_idx", "videos_status_index" ], "key": "fk_videos_users_idx", "key_length": "4", "used_key_parts": ["users_id"], "ref": ["AVideo.u.id"], "r_loops": 1, "rows": 391299, "r_rows": 990067, "r_table_time_ms": 2691.791742, "r_other_time_ms": 6129.887127, "filtered": 50.00025558, "r_filtered": 100, "attached_condition": "v.`status` in ('a','k','f')" }, "table": { "table_name": "c", "access_type": "eq_ref", "possible_keys": ["PRIMARY"], "key": "PRIMARY", "key_length": "4", "used_key_parts": ["id"], "ref": ["AVideo.v.categories_id"], "r_loops": 990067, "rows": 1, "r_rows": 1, "r_table_time_ms": 0.00546745, "r_other_time_ms": 0.569425897, "filtered": 100, "r_filtered": 100 } } } } } ``` While for the next: ```sql SELECT * FROM (SELECT u.id as uid , u.user as uuser , u.name as uname , u.email as uemail , u.password as upassword , u.created as ucreated , u.modified as umodified , u.isAdmin as uisAdmin , u.status as ustatus , u.photoURL as uphotoURL , u.lastLogin as ulastLogin , u.recoverPass as urecoverPass , u.backgroundURL as ubackgroundURL , u.canStream as ucanStream , u.canUpload as ucanUpload , u.canCreateMeet as ucanCreateMeet , u.canViewChart as ucanViewChart , u.about as uabout , u.channelName as uchannelName , u.emailVerified as uemailVerified , u.analyticsCode as uanalyticsCode , u.externalOptions as uexternalOptions , u.first_name as ufirst_name , u.last_name as ulast_name , u.address as uaddress , u.zip_code as uzip_code , u.country as ucountry , u.region as uregion , u.city as ucity , u.donationLink as udonationLink , u.extra_info as uextra_info , u.phone as uphone , u.is_company as uis_company , v.*, c.iconClass, c.name as category, c.clean_name as clean_category,c.description as category_description, v.created as videoCreation, v.modified as videoModified FROM videos as v LEFT JOIN categories c ON categories_id = c.id LEFT JOIN users u ON v.users_id = u.id WHERE 2=2 AND v.status IN ('a','k','f') ORDER BY likes DESC ) a WHERE a.ustatus IN ('a') LIMIT 1000, 900; ```
The ANALYZE FORMAT=JSON is the next: ```json { "query_block": { "select_id": 1, "r_loops": 1, "r_total_time_ms": 8.047324959, "table": { "table_name": "u", "access_type": "ref", "possible_keys": ["PRIMARY", "users_status_IDX"], "key": "users_status_IDX", "key_length": "1", "used_key_parts": ["status"], "ref": ["const"], "r_loops": 1, "rows": 1, "r_rows": 1, "r_table_time_ms": 0.045761237, "r_other_time_ms": 0.0201165, "filtered": 100, "r_filtered": 100, "index_condition": "u.`status` = 'a'" }, "table": { "table_name": "v", "access_type": "ref", "possible_keys": [ "fk_videos_users_idx", "video_status_idx", "videos_status_index" ], "key": "fk_videos_users_idx", "key_length": "4", "used_key_parts": ["users_id"], "ref": ["AVideo.u.id"], "r_loops": 1, "rows": 391299, "r_rows": 1900, "r_table_time_ms": 6.939235389, "r_other_time_ms": 1.010973728, "filtered": 50.00025558, "r_filtered": 100, "attached_condition": "v.`status` in ('a','k','f')" }, "table": { "table_name": "c", "access_type": "eq_ref", "possible_keys": ["PRIMARY"], "key": "PRIMARY", "key_length": "4", "used_key_parts": ["id"], "ref": ["AVideo.v.categories_id"], "r_loops": 1900, "rows": 1, "r_rows": 1, "r_table_time_ms": 0.007669714, "r_other_time_ms": 0.001959112, "filtered": 100, "r_filtered": 100 } } } ```
And what I am trying to understand is, what's the difference? Why would
maria-discuss@lists.launchpad.net> the same exact query will result with this issue?
Just to notice that the next query will not run at all due to fields naming conflict between u and v: ```sql SELECT * FROM (SELECT u.* , v.*, c.iconClass, c.name as category, c.clean_name as clean_category,c.description as category_description, v.created as videoCreation, v.modified as videoModified FROM videos as v LEFT JOIN categories c ON categories_id = c.id LEFT JOIN users u ON v.users_id = u.id WHERE 2=2 AND v.status IN ('a','k','f') ORDER BY likes DESC ) a WHERE a.ustatus IN ('a') LIMIT 1000, 900; ```
I am not new to the DB world that much but it's the first time I am getting my hands dirty about such an issue.
I will share the full DB later on today so it could be used as a learning material. Also, are there any big public databases that can be used for learning and testing purposes?
Thanks, Eliezer
---- Eliezer Croitoru NgTech, Tech Support Mobile: +972-5-28704261 Email: ngtech1ltd@gmail.com Web: https://ngtech.co.il/ My-Tube: https://tube.ngtech.co.il/
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Just had a look with the data set you provided, this looks like a classical case of query optimizer getting it wrong. In your first query, do this modification: s/SELECT/SELECT STRAIGHT_JOIN/ and the execution time goes from seconds to milliseconds. The optimizer wasn't smart enough to figure out that the optimal plan is to start with the videos table and take advantage of early stopping with limit. It seems reasonably reproducible so you might want to file an optimizer bug on MariaDB jira. On Wed, Jul 13, 2022 at 9:59 AM NgTech LTD <ngtech1ltd@gmail.com> wrote:
Sorry, I didn't got why what works faster is faster. Also when I am removing the Order by from the original query it runs faster. There is index to the relevant fields. The big table is the videos so I assumed that the db engine would be smart enough to first run the search in an optimized fashion and then compare to the other tables by the id.
also why the difference between the same exact query when the u.* is used it's slow but when I am using the AS to define each and every field renaming it would run faster. It doesn't make any sense to me as a programmer. If I would have written the code to do this query it would be in such a way that it will run in couple ms...
I have started re-learning my DBA materials with hope to JOIN my sysadmin, programming and dba skills to make sense in the future what doesn't make sense now.
Thanks!
בתאריך יום ד׳, 13 ביולי 2022, 9:45, מאת Gordan Bobic <gordan.bobic@gmail.com>:
It isn't the same query. Sub-selects tend to execute by fully constructing the data set, putting it into a temporary table, and then selecting from that for the wrapping query. That means you get no benefit from early stopping, and often the optimizer doesn't figure out the appropriate indexing so your temporary table ends up with a full table scan. If you care about performance, avoid sub-selects by re-writing the query without it. If sub-selects are unavoidable, use an appropriately indexed materialized view (not a regular view, a regular view is just a sub-select in disguise in the vast majority of cases) that you can either refresh periodically via a scheduled event, or if necessary in real-time using triggers on the underlying tables.
On Wed, Jul 13, 2022 at 9:25 AM <ngtech1ltd@gmail.com> wrote:
The full DB can be downloaded from: https://www.ngtech.co.il/static/AVideo.sql.gz
Thanks for any advice. Eliezer
---- Eliezer Croitoru NgTech, Tech Support Mobile: +972-5-28704261 Email: ngtech1ltd@gmail.com Web: https://ngtech.co.il/ My-Tube: https://tube.ngtech.co.il/
-----Original Message----- From: ngtech1ltd@gmail.com <ngtech1ltd@gmail.com> Sent: Wednesday, 13 July 2022 8:46 To: 'maria-discuss@lists.launchpad.net' <maria-discuss@lists.launchpad.net> Subject: Looking for advice where to start looking at a slow query case
Hey,
I have tried to run a local service which claims to be a YouTube alike which is named AVideo(used to be PHPTube). I loaded the service with 220k+ videos from YouTube and have used MariaDB as the DB backend. The service moved slowly as the size of the videos DB growed from 100k to 150k. I got a recommendation from the developer to use MySQL which didn’t made any sense to me. I tracked the issue to the SQL queries that the service runs. Currently I am upgrading my local servers so the DB cannot be downloaded for testing however I have seen a very interesting thing. I will first put the query here: ```sql SELECT u.*, v.*, c.iconClass, c.name as category, c.clean_name as clean_category,c.description as category_description, v.created as videoCreation, v.modified as videoModified FROM videos as v LEFT JOIN categories c ON categories_id = c.id LEFT JOIN users u ON v.users_id = u.id WHERE 2=2 AND u.status = 'a' AND v.status IN ('a','k','f') ORDER BY likes DESC LIMIT 36, 12; ```
The ANALYZE FORMAT=JSON for that query is: ```json { "query_block": { "select_id": 1, "r_loops": 1, "r_total_time_ms": 9643.807074, "filesort": { "sort_key": "v.likes desc", "r_loops": 1, "r_total_time_ms": 756.223544, "r_limit": 48, "r_used_priority_queue": true, "r_output_rows": 49, "r_sort_mode": "sort_key,rowid", "temporary_table": { "table": { "table_name": "u", "access_type": "ref", "possible_keys": ["PRIMARY", "users_status_IDX"], "key": "users_status_IDX", "key_length": "1", "used_key_parts": ["status"], "ref": ["const"], "r_loops": 1, "rows": 1, "r_rows": 1, "r_table_time_ms": 0.034852389, "r_other_time_ms": 0.020715696, "filtered": 100, "r_filtered": 100, "index_condition": "u.`status` = 'a'" }, "table": { "table_name": "v", "access_type": "ref", "possible_keys": [ "fk_videos_users_idx", "video_status_idx", "videos_status_index" ], "key": "fk_videos_users_idx", "key_length": "4", "used_key_parts": ["users_id"], "ref": ["AVideo.u.id"], "r_loops": 1, "rows": 391299, "r_rows": 990067, "r_table_time_ms": 2691.791742, "r_other_time_ms": 6129.887127, "filtered": 50.00025558, "r_filtered": 100, "attached_condition": "v.`status` in ('a','k','f')" }, "table": { "table_name": "c", "access_type": "eq_ref", "possible_keys": ["PRIMARY"], "key": "PRIMARY", "key_length": "4", "used_key_parts": ["id"], "ref": ["AVideo.v.categories_id"], "r_loops": 990067, "rows": 1, "r_rows": 1, "r_table_time_ms": 0.00546745, "r_other_time_ms": 0.569425897, "filtered": 100, "r_filtered": 100 } } } } } ``` While for the next: ```sql SELECT * FROM (SELECT u.id as uid , u.user as uuser , u.name as uname , u.email as uemail , u.password as upassword , u.created as ucreated , u.modified as umodified , u.isAdmin as uisAdmin , u.status as ustatus , u.photoURL as uphotoURL , u.lastLogin as ulastLogin , u.recoverPass as urecoverPass , u.backgroundURL as ubackgroundURL , u.canStream as ucanStream , u.canUpload as ucanUpload , u.canCreateMeet as ucanCreateMeet , u.canViewChart as ucanViewChart , u.about as uabout , u.channelName as uchannelName , u.emailVerified as uemailVerified , u.analyticsCode as uanalyticsCode , u.externalOptions as uexternalOptions , u.first_name as ufirst_name , u.last_name as ulast_name , u.address as uaddress , u.zip_code as uzip_code , u.country as ucountry , u.region as uregion , u.city as ucity , u.donationLink as udonationLink , u.extra_info as uextra_info , u.phone as uphone , u.is_company as uis_company , v.*, c.iconClass, c.name as category, c.clean_name as clean_category,c.description as category_description, v.created as videoCreation, v.modified as videoModified FROM videos as v LEFT JOIN categories c ON categories_id = c.id LEFT JOIN users u ON v.users_id = u.id WHERE 2=2 AND v.status IN ('a','k','f') ORDER BY likes DESC ) a WHERE a.ustatus IN ('a') LIMIT 1000, 900; ```
The ANALYZE FORMAT=JSON is the next: ```json { "query_block": { "select_id": 1, "r_loops": 1, "r_total_time_ms": 8.047324959, "table": { "table_name": "u", "access_type": "ref", "possible_keys": ["PRIMARY", "users_status_IDX"], "key": "users_status_IDX", "key_length": "1", "used_key_parts": ["status"], "ref": ["const"], "r_loops": 1, "rows": 1, "r_rows": 1, "r_table_time_ms": 0.045761237, "r_other_time_ms": 0.0201165, "filtered": 100, "r_filtered": 100, "index_condition": "u.`status` = 'a'" }, "table": { "table_name": "v", "access_type": "ref", "possible_keys": [ "fk_videos_users_idx", "video_status_idx", "videos_status_index" ], "key": "fk_videos_users_idx", "key_length": "4", "used_key_parts": ["users_id"], "ref": ["AVideo.u.id"], "r_loops": 1, "rows": 391299, "r_rows": 1900, "r_table_time_ms": 6.939235389, "r_other_time_ms": 1.010973728, "filtered": 50.00025558, "r_filtered": 100, "attached_condition": "v.`status` in ('a','k','f')" }, "table": { "table_name": "c", "access_type": "eq_ref", "possible_keys": ["PRIMARY"], "key": "PRIMARY", "key_length": "4", "used_key_parts": ["id"], "ref": ["AVideo.v.categories_id"], "r_loops": 1900, "rows": 1, "r_rows": 1, "r_table_time_ms": 0.007669714, "r_other_time_ms": 0.001959112, "filtered": 100, "r_filtered": 100 } } } ```
And what I am trying to understand is, what's the difference? Why would the same exact query will result with this issue? Just to notice that the next query will not run at all due to fields naming conflict between u and v: ```sql SELECT * FROM (SELECT u.* , v.*, c.iconClass, c.name as category, c.clean_name as clean_category,c.description as category_description, v.created as videoCreation, v.modified as videoModified FROM videos as v LEFT JOIN categories c ON categories_id = c.id LEFT JOIN users u ON v.users_id = u.id WHERE 2=2 AND v.status IN ('a','k','f') ORDER BY likes DESC ) a WHERE a.ustatus IN ('a') LIMIT 1000, 900; ```
I am not new to the DB world that much but it's the first time I am getting my hands dirty about such an issue.
I will share the full DB later on today so it could be used as a learning material. Also, are there any big public databases that can be used for learning and testing purposes?
Thanks, Eliezer
---- Eliezer Croitoru NgTech, Tech Support Mobile: +972-5-28704261 Email: ngtech1ltd@gmail.com Web: https://ngtech.co.il/ My-Tube: https://tube.ngtech.co.il/
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Ok, thanks. בתאריך יום ד׳, 13 ביולי 2022, 10:42, מאת Gordan Bobic < gordan.bobic@gmail.com>:
Just had a look with the data set you provided, this looks like a classical case of query optimizer getting it wrong.
In your first query, do this modification: s/SELECT/SELECT STRAIGHT_JOIN/ and the execution time goes from seconds to milliseconds. The optimizer wasn't smart enough to figure out that the optimal plan is to start with the videos table and take advantage of early stopping with limit.
It seems reasonably reproducible so you might want to file an optimizer bug on MariaDB jira.
On Wed, Jul 13, 2022 at 9:59 AM NgTech LTD <ngtech1ltd@gmail.com> wrote:
Sorry, I didn't got why what works faster is faster. Also when I am removing the Order by from the original query it runs
faster.
There is index to the relevant fields. The big table is the videos so I assumed that the db engine would be smart enough to first run the search in an optimized fashion and then compare to the other tables by the id.
also why the difference between the same exact query when the u.* is used it's slow but when I am using the AS to define each and every field renaming it would run faster. It doesn't make any sense to me as a programmer. If I would have written the code to do this query it would be in such a way that it will run in couple ms...
I have started re-learning my DBA materials with hope to JOIN my sysadmin, programming and dba skills to make sense in the future what doesn't make sense now.
Thanks!
בתאריך יום ד׳, 13 ביולי 2022, 9:45, מאת Gordan Bobic < gordan.bobic@gmail.com>:
It isn't the same query. Sub-selects tend to execute by fully constructing the data set, putting it into a temporary table, and then selecting from that for the wrapping query. That means you get no benefit from early stopping, and often the optimizer doesn't figure out the appropriate indexing so your temporary table ends up with a full table scan. If you care about performance, avoid sub-selects by re-writing the query without it. If sub-selects are unavoidable, use an appropriately indexed materialized view (not a regular view, a regular view is just a sub-select in disguise in the vast majority of cases) that you can either refresh periodically via a scheduled event, or if necessary in real-time using triggers on the underlying tables.
On Wed, Jul 13, 2022 at 9:25 AM <ngtech1ltd@gmail.com> wrote:
The full DB can be downloaded from: https://www.ngtech.co.il/static/AVideo.sql.gz
Thanks for any advice. Eliezer
---- Eliezer Croitoru NgTech, Tech Support Mobile: +972-5-28704261 Email: ngtech1ltd@gmail.com Web: https://ngtech.co.il/ My-Tube: https://tube.ngtech.co.il/
-----Original Message----- From: ngtech1ltd@gmail.com <ngtech1ltd@gmail.com> Sent: Wednesday, 13 July 2022 8:46 To: 'maria-discuss@lists.launchpad.net' <
maria-discuss@lists.launchpad.net>
Subject: Looking for advice where to start looking at a slow query case
Hey,
I have tried to run a local service which claims to be a YouTube alike which is named AVideo(used to be PHPTube). I loaded the service with 220k+ videos from YouTube and have used MariaDB as the DB backend. The service moved slowly as the size of the videos DB growed from 100k to 150k. I got a recommendation from the developer to use MySQL which didn’t made any sense to me. I tracked the issue to the SQL queries that the service runs. Currently I am upgrading my local servers so the DB cannot be downloaded for testing however I have seen a very interesting thing. I will first put the query here: ```sql SELECT u.*, v.*, c.iconClass, c.name as category, c.clean_name as clean_category,c.description as category_description, v.created as videoCreation, v.modified as videoModified FROM videos as v LEFT JOIN categories c ON categories_id = c.id LEFT JOIN users u ON v.users_id = u.id WHERE 2=2 AND u.status = 'a' AND v.status IN ('a','k','f') ORDER BY likes DESC LIMIT 36, 12; ```
The ANALYZE FORMAT=JSON for that query is: ```json { "query_block": { "select_id": 1, "r_loops": 1, "r_total_time_ms": 9643.807074, "filesort": { "sort_key": "v.likes desc", "r_loops": 1, "r_total_time_ms": 756.223544, "r_limit": 48, "r_used_priority_queue": true, "r_output_rows": 49, "r_sort_mode": "sort_key,rowid", "temporary_table": { "table": { "table_name": "u", "access_type": "ref", "possible_keys": ["PRIMARY", "users_status_IDX"], "key": "users_status_IDX", "key_length": "1", "used_key_parts": ["status"], "ref": ["const"], "r_loops": 1, "rows": 1, "r_rows": 1, "r_table_time_ms": 0.034852389, "r_other_time_ms": 0.020715696, "filtered": 100, "r_filtered": 100, "index_condition": "u.`status` = 'a'" }, "table": { "table_name": "v", "access_type": "ref", "possible_keys": [ "fk_videos_users_idx", "video_status_idx", "videos_status_index" ], "key": "fk_videos_users_idx", "key_length": "4", "used_key_parts": ["users_id"], "ref": ["AVideo.u.id"], "r_loops": 1, "rows": 391299, "r_rows": 990067, "r_table_time_ms": 2691.791742, "r_other_time_ms": 6129.887127, "filtered": 50.00025558, "r_filtered": 100, "attached_condition": "v.`status` in ('a','k','f')" }, "table": { "table_name": "c", "access_type": "eq_ref", "possible_keys": ["PRIMARY"], "key": "PRIMARY", "key_length": "4", "used_key_parts": ["id"], "ref": ["AVideo.v.categories_id"], "r_loops": 990067, "rows": 1, "r_rows": 1, "r_table_time_ms": 0.00546745, "r_other_time_ms": 0.569425897, "filtered": 100, "r_filtered": 100 } } } } } ``` While for the next: ```sql SELECT * FROM (SELECT u.id as uid , u.user as uuser , u.name as uname , u.email as uemail , u.password as upassword , u.created as ucreated , u.modified as umodified , u.isAdmin as uisAdmin , u.status as ustatus , u.photoURL as uphotoURL , u.lastLogin as ulastLogin , u.recoverPass as urecoverPass , u.backgroundURL as ubackgroundURL , u.canStream as ucanStream , u.canUpload as ucanUpload , u.canCreateMeet as ucanCreateMeet , u.canViewChart as ucanViewChart , u.about as uabout , u.channelName as uchannelName , u.emailVerified as uemailVerified , u.analyticsCode as uanalyticsCode , u.externalOptions as uexternalOptions , u.first_name as ufirst_name , u.last_name as ulast_name , u.address as uaddress , u.zip_code as uzip_code , u.country as ucountry , u.region as uregion , u.city as ucity , u.donationLink as udonationLink , u.extra_info as uextra_info , u.phone as uphone , u.is_company as uis_company , v.*, c.iconClass, c.name as category, c.clean_name as clean_category,c.description as category_description, v.created as videoCreation, v.modified as videoModified FROM videos as v LEFT JOIN categories c ON categories_id = c.id LEFT JOIN users u ON v.users_id = u.id WHERE 2=2 AND v.status IN ('a','k','f') ORDER BY likes DESC ) a WHERE a.ustatus IN ('a') LIMIT 1000, 900; ```
The ANALYZE FORMAT=JSON is the next: ```json { "query_block": { "select_id": 1, "r_loops": 1, "r_total_time_ms": 8.047324959, "table": { "table_name": "u", "access_type": "ref", "possible_keys": ["PRIMARY", "users_status_IDX"], "key": "users_status_IDX", "key_length": "1", "used_key_parts": ["status"], "ref": ["const"], "r_loops": 1, "rows": 1, "r_rows": 1, "r_table_time_ms": 0.045761237, "r_other_time_ms": 0.0201165, "filtered": 100, "r_filtered": 100, "index_condition": "u.`status` = 'a'" }, "table": { "table_name": "v", "access_type": "ref", "possible_keys": [ "fk_videos_users_idx", "video_status_idx", "videos_status_index" ], "key": "fk_videos_users_idx", "key_length": "4", "used_key_parts": ["users_id"], "ref": ["AVideo.u.id"], "r_loops": 1, "rows": 391299, "r_rows": 1900, "r_table_time_ms": 6.939235389, "r_other_time_ms": 1.010973728, "filtered": 50.00025558, "r_filtered": 100, "attached_condition": "v.`status` in ('a','k','f')" }, "table": { "table_name": "c", "access_type": "eq_ref", "possible_keys": ["PRIMARY"], "key": "PRIMARY", "key_length": "4", "used_key_parts": ["id"], "ref": ["AVideo.v.categories_id"], "r_loops": 1900, "rows": 1, "r_rows": 1, "r_table_time_ms": 0.007669714, "r_other_time_ms": 0.001959112, "filtered": 100, "r_filtered": 100 } } } ```
And what I am trying to understand is, what's the difference? Why would the same exact query will result with this issue? Just to notice that the next query will not run at all due to fields naming conflict between u and v: ```sql SELECT * FROM (SELECT u.* , v.*, c.iconClass, c.name as category, c.clean_name as clean_category,c.description as category_description, v.created as videoCreation, v.modified as videoModified FROM videos as v LEFT JOIN categories c ON categories_id = c.id LEFT JOIN users u ON v.users_id = u.id WHERE 2=2 AND v.status IN ('a','k','f') ORDER BY likes DESC ) a WHERE a.ustatus IN ('a') LIMIT 1000, 900; ```
I am not new to the DB world that much but it's the first time I am getting my hands dirty about such an issue.
I will share the full DB later on today so it could be used as a learning material. Also, are there any big public databases that can be used for learning and testing purposes?
Thanks, Eliezer
---- Eliezer Croitoru NgTech, Tech Support Mobile: +972-5-28704261 Email: ngtech1ltd@gmail.com Web: https://ngtech.co.il/ My-Tube: https://tube.ngtech.co.il/
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Thanks!! Just to point out that this exact same issue exist on MySQL. I haven't tried yet with MSSQL and Oracle DB yet but from my experience MSSQL will probably figure it our right enough. ---- Eliezer Croitoru NgTech, Tech Support Mobile: +972-5-28704261 Email: ngtech1ltd@gmail.com Web: https://ngtech.co.il/ My-Tube: https://tube.ngtech.co.il/ -----Original Message----- From: Gordan Bobic <gordan.bobic@gmail.com> Sent: Wednesday, 13 July 2022 10:42 To: NgTech LTD <ngtech1ltd@gmail.com> Cc: Mailing-List mariadb <maria-discuss@lists.launchpad.net> Subject: Re: [Maria-discuss] Looking for advice where to start looking at a slow query case Just had a look with the data set you provided, this looks like a classical case of query optimizer getting it wrong. In your first query, do this modification: s/SELECT/SELECT STRAIGHT_JOIN/ and the execution time goes from seconds to milliseconds. The optimizer wasn't smart enough to figure out that the optimal plan is to start with the videos table and take advantage of early stopping with limit. It seems reasonably reproducible so you might want to file an optimizer bug on MariaDB jira. On Wed, Jul 13, 2022 at 9:59 AM NgTech LTD <ngtech1ltd@gmail.com> wrote:
Sorry, I didn't got why what works faster is faster. Also when I am removing the Order by from the original query it runs faster. There is index to the relevant fields. The big table is the videos so I assumed that the db engine would be smart enough to first run the search in an optimized fashion and then compare to the other tables by the id.
also why the difference between the same exact query when the u.* is used it's slow but when I am using the AS to define each and every field renaming it would run faster. It doesn't make any sense to me as a programmer. If I would have written the code to do this query it would be in such a way that it will run in couple ms...
I have started re-learning my DBA materials with hope to JOIN my sysadmin, programming and dba skills to make sense in the future what doesn't make sense now.
Thanks!
בתאריך יום ד׳, 13 ביולי 2022, 9:45, מאת Gordan Bobic <gordan.bobic@gmail.com>:
It isn't the same query. Sub-selects tend to execute by fully constructing the data set, putting it into a temporary table, and then selecting from that for the wrapping query. That means you get no benefit from early stopping, and often the optimizer doesn't figure out the appropriate indexing so your temporary table ends up with a full table scan. If you care about performance, avoid sub-selects by re-writing the query without it. If sub-selects are unavoidable, use an appropriately indexed materialized view (not a regular view, a regular view is just a sub-select in disguise in the vast majority of cases) that you can either refresh periodically via a scheduled event, or if necessary in real-time using triggers on the underlying tables.
On Wed, Jul 13, 2022 at 9:25 AM <ngtech1ltd@gmail.com> wrote:
The full DB can be downloaded from: https://www.ngtech.co.il/static/AVideo.sql.gz
Thanks for any advice. Eliezer
---- Eliezer Croitoru NgTech, Tech Support Mobile: +972-5-28704261 Email: ngtech1ltd@gmail.com Web: https://ngtech.co.il/ My-Tube: https://tube.ngtech.co.il/
-----Original Message----- From: ngtech1ltd@gmail.com <ngtech1ltd@gmail.com> Sent: Wednesday, 13 July 2022 8:46 To: 'maria-discuss@lists.launchpad.net' <maria-discuss@lists.launchpad.net> Subject: Looking for advice where to start looking at a slow query case
Hey,
I have tried to run a local service which claims to be a YouTube alike which is named AVideo(used to be PHPTube). I loaded the service with 220k+ videos from YouTube and have used MariaDB as the DB backend. The service moved slowly as the size of the videos DB growed from 100k to 150k. I got a recommendation from the developer to use MySQL which didn’t made any sense to me. I tracked the issue to the SQL queries that the service runs. Currently I am upgrading my local servers so the DB cannot be downloaded for testing however I have seen a very interesting thing. I will first put the query here: ```sql SELECT u.*, v.*, c.iconClass, c.name as category, c.clean_name as clean_category,c.description as category_description, v.created as videoCreation, v.modified as videoModified FROM videos as v LEFT JOIN categories c ON categories_id = c.id LEFT JOIN users u ON v.users_id = u.id WHERE 2=2 AND u.status = 'a' AND v.status IN ('a','k','f') ORDER BY likes DESC LIMIT 36, 12; ```
The ANALYZE FORMAT=JSON for that query is: ```json { "query_block": { "select_id": 1, "r_loops": 1, "r_total_time_ms": 9643.807074, "filesort": { "sort_key": "v.likes desc", "r_loops": 1, "r_total_time_ms": 756.223544, "r_limit": 48, "r_used_priority_queue": true, "r_output_rows": 49, "r_sort_mode": "sort_key,rowid", "temporary_table": { "table": { "table_name": "u", "access_type": "ref", "possible_keys": ["PRIMARY", "users_status_IDX"], "key": "users_status_IDX", "key_length": "1", "used_key_parts": ["status"], "ref": ["const"], "r_loops": 1, "rows": 1, "r_rows": 1, "r_table_time_ms": 0.034852389, "r_other_time_ms": 0.020715696, "filtered": 100, "r_filtered": 100, "index_condition": "u.`status` = 'a'" }, "table": { "table_name": "v", "access_type": "ref", "possible_keys": [ "fk_videos_users_idx", "video_status_idx", "videos_status_index" ], "key": "fk_videos_users_idx", "key_length": "4", "used_key_parts": ["users_id"], "ref": ["AVideo.u.id"], "r_loops": 1, "rows": 391299, "r_rows": 990067, "r_table_time_ms": 2691.791742, "r_other_time_ms": 6129.887127, "filtered": 50.00025558, "r_filtered": 100, "attached_condition": "v.`status` in ('a','k','f')" }, "table": { "table_name": "c", "access_type": "eq_ref", "possible_keys": ["PRIMARY"], "key": "PRIMARY", "key_length": "4", "used_key_parts": ["id"], "ref": ["AVideo.v.categories_id"], "r_loops": 990067, "rows": 1, "r_rows": 1, "r_table_time_ms": 0.00546745, "r_other_time_ms": 0.569425897, "filtered": 100, "r_filtered": 100 } } } } } ``` While for the next: ```sql SELECT * FROM (SELECT u.id as uid , u.user as uuser , u.name as uname , u.email as uemail , u.password as upassword , u.created as ucreated , u.modified as umodified , u.isAdmin as uisAdmin , u.status as ustatus , u.photoURL as uphotoURL , u.lastLogin as ulastLogin , u.recoverPass as urecoverPass , u.backgroundURL as ubackgroundURL , u.canStream as ucanStream , u.canUpload as ucanUpload , u.canCreateMeet as ucanCreateMeet , u.canViewChart as ucanViewChart , u.about as uabout , u.channelName as uchannelName , u.emailVerified as uemailVerified , u.analyticsCode as uanalyticsCode , u.externalOptions as uexternalOptions , u.first_name as ufirst_name , u.last_name as ulast_name , u.address as uaddress , u.zip_code as uzip_code , u.country as ucountry , u.region as uregion , u.city as ucity , u.donationLink as udonationLink , u.extra_info as uextra_info , u.phone as uphone , u.is_company as uis_company , v.*, c.iconClass, c.name as category, c.clean_name as clean_category,c.description as category_description, v.created as videoCreation, v.modified as videoModified FROM videos as v LEFT JOIN categories c ON categories_id = c.id LEFT JOIN users u ON v.users_id = u.id WHERE 2=2 AND v.status IN ('a','k','f') ORDER BY likes DESC ) a WHERE a.ustatus IN ('a') LIMIT 1000, 900; ```
The ANALYZE FORMAT=JSON is the next: ```json { "query_block": { "select_id": 1, "r_loops": 1, "r_total_time_ms": 8.047324959, "table": { "table_name": "u", "access_type": "ref", "possible_keys": ["PRIMARY", "users_status_IDX"], "key": "users_status_IDX", "key_length": "1", "used_key_parts": ["status"], "ref": ["const"], "r_loops": 1, "rows": 1, "r_rows": 1, "r_table_time_ms": 0.045761237, "r_other_time_ms": 0.0201165, "filtered": 100, "r_filtered": 100, "index_condition": "u.`status` = 'a'" }, "table": { "table_name": "v", "access_type": "ref", "possible_keys": [ "fk_videos_users_idx", "video_status_idx", "videos_status_index" ], "key": "fk_videos_users_idx", "key_length": "4", "used_key_parts": ["users_id"], "ref": ["AVideo.u.id"], "r_loops": 1, "rows": 391299, "r_rows": 1900, "r_table_time_ms": 6.939235389, "r_other_time_ms": 1.010973728, "filtered": 50.00025558, "r_filtered": 100, "attached_condition": "v.`status` in ('a','k','f')" }, "table": { "table_name": "c", "access_type": "eq_ref", "possible_keys": ["PRIMARY"], "key": "PRIMARY", "key_length": "4", "used_key_parts": ["id"], "ref": ["AVideo.v.categories_id"], "r_loops": 1900, "rows": 1, "r_rows": 1, "r_table_time_ms": 0.007669714, "r_other_time_ms": 0.001959112, "filtered": 100, "r_filtered": 100 } } } ```
And what I am trying to understand is, what's the difference? Why would the same exact query will result with this issue? Just to notice that the next query will not run at all due to fields naming conflict between u and v: ```sql SELECT * FROM (SELECT u.* , v.*, c.iconClass, c.name as category, c.clean_name as clean_category,c.description as category_description, v.created as videoCreation, v.modified as videoModified FROM videos as v LEFT JOIN categories c ON categories_id = c.id LEFT JOIN users u ON v.users_id = u.id WHERE 2=2 AND v.status IN ('a','k','f') ORDER BY likes DESC ) a WHERE a.ustatus IN ('a') LIMIT 1000, 900; ```
I am not new to the DB world that much but it's the first time I am getting my hands dirty about such an issue.
I will share the full DB later on today so it could be used as a learning material. Also, are there any big public databases that can be used for learning and testing purposes?
Thanks, Eliezer
---- Eliezer Croitoru NgTech, Tech Support Mobile: +972-5-28704261 Email: ngtech1ltd@gmail.com Web: https://ngtech.co.il/ My-Tube: https://tube.ngtech.co.il/
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
MariaDB and MySQL share the same roots, so it is unsurprising their optimizers are similar. IMO MySQL optimizer started to get worse since 5.7, and MariaDB's since 10.5. All databases get it wrong some time, I've yet to work on a database system that didn't at some time or another come up with a bad execution plan for a query. On Wed, Jul 13, 2022 at 12:38 PM <ngtech1ltd@gmail.com> wrote:
Thanks!!
Just to point out that this exact same issue exist on MySQL. I haven't tried yet with MSSQL and Oracle DB yet but from my experience MSSQL will probably figure it our right enough.
---- Eliezer Croitoru NgTech, Tech Support Mobile: +972-5-28704261 Email: ngtech1ltd@gmail.com Web: https://ngtech.co.il/ My-Tube: https://tube.ngtech.co.il/
-----Original Message----- From: Gordan Bobic <gordan.bobic@gmail.com> Sent: Wednesday, 13 July 2022 10:42 To: NgTech LTD <ngtech1ltd@gmail.com> Cc: Mailing-List mariadb <maria-discuss@lists.launchpad.net> Subject: Re: [Maria-discuss] Looking for advice where to start looking at a slow query case
Just had a look with the data set you provided, this looks like a classical case of query optimizer getting it wrong.
In your first query, do this modification: s/SELECT/SELECT STRAIGHT_JOIN/ and the execution time goes from seconds to milliseconds. The optimizer wasn't smart enough to figure out that the optimal plan is to start with the videos table and take advantage of early stopping with limit.
It seems reasonably reproducible so you might want to file an optimizer bug on MariaDB jira.
On Wed, Jul 13, 2022 at 9:59 AM NgTech LTD <ngtech1ltd@gmail.com> wrote:
Sorry, I didn't got why what works faster is faster. Also when I am removing the Order by from the original query it runs faster. There is index to the relevant fields. The big table is the videos so I assumed that the db engine would be smart enough to first run the search in an optimized fashion and then compare to the other tables by the id.
also why the difference between the same exact query when the u.* is used it's slow but when I am using the AS to define each and every field renaming it would run faster. It doesn't make any sense to me as a programmer. If I would have written the code to do this query it would be in such a way that it will run in couple ms...
I have started re-learning my DBA materials with hope to JOIN my sysadmin, programming and dba skills to make sense in the future what doesn't make sense now.
Thanks!
בתאריך יום ד׳, 13 ביולי 2022, 9:45, מאת Gordan Bobic <gordan.bobic@gmail.com>:
It isn't the same query. Sub-selects tend to execute by fully constructing the data set, putting it into a temporary table, and then selecting from that for the wrapping query. That means you get no benefit from early stopping, and often the optimizer doesn't figure out the appropriate indexing so your temporary table ends up with a full table scan. If you care about performance, avoid sub-selects by re-writing the query without it. If sub-selects are unavoidable, use an appropriately indexed materialized view (not a regular view, a regular view is just a sub-select in disguise in the vast majority of cases) that you can either refresh periodically via a scheduled event, or if necessary in real-time using triggers on the underlying tables.
On Wed, Jul 13, 2022 at 9:25 AM <ngtech1ltd@gmail.com> wrote:
The full DB can be downloaded from: https://www.ngtech.co.il/static/AVideo.sql.gz
Thanks for any advice. Eliezer
---- Eliezer Croitoru NgTech, Tech Support Mobile: +972-5-28704261 Email: ngtech1ltd@gmail.com Web: https://ngtech.co.il/ My-Tube: https://tube.ngtech.co.il/
-----Original Message----- From: ngtech1ltd@gmail.com <ngtech1ltd@gmail.com> Sent: Wednesday, 13 July 2022 8:46 To: 'maria-discuss@lists.launchpad.net' <maria-discuss@lists.launchpad.net> Subject: Looking for advice where to start looking at a slow query case
Hey,
I have tried to run a local service which claims to be a YouTube alike which is named AVideo(used to be PHPTube). I loaded the service with 220k+ videos from YouTube and have used MariaDB as the DB backend. The service moved slowly as the size of the videos DB growed from 100k to 150k. I got a recommendation from the developer to use MySQL which didn’t made any sense to me. I tracked the issue to the SQL queries that the service runs. Currently I am upgrading my local servers so the DB cannot be downloaded for testing however I have seen a very interesting thing. I will first put the query here: ```sql SELECT u.*, v.*, c.iconClass, c.name as category, c.clean_name as clean_category,c.description as category_description, v.created as videoCreation, v.modified as videoModified FROM videos as v LEFT JOIN categories c ON categories_id = c.id LEFT JOIN users u ON v.users_id = u.id WHERE 2=2 AND u.status = 'a' AND v.status IN ('a','k','f') ORDER BY likes DESC LIMIT 36, 12; ```
The ANALYZE FORMAT=JSON for that query is: ```json { "query_block": { "select_id": 1, "r_loops": 1, "r_total_time_ms": 9643.807074, "filesort": { "sort_key": "v.likes desc", "r_loops": 1, "r_total_time_ms": 756.223544, "r_limit": 48, "r_used_priority_queue": true, "r_output_rows": 49, "r_sort_mode": "sort_key,rowid", "temporary_table": { "table": { "table_name": "u", "access_type": "ref", "possible_keys": ["PRIMARY", "users_status_IDX"], "key": "users_status_IDX", "key_length": "1", "used_key_parts": ["status"], "ref": ["const"], "r_loops": 1, "rows": 1, "r_rows": 1, "r_table_time_ms": 0.034852389, "r_other_time_ms": 0.020715696, "filtered": 100, "r_filtered": 100, "index_condition": "u.`status` = 'a'" }, "table": { "table_name": "v", "access_type": "ref", "possible_keys": [ "fk_videos_users_idx", "video_status_idx", "videos_status_index" ], "key": "fk_videos_users_idx", "key_length": "4", "used_key_parts": ["users_id"], "ref": ["AVideo.u.id"], "r_loops": 1, "rows": 391299, "r_rows": 990067, "r_table_time_ms": 2691.791742, "r_other_time_ms": 6129.887127, "filtered": 50.00025558, "r_filtered": 100, "attached_condition": "v.`status` in ('a','k','f')" }, "table": { "table_name": "c", "access_type": "eq_ref", "possible_keys": ["PRIMARY"], "key": "PRIMARY", "key_length": "4", "used_key_parts": ["id"], "ref": ["AVideo.v.categories_id"], "r_loops": 990067, "rows": 1, "r_rows": 1, "r_table_time_ms": 0.00546745, "r_other_time_ms": 0.569425897, "filtered": 100, "r_filtered": 100 } } } } } ``` While for the next: ```sql SELECT * FROM (SELECT u.id as uid , u.user as uuser , u.name as uname , u.email as uemail , u.password as upassword , u.created as ucreated , u.modified as umodified , u.isAdmin as uisAdmin , u.status as ustatus , u.photoURL as uphotoURL , u.lastLogin as ulastLogin , u.recoverPass as urecoverPass , u.backgroundURL as ubackgroundURL , u.canStream as ucanStream , u.canUpload as ucanUpload , u.canCreateMeet as ucanCreateMeet , u.canViewChart as ucanViewChart , u.about as uabout , u.channelName as uchannelName , u.emailVerified as uemailVerified , u.analyticsCode as uanalyticsCode , u.externalOptions as uexternalOptions , u.first_name as ufirst_name , u.last_name as ulast_name , u.address as uaddress , u.zip_code as uzip_code , u.country as ucountry , u.region as uregion , u.city as ucity , u.donationLink as udonationLink , u.extra_info as uextra_info , u.phone as uphone , u.is_company as uis_company , v.*, c.iconClass, c.name as category, c.clean_name as clean_category,c.description as category_description, v.created as videoCreation, v.modified as videoModified FROM videos as v LEFT JOIN categories c ON categories_id = c.id LEFT JOIN users u ON v.users_id = u.id WHERE 2=2 AND v.status IN ('a','k','f') ORDER BY likes DESC ) a WHERE a.ustatus IN ('a') LIMIT 1000, 900; ```
The ANALYZE FORMAT=JSON is the next: ```json { "query_block": { "select_id": 1, "r_loops": 1, "r_total_time_ms": 8.047324959, "table": { "table_name": "u", "access_type": "ref", "possible_keys": ["PRIMARY", "users_status_IDX"], "key": "users_status_IDX", "key_length": "1", "used_key_parts": ["status"], "ref": ["const"], "r_loops": 1, "rows": 1, "r_rows": 1, "r_table_time_ms": 0.045761237, "r_other_time_ms": 0.0201165, "filtered": 100, "r_filtered": 100, "index_condition": "u.`status` = 'a'" }, "table": { "table_name": "v", "access_type": "ref", "possible_keys": [ "fk_videos_users_idx", "video_status_idx", "videos_status_index" ], "key": "fk_videos_users_idx", "key_length": "4", "used_key_parts": ["users_id"], "ref": ["AVideo.u.id"], "r_loops": 1, "rows": 391299, "r_rows": 1900, "r_table_time_ms": 6.939235389, "r_other_time_ms": 1.010973728, "filtered": 50.00025558, "r_filtered": 100, "attached_condition": "v.`status` in ('a','k','f')" }, "table": { "table_name": "c", "access_type": "eq_ref", "possible_keys": ["PRIMARY"], "key": "PRIMARY", "key_length": "4", "used_key_parts": ["id"], "ref": ["AVideo.v.categories_id"], "r_loops": 1900, "rows": 1, "r_rows": 1, "r_table_time_ms": 0.007669714, "r_other_time_ms": 0.001959112, "filtered": 100, "r_filtered": 100 } } } ```
And what I am trying to understand is, what's the difference? Why would the same exact query will result with this issue? Just to notice that the next query will not run at all due to fields naming conflict between u and v: ```sql SELECT * FROM (SELECT u.* , v.*, c.iconClass, c.name as category, c.clean_name as clean_category,c.description as category_description, v.created as videoCreation, v.modified as videoModified FROM videos as v LEFT JOIN categories c ON categories_id = c.id LEFT JOIN users u ON v.users_id = u.id WHERE 2=2 AND v.status IN ('a','k','f') ORDER BY likes DESC ) a WHERE a.ustatus IN ('a') LIMIT 1000, 900; ```
I am not new to the DB world that much but it's the first time I am getting my hands dirty about such an issue.
I will share the full DB later on today so it could be used as a learning material. Also, are there any big public databases that can be used for learning and testing purposes?
Thanks, Eliezer
---- Eliezer Croitoru NgTech, Tech Support Mobile: +972-5-28704261 Email: ngtech1ltd@gmail.com Web: https://ngtech.co.il/ My-Tube: https://tube.ngtech.co.il/
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Hey Gordon, Before I am filing a jira bug report I would like to verify how the same queries are running on other DB ie: * MSSQL * PostgreSQL * Oracle DB Are there any recommended tools to migrate the data from MariaDB to PostgreSQL or MSSQL? Specifically MSSQL has a migration tool but I have yet to use it so I am looking for whatever recommendations. Any MariaDB to PostgreSQL transferring tool known to anyone? I can try to re-create the scenario just with the relevant tables and it would be pretty simple to do eventually but time consuming if there is any automated tools for that. Any Recommendation is more then welcome! Thanks, Eliezer ---- Eliezer Croitoru NgTech, Tech Support Mobile: +972-5-28704261 Email: ngtech1ltd@gmail.com Web: https://ngtech.co.il/ My-Tube: https://tube.ngtech.co.il/ -----Original Message----- From: Gordan Bobic <gordan.bobic@gmail.com> Sent: Wednesday, 13 July 2022 12:46 To: Eliezer Croitoru <ngtech1ltd@gmail.com> Cc: Mailing-List mariadb <maria-discuss@lists.launchpad.net> Subject: Re: [Maria-discuss] Looking for advice where to start looking at a slow query case MariaDB and MySQL share the same roots, so it is unsurprising their optimizers are similar. IMO MySQL optimizer started to get worse since 5.7, and MariaDB's since 10.5. All databases get it wrong some time, I've yet to work on a database system that didn't at some time or another come up with a bad execution plan for a query. On Wed, Jul 13, 2022 at 12:38 PM <ngtech1ltd@gmail.com> wrote:
Thanks!!
Just to point out that this exact same issue exist on MySQL. I haven't tried yet with MSSQL and Oracle DB yet but from my experience MSSQL will probably figure it our right enough.
---- Eliezer Croitoru NgTech, Tech Support Mobile: +972-5-28704261 Email: ngtech1ltd@gmail.com Web: https://ngtech.co.il/ My-Tube: https://tube.ngtech.co.il/
-----Original Message----- From: Gordan Bobic <gordan.bobic@gmail.com> Sent: Wednesday, 13 July 2022 10:42 To: NgTech LTD <ngtech1ltd@gmail.com> Cc: Mailing-List mariadb <maria-discuss@lists.launchpad.net> Subject: Re: [Maria-discuss] Looking for advice where to start looking at a slow query case
Just had a look with the data set you provided, this looks like a classical case of query optimizer getting it wrong.
In your first query, do this modification: s/SELECT/SELECT STRAIGHT_JOIN/ and the execution time goes from seconds to milliseconds. The optimizer wasn't smart enough to figure out that the optimal plan is to start with the videos table and take advantage of early stopping with limit.
It seems reasonably reproducible so you might want to file an optimizer bug on MariaDB jira.
On Wed, Jul 13, 2022 at 9:59 AM NgTech LTD <ngtech1ltd@gmail.com> wrote:
Sorry, I didn't got why what works faster is faster. Also when I am removing the Order by from the original query it runs faster. There is index to the relevant fields. The big table is the videos so I assumed that the db engine would be smart enough to first run the search in an optimized fashion and then compare to the other tables by the id.
also why the difference between the same exact query when the u.* is used it's slow but when I am using the AS to define each and every field renaming it would run faster. It doesn't make any sense to me as a programmer. If I would have written the code to do this query it would be in such a way that it will run in couple ms...
I have started re-learning my DBA materials with hope to JOIN my sysadmin, programming and dba skills to make sense in the future what doesn't make sense now.
Thanks!
בתאריך יום ד׳, 13 ביולי 2022, 9:45, מאת Gordan Bobic <gordan.bobic@gmail.com>:
It isn't the same query. Sub-selects tend to execute by fully constructing the data set, putting it into a temporary table, and then selecting from that for the wrapping query. That means you get no benefit from early stopping, and often the optimizer doesn't figure out the appropriate indexing so your temporary table ends up with a full table scan. If you care about performance, avoid sub-selects by re-writing the query without it. If sub-selects are unavoidable, use an appropriately indexed materialized view (not a regular view, a regular view is just a sub-select in disguise in the vast majority of cases) that you can either refresh periodically via a scheduled event, or if necessary in real-time using triggers on the underlying tables.
On Wed, Jul 13, 2022 at 9:25 AM <ngtech1ltd@gmail.com> wrote:
The full DB can be downloaded from: https://www.ngtech.co.il/static/AVideo.sql.gz
Thanks for any advice. Eliezer
---- Eliezer Croitoru NgTech, Tech Support Mobile: +972-5-28704261 Email: ngtech1ltd@gmail.com Web: https://ngtech.co.il/ My-Tube: https://tube.ngtech.co.il/
-----Original Message----- From: ngtech1ltd@gmail.com <ngtech1ltd@gmail.com> Sent: Wednesday, 13 July 2022 8:46 To: 'maria-discuss@lists.launchpad.net' <maria-discuss@lists.launchpad.net> Subject: Looking for advice where to start looking at a slow query case
Hey,
I have tried to run a local service which claims to be a YouTube alike which is named AVideo(used to be PHPTube). I loaded the service with 220k+ videos from YouTube and have used MariaDB as the DB backend. The service moved slowly as the size of the videos DB growed from 100k to 150k. I got a recommendation from the developer to use MySQL which didn’t made any sense to me. I tracked the issue to the SQL queries that the service runs. Currently I am upgrading my local servers so the DB cannot be downloaded for testing however I have seen a very interesting thing. I will first put the query here: ```sql SELECT u.*, v.*, c.iconClass, c.name as category, c.clean_name as clean_category,c.description as category_description, v.created as videoCreation, v.modified as videoModified FROM videos as v LEFT JOIN categories c ON categories_id = c.id LEFT JOIN users u ON v.users_id = u.id WHERE 2=2 AND u.status = 'a' AND v.status IN ('a','k','f') ORDER BY likes DESC LIMIT 36, 12; ```
The ANALYZE FORMAT=JSON for that query is: ```json { "query_block": { "select_id": 1, "r_loops": 1, "r_total_time_ms": 9643.807074, "filesort": { "sort_key": "v.likes desc", "r_loops": 1, "r_total_time_ms": 756.223544, "r_limit": 48, "r_used_priority_queue": true, "r_output_rows": 49, "r_sort_mode": "sort_key,rowid", "temporary_table": { "table": { "table_name": "u", "access_type": "ref", "possible_keys": ["PRIMARY", "users_status_IDX"], "key": "users_status_IDX", "key_length": "1", "used_key_parts": ["status"], "ref": ["const"], "r_loops": 1, "rows": 1, "r_rows": 1, "r_table_time_ms": 0.034852389, "r_other_time_ms": 0.020715696, "filtered": 100, "r_filtered": 100, "index_condition": "u.`status` = 'a'" }, "table": { "table_name": "v", "access_type": "ref", "possible_keys": [ "fk_videos_users_idx", "video_status_idx", "videos_status_index" ], "key": "fk_videos_users_idx", "key_length": "4", "used_key_parts": ["users_id"], "ref": ["AVideo.u.id"], "r_loops": 1, "rows": 391299, "r_rows": 990067, "r_table_time_ms": 2691.791742, "r_other_time_ms": 6129.887127, "filtered": 50.00025558, "r_filtered": 100, "attached_condition": "v.`status` in ('a','k','f')" }, "table": { "table_name": "c", "access_type": "eq_ref", "possible_keys": ["PRIMARY"], "key": "PRIMARY", "key_length": "4", "used_key_parts": ["id"], "ref": ["AVideo.v.categories_id"], "r_loops": 990067, "rows": 1, "r_rows": 1, "r_table_time_ms": 0.00546745, "r_other_time_ms": 0.569425897, "filtered": 100, "r_filtered": 100 } } } } } ``` While for the next: ```sql SELECT * FROM (SELECT u.id as uid , u.user as uuser , u.name as uname , u.email as uemail , u.password as upassword , u.created as ucreated , u.modified as umodified , u.isAdmin as uisAdmin , u.status as ustatus , u.photoURL as uphotoURL , u.lastLogin as ulastLogin , u.recoverPass as urecoverPass , u.backgroundURL as ubackgroundURL , u.canStream as ucanStream , u.canUpload as ucanUpload , u.canCreateMeet as ucanCreateMeet , u.canViewChart as ucanViewChart , u.about as uabout , u.channelName as uchannelName , u.emailVerified as uemailVerified , u.analyticsCode as uanalyticsCode , u.externalOptions as uexternalOptions , u.first_name as ufirst_name , u.last_name as ulast_name , u.address as uaddress , u.zip_code as uzip_code , u.country as ucountry , u.region as uregion , u.city as ucity , u.donationLink as udonationLink , u.extra_info as uextra_info , u.phone as uphone , u.is_company as uis_company , v.*, c.iconClass, c.name as category, c.clean_name as clean_category,c.description as category_description, v.created as videoCreation, v.modified as videoModified FROM videos as v LEFT JOIN categories c ON categories_id = c.id LEFT JOIN users u ON v.users_id = u.id WHERE 2=2 AND v.status IN ('a','k','f') ORDER BY likes DESC ) a WHERE a.ustatus IN ('a') LIMIT 1000, 900; ```
The ANALYZE FORMAT=JSON is the next: ```json { "query_block": { "select_id": 1, "r_loops": 1, "r_total_time_ms": 8.047324959, "table": { "table_name": "u", "access_type": "ref", "possible_keys": ["PRIMARY", "users_status_IDX"], "key": "users_status_IDX", "key_length": "1", "used_key_parts": ["status"], "ref": ["const"], "r_loops": 1, "rows": 1, "r_rows": 1, "r_table_time_ms": 0.045761237, "r_other_time_ms": 0.0201165, "filtered": 100, "r_filtered": 100, "index_condition": "u.`status` = 'a'" }, "table": { "table_name": "v", "access_type": "ref", "possible_keys": [ "fk_videos_users_idx", "video_status_idx", "videos_status_index" ], "key": "fk_videos_users_idx", "key_length": "4", "used_key_parts": ["users_id"], "ref": ["AVideo.u.id"], "r_loops": 1, "rows": 391299, "r_rows": 1900, "r_table_time_ms": 6.939235389, "r_other_time_ms": 1.010973728, "filtered": 50.00025558, "r_filtered": 100, "attached_condition": "v.`status` in ('a','k','f')" }, "table": { "table_name": "c", "access_type": "eq_ref", "possible_keys": ["PRIMARY"], "key": "PRIMARY", "key_length": "4", "used_key_parts": ["id"], "ref": ["AVideo.v.categories_id"], "r_loops": 1900, "rows": 1, "r_rows": 1, "r_table_time_ms": 0.007669714, "r_other_time_ms": 0.001959112, "filtered": 100, "r_filtered": 100 } } } ```
And what I am trying to understand is, what's the difference? Why would the same exact query will result with this issue? Just to notice that the next query will not run at all due to fields naming conflict between u and v: ```sql SELECT * FROM (SELECT u.* , v.*, c.iconClass, c.name as category, c.clean_name as clean_category,c.description as category_description, v.created as videoCreation, v.modified as videoModified FROM videos as v LEFT JOIN categories c ON categories_id = c.id LEFT JOIN users u ON v.users_id = u.id WHERE 2=2 AND v.status IN ('a','k','f') ORDER BY likes DESC ) a WHERE a.ustatus IN ('a') LIMIT 1000, 900; ```
I am not new to the DB world that much but it's the first time I am getting my hands dirty about such an issue.
I will share the full DB later on today so it could be used as a learning material. Also, are there any big public databases that can be used for learning and testing purposes?
Thanks, Eliezer
---- Eliezer Croitoru NgTech, Tech Support Mobile: +972-5-28704261 Email: ngtech1ltd@gmail.com Web: https://ngtech.co.il/ My-Tube: https://tube.ngtech.co.il/
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
On Tue, Jul 26, 2022 at 11:04 PM <ngtech1ltd@gmail.com> wrote:
Hey Gordon,
Before I am filing a jira bug report I would like to verify how the same queries are running on other DB ie: * MSSQL * PostgreSQL * Oracle DB
You are probably going to have to ask that question on forums where the primary interest are those respective databases.
Are there any recommended tools to migrate the data from MariaDB to PostgreSQL or MSSQL?
Migrating to completely different databases is not something you can do without a non-trivial amount of effort, unless your application and requirements are pretty trivial. You are going to have to at the very least make changes to your application to address various incompatibilities in the SQL dialects and features.
Specifically MSSQL has a migration tool but I have yet to use it so I am looking for whatever recommendations. Any MariaDB to PostgreSQL transferring tool known to anyone?
In theory, mysqldump has an option for formatting the output so that it is more easy to ingest into other databases. In reality, unless your data and structures are very trivial, you will still have to thump the output to make the ingestion process work.
I can try to re-create the scenario just with the relevant tables and it would be pretty simple to do eventually but time consuming if there is any automated tools for that.
Any Recommendation is more then welcome!
The recommendation I can give you is that if you are looking at migrating to a completely different database just to avoid using an index hint, you probably haven't thought through the complexities of migrating between databases.
Hey Gordan, Got you.. I actually have a script that can translate the relevant tables for the benchmark test In mind but have yet to sit on it. I really do not have any intention of replacing the DB but merely try to understand if this similar issue in the query optimizer present in most of the popular DB systems. I do assume that Oracle DB do not have an issue since I have seen it does a much more complex queries on a very large data sets pretty easily, however there is a difference between what I believe to what I know when I am trying to state that there is a bug in any DB what so ever and whatever the depth of this bug. Since MySQL and MaraiDB have the same roots and both have this exact same optimization bug I just wanted to verify. Thanks, Eliezer * I will try to update here on whatever steps I will take in the near future ---- Eliezer Croitoru NgTech, Tech Support Mobile: +972-5-28704261 Email: ngtech1ltd@gmail.com Web: https://ngtech.co.il/ My-Tube: https://tube.ngtech.co.il/ -----Original Message----- From: Gordan Bobic <gordan.bobic@gmail.com> Sent: Tuesday, 26 July 2022 23:16 To: Eliezer Croitoru <ngtech1ltd@gmail.com> Cc: Mailing-List mariadb <maria-discuss@lists.launchpad.net> Subject: Re: [Maria-discuss] Looking for advice where to start looking at a slow query case On Tue, Jul 26, 2022 at 11:04 PM <ngtech1ltd@gmail.com> wrote:
Hey Gordon,
Before I am filing a jira bug report I would like to verify how the same queries are running on other DB ie: * MSSQL * PostgreSQL * Oracle DB
You are probably going to have to ask that question on forums where the primary interest are those respective databases.
Are there any recommended tools to migrate the data from MariaDB to PostgreSQL or MSSQL?
Migrating to completely different databases is not something you can do without a non-trivial amount of effort, unless your application and requirements are pretty trivial. You are going to have to at the very least make changes to your application to address various incompatibilities in the SQL dialects and features.
Specifically MSSQL has a migration tool but I have yet to use it so I am looking for whatever recommendations. Any MariaDB to PostgreSQL transferring tool known to anyone?
In theory, mysqldump has an option for formatting the output so that it is more easy to ingest into other databases. In reality, unless your data and structures are very trivial, you will still have to thump the output to make the ingestion process work.
I can try to re-create the scenario just with the relevant tables and it would be pretty simple to do eventually but time consuming if there is any automated tools for that.
Any Recommendation is more then welcome!
The recommendation I can give you is that if you are looking at migrating to a completely different database just to avoid using an index hint, you probably haven't thought through the complexities of migrating between databases.
On Wed, 27 Jul 2022, 00:11 , <ngtech1ltd@gmail.com> wrote:
Hey Gordan,
Got you..
I actually have a script that can translate the relevant tables for the benchmark test In mind but have yet to sit on it. I really do not have any intention of replacing the DB but merely try to understand if this similar issue in the query optimizer present in most of the popular DB systems. I do assume that Oracle DB do not have an issue since I have seen it does a much more complex queries on a very large data sets pretty easily, however there is a difference between what I believe to what I know when I am trying to state that there is a bug in any DB what so ever and whatever the depth of this bug.
On an extensive enough test - you will find optimizer anomalies like this in all RDBMS-es. They won't be in the same place (same combination of queries and data), but there is a reason why most of them have index hinting features (and in those that don't, it is a common feature request). No optimizer is ever perfect, it's all based in heuristics to make educated guesses.
participants (3)
-
Gordan Bobic
-
NgTech LTD
-
ngtech1ltd@gmail.com