
Most database monitoring systems, such as Newrelic, Datadog, Dolphie, Ottertune, Releem, Solarwinds, use data from the performance_schema database from the events_statements_summary_by_digest and events_statements_* tables to get query analytics for databases. This information allows to calculate system performance metrics, in particular Latency. Also, the data from these tables allows you to get a complete picture of the database operation - what queries are executed, their execution statistics, to get query templates based on digest_text, and based on SQL_TEXT field from events_statements_* or QUERY_SAMPLE_TEXT from events_statements_summary_by_digest (for MySQL >= 8) to get a sample query for digest and optimize it. The prepared statements analytics are contained in a single table prepared_statements_instances, which is managed by Performance_schema as follows: - Statement preparation If the statement is successfully instrumented, a new row is added to the prepared_statements_instances table. If the statement cannot be instrumented, Performance_schema_prepared_statements_lost status variable is incremented. - Prepared statement execution Execution of a EXECUTE command for an instrumented prepared statement instance updates the corresponding prepared_statements_instances table row. - Prepared statement deallocation Execution of a CLOSE command for an instrumented prepared statement instance removes the corresponding prepared_statements_instances table row. To avoid resource leaks, removal occurs even if the prepared statement instruments described previously are disabled. In this regard, there are the following problems with analytics for prepared statements compared to regular queries: 1) Most applications work with prepared statements using the following method when processing a single query - creating a prepared statement, executing the prepared statement, and then dropping the prepared statement. This leads to the fact that query statistics exist for a very short period of time, during which it may not be possible to fetch them. 2) The second problem of prepared statement analytics in comparison with regular queries is the absence of the query hash (digest) and information about the schema_name in which the prepared statement is executed, which makes it difficult to identify unique queries, since a separate record in the Prepared_Statements_instances table is created for each creation of a prepared statement. 3) Execution of prepared statements is possible in two protocols: Binary and text. Execution of a prepared statement in the binary protocol does not create an entry in the events_statements_* tables, but execution in the text protocol creates an entry in the specified tables, which leads to different behaviors of the same prepared statement but in different protocols. 4) For prepared statements, there is no option in getting an example request without a ? placeholder. The SQL_TEXT field in Prepared_Statements_instances contains The prepared statement text, with ? placeholder markers, and in some cases (when executing in the text protocol) it is possible to find only the execute command of this statement, which further requires mapping ? placeholder markers with parameters from the execute request. But there are limitations that the binary protocol does not store EXECUTE requests for prepared statements, and most of the interactions with the database occur through it, through drivers for different programming languages, and another limitation may be the use of variables in the EXECUTE command, the value of which still needs to be obtained. Proposed solution: 1) Records should not be deleted from the Prepared_Statements_instances table after dropping prepared statements. If the table is full, the counter Performance_schema_prepared_statements_lost and If no row has the statement digest value for the statement that just completed, and the table is full, the statistics for the statement that just completed are added to a special “catch-all” row with DIGEST = NULL, which is created if necessary. 2) Each statement must be unique by the fields schema_name, in which the given statement is executed and its hash, digest, which will be the same for the same prepared statement. 3) In the tables events_statements_* must be recorded information about the fact of PREPARE, EXECUTE and DEALLOCATE execution for prepared statement, where in the DIGEST_TEXT field is stored the call of the execute command, and in SQL_TEXT - the normalized prepared query, in which ? placeholder markers are replaced by concrete values.

Hi everyone, Sorry, I realized I jumped straight into the topic. I’d appreciate hearing what you think about the prepared statements problems I mentioned. Any feedback or ideas would be awesome. Thanks a lot, Dmitrii releem.com
participants (1)
-
Dmitrii Kochetov