Yes, I accept On Mon, Jan 23, 2012 at 4:12 AM, Timour Katchaounov <timour@montyprogram.com> wrote:
Mark,
The semantics are good for us. The potential overhead from the implementation isn't a concern for our uses. What was the feedback from the MPAB code review?
I haven't submitted the code for review because there are few more small items to complete, however, the current solution was discussed with Monty and Igor, and we all came to an agreement that this is the best solution.
Speaking of overhead, if the LIMIT_ROWS_EXAMINED is not present, then the overhead is an additional IF statement in handler::increment_statistics():
inline void handler::increment_statistics(ulong SSV::*offset) const { status_var_increment(table->in_use->status_var.*offset); + if (table->in_use->examined_rows_limit_cnt < HA_POS_ERROR) + table->in_use->check_examined_rows_limit(); }
If the LIMIT_ROWS_EXAMINED clause is present, the overhead is this inline method:
/** Check if the number of rows accessed by a statement exceeded LIMIT_ROWS_EXAMINED. If so, stop execution. */ void check_examined_rows_limit() { if (++status_var.accessed_rows_and_keys > examined_rows_limit_cnt) { push_warning_printf(this, MYSQL_ERROR::WARN_LEVEL_WARN, ER_QUERY_EXCEEDED_EXAMINED_ROWS_LIMIT, ER(ER_QUERY_EXCEEDED_EXAMINED_ROWS_LIMIT), status_var.accessed_rows_and_keys); killed= ABORT_QUERY; } }
I believe this overhead for each handler call is negligible. If the overhead is measurable, then we could improve by making handler::increment_statistics a function pointer and choosing the right implementation before execution. This can easily be done if needed.
Ok, so I will assume your reply means that you accept this approach, and will complete the feature ASAP.
Timour
On Fri, Jan 20, 2012 at 7:37 AM, MARK CALLAGHAN<mdcallag@gmail.com> wrote:
OK, will have feedback in a few days.
On Fri, Jan 20, 2012 at 6:45 AM, Timour Katchaounov <timour@montyprogram.com> wrote:
Hello Mark,
I want a variant of the LIMIT clause that limits the number of rows examined during SELECT processing. This should return OK when the limit is reached -- something like NESTED_LOOP_QUERY_LIMIT. While LIMIT can be LIMIT x and LIMIT x,y. This only supports one argument -- LIMIT_ROWS_EXAMINED x. Are you interested in implementing this feature for me? We will sponsor the work. The purpose of the feature is to prevent queries from taking too long when there isn't a great index and many rows might be filtered.
After examining few alternatives, I implemented a pretty complete prototype of LIMIT_ROWS_EXAMINED. Please have a look at the description of the task here: https://mariadb.atlassian.net/browse/MDEV-28
It is important to notice that the current implementation will stop execution when LIMIT_ROWS_EXAMINED is reached irrespective of the query execution strategies used (thus not only for JOINs, but also writing/reading to temp tables, sorting, and single-table access methods that scan too many rows). This approach also is future-proof with respect to adding new query processing algorithms because the counting is done at the handler level.
The current approach is such, that it should be fairly easy to backport the feature to almost any MariaDB/MySQL.
Please let me know if this is what you want. If so, then I will proceed with the remaining polishing that is in the TODO list and will get an internal review.
You can also have a look at the small example in the comments below the description to get a feeling what is the behavior.
I can also send you a patch of the current version in case you want to play with the feature before it is pushed.
Timour
-- Mark Callaghan mdcallag@gmail.com
-- Mark Callaghan mdcallag@gmail.com