Re: [Maria-developers] eb9fab7: MDEV-9531: GROUP_CONCAT with ORDER BY inside takes a lot of memory while it's executed
Hi, Sanja! On May 04, Olaksandr Byslkin wrote: still haven't fixed the typo? :)
revision-id: eb9fab76fb4322a7821e6502e6303ed381624429 (mariadb-10.1.13-18-geb9fab7) parent(s): 732adec0a4c75d99389230feeb0deca0ad668de7 committer: Oleksandr Byelkin timestamp: 2016-05-04 19:15:31 +0200 message:
MDEV-9531: GROUP_CONCAT with ORDER BY inside takes a lot of memory while it's executed
Limitation added to Red-Black tree.
diff --git a/include/my_tree.h b/include/my_tree.h index f8be55f..f1916b9 100644 --- a/include/my_tree.h +++ b/include/my_tree.h @@ -57,11 +57,14 @@ typedef struct st_tree_element { } TREE_ELEMENT;
#define ELEMENT_CHILD(element, offs) (*(TREE_ELEMENT**)((char*)element + offs)) +#define R_ELEMENT_CHILD(element, offs) ((TREE_ELEMENT**)((char*)element + offs))
what does "R_ELEMENT_CHILD" mean? What's "R"?
typedef struct st_tree { TREE_ELEMENT *root,null_element; TREE_ELEMENT **parents[MAX_TREE_HEIGHT]; + TREE_ELEMENT *free_element;
why do you need a free_element pointer?
uint offset_to_key,elements_in_tree,size_of_element; + uint elements_limit, del_direction;
why do you need to support different directions here?
size_t memory_limit, allocated; qsort_cmp2 compare; void *custom_arg; diff --git a/mysql-test/r/mysqld--help.result b/mysql-test/r/mysqld--help.result index bef0390..3699b18 100644 --- a/mysql-test/r/mysqld--help.result +++ b/mysql-test/r/mysqld--help.result @@ -240,6 +240,9 @@ The following options may be given as the first argument: --group-concat-max-len=# The maximum length of the result of function GROUP_CONCAT() + --group-concat-max-mem=# + The maximum memory used to calculate result of function + GROUP_CONCAT()
No, please. There is group_concat_max_len already, it's all that user needs. There is no reason why one might need to configure a second limit on for the GROUP_CONCAT.
--gtid-domain-id=# Used with global transaction ID to identify logically independent replication streams. When events can propagate through multiple parallel paths (for example
Regards, Sergei Chief Architect MariaDB and security@mariadb.org
On 27.06.2016 13:07, Sergei Golubchik wrote:
Hi, Sanja!
On May 04, Olaksandr Byslkin wrote:
still haven't fixed the typo? :)
revision-id: eb9fab76fb4322a7821e6502e6303ed381624429 (mariadb-10.1.13-18-geb9fab7) parent(s): 732adec0a4c75d99389230feeb0deca0ad668de7 committer: Oleksandr Byelkin timestamp: 2016-05-04 19:15:31 +0200 message:
MDEV-9531: GROUP_CONCAT with ORDER BY inside takes a lot of memory while it's executed
Limitation added to Red-Black tree.
diff --git a/include/my_tree.h b/include/my_tree.h index f8be55f..f1916b9 100644 --- a/include/my_tree.h +++ b/include/my_tree.h @@ -57,11 +57,14 @@ typedef struct st_tree_element { } TREE_ELEMENT;
#define ELEMENT_CHILD(element, offs) (*(TREE_ELEMENT**)((char*)element + offs)) +#define R_ELEMENT_CHILD(element, offs) ((TREE_ELEMENT**)((char*)element + offs)) what does "R_ELEMENT_CHILD" mean? What's "R"? reference
typedef struct st_tree { TREE_ELEMENT *root,null_element; TREE_ELEMENT **parents[MAX_TREE_HEIGHT]; + TREE_ELEMENT *free_element;
why do you need a free_element pointer?
We can free element not when tree grows enough but only when we are going to insert new element, but then we will need to go back and make path jet another time tho the place of insertion (because of possible rebalancing). Instead we check in the beginning if tree is too big to insert and delete element, then probably we will use it to insert new, but if when we went down to the element and see that we do not need to insert, the element will be left as free for the next time.
uint offset_to_key,elements_in_tree,size_of_element; + uint elements_limit, del_direction;
why do you need to support different directions here?
to make the routine more general.
size_t memory_limit, allocated; qsort_cmp2 compare; void *custom_arg; diff --git a/mysql-test/r/mysqld--help.result b/mysql-test/r/mysqld--help.result index bef0390..3699b18 100644 --- a/mysql-test/r/mysqld--help.result +++ b/mysql-test/r/mysqld--help.result @@ -240,6 +240,9 @@ The following options may be given as the first argument: --group-concat-max-len=# The maximum length of the result of function GROUP_CONCAT() + --group-concat-max-mem=# + The maximum memory used to calculate result of function + GROUP_CONCAT()
No, please. There is group_concat_max_len already, it's all that user needs. There is no reason why one might need to configure a second limit on for the GROUP_CONCAT.
OK, the problem is that even by key length we can not predict string output of the values, the only what we can say is that it will be at least one character (coma) because empty lines possible. Do you have ideas how predict minimal possible string size by value?
--gtid-domain-id=# Used with global transaction ID to identify logically independent replication streams. When events can propagate through multiple parallel paths (for example Regards, Sergei Chief Architect MariaDB and security@mariadb.org
_______________________________________________ Mailing list: https://launchpad.net/~maria-developers Post to : maria-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp
Hi, Oleksandr! On Jun 28, Oleksandr Byelkin wrote:
diff --git a/include/my_tree.h b/include/my_tree.h index f8be55f..f1916b9 100644 --- a/include/my_tree.h +++ b/include/my_tree.h @@ -57,11 +57,14 @@ typedef struct st_tree_element { } TREE_ELEMENT;
#define ELEMENT_CHILD(element, offs) (*(TREE_ELEMENT**)((char*)element + offs)) +#define R_ELEMENT_CHILD(element, offs) ((TREE_ELEMENT**)((char*)element + offs)) what does "R_ELEMENT_CHILD" mean? What's "R"? reference
Not intuitive at all. if you need that, call it ELEMENT_CHILD_PTR (and then, redefine ELEMENT_CHILD to be *ELEMENT_CHILD_PTR(element, offs))
typedef struct st_tree { TREE_ELEMENT *root,null_element; TREE_ELEMENT **parents[MAX_TREE_HEIGHT]; + TREE_ELEMENT *free_element; why do you need a free_element pointer?
We can free element not when tree grows enough but only when we are going to insert new element, but then we will need to go back and make path jet another time tho the place of insertion (because of possible rebalancing). Instead we check in the beginning if tree is too big to insert and delete element, then probably we will use it to insert new, but if when we went down to the element and see that we do not need to insert, the element will be left as free for the next time.
uint offset_to_key,elements_in_tree,size_of_element; + uint elements_limit, del_direction;
why do you need to support different directions here?
to make the routine more general.
you don't use it. let's add functionality when we need it, not just in case.
qsort_cmp2 compare; void *custom_arg; diff --git a/mysql-test/r/mysqld--help.result b/mysql-test/r/mysqld--help.result index bef0390..3699b18 100644 --- a/mysql-test/r/mysqld--help.result +++ b/mysql-test/r/mysqld--help.result @@ -240,6 +240,9 @@ The following options may be given as the first argument: --group-concat-max-len=# The maximum length of the result of function GROUP_CONCAT() + --group-concat-max-mem=# + The maximum memory used to calculate result of function + GROUP_CONCAT()
No, please. There is group_concat_max_len already, it's all that user needs. There is no reason why one might need to configure a second limit on for the GROUP_CONCAT.
OK, the problem is that even by key length we can not predict string output of the values, the only what we can say is that it will be at least one character (coma) because empty lines possible. Do you have ideas how predict minimal possible string size by value?
That, probably, means that limiting the total size of a TREE does not work very well for GROUP_CONCAT... You can fix it on the upper level, in the GROUP_CONCAT itself, by calculating the result length. Like tree_insert(tree, cur_row); res_length+= cur_row_length; if (res_length > group_concat_max_len + threshold) { remove last tree elements as long as res_length stays larger than group_concat_max_len } That won't allow some tricks, like reusing the TREE_ELEMENT, but with a sufficiently large threshold, and every rebalancing is O(log n) anyway. An interesting version of that would be to know the length of the left subtree and when it grows over group_concat_max_len, replace the tree with its left subtree. I'm not sure it will be easy to maintain, though. And while it's not O(log n), it's a fixed cost for every operation, while only a fraction of TREE usages will actually need it. Perhaps calculating it could be an option, it's O(N), so if done rarely it might be ok. Regards, Sergei Chief Architect MariaDB and security@mariadb.org
On 28.06.2016 12:44, Sergei Golubchik wrote:
Hi, Oleksandr!
On Jun 28, Oleksandr Byelkin wrote:
diff --git a/include/my_tree.h b/include/my_tree.h index f8be55f..f1916b9 100644 --- a/include/my_tree.h +++ b/include/my_tree.h @@ -57,11 +57,14 @@ typedef struct st_tree_element { } TREE_ELEMENT;
#define ELEMENT_CHILD(element, offs) (*(TREE_ELEMENT**)((char*)element + offs)) +#define R_ELEMENT_CHILD(element, offs) ((TREE_ELEMENT**)((char*)element + offs)) what does "R_ELEMENT_CHILD" mean? What's "R"? reference Not intuitive at all. if you need that, call it ELEMENT_CHILD_PTR (and then, redefine ELEMENT_CHILD to be *ELEMENT_CHILD_PTR(element, offs)) OK
typedef struct st_tree { TREE_ELEMENT *root,null_element; TREE_ELEMENT **parents[MAX_TREE_HEIGHT]; + TREE_ELEMENT *free_element; why do you need a free_element pointer? We can free element not when tree grows enough but only when we are going to insert new element, but then we will need to go back and make path jet another time tho the place of insertion (because of possible rebalancing). Instead we check in the beginning if tree is too big to insert and delete element, then probably we will use it to insert new, but if when we went down to the element and see that we do not need to insert, the element will be left as free for the next time. uint offset_to_key,elements_in_tree,size_of_element; + uint elements_limit, del_direction; why do you need to support different directions here? to make the routine more general. you don't use it. let's add functionality when we need it, not just in case. OK qsort_cmp2 compare; void *custom_arg; diff --git a/mysql-test/r/mysqld--help.result b/mysql-test/r/mysqld--help.result index bef0390..3699b18 100644 --- a/mysql-test/r/mysqld--help.result +++ b/mysql-test/r/mysqld--help.result @@ -240,6 +240,9 @@ The following options may be given as the first argument: --group-concat-max-len=# The maximum length of the result of function GROUP_CONCAT() + --group-concat-max-mem=# + The maximum memory used to calculate result of function + GROUP_CONCAT() No, please. There is group_concat_max_len already, it's all that user needs. There is no reason why one might need to configure a second limit on for the GROUP_CONCAT. OK, the problem is that even by key length we can not predict string output of the values, the only what we can say is that it will be at least one character (coma) because empty lines possible. Do you have ideas how predict minimal possible string size by value? That, probably, means that limiting the total size of a TREE does not work very well for GROUP_CONCAT...
You can fix it on the upper level, in the GROUP_CONCAT itself, by calculating the result length. Like
tree_insert(tree, cur_row); res_length+= cur_row_length; if (res_length > group_concat_max_len + threshold) { remove last tree elements as long as res_length stays larger than group_concat_max_len }
That won't allow some tricks, like reusing the TREE_ELEMENT, but with a sufficiently large threshold, and every rebalancing is O(log n) anyway.
An interesting version of that would be to know the length of the left subtree and when it grows over group_concat_max_len, replace the tree with its left subtree. I'm not sure it will be easy to maintain, though. And while it's not O(log n), it's a fixed cost for every operation, while only a fraction of TREE usages will actually need it.
Perhaps calculating it could be an option, it's O(N), so if done rarely it might be ok.
But might be not, I see the record length also inside the tree, but it do not give so much (how 8 bytes of integer will be represented?) I made some approach to calculate the length in the other commit, please check it. (there is not addressed this review changes but if it is not OK it has no sens then - everything should be rewritten from scratch).
Regards, Sergei Chief Architect MariaDB and security@mariadb.org
participants (2)
-
Oleksandr Byelkin
-
Sergei Golubchik