On 26.09.2016 19:03, Sergei Golubchik wrote:
Hi, Oleksandr!
On Sep 26, Oleksandr Byelkin wrote:
A hackish workaround could be to adjust tree->elements_limit (in Item_func_group_concat::add) after each insertion. But in this case it would be simpler to limit the tree by size (in bytes) and adjust tree size after each insertion. What do you think about it? I think that even for characters there is not direct correspondence between bytes and number of characters... so it is possible to make limit by bytes in case we have only strings and put it as <length> * <maximum bytes per symbol for given charset> (here better to talk to Bar to ask if there is a pitfalls as difference in client/server/item charsets (I think should not be)).
Also we have to take into account that we have key representation in the tree so probably + 1 bytes for null (I am not sure).
It is worse problem (IMHO) is that it will mean constant allocating/freeing memory (with limit by number of elements tree is kind of freeze when it reach the limit). I am not sure if it is serious problem.
I do not know more obstacles except above. And Limit by size is already present in the tree, but it just free all tree and start from beginning (so should be changed). Yup. I thought about someting like that: besides TREE::memory_limit, we add, say, TREE::memory_reset_to.
And when tree size reaches memory_limit, the tree is shrunk to memory_reset_to. When memory_reset_to is 0, we get the old behavior, full tree reset. If it's not 0, elements are removed one by one until the tree is shrunk appropriately. This provides the backward-compatible interface and solves the constant malloc/free issue that you've mentioned above. I'd think memory_limit should be at least 2x memory_reset_to.
Problems:
* removing elements one by one is not very fast, if memory_reset_to is much lower than memory_limit, it's faster to create a new tree and copy first N elements into it. I suppose with memory_limit being more than 3x memory_reset_to, new tree is already faster. * element size is not the same as Item string value length (tree element is record image with varchar columns in it). To use correct strings value lengths, the upper code needs to correct tree allocated size after each insertion. And after each deletion (if elements will be deleted to reduce the tree size). Also letting the tree growing more then it is needed also takes its tall, because tree should be rebalanced while it growth.