Hi,One other thing, in that example I sent, there is $state->winfunc_sql. For those window functions that take column input, it will be the column to input into the function, and is always of form "expr$[0-9]+ as wf[0-9]+"For example, for the following query it is "expr$0 as wf0 "SELECT salary, sum(salary) OVER (ORDER BY salary) as ss FROM empsalary;The SQL looks like this:1610 Query SELECT NULL as wf_rownum, salary AS expr$0,NULL as wf0,SHA1(CONCAT_WS('#','ONE_PARTITION')) as wf0_hash,salary,SHA1(CONCAT_WS('#',salary)) as wf0_obhashFROM empsalary AS `empsalary` WHERE 1=11611 Query INSERT INTO `aggregation_tmp_86564985` VALUES (NULL,5200,NULL,'3dce9641066f316f334c8008fd3e364274470068',5200,'2fad4efb8e6aaaa53ecdf638137d390e002f9783'),(NULL,4200,NULL,'3dce9641066f316f334c8008fd3e364274470068',4200,'c79fce75b1583ddd36a96178757e0d8d0ac91228'),(NULL,4500,NULL,'3dce9641066f316f334c8008fd3e364274470068',4500,'97a87b470fe9ed5ff51ff9b8543e937e6016d48c'),(NULL,6000,NULL,'3dce9641066f316f334c8008fd3e364274470068',6000,'31d9ddeaa80bc88c1f3117b9724726ebcc7fc72d'),(NULL,5200,NULL,'3dce9641066f316f334c8008fd3e364274470068',5200,'2fad4efb8e6aaaa53ecdf638137d390e002f9783'),(NULL,3500,NULL,'3dce9641066f316f334c8008fd3e364274470068',3500,'65609286cc04ece831a844984a6bc9eb80450cf7'),(NULL,3900,NULL,'3dce9641066f316f334c8008fd3e364274470068',3900,'5446569e8572251dcac168152d6c37074427eae3'),(NULL,4800,NULL,'3dce9641066f316f334c8008fd3e364274470068',4800,'98e9f55262269a05dd4f1ed788626580fdef2e95'),(NULL,5000,NULL,'3dce9641066f316f334c8008fd3e364274470068',5000,'f8237d8959e03355010bb85cc3dc46a46fb31110'),(NULL,4800,NULL,'3dce9641066f316f334c8008fd3e364274470068',4800,'98e9f55262269a05dd4f1ed788626580fdef2e95')1607 Query SELECT distinct wf0_hash h from aggregation_tmp_86564985 ORDER BY salary asc1607 Query SELECT *,expr$0 as wf0FROM `aggregation_tmp_86564985` where wf0_hash='3dce9641066f316f334c8008fd3e364274470068' ORDER BY salary asc1607 Query UPDATE aggregation_tmp_86564985 SET wf0 = 3500 WHERE wf_rownum = 61607 Query UPDATE aggregation_tmp_86564985 SET wf0 = 7400 WHERE wf_rownum = 71607 Query UPDATE aggregation_tmp_86564985 SET wf0 = 11600 WHERE wf_rownum = 21607 Query UPDATE aggregation_tmp_86564985 SET wf0 = 16100 WHERE wf_rownum = 31607 Query UPDATE aggregation_tmp_86564985 SET wf0 = 25700 WHERE wf_rownum = 81607 Query UPDATE aggregation_tmp_86564985 SET wf0 = 25700 WHERE wf_rownum = 101607 Query UPDATE aggregation_tmp_86564985 SET wf0 = 30700 WHERE wf_rownum = 91607 Query UPDATE aggregation_tmp_86564985 SET wf0 = 41100 WHERE wf_rownum = 11607 Query UPDATE aggregation_tmp_86564985 SET wf0 = 41100 WHERE wf_rownum = 51607 Query UPDATE aggregation_tmp_86564985 SET wf0 = 47100 WHERE wf_rownum = 4The cumulative distance example doesn't use any column input so in the SQL examples you see "NULL as wf0." for the window sql--JustinOn Tue, Mar 3, 2015 at 1:40 PM, Justin Swanhart <greenlion@gmail.com> wrote:Hi,I was a little confusing. The query orders of the order by columns (salary is project out into the temporary table) but, later ob_hash is used. This is because the frame extends to all the values the order the same, and comparing multiple columns is hard, so a hash is used instead.--JustinOn Tue, Mar 3, 2015 at 1:28 PM, Justin Swanhart <greenlion@gmail.com> wrote:Hi,Well here is how Shard-Query does them. I assume you would do something very similarly internally with a temp table.a) it create a temporary table for the query reserving null rows for the window functionsb) it adds to the temporary a unique id for each row of the resultset. This is used for framing.c) it adds a hash of the order by columns for the window function for orderingd) it adds a hash of the partition columns for partitioningAfter the regular resultset is stored in the temp table, a function sweeps the table foreach window function, calculating the result of the function based on the framing clause,then the column in the resultset is updated to reflect the computed value.Finally, the resultset is returned to the client.Here is the SQL log of the following query:mysql> call shard_query.sq_helper("SELECT depname, empno, salary, cume_dist() OVER (PARTITION BY depname ORDER by salary rows between 1 following and 1 following) ss FROM empsalary", "", 'test','testtab',1,1); +-----------+-------+--------+------------------+| depname | empno | salary | ss |+-----------+-------+--------+------------------+| develop | 7 | 4200 | 0.2 || develop | 9 | 4500 | 0.4 || develop | 11 | 5200 | 0.8 || develop | 10 | 5200 | 0.8 || develop | 8 | 6000 | 1 || sales | 3 | 4800 | 0.66666666666667 || sales | 4 | 4800 | 0.66666666666667 || sales | 1 | 5000 | 1 || personnel | 5 | 3500 | 0.5 || personnel | 2 | 3900 | 1 |+-----------+-------+--------+------------------+10 rows in set (0.08 sec)Query OK, 0 rows affected (0.09 sec)150303 13:12:05 1533 Query call shard_query.sq_helper("SELECT depname, empno, salary, cume_dist() OVER (PARTITION BY depname ORDER by salary rows between 1 following and 1 following) ss FROM empsalary", "", 'test','testtab',1,1)-- get meta data for resultset (notice 0=1 in where clause)1553 Query SELECT NULL as wf_rownum, depname AS expr$0,empno AS expr$1,salary AS expr$2,NULL as wf0,depname,SHA1(CONCAT_WS('#',depname)) as wf0_hash,salary,SHA1(CONCAT_WS('#',salary)) as wf0_obhashFROM empsalary AS `empsalary` WHERE 1=1 LIMIT 0-- create temp table1553 Query CREATE TABLE aggregation_tmp_45403179 (wf_rownum bigint auto_increment primary key,expr$0 VARCHAR(255),expr$1 VARCHAR(255),expr$2 VARCHAR(255),wf0 VARCHAR(255),depname VARCHAR(255),wf0_hash VARCHAR(255),salary VARCHAR(255),wf0_obhash VARCHAR(255)) ENGINE=MYISAM-- get resultset1556 Query SELECT NULL as wf_rownum, depname AS expr$0,empno AS expr$1,salary AS expr$2,NULL as wf0,depname,SHA1(CONCAT_WS('#',depname)) as wf0_hash,salary,SHA1(CONCAT_WS('#',salary)) as wf0_obhashFROM empsalary AS `empsalary` WHERE 1=1-- store resultset1557 Query INSERT INTO `aggregation_tmp_45403179` VALUES (NULL,'develop',11,5200,NULL,'develop','418a6bc4deccf0f7d5182192d51a54e504b3f3c9',5200,'2fad4efb8e6aaaa53ecdf638137d390e002f9783'),(NULL,'develop',7,4200,NULL,'develop','418a6bc4deccf0f7d5182192d51a54e504b3f3c9',4200,'c79fce75b1583ddd36a96178757e0d8d0ac91228'),(NULL,'develop',9,4500,NULL,'develop','418a6bc4deccf0f7d5182192d51a54e504b3f3c9',4500,'97a87b470fe9ed5ff51ff9b8543e937e6016d48c'),(NULL,'develop',8,6000,NULL,'develop','418a6bc4deccf0f7d5182192d51a54e504b3f3c9',6000,'31d9ddeaa80bc88c1f3117b9724726ebcc7fc72d'),(NULL,'develop',10,5200,NULL,'develop','418a6bc4deccf0f7d5182192d51a54e504b3f3c9',5200,'2fad4efb8e6aaaa53ecdf638137d390e002f9783'),(NULL,'personnel',5,3500,NULL,'personnel','fc08ce9ebee8734c2dc883c0dbd607686bdce8f3',3500,'65609286cc04ece831a844984a6bc9eb80450cf7'),(NULL,'personnel',2,3900,NULL,'personnel','fc08ce9ebee8734c2dc883c0dbd607686bdce8f3',3900,'5446569e8572251dcac168152d6c37074427eae3'),(NULL,'sales',3,4800,NULL,'sales','59248c4dae276a021cb296d2ee0e6a0c962a8d7f',4800,'98e9f55262269a05dd4f1ed788626580fdef2e95'),(NULL,'sales',1,5000,NULL,'sales','59248c4dae276a021cb296d2ee0e6a0c962a8d7f',5000,'f8237d8959e03355010bb85cc3dc46a46fb31110'),(NULL,'sales',4,4800,NULL,'sales','59248c4dae276a021cb296d2ee0e6a0c962a8d7f',4800,'98e9f55262269a05dd4f1ed788626580fdef2e95')-- handle window function 0 (the only one in this case)-- get the hashes for each partition1553 Query SELECT distinct wf0_hash h from aggregation_tmp_45403179 ORDER BY salary asc-- compute the values for each partition (three in this case)-- process the window and update the temp table (see code at the end for wf_cume) for each partition in turn.-- as you can see there is a select followed by updates1553 Query SELECT *,NULL as wf0FROM `aggregation_tmp_45403179` where wf0_hash='fc08ce9ebee8734c2dc883c0dbd607686bdce8f3' ORDER BY salary asc1553 Query UPDATE aggregation_tmp_45403179 SET wf0 = 0.5 WHERE wf_rownum in (6)1553 Query UPDATE aggregation_tmp_45403179 SET wf0 = 1 WHERE wf_rownum in (7)1553 Query SELECT *,NULL as wf0FROM `aggregation_tmp_45403179` where wf0_hash='59248c4dae276a021cb296d2ee0e6a0c962a8d7f' ORDER BY salary asc1553 Query UPDATE aggregation_tmp_45403179 SET wf0 = 0.66666666666667 WHERE wf_rownum in (8,10)1553 Query UPDATE aggregation_tmp_45403179 SET wf0 = 1 WHERE wf_rownum in (9)1553 Query SELECT *,NULL as wf0FROM `aggregation_tmp_45403179` where wf0_hash='418a6bc4deccf0f7d5182192d51a54e504b3f3c9' ORDER BY salary asc1553 Query UPDATE aggregation_tmp_45403179 SET wf0 = 0.2 WHERE wf_rownum in (2)1553 Query UPDATE aggregation_tmp_45403179 SET wf0 = 0.4 WHERE wf_rownum in (3)1553 Query UPDATE aggregation_tmp_45403179 SET wf0 = 0.8 WHERE wf_rownum in (1,5)1553 Query UPDATE aggregation_tmp_45403179 SET wf0 = 1 WHERE wf_rownum in (4)-- return resultset to client1559 Query SELECT expr$0 AS `depname`,expr$1 AS `empno`,expr$2 AS `salary`,wf0 as `ss`FROM `aggregation_tmp_45403179` ORDER BY wf0_hash,salary asc-- remove temp table1559 Query DROP TABLE IF EXISTS aggregation_tmp_454031791559 Quit1550 Quit150303 13:12:12 1533 Query set global general_log=0protected function wf_cume_dist($num,$state) {static $sum;$win = $state->windows[$num];if(empty($win['order'])) {if($percent)$sql = "update " . $state->table_name . " set wf{$num}=1";else$sql = "update " . $state->table_name . " set wf{$num}=0";$state->DAL->my_query($sql);if($err = $state->DAL->my_error()) {$this->errors[] = $err;return false;}return true;} else {/* running sum*/$sql = "SELECT distinct wf{$num}_hash h from " . $state->table_name . " ORDER BY " . $win['order_by'];$stmt = $state->DAL->my_query($sql);if($err = $state->DAL->my_error()) {$this->errors[] = $err;return false;}$last_hash = "";$hash = "";$last_ob_hash = "";$ob_hash = "";while($row = $state->DAL->my_fetch_assoc($stmt)) {#$sql = "select * from " . $state->table_name . " where wf{$num}_hash='" . $row['h'] . "' ORDER BY " . $win['order_by'];$sql = "SELECT *," . $state->winfunc_sql . " where wf{$num}_hash='" . $row['h'] . "' ORDER BY " . $win['order_by'];$stmt2 = $state->DAL->my_query($sql);if($err = $state->DAL->my_error()) {$this->errors[] = $err;return false;}$done=array();$rows=array();while($row2=$state->DAL->my_fetch_assoc($stmt2)) {$rows[] = $row2;}$last_hash = "";$last_ob_hash = "";$i = 0;$rowlist="";$rank = 0;while($i<count($rows)) {$row2 = $rows[$i];++$rank;$ob_hash = $row2["wf{$num}_obhash"];$rowlist=$row2['wf_rownum'];for($n=$i+1;$n<count($rows);++$n) {$row3 = $rows[$n];$new_ob_hash = $row3["wf{$num}_obhash"];if($new_ob_hash != $ob_hash) {break;}$rowlist .= "," . $row3['wf_rownum'];++$i;++$rank;}$dist = $rank/count($rows);$sql = "UPDATE " . $state->table_name . " SET wf{$num} = {$dist} WHERE wf_rownum in ({$rowlist})";$state->DAL->my_query($sql);if($err = $state->DAL->my_error()) {$this->errors[] = $err;return false;}++$i;}}}return true;}On Tue, Mar 3, 2015 at 12:55 PM, Igor Babaev <igor@askmonty.org> wrote:On 03/03/2015 10:25 AM, Sergei Golubchik wrote:
> Hi, Igor!
>
> On Mar 03, Igor Babaev wrote:
>>>
>>>> I'd also like to discuss window functions too. I've implemented them
>>>> in shard-query and have ideas about how to implement them in the
>>>> server, but pluggable parser would be really useful here.
>>>
>>> Window functions have a good chance of being in 10.2, it's MDEV-6115.
>>> But I don't think that somebody is working on MDEV-6115 yet.
>>
>> I started working on MDEV-6115 some time ago.
>
> Ah, great. Sorry, I didn't know it.
> Could you then discuss it with Justin, please?
> See above, he has some ideas about the implementation.
Justin,
How do you prefer discussing your ideas?
On IRC (#maria) or by phone? (I don't have skype at the moment)
Regards,
Igor.
>
> Regards,
> Sergei
_______________________________________________
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help : https://help.launchpad.net/ListHelp