[Maria-developers] [GSoC] Introduction Mail
Hi all, I have been accepted for the project "Self-Tuning Optimizer". I will be doing this project under the mentorship of Sergei Golubchik. There is already a task created which will track the progress for this project: https://mariadb.atlassian.net/browse/MDEV-350 . I'll start with getting familiar with the code base, reading some literature on cost estimation in optimizer and interacting with community. About me: I am currently a 4th year student pursuing B.Tech - M.Tech dual degree in Computer Science and Engineering from Indian Institute of Technology (IIT) – Kanpur, India. This would be my first GSoC project. I had done an internship at Facebook last year. I'll also be maintaining a blog at http://igniting.in . I'm really excited to be a part of mariadb and I hope to make a significant contribution to it. Regards Anshu Avinash
Hi, Anshu! So, how do we start? I see you're often present on irc, which is great. You've looked at the code, what do you think? Do you understand how different parts of this feature fit together? We can start from just one constant (global or per-engine) and see how it'll work. And, by the way, when you start coding (May 19) or earlier, as you prefer, I would like to start seeing some kind of weekly updates from you. In email or in your blog - whatever you feel more comfortable with. Regards, Sergei
Hi, I had been busy with my exams (which got finished today). I'll come up with more detailed working plan and doubts in a couple of days. On Tue, Apr 29, 2014 at 11:28 PM, Sergei Golubchik <serg@mariadb.org> wrote:
Hi, Anshu!
So, how do we start?
Till now, I have not done much. I had started exploring the code for optimizer but stopped mid-way and started to read the book "Inside the SQL Server Query Optimizer" for a better understanding of how the optimizer works. I see you're often present on irc, which is great.
(I go by nick igniting)
You've looked at the code, what do you think?
Do you understand how different parts of this feature fit together?
We can start from just one constant (global or per-engine) and see how it'll work.
I had started exploring tmptable_create_cost. In sql/sql_const.h: HEAP_TEMPTABLE_CREATE_COST is defined to be 2.0 and DISK_TEMPTABLE_CREATE_COST as 40.0. As discussed on irc with serg, I'll start to profile this constant (which I have not started yet).
And, by the way, when you start coding (May 19) or earlier, as you prefer, I would like to start seeing some kind of weekly updates from you. In email or in your blog - whatever you feel more comfortable with.
Blog updates should be fine. Regards,
Sergei
Regards Anshu
Hi, Anshu! On Apr 30, Anshu Avinash wrote:
You've looked at the code, what do you think? Do you understand how different parts of this feature fit together? We can start from just one constant (global or per-engine) and see how it'll work.
I had started exploring tmptable_create_cost. In sql/sql_const.h: HEAP_TEMPTABLE_CREATE_COST is defined to be 2.0 and DISK_TEMPTABLE_CREATE_COST as 40.0. As discussed on irc with serg, I'll start to profile this constant (which I have not started yet).
Note, that you shouldn't profile anything yourself. Instead, the goal is to have profiling code built into the server and it should profile and adjust these constants automatically. Regards, Sergei
Hi, Anshu! How are you doing? Any progress so far? On Apr 30, Anshu Avinash wrote:
And, by the way, when you start coding (May 19) or earlier, as you prefer, I would like to start seeing some kind of weekly updates from you. In email or in your blog - whatever you feel more comfortable with.
Blog updates should be fine.
That's fine. Whatever you prefer. One blog post every week then, preferrably on Monday. Regards, Sergei
Hi! On 8 May 2014, at 22:33, Sergei Golubchik <serg@mariadb.org> wrote:
Hi, Anshu!
How are you doing? Any progress so far?
On Apr 30, Anshu Avinash wrote:
And, by the way, when you start coding (May 19) or earlier, as you prefer, I would like to start seeing some kind of weekly updates from you. In email or in your blog - whatever you feel more comfortable with.
Blog updates should be fine.
That's fine. Whatever you prefer. One blog post every week then, preferrably on Monday.
For the benefit of others Anshu, please also post your weekly reports to maria-developers@lists.launchpad.net - I think it will be really good for those that don't drop by your blog and you'll likely also get other feedback maybe This goes to all those participating in GSoC. Also for those with a blog + RSS feed, you should aim to get it on http://planetmariadb.org/ and http://planet.mysql.com/ cheers, -colin -- Colin Charles, Chief Evangelist, SkySQL - The MariaDB Company blog: http://bytebot.net/blog/| t: +6-012-204-3201 | Skype: colincharles
Hi all, Sorry for the irregular updates. I had been busy for last couple of days and might still be busy for 1-2 days more. I would be completely free starting next week, and would be updating my blog weekly on every Monday (so 1st update would be on May 12). I would also send the link of my post weekly on the mailing list. As discussed on irc, I started to explore the pair of constants: handler::scan_time() and handler::read_time(). I also started looking into sql_statistics.cc for writing the optimizer constants into a persistent db. Regards Anshu Avinash On Thu, May 8, 2014 at 11:08 PM, Colin Charles <colin@mariadb.org> wrote:
Hi!
On 8 May 2014, at 22:33, Sergei Golubchik <serg@mariadb.org> wrote:
Hi, Anshu!
How are you doing? Any progress so far?
On Apr 30, Anshu Avinash wrote:
And, by the way, when you start coding (May 19) or earlier, as you prefer, I would like to start seeing some kind of weekly updates from you. In email or in your blog - whatever you feel more comfortable
with.
Blog updates should be fine.
That's fine. Whatever you prefer. One blog post every week then, preferrably on Monday.
For the benefit of others Anshu, please also post your weekly reports to maria-developers@lists.launchpad.net - I think it will be really good for those that don't drop by your blog and you'll likely also get other feedback maybe
This goes to all those participating in GSoC.
Also for those with a blog + RSS feed, you should aim to get it on http://planetmariadb.org/ and http://planet.mysql.com/
cheers, -colin
-- Colin Charles, Chief Evangelist, SkySQL - The MariaDB Company blog: http://bytebot.net/blog/| t: +6-012-204-3201 | Skype: colincharles
Hi all, You can find my blog entry for this week at http://igniting.in/gsoc2014/2014/05/11/first-steps/ . Regards Anshu Avinash On Thu, May 8, 2014 at 11:46 PM, Anshu Avinash <anshu.avinash35@gmail.com>wrote:
Hi all,
Sorry for the irregular updates. I had been busy for last couple of days and might still be busy for 1-2 days more. I would be completely free starting next week, and would be updating my blog weekly on every Monday (so 1st update would be on May 12). I would also send the link of my post weekly on the mailing list.
As discussed on irc, I started to explore the pair of constants: handler::scan_time() and handler::read_time(). I also started looking into sql_statistics.cc for writing the optimizer constants into a persistent db.
Regards Anshu Avinash
On Thu, May 8, 2014 at 11:08 PM, Colin Charles <colin@mariadb.org> wrote:
Hi!
On 8 May 2014, at 22:33, Sergei Golubchik <serg@mariadb.org> wrote:
Hi, Anshu!
How are you doing? Any progress so far?
On Apr 30, Anshu Avinash wrote:
And, by the way, when you start coding (May 19) or earlier, as you prefer, I would like to start seeing some kind of weekly updates from you. In email or in your blog - whatever you feel more comfortable
with.
Blog updates should be fine.
That's fine. Whatever you prefer. One blog post every week then, preferrably on Monday.
For the benefit of others Anshu, please also post your weekly reports to maria-developers@lists.launchpad.net - I think it will be really good for those that don't drop by your blog and you'll likely also get other feedback maybe
This goes to all those participating in GSoC.
Also for those with a blog + RSS feed, you should aim to get it on http://planetmariadb.org/ and http://planet.mysql.com/
cheers, -colin
-- Colin Charles, Chief Evangelist, SkySQL - The MariaDB Company blog: http://bytebot.net/blog/| t: +6-012-204-3201 | Skype: colincharles
Hi all, This week's blog entry would get delayed by couple of days. I have started coding though and would like to give heads up on what I'm doing. I've looked at the diffs for "Cost model project" of mysql: http://bazaar.launchpad.net/~mysql/mysql-server/5.7/revision/7596 and http://bazaar.launchpad.net/~mysql/mysql-server/5.7/revision/7222 . These give a pretty good idea about what are the hard-coded constants and where are they being used. The idea is to multiply "READ_TIME_FACTOR" and "SCAN_TIME_FACTOR" to the values returned by read_time() and scan_time() in handler.h, while returning. These values would be read from a table in mysql db. For that I've looked at sql_statistics.cc. After completing this, I'll first change the values of these constants manually and check if the better or worse query plans are being selected. I'll first do the last step manually, to check if everything is working as expected and later automate it. Regards Anshu On Mon, May 12, 2014 at 11:22 AM, Anshu Avinash <anshu.avinash35@gmail.com>wrote:
Hi all,
You can find my blog entry for this week at http://igniting.in/gsoc2014/2014/05/11/first-steps/ .
Regards Anshu Avinash
On Thu, May 8, 2014 at 11:46 PM, Anshu Avinash <anshu.avinash35@gmail.com>wrote:
Hi all,
Sorry for the irregular updates. I had been busy for last couple of days and might still be busy for 1-2 days more. I would be completely free starting next week, and would be updating my blog weekly on every Monday (so 1st update would be on May 12). I would also send the link of my post weekly on the mailing list.
As discussed on irc, I started to explore the pair of constants: handler::scan_time() and handler::read_time(). I also started looking into sql_statistics.cc for writing the optimizer constants into a persistent db.
Regards Anshu Avinash
On Thu, May 8, 2014 at 11:08 PM, Colin Charles <colin@mariadb.org> wrote:
Hi!
On 8 May 2014, at 22:33, Sergei Golubchik <serg@mariadb.org> wrote:
Hi, Anshu!
How are you doing? Any progress so far?
On Apr 30, Anshu Avinash wrote:
And, by the way, when you start coding (May 19) or earlier, as you prefer, I would like to start seeing some kind of weekly updates from you. In email or in your blog - whatever you feel more comfortable
with.
Blog updates should be fine.
That's fine. Whatever you prefer. One blog post every week then, preferrably on Monday.
For the benefit of others Anshu, please also post your weekly reports to maria-developers@lists.launchpad.net - I think it will be really good for those that don't drop by your blog and you'll likely also get other feedback maybe
This goes to all those participating in GSoC.
Also for those with a blog + RSS feed, you should aim to get it on http://planetmariadb.org/ and http://planet.mysql.com/
cheers, -colin
-- Colin Charles, Chief Evangelist, SkySQL - The MariaDB Company blog: http://bytebot.net/blog/| t: +6-012-204-3201 | Skype: colincharles
wow a big work, congratulation guy, i will read part by part to better understand mariadb code 2014-05-19 16:33 GMT-03:00 Anshu Avinash <anshu.avinash35@gmail.com>:
Hi all,
This week's blog entry would get delayed by couple of days. I have started coding though and would like to give heads up on what I'm doing.
I've looked at the diffs for "Cost model project" of mysql: http://bazaar.launchpad.net/~mysql/mysql-server/5.7/revision/7596 and http://bazaar.launchpad.net/~mysql/mysql-server/5.7/revision/7222 . These give a pretty good idea about what are the hard-coded constants and where are they being used.
The idea is to multiply "READ_TIME_FACTOR" and "SCAN_TIME_FACTOR" to the values returned by read_time() and scan_time() in handler.h, while returning. These values would be read from a table in mysql db. For that I've looked at sql_statistics.cc. After completing this, I'll first change the values of these constants manually and check if the better or worse query plans are being selected. I'll first do the last step manually, to check if everything is working as expected and later automate it.
Regards Anshu
On Mon, May 12, 2014 at 11:22 AM, Anshu Avinash <anshu.avinash35@gmail.com
wrote:
Hi all,
You can find my blog entry for this week at http://igniting.in/gsoc2014/2014/05/11/first-steps/ .
Regards Anshu Avinash
On Thu, May 8, 2014 at 11:46 PM, Anshu Avinash <anshu.avinash35@gmail.com
wrote:
Hi all,
Sorry for the irregular updates. I had been busy for last couple of days and might still be busy for 1-2 days more. I would be completely free starting next week, and would be updating my blog weekly on every Monday (so 1st update would be on May 12). I would also send the link of my post weekly on the mailing list.
As discussed on irc, I started to explore the pair of constants: handler::scan_time() and handler::read_time(). I also started looking into sql_statistics.cc for writing the optimizer constants into a persistent db.
Regards Anshu Avinash
On Thu, May 8, 2014 at 11:08 PM, Colin Charles <colin@mariadb.org>wrote:
Hi!
On 8 May 2014, at 22:33, Sergei Golubchik <serg@mariadb.org> wrote:
Hi, Anshu!
How are you doing? Any progress so far?
On Apr 30, Anshu Avinash wrote:
> And, by the way, when you start coding (May 19) or earlier, as you > prefer, I would like to start seeing some kind of weekly updates
from
> you. In email or in your blog - whatever you feel more comfortable with.
Blog updates should be fine.
That's fine. Whatever you prefer. One blog post every week then, preferrably on Monday.
For the benefit of others Anshu, please also post your weekly reports to maria-developers@lists.launchpad.net - I think it will be really good for those that don't drop by your blog and you'll likely also get other feedback maybe
This goes to all those participating in GSoC.
Also for those with a blog + RSS feed, you should aim to get it on http://planetmariadb.org/ and http://planet.mysql.com/
cheers, -colin
-- Colin Charles, Chief Evangelist, SkySQL - The MariaDB Company blog: http://bytebot.net/blog/| t: +6-012-204-3201 | Skype: colincharles
_______________________________________________ Mailing list: https://launchpad.net/~maria-developers Post to : maria-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
Hi all, You can find my this week's blog entry at http://igniting.in/gsoc2014/2014/05/25/coding-things-up/ . I have created a branch on launchpad for my work: http://bazaar.launchpad.net/~igniting/maria/maria/revision/4211 . You can give your suggestions/reviews either on this thread or as a comment on the blog itself. Regards Anshu Avinash On Tue, May 20, 2014 at 1:22 AM, Roberto Spadim <roberto@spadim.com.br>wrote:
wow a big work, congratulation guy, i will read part by part to better understand mariadb code
2014-05-19 16:33 GMT-03:00 Anshu Avinash <anshu.avinash35@gmail.com>:
Hi all,
This week's blog entry would get delayed by couple of days. I have started coding though and would like to give heads up on what I'm doing.
I've looked at the diffs for "Cost model project" of mysql: http://bazaar.launchpad.net/~mysql/mysql-server/5.7/revision/7596 and http://bazaar.launchpad.net/~mysql/mysql-server/5.7/revision/7222 . These give a pretty good idea about what are the hard-coded constants and where are they being used.
The idea is to multiply "READ_TIME_FACTOR" and "SCAN_TIME_FACTOR" to the values returned by read_time() and scan_time() in handler.h, while returning. These values would be read from a table in mysql db. For that I've looked at sql_statistics.cc. After completing this, I'll first change the values of these constants manually and check if the better or worse query plans are being selected. I'll first do the last step manually, to check if everything is working as expected and later automate it.
Regards Anshu
On Mon, May 12, 2014 at 11:22 AM, Anshu Avinash < anshu.avinash35@gmail.com> wrote:
Hi all,
You can find my blog entry for this week at http://igniting.in/gsoc2014/2014/05/11/first-steps/ .
Regards Anshu Avinash
On Thu, May 8, 2014 at 11:46 PM, Anshu Avinash < anshu.avinash35@gmail.com> wrote:
Hi all,
Sorry for the irregular updates. I had been busy for last couple of days and might still be busy for 1-2 days more. I would be completely free starting next week, and would be updating my blog weekly on every Monday (so 1st update would be on May 12). I would also send the link of my post weekly on the mailing list.
As discussed on irc, I started to explore the pair of constants: handler::scan_time() and handler::read_time(). I also started looking into sql_statistics.cc for writing the optimizer constants into a persistent db.
Regards Anshu Avinash
On Thu, May 8, 2014 at 11:08 PM, Colin Charles <colin@mariadb.org>wrote:
Hi!
On 8 May 2014, at 22:33, Sergei Golubchik <serg@mariadb.org> wrote:
Hi, Anshu!
How are you doing? Any progress so far?
On Apr 30, Anshu Avinash wrote: > >> And, by the way, when you start coding (May 19) or earlier, as you >> prefer, I would like to start seeing some kind of weekly updates from >> you. In email or in your blog - whatever you feel more comfortable with. > > Blog updates should be fine.
That's fine. Whatever you prefer. One blog post every week then, preferrably on Monday.
For the benefit of others Anshu, please also post your weekly reports to maria-developers@lists.launchpad.net - I think it will be really good for those that don't drop by your blog and you'll likely also get other feedback maybe
This goes to all those participating in GSoC.
Also for those with a blog + RSS feed, you should aim to get it on http://planetmariadb.org/ and http://planet.mysql.com/
cheers, -colin
-- Colin Charles, Chief Evangelist, SkySQL - The MariaDB Company blog: http://bytebot.net/blog/| t: +6-012-204-3201 | Skype: colincharles
_______________________________________________ Mailing list: https://launchpad.net/~maria-developers Post to : maria-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
Hi, Anshu! On May 25, Anshu Avinash wrote:
Hi all,
You can find my this week's blog entry at http://igniting.in/gsoc2014/2014/05/25/coding-things-up/ . I have created a branch on launchpad for my work: http://bazaar.launchpad.net/~igniting/maria/maria/revision/4211 . You can give your suggestions/reviews either on this thread or as a comment on the blog itself.
I've got used to launchpad, where one can subscribe to a branch and get all pushes by email - so that I could simply reply to these emails. It doesn't seem to work this way on guthub :( Anyway...
diff --git a/mysql-test/t/costmodel.test b/mysql-test/t/costmodel.test --- /dev/null +++ b/mysql-test/t/costmodel.test @@ -0,0 +1,9 @@ +--disable_warnings +DROP TABLE IF EXISTS t1; +--enable_warnings + +CREATE TABLE t1 (a INT); +INSERT INTO t1 VALUES (1); +SELECT * FROM t1;
This test doesn't seem to do anything with your changes in the code. it doesn't test new constants, that you've added.
+ +DROP TABLE t1; diff --git a/scripts/mysql_system_tables.sql b/scripts/mysql_system_tables.sql --- a/scripts/mysql_system_tables.sql +++ b/scripts/mysql_system_tables.sql @@ -229,3 +229,10 @@ CREATE TABLE IF NOT EXISTS index_stats (db_name varchar(64) NOT NULL, table_name -- we avoid mixed-engine transactions. set storage_engine=@orig_storage_engine; CREATE TABLE IF NOT EXISTS gtid_slave_pos (domain_id INT UNSIGNED NOT NULL, sub_id BIGINT UNSIGNED NOT NULL, server_id INT UNSIGNED NOT NULL, seq_no BIGINT UNSIGNED NOT NULL, PRIMARY KEY (domain_id, sub_id)) comment='Replication slave GTID position'; + +-- Tables for Self Tuning Cost Optimizer + +CREATE TABLE IF NOT EXISTS all_constants (const_name varchar(64) NOT NULL, const_value double NOT NULL, PRIMARY KEY (const_name) ) ENGINE=MyISAM CHARACTER SET utf8 COLLATE utf8_bin comment='Constants for optimizer';
Uhm... I don't particularly like the "all_constants" name, we don't have other tables or data structures that are called like that. What about "optimizer_cost_factors" ? Same applies to the code changes below (although, in the code it could be simply "cost_factors", for brevity)
+ +-- Remember for later if all_constants table already existed +set @had_all_constants_table= @@warning_count != 0; diff --git a/sql/opt_costmodel.h b/sql/opt_costmodel.h --- /dev/null +++ b/sql/opt_costmodel.h @@ -0,0 +1,15 @@ +/* Interface to get constants */ + +#ifndef _opt_costmodel_h
SQL_OPT_COSTMODEL_INCLUDED please, not _opt_costmodel_h
+#define _opt_costmodel_h + +enum enum_all_constants_col +{ + ALL_CONSTANTS_CONST_NAME, + ALL_CONSTANTS_CONST_VALUE +};
You can move the enum into the opt_costmodel.cc This opt_costmodel.h file is the interface - other source files includes it to get access to your public API. But you don't want them to read your table directly, so they don't need to know how the fields are mapped. I mean, this enum is part of the implementation, your internal stuff, not the public API that others need to see or care about.
+ +double get_read_time_factor(THD *thd); +double get_scan_time_factor(THD *thd);
On the other hand, these two are API functions all right.
+ +#endif /* _opt_costmodel_h */ diff --git a/sql/opt_costmodel.cc b/sql/opt_costmodel.cc --- /dev/null +++ b/sql/opt_costmodel.cc @@ -0,0 +1,178 @@ +#include "sql_base.h" +#include "key.h" +#include "opt_costmodel.h" ... +static +inline int open_table(THD *thd, TABLE_LIST *table, + Open_tables_backup *backup, + bool for_write) +{ + init_table_list(table, for_write); + init_mdl_requests(table);
better use table->init_one_table() instead
+ return open_system_tables_for_read(thd, table, backup); +} + +class All_constants +{ +private: + /* Handler used for retrieval of all_constants table */ + handler *all_constants_file; + /* Table to read constants from or to update/delete */ + TABLE *all_constants_table; + /* Length of the key to access all_constants table */ + uint key_length; + /* Number of the keys to access all_constants table */ + uint key_idx; + /* Structure for the index to access all_constants table */ + KEY *key_info; + /* Record buffers used to access/update all_constants table */ + uchar *record[2];
record buffers are in the TABLE already, no need to duplicate them here
+ + LEX_STRING const_name; + + Field *const_name_field; + Field *const_value_field; + + double const_value; + +public: + All_constants(TABLE *tab, LEX_STRING name) + :all_constants_table(tab), const_name(name) + { + all_constants_file= all_constants_table->file; + /* all_constants table has only one key */ + key_idx= 0; + key_info= &all_constants_table->key_info[key_idx]; + key_length= key_info->key_length; + record[0]= all_constants_table->record[0]; + record[1]= all_constants_table->record[1]; + const_name_field= all_constants_table->field[ALL_CONSTANTS_CONST_NAME]; + const_value_field= all_constants_table->field[ALL_CONSTANTS_CONST_VALUE];
Why are you copying all this from the TABLE into your class?
+ const_value= 1.0; + } + + /** + @brief + Set the key fields for the all_constants table + + @details + The function sets the value of the field const_name + in the record buffer for the table all_constants. + This field is the primary key for the table. + + @note + The function is supposed to be called before any use of the + method find_const for an object of the All_constants class. + */ + + void set_key_fields() + { + const_name_field->store(const_name.str, const_name.length, system_charset_info); + } + + /** + @brief + Find a record in the all_constants table by a primary key + + @details + The function looks for a record in all_constants by its primary key. + It assumes that the key fields have been already stored in the record + buffer of all_constants table. + + @retval + FALSE the record is not found + @retval + TRUE the record is found + */ + + bool find_const() + { + uchar key[MAX_KEY_LENGTH]; + key_copy(key, record[0], key_info, key_length); + return !all_constants_file->ha_index_read_idx_map(record[0], key_idx, key, + HA_WHOLE_KEY, HA_READ_KEY_EXACT); + } + + void read_const_value() + { + if (find_const()) + { + Field *const_field= all_constants_table->field[ALL_CONSTANTS_CONST_VALUE]; + if(!const_field->is_null()) + { + const_value= const_field->val_real(); + } + } + } + + double get_const_value() + { + return const_value; + } + + ~All_constants() {} +}; + +static double read_constant_from_table(THD *thd, const LEX_STRING const_name) +{ + TABLE_LIST table_list; + Open_tables_backup open_tables_backup; + + if (open_table(thd, &table_list, &open_tables_backup, FALSE)) + { + thd->clear_error(); + return 1.0; + } + + All_constants all_constants(table_list.table, const_name); + all_constants.set_key_fields(); + all_constants.read_const_value(); + + close_system_tables(thd, &open_tables_backup); + + return all_constants.get_const_value(); +} + +double get_read_time_factor(THD *thd) +{ + return read_constant_from_table(thd, read_time_factor); +} + +double get_scan_time_factor(THD *thd) +{ + return read_constant_from_table(thd, scan_time_factor); +}
No-no-no. *Every* time when an optimizer needs a constant you open the table, search in the index, and close the table! That's many times per query. Per *every* query! What you need to do is add some kind of an initialization function, for example init_cost_factors() or Cost_factors::init() (if everything is in the Cost_factors class) in this function/method you open the table, read everything in memory, close the table. And then you never touch the table. You only access constants from memory, like inline double Cost_factors::scan_time() { return scan_time; } inline double Cost_factors::read_time() { return read_time; } At least until you implement the code that collects statistics and updates cost factors accordingly. Regards, Sergei
hi guys, one question, that maybe at the right time could be answered... with the new cost model, maybe at explain we could have a new information about how many time we will expend? and maybe something like, estimate time / executed time (via show warnings?) i will read the patch about this work, and check if i could help with something like this is it interesting? 2014-05-26 13:30 GMT-03:00 Sergei Golubchik <serg@mariadb.org>:
Hi, Anshu!
Hi all,
You can find my this week's blog entry at http://igniting.in/gsoc2014/2014/05/25/coding-things-up/ . I have created a branch on launchpad for my work: http://bazaar.launchpad.net/~igniting/maria/maria/revision/4211 . You can give your suggestions/reviews either on this thread or as a comment on
On May 25, Anshu Avinash wrote: the
blog itself.
I've got used to launchpad, where one can subscribe to a branch and get all pushes by email - so that I could simply reply to these emails.
It doesn't seem to work this way on guthub :(
Anyway...
diff --git a/mysql-test/t/costmodel.test b/mysql-test/t/costmodel.test --- /dev/null +++ b/mysql-test/t/costmodel.test @@ -0,0 +1,9 @@ +--disable_warnings +DROP TABLE IF EXISTS t1; +--enable_warnings + +CREATE TABLE t1 (a INT); +INSERT INTO t1 VALUES (1); +SELECT * FROM t1;
This test doesn't seem to do anything with your changes in the code. it doesn't test new constants, that you've added.
+ +DROP TABLE t1; diff --git a/scripts/mysql_system_tables.sql b/scripts/mysql_system_tables.sql --- a/scripts/mysql_system_tables.sql +++ b/scripts/mysql_system_tables.sql @@ -229,3 +229,10 @@ CREATE TABLE IF NOT EXISTS index_stats (db_name varchar(64) NOT NULL, table_name -- we avoid mixed-engine transactions. set storage_engine=@orig_storage_engine; CREATE TABLE IF NOT EXISTS gtid_slave_pos (domain_id INT UNSIGNED NOT NULL, sub_id BIGINT UNSIGNED NOT NULL, server_id INT UNSIGNED NOT NULL, seq_no BIGINT UNSIGNED NOT NULL, PRIMARY KEY (domain_id, sub_id)) comment='Replication slave GTID position'; + +-- Tables for Self Tuning Cost Optimizer + +CREATE TABLE IF NOT EXISTS all_constants (const_name varchar(64) NOT NULL, const_value double NOT NULL, PRIMARY KEY (const_name) ) ENGINE=MyISAM CHARACTER SET utf8 COLLATE utf8_bin comment='Constants for optimizer';
Uhm... I don't particularly like the "all_constants" name, we don't have other tables or data structures that are called like that.
What about "optimizer_cost_factors" ? Same applies to the code changes below (although, in the code it could be simply "cost_factors", for brevity)
+ +-- Remember for later if all_constants table already existed +set @had_all_constants_table= @@warning_count != 0; diff --git a/sql/opt_costmodel.h b/sql/opt_costmodel.h --- /dev/null +++ b/sql/opt_costmodel.h @@ -0,0 +1,15 @@ +/* Interface to get constants */ + +#ifndef _opt_costmodel_h
SQL_OPT_COSTMODEL_INCLUDED please, not _opt_costmodel_h
+#define _opt_costmodel_h + +enum enum_all_constants_col +{ + ALL_CONSTANTS_CONST_NAME, + ALL_CONSTANTS_CONST_VALUE +};
You can move the enum into the opt_costmodel.cc
This opt_costmodel.h file is the interface - other source files includes it to get access to your public API. But you don't want them to read your table directly, so they don't need to know how the fields are mapped.
I mean, this enum is part of the implementation, your internal stuff, not the public API that others need to see or care about.
+ +double get_read_time_factor(THD *thd); +double get_scan_time_factor(THD *thd);
On the other hand, these two are API functions all right.
+ +#endif /* _opt_costmodel_h */ diff --git a/sql/opt_costmodel.cc b/sql/opt_costmodel.cc --- /dev/null +++ b/sql/opt_costmodel.cc @@ -0,0 +1,178 @@ +#include "sql_base.h" +#include "key.h" +#include "opt_costmodel.h" ... +static +inline int open_table(THD *thd, TABLE_LIST *table, + Open_tables_backup *backup, + bool for_write) +{ + init_table_list(table, for_write); + init_mdl_requests(table);
better use table->init_one_table() instead
+ return open_system_tables_for_read(thd, table, backup); +} + +class All_constants +{ +private: + /* Handler used for retrieval of all_constants table */ + handler *all_constants_file; + /* Table to read constants from or to update/delete */ + TABLE *all_constants_table; + /* Length of the key to access all_constants table */ + uint key_length; + /* Number of the keys to access all_constants table */ + uint key_idx; + /* Structure for the index to access all_constants table */ + KEY *key_info; + /* Record buffers used to access/update all_constants table */ + uchar *record[2];
record buffers are in the TABLE already, no need to duplicate them here
+ + LEX_STRING const_name; + + Field *const_name_field; + Field *const_value_field; + + double const_value; + +public: + All_constants(TABLE *tab, LEX_STRING name) + :all_constants_table(tab), const_name(name) + { + all_constants_file= all_constants_table->file; + /* all_constants table has only one key */ + key_idx= 0; + key_info= &all_constants_table->key_info[key_idx]; + key_length= key_info->key_length; + record[0]= all_constants_table->record[0]; + record[1]= all_constants_table->record[1]; + const_name_field= all_constants_table->field[ALL_CONSTANTS_CONST_NAME]; + const_value_field= all_constants_table->field[ALL_CONSTANTS_CONST_VALUE];
Why are you copying all this from the TABLE into your class?
+ const_value= 1.0; + } + + /** + @brief + Set the key fields for the all_constants table + + @details + The function sets the value of the field const_name + in the record buffer for the table all_constants. + This field is the primary key for the table. + + @note + The function is supposed to be called before any use of the + method find_const for an object of the All_constants class. + */ + + void set_key_fields() + { + const_name_field->store(const_name.str, const_name.length, system_charset_info); + } + + /** + @brief + Find a record in the all_constants table by a primary key + + @details + The function looks for a record in all_constants by its primary key. + It assumes that the key fields have been already stored in the record + buffer of all_constants table. + + @retval + FALSE the record is not found + @retval + TRUE the record is found + */ + + bool find_const() + { + uchar key[MAX_KEY_LENGTH]; + key_copy(key, record[0], key_info, key_length); + return !all_constants_file->ha_index_read_idx_map(record[0], key_idx, key, + HA_WHOLE_KEY, HA_READ_KEY_EXACT); + } + + void read_const_value() + { + if (find_const()) + { + Field *const_field= all_constants_table->field[ALL_CONSTANTS_CONST_VALUE]; + if(!const_field->is_null()) + { + const_value= const_field->val_real(); + } + } + } + + double get_const_value() + { + return const_value; + } + + ~All_constants() {} +}; + +static double read_constant_from_table(THD *thd, const LEX_STRING const_name) +{ + TABLE_LIST table_list; + Open_tables_backup open_tables_backup; + + if (open_table(thd, &table_list, &open_tables_backup, FALSE)) + { + thd->clear_error(); + return 1.0; + } + + All_constants all_constants(table_list.table, const_name); + all_constants.set_key_fields(); + all_constants.read_const_value(); + + close_system_tables(thd, &open_tables_backup); + + return all_constants.get_const_value(); +} + +double get_read_time_factor(THD *thd) +{ + return read_constant_from_table(thd, read_time_factor); +} + +double get_scan_time_factor(THD *thd) +{ + return read_constant_from_table(thd, scan_time_factor); +}
No-no-no. *Every* time when an optimizer needs a constant you open the table, search in the index, and close the table! That's many times per query. Per *every* query!
What you need to do is add some kind of an initialization function, for example
init_cost_factors() or Cost_factors::init() (if everything is in the Cost_factors class)
in this function/method you open the table, read everything in memory, close the table. And then you never touch the table.
You only access constants from memory, like
inline double Cost_factors::scan_time() { return scan_time; } inline double Cost_factors::read_time() { return read_time; }
At least until you implement the code that collects statistics and updates cost factors accordingly.
Regards, Sergei
_______________________________________________ Mailing list: https://launchpad.net/~maria-developers Post to : maria-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
Hi, Roberto! On May 26, Roberto Spadim wrote:
hi guys, one question, that maybe at the right time could be answered... with the new cost model, maybe at explain we could have a new information about how many time we will expend? and maybe something like, estimate time / executed time (via show warnings?) i will read the patch about this work, and check if i could help with something like this is it interesting?
First, it's not about creating a new cost model. Only about providing more adequate real-life values to certain parameters in the current cost model. And yes, the work could, possibly, help to estimate the query execution time. Although this wasn't the goal, we only wanted to find correct relative values, not absolute. Regards, Sergei
Hi Sergei! nice :) i'm reading about it just one doubt i didn't understand everything yet there's a read time and a scan time, what the difference? read = sequencial read and scan = non sequencial read? or something like table read cost, index read cost? 2014-05-27 5:41 GMT-03:00 Sergei Golubchik <serg@mariadb.org>:
Hi, Roberto!
On May 26, Roberto Spadim wrote:
hi guys, one question, that maybe at the right time could be answered... with the new cost model, maybe at explain we could have a new information about how many time we will expend? and maybe something like, estimate time / executed time (via show warnings?) i will read the patch about this work, and check if i could help with something like this is it interesting?
First, it's not about creating a new cost model. Only about providing more adequate real-life values to certain parameters in the current cost model.
And yes, the work could, possibly, help to estimate the query execution time. Although this wasn't the goal, we only wanted to find correct relative values, not absolute.
Regards, Sergei
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
Hi, Roberto! On May 27, Roberto Spadim wrote:
Hi Sergei! nice :) i'm reading about it just one doubt i didn't understand everything yet there's a read time and a scan time, what the difference? read = sequencial read and scan = non sequencial read? or something like table read cost, index read cost?
Yes, the handler methods are: virtual double handler::scan_time() { return stats.data_file_length / IO_SIZE + 2; } virtual double handler::read_time(uint index, uint ranges, ha_rows rows) { return ranges+rows; } these are the default implementations, in the base handler class. Method names are historical and *very* old. The return values are kind of "number of disk seeks". You can see that scan_time() assumes a sequential disk read of blocks of IO_SIZE bytes. While read_time() assumes random reads of 'rows' rows from the data file and 'ranges' ranges from the index. These default implementations, I suspect, predate even MyISAM. Regards, Sergei
nice :) maybe part of msql? :) very old times nice, i will stay updated with this gson project, very interesting and a very good work :) 2014-05-27 15:35 GMT-03:00 Sergei Golubchik <serg@mariadb.org>:
Hi, Roberto!
On May 27, Roberto Spadim wrote:
Hi Sergei! nice :) i'm reading about it just one doubt i didn't understand everything yet there's a read time and a scan time, what the difference? read = sequencial read and scan = non sequencial read? or something like table read cost, index read cost?
Yes, the handler methods are:
virtual double handler::scan_time() { return stats.data_file_length / IO_SIZE + 2; } virtual double handler::read_time(uint index, uint ranges, ha_rows rows) { return ranges+rows; }
these are the default implementations, in the base handler class. Method names are historical and *very* old.
The return values are kind of "number of disk seeks". You can see that scan_time() assumes a sequential disk read of blocks of IO_SIZE bytes.
While read_time() assumes random reads of 'rows' rows from the data file and 'ranges' ranges from the index.
These default implementations, I suspect, predate even MyISAM.
Regards, Sergei
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
Hi all, As per serg comments on the previous commit, I have modified the code to add an init function which will initialize all the cost factors at the start of the server. The latest code is at https://github.com/igniting/server/commits/selfTuningOptimizer. I have also attached the diff here. However, I am still not able to come up with an example for my test case for READ_TIME_FACTOR and SCAN_TIME_FACTOR. I will go through the code again to figure it out. Currently the test case which I have written just gives the workflow: EXPLAIN SELECT ...; Update constant tables directly; Reconnect; EXPLAIN SELECT. Once I have this figured out, I will move to measuring the query time, and updating the stats for the corresponding cost factors. I will also write a blog which will have more details. Regards Anshu Avinash On Wed, May 28, 2014 at 1:05 AM, Roberto Spadim <roberto@spadim.com.br> wrote:
nice :) maybe part of msql? :) very old times
nice, i will stay updated with this gson project, very interesting and a very good work :)
2014-05-27 15:35 GMT-03:00 Sergei Golubchik <serg@mariadb.org>:
Hi, Roberto!
On May 27, Roberto Spadim wrote:
Hi Sergei! nice :) i'm reading about it just one doubt i didn't understand everything yet there's a read time and a scan time, what the difference? read = sequencial read and scan = non sequencial read? or something like table read cost, index read cost?
Yes, the handler methods are:
virtual double handler::scan_time() { return stats.data_file_length / IO_SIZE + 2; } virtual double handler::read_time(uint index, uint ranges, ha_rows rows) { return ranges+rows; }
these are the default implementations, in the base handler class. Method names are historical and *very* old.
The return values are kind of "number of disk seeks". You can see that scan_time() assumes a sequential disk read of blocks of IO_SIZE bytes.
While read_time() assumes random reads of 'rows' rows from the data file and 'ranges' ranges from the index.
These default implementations, I suspect, predate even MyISAM.
Regards, Sergei
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
Hi, Anshu! On Jun 08, Anshu Avinash wrote:
Hi all,
As per serg comments on the previous commit, I have modified the code to add an init function which will initialize all the cost factors at the start of the server. The latest code is at https://github.com/igniting/server/commits/selfTuningOptimizer. I have also attached the diff here.
However, I am still not able to come up with an example for my test case for READ_TIME_FACTOR and SCAN_TIME_FACTOR. I will go through the code again to figure it out. Currently the test case which I have written just gives the workflow: EXPLAIN SELECT ...; Update constant tables directly; Reconnect; EXPLAIN SELECT.
For a simple test case, try something like the following: create table t1 (a int auto_increment primary key, b int); insert t1 (b) values (1),(2),(3),... I think, about 20-30 rows should be enough. explain select t1 where a > 15; it doesn't have to be 15, try to experiment with different constants in the query. Pick the one where 'a > N' gives you a table scan (explain shows 'ALL'), but with 'a > N+1' explain shows 'range'. Now, modify one of your cost factors and try again. explain should change from 'ALL' to 'range'. I'll send a second email soon with my comments about your patch, thanks for attaching it! Regards, Sergei
Hi, Anshu! On Jun 08, Anshu Avinash wrote:
Hi all,
As per serg comments on the previous commit, I have modified the code to add an init function which will initialize all the cost factors at the start of the server. The latest code is at https://github.com/igniting/server/commits/selfTuningOptimizer. I have also attached the diff here.
Much better! Good work! See my comments below.
diff --git a/mysql-test/t/costmodel.test b/mysql-test/t/costmodel.test --- /dev/null +++ b/mysql-test/t/costmodel.test @@ -0,0 +1,37 @@ +--disable_warnings +DROP DATABASE IF EXISTS world; +--enable_warnings + +CREATE DATABASE world; +use world; + +--source include/world_schema.inc +--disable_query_log +--disable_result_log +--disable_warnings +--source include/world.inc +--enable_query_log +--enable_result_log +--enable_warnings + +EXPLAIN + SELECT c.name, ci.name FROM Country c, City ci + WHERE c.capital=ci.id; + +use mysql; + +UPDATE optimizer_cost_factors +SET const_value=1000.0 +WHERE const_name='READ_TIME_FACTOR'; + +--enable_reconnect +--exec echo "restart" > $MYSQLTEST_VARDIR/tmp/mysqld.1.expect +--source include/wait_until_connected_again.inc + +use world; + +EXPLAIN + SELECT c.name, ci.name FROM Country c, City ci + WHERE c.capital=ci.id; + +DROP DATABASE world;
This is a very correctly written test file. Good! But just to test whether READ_TIME_FACTOR and SCAN_TIME_FACTOR work, I'd suggest to use a simpler test, not a world database, not a join, but as I described in another email - one table, minimal structure, one simple SELECT.
diff --git a/scripts/mysql_system_tables.sql b/scripts/mysql_system_tables.sql --- a/scripts/mysql_system_tables.sql +++ b/scripts/mysql_system_tables.sql @@ -229,3 +229,10 @@ CREATE TABLE IF NOT EXISTS index_stats (db_name varchar(64) NOT NULL, table_name -- we avoid mixed-engine transactions. set storage_engine=@orig_storage_engine; CREATE TABLE IF NOT EXISTS gtid_slave_pos (domain_id INT UNSIGNED NOT NULL, sub_id BIGINT UNSIGNED NOT NULL, server_id INT UNSIGNED NOT NULL, seq_no BIGINT UNSIGNED NOT NULL, PRIMARY KEY (domain_id, sub_id)) comment='Replication slave GTID position'; + +-- Tables for Self Tuning Cost Optimizer + +CREATE TABLE IF NOT EXISTS optimizer_cost_factors (const_name varchar(64) NOT NULL, const_value double NOT NULL, PRIMARY KEY (const_name) ) ENGINE=MyISAM CHARACTER SET utf8 COLLATE utf8_bin comment='Constants for optimizer';
please, make it latin1, not utf8. We don't need to support utf8 names for cost factors :)
+ +-- Remember for later if all_constants table already existed +set @had_optimizer_cost_factors_table= @@warning_count != 0; diff --git a/scripts/mysql_system_tables_data.sql b/scripts/mysql_system_tables_data.sql --- a/scripts/mysql_system_tables_data.sql +++ b/scripts/mysql_system_tables_data.sql @@ -53,3 +53,9 @@ INSERT INTO tmp_proxies_priv VALUES ('localhost', 'root', '', '', TRUE, '', now( REPLACE INTO tmp_proxies_priv SELECT @current_hostname, 'root', '', '', TRUE, '', now() FROM DUAL WHERE @current_hostname != 'localhost'; INSERT INTO proxies_priv SELECT * FROM tmp_proxies_priv WHERE @had_proxies_priv_table=0; DROP TABLE tmp_proxies_priv; + +CREATE TEMPORARY TABLE tmp_optimizer_cost_factors LIKE optimizer_cost_factors; +INSERT INTO tmp_optimizer_cost_factors VALUES ('READ_TIME_FACTOR', 1.0); +INSERT INTO tmp_optimizer_cost_factors VALUES ('SCAN_TIME_FACTOR', 1.0); +INSERT INTO optimizer_cost_factors SELECT * FROM tmp_optimizer_cost_factors WHERE @had_optimizer_cost_factors_table=0; +DROP TABLE tmp_optimizer_cost_factors;
Eh. I'd suggest to not use a temporary table and not use WHERE @had_optimizer_cost_factors_table but simply INSERT INTO optimizer_cost_factors VALUES ('READ_TIME_FACTOR', 1.0); INSERT INTO optimizer_cost_factors VALUES ('SCAN_TIME_FACTOR', 1.0); the difference is - if optimizer_cost_factors table already exists, your code will insert *nothing at all*. But the one I suggest will insert missing constants while keeping existing values intact. For example, if the table optimizer_cost_factors exists, and has only one row { READ_TIME_FACTOR, 10.0 }. Then after your script it will still have this one row, while after my suggested solution it'll be | READ_TIME_FACTOR | 10.0 | | SCAN_TIME_FACTOR | 1.0 | which, I think, is better. If you'd like you can even use INSERT IGNORE to ignore duplicate key errors. But they'll be ignored by the client anyway, so it doesn't matter much in this case.
diff --git a/sql/handler.h b/sql/handler.h --- a/sql/handler.h +++ b/sql/handler.h @@ -2741,7 +2742,8 @@ public: reset_statistics(); } virtual double scan_time() - { return ulonglong2double(stats.data_file_length) / IO_SIZE + 2; } + { return Cost_factors::get_scan_time_factor() * + (ulonglong2double(stats.data_file_length) / IO_SIZE + 2); }
No, this is wrong. You use the scan factor *in the virtual method*, which means that every storage engine that overwrite handler::scan_time() will not use Cost_factors::get_scan_time_factor(). The correct solution - see ha_index_init, ha_rnd_next, etc. You make handler::scan_time() a *protected* method. And create a public non-virtual method handler::ha_scan_time(). Like this double ha_scan_time() { return Cost_factors::scan_factor() * scan_time(); } (above I've also renamed the get_scan_time_factor() to be a bit shorter)
/** The cost of reading a set of ranges from the table using an index @@ -2755,7 +2757,7 @@ public: using an index by calling it using read_time(index, 1, table_size). */ virtual double read_time(uint index, uint ranges, ha_rows rows) - { return rows2double(ranges+rows); } + { return Cost_factors::get_read_time_factor() * rows2double(ranges+rows); }
same here. While you have your cost factors in the virtual method, they won't have any effect on the optimizer, you should be able to see it in the test case.
/** Calculate cost of 'keyread' scan for given index and number of records. diff --git a/sql/opt_costmodel.h b/sql/opt_costmodel.h --- /dev/null +++ b/sql/opt_costmodel.h @@ -0,0 +1,25 @@ +/* Interface to get constants */ + +#ifndef SQL_OPT_COSTMODEL_INCLUDED +#define SQL_OPT_COSTMODEL_INCLUDED + +class Cost_factors +{ +private: + static bool isInitialized;
please, try to stick to naming conventions that we have in MariaDB. as you might've noticed, we don't use camelCase.
+ static double read_time_factor; + static double scan_time_factor; + +public: + static void init(); + static inline double get_read_time_factor() + { + return read_time_factor; + } + static inline double get_scan_time_factor() + { + return scan_time_factor; + } +};
Ok, so you've made everything static and never instantiated a single instance of the Cost_factors class. I see. What benefits does this have as compared to making nothing inside the class static, but creating one static Cost_factors object?
+ +#endif /* SQL_OPT_COSTMODEL_INCLUDED */ diff --git a/sql/opt_costmodel.cc b/sql/opt_costmodel.cc --- /dev/null +++ b/sql/opt_costmodel.cc @@ -0,0 +1,103 @@ +#include "sql_base.h" +#include "key.h" +#include "records.h" +#include "opt_costmodel.h" + +/* Name of database to which the optimizer_cost_factors table belongs */ +static const LEX_STRING db_name= { C_STRING_WITH_LEN("mysql") }; + +/* Name of all_constants table */ +static const LEX_STRING table_name = { C_STRING_WITH_LEN("optimizer_cost_factors") }; + +/* Columns in the optimizer_cost_factors_table */ +enum cost_factors_col +{ + COST_FACTORS_CONST_NAME, + COST_FACTORS_CONST_VALUE +}; + +/* Name of the constants present in table */ +static const LEX_STRING read_time_factor_name = { C_STRING_WITH_LEN("READ_TIME_FACTOR") }; +static const LEX_STRING scan_time_factor_name = { C_STRING_WITH_LEN("SCAN_TIME_FACTOR") }; + +/* Helper functions for Cost_factors::init() */ + +static +inline int open_table(THD *thd, TABLE_LIST *table, + Open_tables_backup *backup, + bool for_write) +{ + enum thr_lock_type lock_type_arg= for_write? TL_WRITE: TL_READ; + table->init_one_table(db_name.str, db_name.length, table_name.str, + table_name.length, table_name.str, lock_type_arg); + return open_system_tables_for_read(thd, table, backup); +} + +static inline void clean_up(THD *thd) +{ + close_mysql_tables(thd); + delete thd; +} + +/* Initialize static class members */ +bool Cost_factors::isInitialized= false; +double Cost_factors::read_time_factor= 1.0; +double Cost_factors::scan_time_factor= 1.0; + +/* Interface functions */ + +void Cost_factors::init() +{ + TABLE_LIST table_list; + Open_tables_backup open_tables_backup; + READ_RECORD read_record_info; + TABLE *table; + MEM_ROOT mem; + init_sql_alloc(&mem, 1024, 0, MYF(0)); + THD *new_thd = new THD; + + if(!new_thd) + { + free_root(&mem, MYF(0)); + DBUG_VOID_RETURN; + } + + new_thd->thread_stack= (char *) &new_thd; + new_thd->store_globals(); + new_thd->set_db(db_name.str, db_name.length); + + if(open_table(new_thd, &table_list, &open_tables_backup, FALSE)) + { + clean_up(new_thd); + DBUG_VOID_RETURN;
it's ok to do goto err: here and put the err: label at the end with the cleanup code - this pattern is used very often in MariaDB
+ } + + table= table_list.table; + if(init_read_record(&read_record_info, new_thd, table, NULL, 1, 0, FALSE)) + { + clean_up(new_thd); + DBUG_VOID_RETURN; + } + + table->use_all_columns(); + while (!read_record_info.read_record(&read_record_info)) + { + LEX_STRING const_name; + const_name.str= get_field(&mem, table->field[COST_FACTORS_CONST_NAME]); + const_name.length= strlen(const_name.str);
you don't use const_name.length anywhere, you don't need LEX_STRING here.
+ + double const_value; + const_value= table->field[COST_FACTORS_CONST_VALUE]->val_real(); + if(!strcmp(const_name.str, read_time_factor_name.str)) + { + Cost_factors::read_time_factor= const_value; + } + else if(!strcmp(const_name.str, scan_time_factor_name.str)) + { + Cost_factors::scan_time_factor= const_value; + }
To avoid many if/else if you can do like this: struct st_factor { const char *name; double *value; } factors = { { "READ_TIME_FACTOR", & Cost_factors::read_time_factor }, { "SCAN_TIME_FACTOR", & Cost_factors::scan_time_factor }, { 0, 0 }}; or even #define FACTOR(X) { #X, &Cost_factors::X } struct st_factor { const char *name; double *value; } factors = { FACTOR(read_time_factor), FACTOR(scan_time_factor), { 0, 0 }}; either way, instead of if/elseif you do for (st_factor *f= factors; f->name; f++) { if (strcasecmp(f->name, const_name) == 0) { *(f->value)= const_value; break; } } if (f->name == 0) sql_print_warning("Invalid row in the optimizer_cost_factors table: %s", const_name);
+ } + + clean_up(new_thd); + DBUG_VOID_RETURN; +} Regards, Sergei
Hi serg, I have implemented your suggestions. In the test case I created a table with 25 rows. Explain for 'select * from t1 where a > 19' gave 'ALL' while explain for 'select * from t1 where a > 20' gives range. I have also written public methods ha_scan_time() and ha_read_time(). I should replace every occurrence of handler::scan_time() with ha_scan_time(), right? How do I make sure that I don't miss any place? Also regarding making everything in the class Cost_factors static vs creating an object and using it everywhere: cann't we use a namespace Cost_factors? How is it done usually? I don't see much usage of namespaces in the code. I'll send the updated patch as soon as these things are sorted out. Regards Anshu On Mon, Jun 9, 2014 at 11:25 PM, Sergei Golubchik <serg@mariadb.org> wrote:
Hi, Anshu!
On Jun 08, Anshu Avinash wrote:
Hi all,
As per serg comments on the previous commit, I have modified the code to add an init function which will initialize all the cost factors at the start of the server. The latest code is at https://github.com/igniting/server/commits/selfTuningOptimizer. I have also attached the diff here.
Much better! Good work! See my comments below.
diff --git a/mysql-test/t/costmodel.test b/mysql-test/t/costmodel.test --- /dev/null +++ b/mysql-test/t/costmodel.test @@ -0,0 +1,37 @@ +--disable_warnings +DROP DATABASE IF EXISTS world; +--enable_warnings + +CREATE DATABASE world; +use world; + +--source include/world_schema.inc +--disable_query_log +--disable_result_log +--disable_warnings +--source include/world.inc +--enable_query_log +--enable_result_log +--enable_warnings + +EXPLAIN + SELECT c.name, ci.name FROM Country c, City ci + WHERE c.capital=ci.id; + +use mysql; + +UPDATE optimizer_cost_factors +SET const_value=1000.0 +WHERE const_name='READ_TIME_FACTOR'; + +--enable_reconnect +--exec echo "restart" > $MYSQLTEST_VARDIR/tmp/mysqld.1.expect +--source include/wait_until_connected_again.inc + +use world; + +EXPLAIN + SELECT c.name, ci.name FROM Country c, City ci + WHERE c.capital=ci.id; + +DROP DATABASE world;
This is a very correctly written test file. Good!
But just to test whether READ_TIME_FACTOR and SCAN_TIME_FACTOR work, I'd suggest to use a simpler test, not a world database, not a join, but as I described in another email - one table, minimal structure, one simple SELECT.
diff --git a/scripts/mysql_system_tables.sql b/scripts/mysql_system_tables.sql --- a/scripts/mysql_system_tables.sql +++ b/scripts/mysql_system_tables.sql @@ -229,3 +229,10 @@ CREATE TABLE IF NOT EXISTS index_stats (db_name varchar(64) NOT NULL, table_name -- we avoid mixed-engine transactions. set storage_engine=@orig_storage_engine; CREATE TABLE IF NOT EXISTS gtid_slave_pos (domain_id INT UNSIGNED NOT NULL, sub_id BIGINT UNSIGNED NOT NULL, server_id INT UNSIGNED NOT NULL, seq_no BIGINT UNSIGNED NOT NULL, PRIMARY KEY (domain_id, sub_id)) comment='Replication slave GTID position'; + +-- Tables for Self Tuning Cost Optimizer + +CREATE TABLE IF NOT EXISTS optimizer_cost_factors (const_name varchar(64) NOT NULL, const_value double NOT NULL, PRIMARY KEY (const_name) ) ENGINE=MyISAM CHARACTER SET utf8 COLLATE utf8_bin comment='Constants for optimizer';
please, make it latin1, not utf8. We don't need to support utf8 names for cost factors :)
+ +-- Remember for later if all_constants table already existed +set @had_optimizer_cost_factors_table= @@warning_count != 0; diff --git a/scripts/mysql_system_tables_data.sql b/scripts/mysql_system_tables_data.sql --- a/scripts/mysql_system_tables_data.sql +++ b/scripts/mysql_system_tables_data.sql @@ -53,3 +53,9 @@ INSERT INTO tmp_proxies_priv VALUES ('localhost', 'root', '', '', TRUE, '', now( REPLACE INTO tmp_proxies_priv SELECT @current_hostname, 'root', '', '', TRUE, '', now() FROM DUAL WHERE @current_hostname != 'localhost'; INSERT INTO proxies_priv SELECT * FROM tmp_proxies_priv WHERE @had_proxies_priv_table=0; DROP TABLE tmp_proxies_priv; + +CREATE TEMPORARY TABLE tmp_optimizer_cost_factors LIKE optimizer_cost_factors; +INSERT INTO tmp_optimizer_cost_factors VALUES ('READ_TIME_FACTOR', 1.0); +INSERT INTO tmp_optimizer_cost_factors VALUES ('SCAN_TIME_FACTOR', 1.0); +INSERT INTO optimizer_cost_factors SELECT * FROM tmp_optimizer_cost_factors WHERE @had_optimizer_cost_factors_table=0; +DROP TABLE tmp_optimizer_cost_factors;
Eh. I'd suggest to not use a temporary table and not use WHERE @had_optimizer_cost_factors_table but simply
INSERT INTO optimizer_cost_factors VALUES ('READ_TIME_FACTOR', 1.0); INSERT INTO optimizer_cost_factors VALUES ('SCAN_TIME_FACTOR', 1.0);
the difference is - if optimizer_cost_factors table already exists, your code will insert *nothing at all*. But the one I suggest will insert missing constants while keeping existing values intact. For example, if the table optimizer_cost_factors exists, and has only one row { READ_TIME_FACTOR, 10.0 }. Then after your script it will still have this one row, while after my suggested solution it'll be
| READ_TIME_FACTOR | 10.0 | | SCAN_TIME_FACTOR | 1.0 |
which, I think, is better. If you'd like you can even use INSERT IGNORE to ignore duplicate key errors. But they'll be ignored by the client anyway, so it doesn't matter much in this case.
diff --git a/sql/handler.h b/sql/handler.h --- a/sql/handler.h +++ b/sql/handler.h @@ -2741,7 +2742,8 @@ public: reset_statistics(); } virtual double scan_time() - { return ulonglong2double(stats.data_file_length) / IO_SIZE + 2; } + { return Cost_factors::get_scan_time_factor() * + (ulonglong2double(stats.data_file_length) / IO_SIZE + 2); }
No, this is wrong. You use the scan factor *in the virtual method*, which means that every storage engine that overwrite handler::scan_time() will not use Cost_factors::get_scan_time_factor().
The correct solution - see ha_index_init, ha_rnd_next, etc. You make handler::scan_time() a *protected* method. And create a public non-virtual method handler::ha_scan_time(). Like this
double ha_scan_time() { return Cost_factors::scan_factor() * scan_time(); }
(above I've also renamed the get_scan_time_factor() to be a bit shorter)
/** The cost of reading a set of ranges from the table using an index @@ -2755,7 +2757,7 @@ public: using an index by calling it using read_time(index, 1, table_size). */ virtual double read_time(uint index, uint ranges, ha_rows rows) - { return rows2double(ranges+rows); } + { return Cost_factors::get_read_time_factor() * rows2double(ranges+rows); }
same here. While you have your cost factors in the virtual method, they won't have any effect on the optimizer, you should be able to see it in the test case.
/** Calculate cost of 'keyread' scan for given index and number of
records.
diff --git a/sql/opt_costmodel.h b/sql/opt_costmodel.h --- /dev/null +++ b/sql/opt_costmodel.h @@ -0,0 +1,25 @@ +/* Interface to get constants */ + +#ifndef SQL_OPT_COSTMODEL_INCLUDED +#define SQL_OPT_COSTMODEL_INCLUDED + +class Cost_factors +{ +private: + static bool isInitialized;
please, try to stick to naming conventions that we have in MariaDB. as you might've noticed, we don't use camelCase.
+ static double read_time_factor; + static double scan_time_factor; + +public: + static void init(); + static inline double get_read_time_factor() + { + return read_time_factor; + } + static inline double get_scan_time_factor() + { + return scan_time_factor; + } +};
Ok, so you've made everything static and never instantiated a single instance of the Cost_factors class. I see. What benefits does this have as compared to making nothing inside the class static, but creating one static Cost_factors object?
+ +#endif /* SQL_OPT_COSTMODEL_INCLUDED */ diff --git a/sql/opt_costmodel.cc b/sql/opt_costmodel.cc --- /dev/null +++ b/sql/opt_costmodel.cc @@ -0,0 +1,103 @@ +#include "sql_base.h" +#include "key.h" +#include "records.h" +#include "opt_costmodel.h" + +/* Name of database to which the optimizer_cost_factors table belongs */ +static const LEX_STRING db_name= { C_STRING_WITH_LEN("mysql") }; + +/* Name of all_constants table */ +static const LEX_STRING table_name = { C_STRING_WITH_LEN("optimizer_cost_factors") }; + +/* Columns in the optimizer_cost_factors_table */ +enum cost_factors_col +{ + COST_FACTORS_CONST_NAME, + COST_FACTORS_CONST_VALUE +}; + +/* Name of the constants present in table */ +static const LEX_STRING read_time_factor_name = { C_STRING_WITH_LEN("READ_TIME_FACTOR") }; +static const LEX_STRING scan_time_factor_name = { C_STRING_WITH_LEN("SCAN_TIME_FACTOR") }; + +/* Helper functions for Cost_factors::init() */ + +static +inline int open_table(THD *thd, TABLE_LIST *table, + Open_tables_backup *backup, + bool for_write) +{ + enum thr_lock_type lock_type_arg= for_write? TL_WRITE: TL_READ; + table->init_one_table(db_name.str, db_name.length, table_name.str, + table_name.length, table_name.str, lock_type_arg); + return open_system_tables_for_read(thd, table, backup); +} + +static inline void clean_up(THD *thd) +{ + close_mysql_tables(thd); + delete thd; +} + +/* Initialize static class members */ +bool Cost_factors::isInitialized= false; +double Cost_factors::read_time_factor= 1.0; +double Cost_factors::scan_time_factor= 1.0; + +/* Interface functions */ + +void Cost_factors::init() +{ + TABLE_LIST table_list; + Open_tables_backup open_tables_backup; + READ_RECORD read_record_info; + TABLE *table; + MEM_ROOT mem; + init_sql_alloc(&mem, 1024, 0, MYF(0)); + THD *new_thd = new THD; + + if(!new_thd) + { + free_root(&mem, MYF(0)); + DBUG_VOID_RETURN; + } + + new_thd->thread_stack= (char *) &new_thd; + new_thd->store_globals(); + new_thd->set_db(db_name.str, db_name.length); + + if(open_table(new_thd, &table_list, &open_tables_backup, FALSE)) + { + clean_up(new_thd); + DBUG_VOID_RETURN;
it's ok to do goto err: here and put the err: label at the end with the cleanup code - this pattern is used very often in MariaDB
+ } + + table= table_list.table; + if(init_read_record(&read_record_info, new_thd, table, NULL, 1, 0, FALSE)) + { + clean_up(new_thd); + DBUG_VOID_RETURN; + } + + table->use_all_columns(); + while (!read_record_info.read_record(&read_record_info)) + { + LEX_STRING const_name; + const_name.str= get_field(&mem, table->field[COST_FACTORS_CONST_NAME]); + const_name.length= strlen(const_name.str);
you don't use const_name.length anywhere, you don't need LEX_STRING here.
+ + double const_value; + const_value= table->field[COST_FACTORS_CONST_VALUE]->val_real(); + if(!strcmp(const_name.str, read_time_factor_name.str)) + { + Cost_factors::read_time_factor= const_value; + } + else if(!strcmp(const_name.str, scan_time_factor_name.str)) + { + Cost_factors::scan_time_factor= const_value; + }
To avoid many if/else if you can do like this:
struct st_factor { const char *name; double *value; } factors = { { "READ_TIME_FACTOR", & Cost_factors::read_time_factor }, { "SCAN_TIME_FACTOR", & Cost_factors::scan_time_factor }, { 0, 0 }};
or even
#define FACTOR(X) { #X, &Cost_factors::X } struct st_factor { const char *name; double *value; } factors = { FACTOR(read_time_factor), FACTOR(scan_time_factor), { 0, 0 }};
either way, instead of if/elseif you do
for (st_factor *f= factors; f->name; f++) { if (strcasecmp(f->name, const_name) == 0) { *(f->value)= const_value; break; } } if (f->name == 0) sql_print_warning("Invalid row in the optimizer_cost_factors table: %s", const_name);
+ } + + clean_up(new_thd); + DBUG_VOID_RETURN; +} Regards, Sergei
Hi, Anshu! On Jun 10, Anshu Avinash wrote:
Hi serg,
I have implemented your suggestions. In the test case I created a table with 25 rows. Explain for 'select * from t1 where a > 19' gave 'ALL' while explain for 'select * from t1 where a > 20' gives range. I have also written public methods ha_scan_time() and ha_read_time(). I should replace every occurrence of handler::scan_time() with ha_scan_time(), right? How do I make sure that I don't miss any place?
See other ha_xxx vs xxx methods, e.g. ha_rnd_init and rnd_init. You make scan_time() protected - and the compiler won't allow it to be called outside of the handler class.
Also regarding making everything in the class Cost_factors static vs creating an object and using it everywhere: cann't we use a namespace Cost_factors? How is it done usually? I don't see much usage of namespaces in the code.
Yes, namespaces aren't used much in MariaDB, but feel free to use them, if you'd like. Regards, Sergei
Hi serg, I have attached the updated diff. I'm still unable to observe the effect introduced by the cost factors. Maybe I just need to study the usage of scan_time() and read_time() in greater detail. Regards Anshu On Tue, Jun 10, 2014 at 1:19 PM, Sergei Golubchik <serg@mariadb.org> wrote:
Hi, Anshu!
On Jun 10, Anshu Avinash wrote:
Hi serg,
I have implemented your suggestions. In the test case I created a table with 25 rows. Explain for 'select * from t1 where a > 19' gave 'ALL' while explain for 'select * from t1 where a > 20' gives range. I have also written public methods ha_scan_time() and ha_read_time(). I should replace every occurrence of handler::scan_time() with ha_scan_time(), right? How do I make sure that I don't miss any place?
See other ha_xxx vs xxx methods, e.g. ha_rnd_init and rnd_init.
You make scan_time() protected - and the compiler won't allow it to be called outside of the handler class.
Also regarding making everything in the class Cost_factors static vs creating an object and using it everywhere: cann't we use a namespace Cost_factors? How is it done usually? I don't see much usage of namespaces in the code.
Yes, namespaces aren't used much in MariaDB, but feel free to use them, if you'd like.
Regards, Sergei
Hi, Anshu! On Jun 11, Anshu Avinash wrote:
Hi serg,
I have attached the updated diff. I'm still unable to observe the effect introduced by the cost factors. Maybe I just need to study the usage of scan_time() and read_time() in greater detail.
Ok, see one suggestion below. If that won't help - push your changes to github, I'll try myself.
diff --git a/mysql-test/r/costmodel.result b/mysql-test/r/costmodel.result new file mode 100644 index 0000000..6668758 --- /dev/null +++ b/mysql-test/r/costmodel.result @@ -0,0 +1,20 @@ +DROP TABLE IF EXISTS t1; +CREATE TABLE t1 (a int auto_increment primary key, b int); +INSERT INTO t1(b) values (1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12), +(13),(14),(15),(16),(17),(18),(19),(20),(21),(22),(23),(24),(25); +EXPLAIN +SELECT * FROM t1 +WHERE a > 21; +id select_type table type possible_keys key key_len ref rows Extra +1 SIMPLE t1 range PRIMARY PRIMARY 4 NULL 5 Using index condition +use mysql; +UPDATE optimizer_cost_factors +SET const_value=2.0 +WHERE const_name='SCAN_TIME_FACTOR'; +use test; +EXPLAIN +SELECT * FROM t1 +WHERE a > 21; +id select_type table type possible_keys key key_len ref rows Extra +1 SIMPLE t1 range PRIMARY PRIMARY 4 NULL 5 Using index condition +DROP TABLE t1;
See? Your first EXPLAIN already shows type=range. And then you increase the cost of the table scan. Of course, nothing can happen anymore. Try to make the first explain to show type=ALL (which means table scan), then increase the cost of the table scan and see how optimizer will switch to type=range. Regards, Sergei
Hi serg, Yes, I had tried that. I had tried many combinations. I have updated the test however on github code: https://github.com/igniting/server/tree/selfTuningOptimizer. Regards Anshu On Wed, Jun 11, 2014 at 1:13 PM, Sergei Golubchik <serg@mariadb.org> wrote:
Hi, Anshu!
On Jun 11, Anshu Avinash wrote:
Hi serg,
I have attached the updated diff. I'm still unable to observe the effect introduced by the cost factors. Maybe I just need to study the usage of scan_time() and read_time() in greater detail.
Ok, see one suggestion below. If that won't help - push your changes to github, I'll try myself.
diff --git a/mysql-test/r/costmodel.result b/mysql-test/r/costmodel.result new file mode 100644 index 0000000..6668758 --- /dev/null +++ b/mysql-test/r/costmodel.result @@ -0,0 +1,20 @@ +DROP TABLE IF EXISTS t1; +CREATE TABLE t1 (a int auto_increment primary key, b int); +INSERT INTO t1(b) values (1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12), +(13),(14),(15),(16),(17),(18),(19),(20),(21),(22),(23),(24),(25); +EXPLAIN +SELECT * FROM t1 +WHERE a > 21; +id select_type table type possible_keys key key_len ref rows Extra +1 SIMPLE t1 range PRIMARY PRIMARY 4 NULL 5 Using index condition +use mysql; +UPDATE optimizer_cost_factors +SET const_value=2.0 +WHERE const_name='SCAN_TIME_FACTOR'; +use test; +EXPLAIN +SELECT * FROM t1 +WHERE a > 21; +id select_type table type possible_keys key key_len ref rows Extra +1 SIMPLE t1 range PRIMARY PRIMARY 4 NULL 5 Using index condition +DROP TABLE t1;
See? Your first EXPLAIN already shows type=range. And then you increase the cost of the table scan. Of course, nothing can happen anymore.
Try to make the first explain to show type=ALL (which means table scan), then increase the cost of the table scan and see how optimizer will switch to type=range.
Regards, Sergei
Hi, In this case WHERE a > 6 should have a better chance to trigger a full table scan than a > 21. Jocelyn Le 11/06/2014 09:51, Anshu Avinash a écrit :
Hi serg,
Yes, I had tried that. I had tried many combinations. I have updated the test however on github code: https://github.com/igniting/server/tree/selfTuningOptimizer.
Regards Anshu
On Wed, Jun 11, 2014 at 1:13 PM, Sergei Golubchik <serg@mariadb.org <mailto:serg@mariadb.org>> wrote:
Hi, Anshu!
On Jun 11, Anshu Avinash wrote: > Hi serg, > > I have attached the updated diff. I'm still unable to observe the effect > introduced by the cost factors. Maybe I just need to study the usage of > scan_time() and read_time() in greater detail.
Ok, see one suggestion below. If that won't help - push your changes to github, I'll try myself.
> diff --git a/mysql-test/r/costmodel.result b/mysql-test/r/costmodel.result > new file mode 100644 > index 0000000..6668758 > --- /dev/null > +++ b/mysql-test/r/costmodel.result > @@ -0,0 +1,20 @@ > +DROP TABLE IF EXISTS t1; > +CREATE TABLE t1 (a int auto_increment primary key, b int); > +INSERT INTO t1(b) values (1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12), > +(13),(14),(15),(16),(17),(18),(19),(20),(21),(22),(23),(24),(25); > +EXPLAIN > +SELECT * FROM t1 > +WHERE a > 21; > +id select_type table type possible_keys key key_len ref rows Extra > +1 SIMPLE t1 range PRIMARY PRIMARY 4 NULL 5 Using index condition > +use mysql; > +UPDATE optimizer_cost_factors > +SET const_value=2.0 > +WHERE const_name='SCAN_TIME_FACTOR'; > +use test; > +EXPLAIN > +SELECT * FROM t1 > +WHERE a > 21; > +id select_type table type possible_keys key key_len ref rows Extra > +1 SIMPLE t1 range PRIMARY PRIMARY 4 NULL 5 Using index condition > +DROP TABLE t1;
See? Your first EXPLAIN already shows type=range. And then you increase the cost of the table scan. Of course, nothing can happen anymore.
Try to make the first explain to show type=ALL (which means table scan), then increase the cost of the table scan and see how optimizer will switch to type=range.
Regards, Sergei
_______________________________________________ Mailing list: https://launchpad.net/~maria-developers Post to : maria-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp
Hi, Anshu! Please, see the attached patch. It introduces a macro top_level_cond_is_satisfied() that now marks all places where a comparison (in the TIME_FOR_COMPARE sense) is done. So, all you need to do is to convert this macro into an (inline) function and increment your counter from there. As for TIME_FOR_COMPARE_ROWID, it's simpler. There's handler::cmp_ref(), a virtual handler method. Create handler::ha_cmp_ref(), non-virtual inline public method (make cmp_ref() protected, as usual). And increment your counter from ha_cmp_ref(). By the way, before you start writing the code to solve our equations, please write an email, explaining how you're going to solve them, that is, what method you'll be using. There are lots of them. Some are faster, some are slower. Some work better with sparse matrices (and ours will be very much sparse). Regards, Sergei
Hi Sergei, I have been looking into various methods that can meet our conditions. The Gauss-Seidel Method ( http://en.wikipedia.org/wiki/Gauss%E2%80%93Seidel_method) might meet our requirements. I had just started writing the code. So in a nutshell, Gauss-seidel method is an iterative method. In our case, the initial values for all variables would be obtained by (total_query_time)/total_queries if total_queries not equal to 0, 0 otherwise. After each query, we will get update the value of one variable (which should have a non-zero coefficient in the current equation). We would cycle through all constants query after query. The logic is that after many queries, the values would converge to the correct value. Also updating one variable after each query shouldn't be much of an overhead and we just need to store coefficients for the current query. The code for this is very simple, I can give a patch in 1-2 days, if the overall idea seems ok. Regards Anshu On Wed, Jun 25, 2014 at 8:10 PM, Sergei Golubchik <serg@mariadb.org> wrote:
Hi, Anshu!
Please, see the attached patch. It introduces a macro top_level_cond_is_satisfied() that now marks all places where a comparison (in the TIME_FOR_COMPARE sense) is done.
So, all you need to do is to convert this macro into an (inline) function and increment your counter from there.
As for TIME_FOR_COMPARE_ROWID, it's simpler. There's handler::cmp_ref(), a virtual handler method. Create handler::ha_cmp_ref(), non-virtual inline public method (make cmp_ref() protected, as usual). And increment your counter from ha_cmp_ref().
By the way, before you start writing the code to solve our equations, please write an email, explaining how you're going to solve them, that is, what method you'll be using. There are lots of them. Some are faster, some are slower. Some work better with sparse matrices (and ours will be very much sparse).
Regards, Sergei
Hi, Anshu! On Jun 25, Anshu Avinash wrote:
Hi Sergei,
I have been looking into various methods that can meet our conditions. The Gauss-Seidel Method (http://en.wikipedia.org/wiki/Gauss-Seidel_method) might meet our requirements. I had just started writing the code.
So in a nutshell, Gauss-seidel method is an iterative method. In our case, the initial values for all variables would be obtained by (total_query_time)/total_queries if total_queries not equal to 0, 0 otherwise. After each query, we will get update the value of one variable (which should have a non-zero coefficient in the current equation). We would cycle through all constants query after query. The logic is that after many queries, the values would converge to the correct value. Also updating one variable after each query shouldn't be much of an overhead and we just need to store coefficients for the current query.
The code for this is very simple, I can give a patch in 1-2 days, if the overall idea seems ok.
I don't have much experience in here :( Are you sure that Gauss-Seidel will converge? It's not obvious to me that the convergence criteria are met on our data. Besides, we do have a sparse matrix, that's true. But we don't (and won't ever) have a large matrix with millions of unknowns. We might get a thousand in a few years (ten per-engine constants times MAX_HA is already 640). So, methods that are asymptotically faster may not be faster in our case. I'd suggest the following: in your method that should be solving the equation - instead of solving anything, just dump the data to a file (make sure that data from different threads aren't intermixed, or use different files per thread). Then you can use any existing tool or software package to play with the data, try different methods, even prototype in python, if you'd like. One of the issues to keep in mind - the system won't be completely defined. Basically never. MAX_HA = 64, but practically you can expect the data for one or two engines only, so no matter how many equations you'll collect, ~120 variables will always have zero coefficients. By the way, just a thought, in iterative methods you can use your current values of optimizer constants as initial values for the first iteration. Regards, Sergei
Hi all, You can find this week's blog entry at http://igniting.in/gsoc2014/2014/06/09/more-coding/. I'm now maintaining the code only on github: https://github.com/igniting/server/tree/selfTuningOptimizer. Regards Anshu Avinash On Sun, May 25, 2014 at 3:27 PM, Anshu Avinash <anshu.avinash35@gmail.com> wrote:
Hi all,
You can find my this week's blog entry at http://igniting.in/gsoc2014/2014/05/25/coding-things-up/ . I have created a branch on launchpad for my work: http://bazaar.launchpad.net/~igniting/maria/maria/revision/4211 . You can give your suggestions/reviews either on this thread or as a comment on the blog itself.
Regards Anshu Avinash
On Tue, May 20, 2014 at 1:22 AM, Roberto Spadim <roberto@spadim.com.br> wrote:
wow a big work, congratulation guy, i will read part by part to better understand mariadb code
2014-05-19 16:33 GMT-03:00 Anshu Avinash <anshu.avinash35@gmail.com>:
Hi all,
This week's blog entry would get delayed by couple of days. I have started coding though and would like to give heads up on what I'm doing.
I've looked at the diffs for "Cost model project" of mysql: http://bazaar.launchpad.net/~mysql/mysql-server/5.7/revision/7596 and http://bazaar.launchpad.net/~mysql/mysql-server/5.7/revision/7222 . These give a pretty good idea about what are the hard-coded constants and where are they being used.
The idea is to multiply "READ_TIME_FACTOR" and "SCAN_TIME_FACTOR" to the values returned by read_time() and scan_time() in handler.h, while returning. These values would be read from a table in mysql db. For that I've looked at sql_statistics.cc. After completing this, I'll first change the values of these constants manually and check if the better or worse query plans are being selected. I'll first do the last step manually, to check if everything is working as expected and later automate it.
Regards Anshu
On Mon, May 12, 2014 at 11:22 AM, Anshu Avinash < anshu.avinash35@gmail.com> wrote:
Hi all,
You can find my blog entry for this week at http://igniting.in/gsoc2014/2014/05/11/first-steps/ .
Regards Anshu Avinash
On Thu, May 8, 2014 at 11:46 PM, Anshu Avinash < anshu.avinash35@gmail.com> wrote:
Hi all,
Sorry for the irregular updates. I had been busy for last couple of days and might still be busy for 1-2 days more. I would be completely free starting next week, and would be updating my blog weekly on every Monday (so 1st update would be on May 12). I would also send the link of my post weekly on the mailing list.
As discussed on irc, I started to explore the pair of constants: handler::scan_time() and handler::read_time(). I also started looking into sql_statistics.cc for writing the optimizer constants into a persistent db.
Regards Anshu Avinash
On Thu, May 8, 2014 at 11:08 PM, Colin Charles <colin@mariadb.org> wrote:
Hi!
On 8 May 2014, at 22:33, Sergei Golubchik <serg@mariadb.org> wrote:
> Hi, Anshu! > > How are you doing? Any progress so far? > > On Apr 30, Anshu Avinash wrote: >> >>> And, by the way, when you start coding (May 19) or earlier, as you >>> prefer, I would like to start seeing some kind of weekly updates from >>> you. In email or in your blog - whatever you feel more comfortable with. >> >> Blog updates should be fine. > > That's fine. Whatever you prefer. > One blog post every week then, preferrably on Monday. >
For the benefit of others Anshu, please also post your weekly reports to maria-developers@lists.launchpad.net - I think it will be really good for those that don't drop by your blog and you'll likely also get other feedback maybe
This goes to all those participating in GSoC.
Also for those with a blog + RSS feed, you should aim to get it on http://planetmariadb.org/ and http://planet.mysql.com/
cheers, -colin
-- Colin Charles, Chief Evangelist, SkySQL - The MariaDB Company blog: http://bytebot.net/blog/| t: +6-012-204-3201 | Skype: colincharles
_______________________________________________ Mailing list: https://launchpad.net/~maria-developers Post to : maria-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
Well i wws reading your posts Do you need big data to test read and scan times? Em segunda-feira, 9 de junho de 2014, Anshu Avinash < anshu.avinash35@gmail.com> escreveu:
Hi all,
You can find this week's blog entry at http://igniting.in/gsoc2014/2014/06/09/more-coding/. I'm now maintaining the code only on github: https://github.com/igniting/server/tree/selfTuningOptimizer.
Regards Anshu Avinash
On Sun, May 25, 2014 at 3:27 PM, Anshu Avinash <anshu.avinash35@gmail.com> wrote:
Hi all,
You can find my this week's blog entry at http://igniting.in/gsoc2014/2014/05/25/coding-things-up/ . I have created a branch on launchpad for my work: http://bazaar.launchpad.net/~igniting/maria/maria/revision/4211 . You can give your suggestions/reviews either on this thread or as a comment on the blog itself.
Regards Anshu Avinash
On Tue, May 20, 2014 at 1:22 AM, Roberto Spadim <roberto@spadim.com.br> wrote:
wow a big work, congratulation guy, i will read part by part to better understand mariadb code
2014-05-19 16:33 GMT-03:00 Anshu Avinash <anshu.avinash35@gmail.com>:
Hi all,
This week's blog entry would get delayed by couple of days. I have started coding though and would like to give heads up on what I'm doing.
I've looked at the diffs for "Cost model project" of mysql: http://bazaar.launchpad.net/~mysql/mysql-server/5.7/revision/7596 and http://bazaar.launchpad.net/~mysql/mysql-server/5.7/revision/7222 . These give a pretty good idea about what are the hard-coded constants and where are they being used.
The idea is to multiply "READ_TIME_FACTOR" and "SCAN_TIME_FACTOR" to the values returned by read_time() and scan_time() in handler.h, while returning. These values would be read from a table in mysql db. For that I've looked at sql_statistics.cc. After completing this, I'll first change the values of these constants manually and check if the better or worse query plans are being selected. I'll first do the last step manually, to check if everything is working as expected and later automate it.
Regards Anshu
On Mon, May 12, 2014 at 11:22 AM, Anshu Avinash <anshu.avinash35@gmail.com
wrote:
Hi all,
You can find my blog entry for this week at http://igniting.in/gsoc2014/2014/05/11/first-steps/ .
Regards Anshu Avinash
On Thu, May 8, 2014 at 11:46 PM, Anshu Avinash <anshu.avinash35@gmail.com> wrote:
Hi all,
Sorry for the irregular updates. I had been busy for last couple of days and might still be busy for 1-2 days more. I would be completely free starting next week, and would be updating my blog weekly on every Monday (so 1st update would be on May 12). I would also send the link of my post weekly on the mailing list.
As discussed on irc, I started to explore the pair of constants: handler::scan_time() and handler::read_time().
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
Hi all, You can find this week's blog entry at: http://igniting.in/2014/06/23/work-before-mid-term/ Suggestions/reviews are welcome. Regards Anshu Avinash On Mon, Jun 9, 2014 at 7:30 PM, Roberto Spadim <roberto@spadim.com.br> wrote:
Well i wws reading your posts Do you need big data to test read and scan times?
Em segunda-feira, 9 de junho de 2014, Anshu Avinash < anshu.avinash35@gmail.com> escreveu:
Hi all,
You can find this week's blog entry at http://igniting.in/gsoc2014/2014/06/09/more-coding/. I'm now maintaining the code only on github: https://github.com/igniting/server/tree/selfTuningOptimizer.
Regards Anshu Avinash
On Sun, May 25, 2014 at 3:27 PM, Anshu Avinash <anshu.avinash35@gmail.com
wrote:
Hi all,
You can find my this week's blog entry at http://igniting.in/gsoc2014/2014/05/25/coding-things-up/ . I have created a branch on launchpad for my work: http://bazaar.launchpad.net/~igniting/maria/maria/revision/4211 . You can give your suggestions/reviews either on this thread or as a comment on the blog itself.
Regards Anshu Avinash
On Tue, May 20, 2014 at 1:22 AM, Roberto Spadim <roberto@spadim.com.br> wrote:
wow a big work, congratulation guy, i will read part by part to better understand mariadb code
2014-05-19 16:33 GMT-03:00 Anshu Avinash <anshu.avinash35@gmail.com>:
Hi all,
This week's blog entry would get delayed by couple of days. I have started coding though and would like to give heads up on what I'm doing.
I've looked at the diffs for "Cost model project" of mysql: http://bazaar.launchpad.net/~mysql/mysql-server/5.7/revision/7596 and http://bazaar.launchpad.net/~mysql/mysql-server/5.7/revision/7222 . These give a pretty good idea about what are the hard-coded constants and where are they being used.
The idea is to multiply "READ_TIME_FACTOR" and "SCAN_TIME_FACTOR" to the values returned by read_time() and scan_time() in handler.h, while returning. These values would be read from a table in mysql db. For that I've looked at sql_statistics.cc. After completing this, I'll first change the values of these constants manually and check if the better or worse query plans are being selected. I'll first do the last step manually, to check if everything is working as expected and later automate it.
Regards Anshu
On Mon, May 12, 2014 at 11:22 AM, Anshu Avinash < anshu.avinash35@gmail.com> wrote:
Hi all,
You can find my blog entry for this week at http://igniting.in/gsoc2014/2014/05/11/first-steps/ .
Regards Anshu Avinash
On Thu, May 8, 2014 at 11:46 PM, Anshu Avinash <anshu.avinash35@gmail.com
wrote:
Hi all,
Sorry for the irregular updates. I had been busy for last couple of days and might still be busy for 1-2 days more. I would be completely free starting next week, and would be updating my blog weekly on every Monday (so 1st update would be on May 12). I would also send the link of my post weekly on the mailing list.
As discussed on irc, I started to explore the pair of constants: handler::scan_time() and handler::read_time().
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
"Sorry this page does not exist =(" 2014-06-23 8:07 GMT-03:00 Anshu Avinash <anshu.avinash35@gmail.com>:
Hi all,
You can find this week's blog entry at: http://igniting.in/2014/06/23/work-before-mid-term/ Suggestions/reviews are welcome.
Regards Anshu Avinash
On Mon, Jun 9, 2014 at 7:30 PM, Roberto Spadim <roberto@spadim.com.br> wrote:
Well i wws reading your posts Do you need big data to test read and scan times?
Em segunda-feira, 9 de junho de 2014, Anshu Avinash <anshu.avinash35@gmail.com> escreveu:
Hi all,
You can find this week's blog entry at http://igniting.in/gsoc2014/2014/06/09/more-coding/. I'm now maintaining the code only on github: https://github.com/igniting/server/tree/selfTuningOptimizer.
Regards Anshu Avinash
On Sun, May 25, 2014 at 3:27 PM, Anshu Avinash <anshu.avinash35@gmail.com> wrote:
Hi all,
You can find my this week's blog entry at http://igniting.in/gsoc2014/2014/05/25/coding-things-up/ . I have created a branch on launchpad for my work: http://bazaar.launchpad.net/~igniting/maria/maria/revision/4211 . You can give your suggestions/reviews either on this thread or as a comment on the blog itself.
Regards Anshu Avinash
On Tue, May 20, 2014 at 1:22 AM, Roberto Spadim <roberto@spadim.com.br> wrote:
wow a big work, congratulation guy, i will read part by part to better understand mariadb code
2014-05-19 16:33 GMT-03:00 Anshu Avinash <anshu.avinash35@gmail.com>:
Hi all,
This week's blog entry would get delayed by couple of days. I have started coding though and would like to give heads up on what I'm doing.
I've looked at the diffs for "Cost model project" of mysql: http://bazaar.launchpad.net/~mysql/mysql-server/5.7/revision/7596 and http://bazaar.launchpad.net/~mysql/mysql-server/5.7/revision/7222 . These give a pretty good idea about what are the hard-coded constants and where are they being used.
The idea is to multiply "READ_TIME_FACTOR" and "SCAN_TIME_FACTOR" to the values returned by read_time() and scan_time() in handler.h, while returning. These values would be read from a table in mysql db. For that I've looked at sql_statistics.cc. After completing this, I'll first change the values of these constants manually and check if the better or worse query plans are being selected. I'll first do the last step manually, to check if everything is working as expected and later automate it.
Regards Anshu
On Mon, May 12, 2014 at 11:22 AM, Anshu Avinash <anshu.avinash35@gmail.com> wrote:
Hi all,
You can find my blog entry for this week at http://igniting.in/gsoc2014/2014/05/11/first-steps/ .
Regards Anshu Avinash
On Thu, May 8, 2014 at 11:46 PM, Anshu Avinash <anshu.avinash35@gmail.com> wrote:
Hi all,
Sorry for the irregular updates. I had been busy for last couple of days and might still be busy for 1-2 days more. I would be completely free starting next week, and would be updating my blog weekly on every Monday (so 1st update would be on May 12). I would also send the link of my post weekly on the mailing list.
As discussed on irc, I started to explore the pair of constants: handler::scan_time() and handler::read_time().
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
Hi, Sorry for the confusion, this is the new link: http://igniting.in/gsoc2014/2014/06/23/work-before-mid-term/ Thanks for pointing out. Regards Anshu On Mon, Jun 23, 2014 at 7:32 PM, Roberto Spadim <roberto@spadim.com.br> wrote:
"Sorry this page does not exist =("
Hi all,
You can find this week's blog entry at: http://igniting.in/2014/06/23/work-before-mid-term/ Suggestions/reviews are welcome.
Regards Anshu Avinash
On Mon, Jun 9, 2014 at 7:30 PM, Roberto Spadim <roberto@spadim.com.br> wrote:
Well i wws reading your posts Do you need big data to test read and scan times?
Em segunda-feira, 9 de junho de 2014, Anshu Avinash <anshu.avinash35@gmail.com> escreveu:
Hi all,
You can find this week's blog entry at http://igniting.in/gsoc2014/2014/06/09/more-coding/. I'm now
code only on github: https://github.com/igniting/server/tree/selfTuningOptimizer.
Regards Anshu Avinash
On Sun, May 25, 2014 at 3:27 PM, Anshu Avinash <anshu.avinash35@gmail.com> wrote:
Hi all,
You can find my this week's blog entry at http://igniting.in/gsoc2014/2014/05/25/coding-things-up/ . I have created a branch on launchpad for my work: http://bazaar.launchpad.net/~igniting/maria/maria/revision/4211 . You can give your suggestions/reviews either on this thread or as a comment on
blog itself.
Regards Anshu Avinash
On Tue, May 20, 2014 at 1:22 AM, Roberto Spadim <roberto@spadim.com.br
wrote:
wow a big work, congratulation guy, i will read part by part to better understand mariadb code
2014-05-19 16:33 GMT-03:00 Anshu Avinash <anshu.avinash35@gmail.com>:
Hi all,
This week's blog entry would get delayed by couple of days. I have started coding though and would like to give heads up on what I'm doing.
I've looked at the diffs for "Cost model project" of mysql: http://bazaar.launchpad.net/~mysql/mysql-server/5.7/revision/7596 and http://bazaar.launchpad.net/~mysql/mysql-server/5.7/revision/7222 . These give a pretty good idea about what are the hard-coded constants and where are they being used.
The idea is to multiply "READ_TIME_FACTOR" and "SCAN_TIME_FACTOR" to
values returned by read_time() and scan_time() in handler.h, while returning. These values would be read from a table in mysql db. For
2014-06-23 8:07 GMT-03:00 Anshu Avinash <anshu.avinash35@gmail.com>: maintaining the the the that
I've looked at sql_statistics.cc. After completing this, I'll first change the values of these constants manually and check if the better or worse query plans are being selected. I'll first do the last step manually, to check if everything is working as expected and later automate it.
Regards Anshu
On Mon, May 12, 2014 at 11:22 AM, Anshu Avinash <anshu.avinash35@gmail.com> wrote:
Hi all,
You can find my blog entry for this week at http://igniting.in/gsoc2014/2014/05/11/first-steps/ .
Regards Anshu Avinash
On Thu, May 8, 2014 at 11:46 PM, Anshu Avinash <anshu.avinash35@gmail.com> wrote:
Hi all,
Sorry for the irregular updates. I had been busy for last couple of days and might still be busy for 1-2 days more. I would be completely free starting next week, and would be updating my blog weekly on every Monday (so 1st update would be on May 12). I would also send the link of my post weekly on the mailing list.
As discussed on irc, I started to explore the pair of constants: handler::scan_time() and handler::read_time().
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
" MDEV. " it's nice to put full name (MDEV-350), since google and others search engines help when someone try to find information about mdev 350 text is ok :) 2014-06-23 11:04 GMT-03:00 Anshu Avinash <anshu.avinash35@gmail.com>:
Hi,
Sorry for the confusion, this is the new link: http://igniting.in/gsoc2014/2014/06/23/work-before-mid-term/ Thanks for pointing out.
Regards Anshu
On Mon, Jun 23, 2014 at 7:32 PM, Roberto Spadim <roberto@spadim.com.br> wrote:
"Sorry this page does not exist =("
2014-06-23 8:07 GMT-03:00 Anshu Avinash <anshu.avinash35@gmail.com>:
Hi all,
You can find this week's blog entry at: http://igniting.in/2014/06/23/work-before-mid-term/ Suggestions/reviews are welcome.
Regards Anshu Avinash
On Mon, Jun 9, 2014 at 7:30 PM, Roberto Spadim <roberto@spadim.com.br> wrote:
Well i wws reading your posts Do you need big data to test read and scan times?
Em segunda-feira, 9 de junho de 2014, Anshu Avinash <anshu.avinash35@gmail.com> escreveu:
Hi all,
You can find this week's blog entry at http://igniting.in/gsoc2014/2014/06/09/more-coding/. I'm now maintaining the code only on github: https://github.com/igniting/server/tree/selfTuningOptimizer.
Regards Anshu Avinash
On Sun, May 25, 2014 at 3:27 PM, Anshu Avinash <anshu.avinash35@gmail.com> wrote:
Hi all,
You can find my this week's blog entry at http://igniting.in/gsoc2014/2014/05/25/coding-things-up/ . I have created a branch on launchpad for my work: http://bazaar.launchpad.net/~igniting/maria/maria/revision/4211 . You can give your suggestions/reviews either on this thread or as a comment on the blog itself.
Regards Anshu Avinash
On Tue, May 20, 2014 at 1:22 AM, Roberto Spadim <roberto@spadim.com.br> wrote:
wow a big work, congratulation guy, i will read part by part to better understand mariadb code
2014-05-19 16:33 GMT-03:00 Anshu Avinash <anshu.avinash35@gmail.com>:
Hi all,
This week's blog entry would get delayed by couple of days. I have started coding though and would like to give heads up on what I'm doing.
I've looked at the diffs for "Cost model project" of mysql: http://bazaar.launchpad.net/~mysql/mysql-server/5.7/revision/7596 and http://bazaar.launchpad.net/~mysql/mysql-server/5.7/revision/7222 . These give a pretty good idea about what are the hard-coded constants and where are they being used.
The idea is to multiply "READ_TIME_FACTOR" and "SCAN_TIME_FACTOR" to the values returned by read_time() and scan_time() in handler.h, while returning. These values would be read from a table in mysql db. For that I've looked at sql_statistics.cc. After completing this, I'll first change the values of these constants manually and check if the better or worse query plans are being selected. I'll first do the last step manually, to check if everything is working as expected and later automate it.
Regards Anshu
On Mon, May 12, 2014 at 11:22 AM, Anshu Avinash <anshu.avinash35@gmail.com> wrote:
Hi all,
You can find my blog entry for this week at http://igniting.in/gsoc2014/2014/05/11/first-steps/ .
Regards Anshu Avinash
On Thu, May 8, 2014 at 11:46 PM, Anshu Avinash <anshu.avinash35@gmail.com> wrote:
Hi all,
Sorry for the irregular updates. I had been busy for last couple of days and might still be busy for 1-2 days more. I would be completely free starting next week, and would be updating my blog weekly on every Monday (so 1st update would be on May 12). I would also send the link of my post weekly on the mailing list.
As discussed on irc, I started to explore the pair of constants: handler::scan_time() and handler::read_time().
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
Hi all, This week's blog post is at: http://igniting.in/gsoc2014/2014/07/08/solving-linear-equations/ . Sorry for the delay. Suggestions for an approach to solve the system of linear equations are welcome. Regards Anshu Avinash On Mon, Jun 23, 2014 at 7:39 PM, Roberto Spadim <roberto@spadim.com.br> wrote:
" MDEV. " it's nice to put full name (MDEV-350), since google and others search engines help when someone try to find information about mdev 350
text is ok :)
Hi,
Sorry for the confusion, this is the new link: http://igniting.in/gsoc2014/2014/06/23/work-before-mid-term/ Thanks for pointing out.
Regards Anshu
On Mon, Jun 23, 2014 at 7:32 PM, Roberto Spadim <roberto@spadim.com.br> wrote:
"Sorry this page does not exist =("
2014-06-23 8:07 GMT-03:00 Anshu Avinash <anshu.avinash35@gmail.com>:
Hi all,
You can find this week's blog entry at: http://igniting.in/2014/06/23/work-before-mid-term/ Suggestions/reviews are welcome.
Regards Anshu Avinash
On Mon, Jun 9, 2014 at 7:30 PM, Roberto Spadim <roberto@spadim.com.br
wrote:
Well i wws reading your posts Do you need big data to test read and scan times?
Em segunda-feira, 9 de junho de 2014, Anshu Avinash <anshu.avinash35@gmail.com> escreveu:
Hi all,
You can find this week's blog entry at http://igniting.in/gsoc2014/2014/06/09/more-coding/. I'm now maintaining the code only on github: https://github.com/igniting/server/tree/selfTuningOptimizer.
Regards Anshu Avinash
On Sun, May 25, 2014 at 3:27 PM, Anshu Avinash <anshu.avinash35@gmail.com> wrote:
Hi all,
You can find my this week's blog entry at http://igniting.in/gsoc2014/2014/05/25/coding-things-up/ . I have created a branch on launchpad for my work: http://bazaar.launchpad.net/~igniting/maria/maria/revision/4211 .
You
can give your suggestions/reviews either on this thread or as a comment on the blog itself.
Regards Anshu Avinash
On Tue, May 20, 2014 at 1:22 AM, Roberto Spadim <roberto@spadim.com.br> wrote:
wow a big work, congratulation guy, i will read part by part to better understand mariadb code
2014-05-19 16:33 GMT-03:00 Anshu Avinash <anshu.avinash35@gmail.com :
Hi all,
This week's blog entry would get delayed by couple of days. I have started coding though and would like to give heads up on what I'm doing.
I've looked at the diffs for "Cost model project" of mysql: http://bazaar.launchpad.net/~mysql/mysql-server/5.7/revision/7596 and http://bazaar.launchpad.net/~mysql/mysql-server/5.7/revision/7222 . These give a pretty good idea about what are the hard-coded constants and where are they being used.
The idea is to multiply "READ_TIME_FACTOR" and "SCAN_TIME_FACTOR" to the values returned by read_time() and scan_time() in handler.h, while returning. These values would be read from a table in mysql db. For that I've looked at sql_statistics.cc. After completing this, I'll first change the values of these constants manually and check if the better or worse query plans are being selected. I'll first do the last step manually, to check if everything is working as expected and later automate it.
Regards Anshu
On Mon, May 12, 2014 at 11:22 AM, Anshu Avinash <anshu.avinash35@gmail.com> wrote:
Hi all,
You can find my blog entry for this week at http://igniting.in/gsoc2014/2014/05/11/first-steps/ .
Regards Anshu Avinash
On Thu, May 8, 2014 at 11:46 PM, Anshu Avinash <anshu.avinash35@gmail.com> wrote:
Hi all,
Sorry for the irregular updates. I had been busy for last couple of days and might still be busy for 1-2 days more. I would be completely free starting next week, and would be updating my blog weekly on every Monday (so 1st update would be on May 12). I would also send the link of my
2014-06-23 11:04 GMT-03:00 Anshu Avinash <anshu.avinash35@gmail.com>: post
weekly on the mailing list.
As discussed on irc, I started to explore the pair of constants: handler::scan_time() and handler::read_time().
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
could you 'display' the dataset you used with octave? 2014-07-08 13:55 GMT-03:00 Anshu Avinash <anshu.avinash35@gmail.com>:
Hi all,
This week's blog post is at: http://igniting.in/gsoc2014/2014/07/08/solving-linear-equations/ . Sorry for the delay. Suggestions for an approach to solve the system of linear equations are welcome.
Regards Anshu Avinash
On Mon, Jun 23, 2014 at 7:39 PM, Roberto Spadim <roberto@spadim.com.br> wrote:
" MDEV. " it's nice to put full name (MDEV-350), since google and others search engines help when someone try to find information about mdev 350
text is ok :)
2014-06-23 11:04 GMT-03:00 Anshu Avinash <anshu.avinash35@gmail.com>:
Hi,
Sorry for the confusion, this is the new link: http://igniting.in/gsoc2014/2014/06/23/work-before-mid-term/ Thanks for pointing out.
Regards Anshu
On Mon, Jun 23, 2014 at 7:32 PM, Roberto Spadim <roberto@spadim.com.br> wrote:
"Sorry this page does not exist =("
2014-06-23 8:07 GMT-03:00 Anshu Avinash <anshu.avinash35@gmail.com>:
Hi all,
You can find this week's blog entry at: http://igniting.in/2014/06/23/work-before-mid-term/ Suggestions/reviews are welcome.
Regards Anshu Avinash
On Mon, Jun 9, 2014 at 7:30 PM, Roberto Spadim <roberto@spadim.com.br> wrote:
Well i wws reading your posts Do you need big data to test read and scan times?
Em segunda-feira, 9 de junho de 2014, Anshu Avinash <anshu.avinash35@gmail.com> escreveu:
> Hi all, > > You can find this week's blog entry at > http://igniting.in/gsoc2014/2014/06/09/more-coding/. I'm now > maintaining the > code only on github: > https://github.com/igniting/server/tree/selfTuningOptimizer. > > Regards > Anshu Avinash > > > On Sun, May 25, 2014 at 3:27 PM, Anshu Avinash > <anshu.avinash35@gmail.com> wrote: > > Hi all, > > You can find my this week's blog entry at > http://igniting.in/gsoc2014/2014/05/25/coding-things-up/ . I have > created a > branch on launchpad for my work: > http://bazaar.launchpad.net/~igniting/maria/maria/revision/4211 . > You > can > give your suggestions/reviews either on this thread or as a comment > on > the > blog itself. > > Regards > Anshu Avinash > > > On Tue, May 20, 2014 at 1:22 AM, Roberto Spadim > <roberto@spadim.com.br> > wrote: > > wow a big work, congratulation guy, i will read part by part to > better > understand mariadb code > > > 2014-05-19 16:33 GMT-03:00 Anshu Avinash > <anshu.avinash35@gmail.com>: > > Hi all, > > This week's blog entry would get delayed by couple of days. I have > started coding though and would like to give heads up on what I'm > doing. > > I've looked at the diffs for "Cost model project" of mysql: > http://bazaar.launchpad.net/~mysql/mysql-server/5.7/revision/7596 > and > http://bazaar.launchpad.net/~mysql/mysql-server/5.7/revision/7222 . > These > give a pretty good idea about what are the hard-coded constants and > where > are they being used. > > The idea is to multiply "READ_TIME_FACTOR" and "SCAN_TIME_FACTOR" > to > the > values returned by read_time() and scan_time() in handler.h, while > returning. These values would be read from a table in mysql db. For > that > I've looked at sql_statistics.cc. After completing this, I'll first > change > the values of these constants manually and check if the better or > worse > query plans are being selected. I'll first do the last step > manually, > to > check if everything is working as expected and later automate it. > > Regards > Anshu > > > On Mon, May 12, 2014 at 11:22 AM, Anshu Avinash > <anshu.avinash35@gmail.com> wrote: > > Hi all, > > You can find my blog entry for this week at > http://igniting.in/gsoc2014/2014/05/11/first-steps/ . > > Regards > Anshu Avinash > > > On Thu, May 8, 2014 at 11:46 PM, Anshu Avinash > <anshu.avinash35@gmail.com> wrote: > > Hi all, > > Sorry for the irregular updates. I had been busy for last couple of > days > and might still be busy for 1-2 days more. I would be completely > free > starting next week, and would be updating my blog weekly on every > Monday (so > 1st update would be on May 12). I would also send the link of my > post > weekly > on the mailing list. > > As discussed on irc, I started to explore the pair of constants: > handler::scan_time() and handler::read_time().
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
Hi all, You can download it here ( https://drive.google.com/file/d/0B7NiQb4EbbUVNVJFZ2xkRVR3Ylk/edit?usp=sharin...). It is around 26M. I have added the link on blog too. Regards Anshu On Tue, Jul 8, 2014 at 10:38 PM, Roberto Spadim <roberto@spadim.com.br> wrote:
could you 'display' the dataset you used with octave?
2014-07-08 13:55 GMT-03:00 Anshu Avinash <anshu.avinash35@gmail.com>:
Hi all,
This week's blog post is at: http://igniting.in/gsoc2014/2014/07/08/solving-linear-equations/ . Sorry for the delay. Suggestions for an approach to solve the system of linear equations are welcome.
Regards Anshu Avinash
On Mon, Jun 23, 2014 at 7:39 PM, Roberto Spadim <roberto@spadim.com.br> wrote:
" MDEV. " it's nice to put full name (MDEV-350), since google and others search engines help when someone try to find information about mdev 350
text is ok :)
2014-06-23 11:04 GMT-03:00 Anshu Avinash <anshu.avinash35@gmail.com>:
Hi,
Sorry for the confusion, this is the new link: http://igniting.in/gsoc2014/2014/06/23/work-before-mid-term/ Thanks for pointing out.
Regards Anshu
On Mon, Jun 23, 2014 at 7:32 PM, Roberto Spadim <
roberto@spadim.com.br>
wrote:
"Sorry this page does not exist =("
2014-06-23 8:07 GMT-03:00 Anshu Avinash <anshu.avinash35@gmail.com>:
Hi all,
You can find this week's blog entry at: http://igniting.in/2014/06/23/work-before-mid-term/ Suggestions/reviews are welcome.
Regards Anshu Avinash
On Mon, Jun 9, 2014 at 7:30 PM, Roberto Spadim <roberto@spadim.com.br> wrote: > > Well i wws reading your posts > Do you need big data to test read and scan times? > > Em segunda-feira, 9 de junho de 2014, Anshu Avinash > <anshu.avinash35@gmail.com> escreveu: > >> Hi all, >> >> You can find this week's blog entry at >> http://igniting.in/gsoc2014/2014/06/09/more-coding/. I'm now >> maintaining the >> code only on github: >> https://github.com/igniting/server/tree/selfTuningOptimizer. >> >> Regards >> Anshu Avinash >> >> >> On Sun, May 25, 2014 at 3:27 PM, Anshu Avinash >> <anshu.avinash35@gmail.com> wrote: >> >> Hi all, >> >> You can find my this week's blog entry at >> http://igniting.in/gsoc2014/2014/05/25/coding-things-up/ . I
have
>> created a >> branch on launchpad for my work: >> http://bazaar.launchpad.net/~igniting/maria/maria/revision/4211 . >> You >> can >> give your suggestions/reviews either on this thread or as a comment >> on >> the >> blog itself. >> >> Regards >> Anshu Avinash >> >> >> On Tue, May 20, 2014 at 1:22 AM, Roberto Spadim >> <roberto@spadim.com.br> >> wrote: >> >> wow a big work, congratulation guy, i will read part by part to >> better >> understand mariadb code >> >> >> 2014-05-19 16:33 GMT-03:00 Anshu Avinash >> <anshu.avinash35@gmail.com>: >> >> Hi all, >> >> This week's blog entry would get delayed by couple of days. I have >> started coding though and would like to give heads up on what I'm >> doing. >> >> I've looked at the diffs for "Cost model project" of mysql: >> http://bazaar.launchpad.net/~mysql/mysql-server/5.7/revision/7596 >> and >> http://bazaar.launchpad.net/~mysql/mysql-server/5.7/revision/7222 . >> These >> give a pretty good idea about what are the hard-coded constants and >> where >> are they being used. >> >> The idea is to multiply "READ_TIME_FACTOR" and "SCAN_TIME_FACTOR" >> to >> the >> values returned by read_time() and scan_time() in handler.h, while >> returning. These values would be read from a table in mysql db. For >> that >> I've looked at sql_statistics.cc. After completing this, I'll first >> change >> the values of these constants manually and check if the better or >> worse >> query plans are being selected. I'll first do the last step >> manually, >> to >> check if everything is working as expected and later automate it. >> >> Regards >> Anshu >> >> >> On Mon, May 12, 2014 at 11:22 AM, Anshu Avinash >> <anshu.avinash35@gmail.com> wrote: >> >> Hi all, >> >> You can find my blog entry for this week at >> http://igniting.in/gsoc2014/2014/05/11/first-steps/ . >> >> Regards >> Anshu Avinash >> >> >> On Thu, May 8, 2014 at 11:46 PM, Anshu Avinash >> <anshu.avinash35@gmail.com> wrote: >> >> Hi all, >> >> Sorry for the irregular updates. I had been busy for last couple of >> days >> and might still be busy for 1-2 days more. I would be completely >> free >> starting next week, and would be updating my blog weekly on every >> Monday (so >> 1st update would be on May 12). I would also send the link of my >> post >> weekly >> on the mailing list. >> >> As discussed on irc, I started to explore the pair of constants: >> handler::scan_time() and handler::read_time(). > > > > -- > Roberto Spadim > SPAEmpresarial > Eng. Automação e Controle >
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
=] nice 2014-07-08 14:18 GMT-03:00 Anshu Avinash <anshu.avinash35@gmail.com>:
Hi all,
You can download it here (https://drive.google.com/file/d/0B7NiQb4EbbUVNVJFZ2xkRVR3Ylk/edit?usp=sharin...). It is around 26M. I have added the link on blog too.
Regards Anshu
On Tue, Jul 8, 2014 at 10:38 PM, Roberto Spadim <roberto@spadim.com.br> wrote:
could you 'display' the dataset you used with octave?
2014-07-08 13:55 GMT-03:00 Anshu Avinash <anshu.avinash35@gmail.com>:
Hi all,
This week's blog post is at: http://igniting.in/gsoc2014/2014/07/08/solving-linear-equations/ . Sorry for the delay. Suggestions for an approach to solve the system of linear equations are welcome.
Regards Anshu Avinash
On Mon, Jun 23, 2014 at 7:39 PM, Roberto Spadim <roberto@spadim.com.br> wrote:
" MDEV. " it's nice to put full name (MDEV-350), since google and others search engines help when someone try to find information about mdev 350
text is ok :)
2014-06-23 11:04 GMT-03:00 Anshu Avinash <anshu.avinash35@gmail.com>:
Hi,
Sorry for the confusion, this is the new link: http://igniting.in/gsoc2014/2014/06/23/work-before-mid-term/ Thanks for pointing out.
Regards Anshu
On Mon, Jun 23, 2014 at 7:32 PM, Roberto Spadim <roberto@spadim.com.br> wrote:
"Sorry this page does not exist =("
2014-06-23 8:07 GMT-03:00 Anshu Avinash <anshu.avinash35@gmail.com>: > Hi all, > > You can find this week's blog entry at: > http://igniting.in/2014/06/23/work-before-mid-term/ > Suggestions/reviews are welcome. > > Regards > Anshu Avinash > > > On Mon, Jun 9, 2014 at 7:30 PM, Roberto Spadim > <roberto@spadim.com.br> > wrote: >> >> Well i wws reading your posts >> Do you need big data to test read and scan times? >> >> Em segunda-feira, 9 de junho de 2014, Anshu Avinash >> <anshu.avinash35@gmail.com> escreveu: >> >>> Hi all, >>> >>> You can find this week's blog entry at >>> http://igniting.in/gsoc2014/2014/06/09/more-coding/. I'm now >>> maintaining the >>> code only on github: >>> https://github.com/igniting/server/tree/selfTuningOptimizer. >>> >>> Regards >>> Anshu Avinash >>> >>> >>> On Sun, May 25, 2014 at 3:27 PM, Anshu Avinash >>> <anshu.avinash35@gmail.com> wrote: >>> >>> Hi all, >>> >>> You can find my this week's blog entry at >>> http://igniting.in/gsoc2014/2014/05/25/coding-things-up/ . I >>> have >>> created a >>> branch on launchpad for my work: >>> http://bazaar.launchpad.net/~igniting/maria/maria/revision/4211 >>> . >>> You >>> can >>> give your suggestions/reviews either on this thread or as a >>> comment >>> on >>> the >>> blog itself. >>> >>> Regards >>> Anshu Avinash >>> >>> >>> On Tue, May 20, 2014 at 1:22 AM, Roberto Spadim >>> <roberto@spadim.com.br> >>> wrote: >>> >>> wow a big work, congratulation guy, i will read part by part to >>> better >>> understand mariadb code >>> >>> >>> 2014-05-19 16:33 GMT-03:00 Anshu Avinash >>> <anshu.avinash35@gmail.com>: >>> >>> Hi all, >>> >>> This week's blog entry would get delayed by couple of days. I >>> have >>> started coding though and would like to give heads up on what >>> I'm >>> doing. >>> >>> I've looked at the diffs for "Cost model project" of mysql: >>> >>> http://bazaar.launchpad.net/~mysql/mysql-server/5.7/revision/7596 >>> and >>> >>> http://bazaar.launchpad.net/~mysql/mysql-server/5.7/revision/7222 . >>> These >>> give a pretty good idea about what are the hard-coded constants >>> and >>> where >>> are they being used. >>> >>> The idea is to multiply "READ_TIME_FACTOR" and >>> "SCAN_TIME_FACTOR" >>> to >>> the >>> values returned by read_time() and scan_time() in handler.h, >>> while >>> returning. These values would be read from a table in mysql db. >>> For >>> that >>> I've looked at sql_statistics.cc. After completing this, I'll >>> first >>> change >>> the values of these constants manually and check if the better >>> or >>> worse >>> query plans are being selected. I'll first do the last step >>> manually, >>> to >>> check if everything is working as expected and later automate >>> it. >>> >>> Regards >>> Anshu >>> >>> >>> On Mon, May 12, 2014 at 11:22 AM, Anshu Avinash >>> <anshu.avinash35@gmail.com> wrote: >>> >>> Hi all, >>> >>> You can find my blog entry for this week at >>> http://igniting.in/gsoc2014/2014/05/11/first-steps/ . >>> >>> Regards >>> Anshu Avinash >>> >>> >>> On Thu, May 8, 2014 at 11:46 PM, Anshu Avinash >>> <anshu.avinash35@gmail.com> wrote: >>> >>> Hi all, >>> >>> Sorry for the irregular updates. I had been busy for last couple >>> of >>> days >>> and might still be busy for 1-2 days more. I would be completely >>> free >>> starting next week, and would be updating my blog weekly on >>> every >>> Monday (so >>> 1st update would be on May 12). I would also send the link of my >>> post >>> weekly >>> on the mailing list. >>> >>> As discussed on irc, I started to explore the pair of constants: >>> handler::scan_time() and handler::read_time(). >> >> >> >> -- >> Roberto Spadim >> SPAEmpresarial >> Eng. Automação e Controle >> >
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
just to understand... --- the solve_equation part, today only used to save information: std::ofstream datafile; char file_name[100]; my_snprintf(file_name, 100, "/tmp/mariadb_cost_coefficients_%lu.txt", thread_id); datafile.open(file_name, std::ios::app); for(int i=0; i < MAX_CONSTANTS; i++) datafile << coefficients[i].value << " "; datafile << total_time << "\n"; datafile.close(); ---- the idea is: given a query and some coefficients[i].value, you got total_time need to execute the query you want to "train" something to tell you how many time the same query should execute? or, what's the "x[i]" variables from your system (hardware/hard disk/etc), and extend this to others queries? 2014-07-08 14:20 GMT-03:00 Roberto Spadim <roberto@spadim.com.br>:
=] nice
2014-07-08 14:18 GMT-03:00 Anshu Avinash <anshu.avinash35@gmail.com>:
Hi all,
You can download it here (https://drive.google.com/file/d/0B7NiQb4EbbUVNVJFZ2xkRVR3Ylk/edit?usp=sharin...). It is around 26M. I have added the link on blog too.
Regards Anshu
On Tue, Jul 8, 2014 at 10:38 PM, Roberto Spadim <roberto@spadim.com.br> wrote:
could you 'display' the dataset you used with octave?
2014-07-08 13:55 GMT-03:00 Anshu Avinash <anshu.avinash35@gmail.com>:
Hi all,
This week's blog post is at: http://igniting.in/gsoc2014/2014/07/08/solving-linear-equations/ . Sorry for the delay. Suggestions for an approach to solve the system of linear equations are welcome.
Regards Anshu Avinash
On Mon, Jun 23, 2014 at 7:39 PM, Roberto Spadim <roberto@spadim.com.br> wrote:
" MDEV. " it's nice to put full name (MDEV-350), since google and others search engines help when someone try to find information about mdev 350
text is ok :)
2014-06-23 11:04 GMT-03:00 Anshu Avinash <anshu.avinash35@gmail.com>:
Hi,
Sorry for the confusion, this is the new link: http://igniting.in/gsoc2014/2014/06/23/work-before-mid-term/ Thanks for pointing out.
Regards Anshu
On Mon, Jun 23, 2014 at 7:32 PM, Roberto Spadim <roberto@spadim.com.br> wrote: > > "Sorry this page does not exist =(" > > 2014-06-23 8:07 GMT-03:00 Anshu Avinash <anshu.avinash35@gmail.com>: > > Hi all, > > > > You can find this week's blog entry at: > > http://igniting.in/2014/06/23/work-before-mid-term/ > > Suggestions/reviews are welcome. > > > > Regards > > Anshu Avinash > > > > > > On Mon, Jun 9, 2014 at 7:30 PM, Roberto Spadim > > <roberto@spadim.com.br> > > wrote: > >> > >> Well i wws reading your posts > >> Do you need big data to test read and scan times? > >> > >> Em segunda-feira, 9 de junho de 2014, Anshu Avinash > >> <anshu.avinash35@gmail.com> escreveu: > >> > >>> Hi all, > >>> > >>> You can find this week's blog entry at > >>> http://igniting.in/gsoc2014/2014/06/09/more-coding/. I'm now > >>> maintaining the > >>> code only on github: > >>> https://github.com/igniting/server/tree/selfTuningOptimizer. > >>> > >>> Regards > >>> Anshu Avinash > >>> > >>> > >>> On Sun, May 25, 2014 at 3:27 PM, Anshu Avinash > >>> <anshu.avinash35@gmail.com> wrote: > >>> > >>> Hi all, > >>> > >>> You can find my this week's blog entry at > >>> http://igniting.in/gsoc2014/2014/05/25/coding-things-up/ . I > >>> have > >>> created a > >>> branch on launchpad for my work: > >>> http://bazaar.launchpad.net/~igniting/maria/maria/revision/4211 > >>> . > >>> You > >>> can > >>> give your suggestions/reviews either on this thread or as a > >>> comment > >>> on > >>> the > >>> blog itself. > >>> > >>> Regards > >>> Anshu Avinash > >>> > >>> > >>> On Tue, May 20, 2014 at 1:22 AM, Roberto Spadim > >>> <roberto@spadim.com.br> > >>> wrote: > >>> > >>> wow a big work, congratulation guy, i will read part by part to > >>> better > >>> understand mariadb code > >>> > >>> > >>> 2014-05-19 16:33 GMT-03:00 Anshu Avinash > >>> <anshu.avinash35@gmail.com>: > >>> > >>> Hi all, > >>> > >>> This week's blog entry would get delayed by couple of days. I > >>> have > >>> started coding though and would like to give heads up on what > >>> I'm > >>> doing. > >>> > >>> I've looked at the diffs for "Cost model project" of mysql: > >>> > >>> http://bazaar.launchpad.net/~mysql/mysql-server/5.7/revision/7596 > >>> and > >>> > >>> http://bazaar.launchpad.net/~mysql/mysql-server/5.7/revision/7222 . > >>> These > >>> give a pretty good idea about what are the hard-coded constants > >>> and > >>> where > >>> are they being used. > >>> > >>> The idea is to multiply "READ_TIME_FACTOR" and > >>> "SCAN_TIME_FACTOR" > >>> to > >>> the > >>> values returned by read_time() and scan_time() in handler.h, > >>> while > >>> returning. These values would be read from a table in mysql db. > >>> For > >>> that > >>> I've looked at sql_statistics.cc. After completing this, I'll > >>> first > >>> change > >>> the values of these constants manually and check if the better > >>> or > >>> worse > >>> query plans are being selected. I'll first do the last step > >>> manually, > >>> to > >>> check if everything is working as expected and later automate > >>> it. > >>> > >>> Regards > >>> Anshu > >>> > >>> > >>> On Mon, May 12, 2014 at 11:22 AM, Anshu Avinash > >>> <anshu.avinash35@gmail.com> wrote: > >>> > >>> Hi all, > >>> > >>> You can find my blog entry for this week at > >>> http://igniting.in/gsoc2014/2014/05/11/first-steps/ . > >>> > >>> Regards > >>> Anshu Avinash > >>> > >>> > >>> On Thu, May 8, 2014 at 11:46 PM, Anshu Avinash > >>> <anshu.avinash35@gmail.com> wrote: > >>> > >>> Hi all, > >>> > >>> Sorry for the irregular updates. I had been busy for last couple > >>> of > >>> days > >>> and might still be busy for 1-2 days more. I would be completely > >>> free > >>> starting next week, and would be updating my blog weekly on > >>> every > >>> Monday (so > >>> 1st update would be on May 12). I would also send the link of my > >>> post > >>> weekly > >>> on the mailing list. > >>> > >>> As discussed on irc, I started to explore the pair of constants: > >>> handler::scan_time() and handler::read_time(). > >> > >> > >> > >> -- > >> Roberto Spadim > >> SPAEmpresarial > >> Eng. Automação e Controle > >> > > > > > > -- > Roberto Spadim > SPAEmpresarial > Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
Hi, The idea is we know the total time the query took, and how many times an operation was performed. For example, consider the case of 'read_time'. We know how many times an index read took place, but don't know how much time does it take to do an index read. By solving these equations, we are trying to find out time for individual operations. coefficients[i].value is `how many time the operation i took place in a single query.` Hope this clears things up. Regards Anshu Avinash On Tue, Jul 8, 2014 at 10:57 PM, Roberto Spadim <roberto@spadim.com.br> wrote:
just to understand... --- the solve_equation part, today only used to save information: std::ofstream datafile; char file_name[100]; my_snprintf(file_name, 100, "/tmp/mariadb_cost_coefficients_%lu.txt", thread_id); datafile.open(file_name, std::ios::app); for(int i=0; i < MAX_CONSTANTS; i++) datafile << coefficients[i].value << " "; datafile << total_time << "\n"; datafile.close(); ----
the idea is: given a query and some coefficients[i].value, you got total_time need to execute the query you want to "train" something to tell you how many time the same query should execute? or, what's the "x[i]" variables from your system (hardware/hard disk/etc), and extend this to others queries?
2014-07-08 14:20 GMT-03:00 Roberto Spadim <roberto@spadim.com.br>:
=] nice
2014-07-08 14:18 GMT-03:00 Anshu Avinash <anshu.avinash35@gmail.com>:
Hi all,
You can download it here ( https://drive.google.com/file/d/0B7NiQb4EbbUVNVJFZ2xkRVR3Ylk/edit?usp=sharin... ). It is around 26M. I have added the link on blog too.
Regards Anshu
On Tue, Jul 8, 2014 at 10:38 PM, Roberto Spadim <roberto@spadim.com.br> wrote:
could you 'display' the dataset you used with octave?
2014-07-08 13:55 GMT-03:00 Anshu Avinash <anshu.avinash35@gmail.com>:
Hi all,
This week's blog post is at: http://igniting.in/gsoc2014/2014/07/08/solving-linear-equations/ .
Sorry
for the delay. Suggestions for an approach to solve the system of linear equations are welcome.
Regards Anshu Avinash
On Mon, Jun 23, 2014 at 7:39 PM, Roberto Spadim < roberto@spadim.com.br> wrote:
" MDEV. " it's nice to put full name (MDEV-350), since google and others
search
engines help when someone try to find information about mdev 350
text is ok :)
2014-06-23 11:04 GMT-03:00 Anshu Avinash <anshu.avinash35@gmail.com : > Hi, > > Sorry for the confusion, this is the new link: > http://igniting.in/gsoc2014/2014/06/23/work-before-mid-term/ > Thanks for pointing out. > > Regards > Anshu > > > On Mon, Jun 23, 2014 at 7:32 PM, Roberto Spadim > <roberto@spadim.com.br> > wrote: >> >> "Sorry this page does not exist =(" >> >> 2014-06-23 8:07 GMT-03:00 Anshu Avinash < anshu.avinash35@gmail.com>: >> > Hi all, >> > >> > You can find this week's blog entry at: >> > http://igniting.in/2014/06/23/work-before-mid-term/ >> > Suggestions/reviews are welcome. >> > >> > Regards >> > Anshu Avinash >> > >> > >> > On Mon, Jun 9, 2014 at 7:30 PM, Roberto Spadim >> > <roberto@spadim.com.br> >> > wrote: >> >> >> >> Well i wws reading your posts >> >> Do you need big data to test read and scan times? >> >> >> >> Em segunda-feira, 9 de junho de 2014, Anshu Avinash >> >> <anshu.avinash35@gmail.com> escreveu: >> >> >> >>> Hi all, >> >>> >> >>> You can find this week's blog entry at >> >>> http://igniting.in/gsoc2014/2014/06/09/more-coding/. I'm now >> >>> maintaining the >> >>> code only on github: >> >>> https://github.com/igniting/server/tree/selfTuningOptimizer. >> >>> >> >>> Regards >> >>> Anshu Avinash >> >>> >> >>> >> >>> On Sun, May 25, 2014 at 3:27 PM, Anshu Avinash >> >>> <anshu.avinash35@gmail.com> wrote: >> >>> >> >>> Hi all, >> >>> >> >>> You can find my this week's blog entry at >> >>> http://igniting.in/gsoc2014/2014/05/25/coding-things-up/ . I >> >>> have >> >>> created a >> >>> branch on launchpad for my work: >> >>> http://bazaar.launchpad.net/~igniting/maria/maria/revision/4211 >> >>> . >> >>> You >> >>> can >> >>> give your suggestions/reviews either on this thread or as a >> >>> comment >> >>> on >> >>> the >> >>> blog itself. >> >>> >> >>> Regards >> >>> Anshu Avinash >> >>> >> >>> >> >>> On Tue, May 20, 2014 at 1:22 AM, Roberto Spadim >> >>> <roberto@spadim.com.br> >> >>> wrote: >> >>> >> >>> wow a big work, congratulation guy, i will read part by part to >> >>> better >> >>> understand mariadb code >> >>> >> >>> >> >>> 2014-05-19 16:33 GMT-03:00 Anshu Avinash >> >>> <anshu.avinash35@gmail.com>: >> >>> >> >>> Hi all, >> >>> >> >>> This week's blog entry would get delayed by couple of days. I >> >>> have >> >>> started coding though and would like to give heads up on what >> >>> I'm >> >>> doing. >> >>> >> >>> I've looked at the diffs for "Cost model project" of mysql: >> >>> >> >>> http://bazaar.launchpad.net/~mysql/mysql-server/5.7/revision/7596 >> >>> and >> >>> >> >>> http://bazaar.launchpad.net/~mysql/mysql-server/5.7/revision/7222 . >> >>> These >> >>> give a pretty good idea about what are the hard-coded constants >> >>> and >> >>> where >> >>> are they being used. >> >>> >> >>> The idea is to multiply "READ_TIME_FACTOR" and >> >>> "SCAN_TIME_FACTOR" >> >>> to >> >>> the >> >>> values returned by read_time() and scan_time() in handler.h, >> >>> while >> >>> returning. These values would be read from a table in mysql db. >> >>> For >> >>> that >> >>> I've looked at sql_statistics.cc. After completing this, I'll >> >>> first >> >>> change >> >>> the values of these constants manually and check if the better >> >>> or >> >>> worse >> >>> query plans are being selected. I'll first do the last step >> >>> manually, >> >>> to >> >>> check if everything is working as expected and later automate >> >>> it. >> >>> >> >>> Regards >> >>> Anshu >> >>> >> >>> >> >>> On Mon, May 12, 2014 at 11:22 AM, Anshu Avinash >> >>> <anshu.avinash35@gmail.com> wrote: >> >>> >> >>> Hi all, >> >>> >> >>> You can find my blog entry for this week at >> >>> http://igniting.in/gsoc2014/2014/05/11/first-steps/ . >> >>> >> >>> Regards >> >>> Anshu Avinash >> >>> >> >>> >> >>> On Thu, May 8, 2014 at 11:46 PM, Anshu Avinash >> >>> <anshu.avinash35@gmail.com> wrote: >> >>> >> >>> Hi all, >> >>> >> >>> Sorry for the irregular updates. I had been busy for last couple >> >>> of >> >>> days >> >>> and might still be busy for 1-2 days more. I would be completely >> >>> free >> >>> starting next week, and would be updating my blog weekly on >> >>> every >> >>> Monday (so >> >>> 1st update would be on May 12). I would also send the link of my >> >>> post >> >>> weekly >> >>> on the mailing list. >> >>> >> >>> As discussed on irc, I started to explore the pair of constants: >> >>> handler::scan_time() and handler::read_time(). >> >> >> >> >> >> >> >> -- >> >> Roberto Spadim >> >> SPAEmpresarial >> >> Eng. Automação e Controle >> >> >> > >> >> >> >> -- >> Roberto Spadim >> SPAEmpresarial >> Eng. Automação e Controle > >
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
going back... a1t1 + a2t2 + … + a130t130= ttotal a1, t1... a1 is something you don't know t1 is the coefficients[i]? it's a first order equation, right? 2014-07-08 15:20 GMT-03:00 Anshu Avinash <anshu.avinash35@gmail.com>:
Hi,
The idea is we know the total time the query took, and how many times an operation was performed. For example, consider the case of 'read_time'. We know how many times an index read took place, but don't know how much time does it take to do an index read. By solving these equations, we are trying to find out time for individual operations. coefficients[i].value is `how many time the operation i took place in a single query.`
Hope this clears things up.
Regards Anshu Avinash
On Tue, Jul 8, 2014 at 10:57 PM, Roberto Spadim <roberto@spadim.com.br> wrote:
just to understand... --- the solve_equation part, today only used to save information: std::ofstream datafile; char file_name[100]; my_snprintf(file_name, 100, "/tmp/mariadb_cost_coefficients_%lu.txt", thread_id); datafile.open(file_name, std::ios::app); for(int i=0; i < MAX_CONSTANTS; i++) datafile << coefficients[i].value << " "; datafile << total_time << "\n"; datafile.close(); ----
the idea is: given a query and some coefficients[i].value, you got total_time need to execute the query you want to "train" something to tell you how many time the same query should execute? or, what's the "x[i]" variables from your system (hardware/hard disk/etc), and extend this to others queries?
2014-07-08 14:20 GMT-03:00 Roberto Spadim <roberto@spadim.com.br>:
=] nice
2014-07-08 14:18 GMT-03:00 Anshu Avinash <anshu.avinash35@gmail.com>:
Hi all,
You can download it here
(https://drive.google.com/file/d/0B7NiQb4EbbUVNVJFZ2xkRVR3Ylk/edit?usp=sharin...). It is around 26M. I have added the link on blog too.
Regards Anshu
On Tue, Jul 8, 2014 at 10:38 PM, Roberto Spadim <roberto@spadim.com.br> wrote:
could you 'display' the dataset you used with octave?
2014-07-08 13:55 GMT-03:00 Anshu Avinash <anshu.avinash35@gmail.com>:
Hi all,
This week's blog post is at: http://igniting.in/gsoc2014/2014/07/08/solving-linear-equations/ . Sorry for the delay. Suggestions for an approach to solve the system of linear equations are welcome.
Regards Anshu Avinash
On Mon, Jun 23, 2014 at 7:39 PM, Roberto Spadim <roberto@spadim.com.br> wrote: > > " MDEV. " > it's nice to put full name (MDEV-350), since google and others > search > engines help when someone try to find information about mdev 350 > > text is ok :) > > 2014-06-23 11:04 GMT-03:00 Anshu Avinash > <anshu.avinash35@gmail.com>: > > Hi, > > > > Sorry for the confusion, this is the new link: > > http://igniting.in/gsoc2014/2014/06/23/work-before-mid-term/ > > Thanks for pointing out. > > > > Regards > > Anshu > > > > > > On Mon, Jun 23, 2014 at 7:32 PM, Roberto Spadim > > <roberto@spadim.com.br> > > wrote: > >> > >> "Sorry this page does not exist =(" > >> > >> 2014-06-23 8:07 GMT-03:00 Anshu Avinash > >> <anshu.avinash35@gmail.com>: > >> > Hi all, > >> > > >> > You can find this week's blog entry at: > >> > http://igniting.in/2014/06/23/work-before-mid-term/ > >> > Suggestions/reviews are welcome. > >> > > >> > Regards > >> > Anshu Avinash > >> > > >> > > >> > On Mon, Jun 9, 2014 at 7:30 PM, Roberto Spadim > >> > <roberto@spadim.com.br> > >> > wrote: > >> >> > >> >> Well i wws reading your posts > >> >> Do you need big data to test read and scan times? > >> >> > >> >> Em segunda-feira, 9 de junho de 2014, Anshu Avinash > >> >> <anshu.avinash35@gmail.com> escreveu: > >> >> > >> >>> Hi all, > >> >>> > >> >>> You can find this week's blog entry at > >> >>> http://igniting.in/gsoc2014/2014/06/09/more-coding/. I'm now > >> >>> maintaining the > >> >>> code only on github: > >> >>> https://github.com/igniting/server/tree/selfTuningOptimizer. > >> >>> > >> >>> Regards > >> >>> Anshu Avinash > >> >>> > >> >>> > >> >>> On Sun, May 25, 2014 at 3:27 PM, Anshu Avinash > >> >>> <anshu.avinash35@gmail.com> wrote: > >> >>> > >> >>> Hi all, > >> >>> > >> >>> You can find my this week's blog entry at > >> >>> http://igniting.in/gsoc2014/2014/05/25/coding-things-up/ . I > >> >>> have > >> >>> created a > >> >>> branch on launchpad for my work: > >> >>> > >> >>> http://bazaar.launchpad.net/~igniting/maria/maria/revision/4211 > >> >>> . > >> >>> You > >> >>> can > >> >>> give your suggestions/reviews either on this thread or as a > >> >>> comment > >> >>> on > >> >>> the > >> >>> blog itself. > >> >>> > >> >>> Regards > >> >>> Anshu Avinash > >> >>> > >> >>> > >> >>> On Tue, May 20, 2014 at 1:22 AM, Roberto Spadim > >> >>> <roberto@spadim.com.br> > >> >>> wrote: > >> >>> > >> >>> wow a big work, congratulation guy, i will read part by part > >> >>> to > >> >>> better > >> >>> understand mariadb code > >> >>> > >> >>> > >> >>> 2014-05-19 16:33 GMT-03:00 Anshu Avinash > >> >>> <anshu.avinash35@gmail.com>: > >> >>> > >> >>> Hi all, > >> >>> > >> >>> This week's blog entry would get delayed by couple of days. > >> >>> I > >> >>> have > >> >>> started coding though and would like to give heads up on > >> >>> what > >> >>> I'm > >> >>> doing. > >> >>> > >> >>> I've looked at the diffs for "Cost model project" of mysql: > >> >>> > >> >>> > >> >>> http://bazaar.launchpad.net/~mysql/mysql-server/5.7/revision/7596 > >> >>> and > >> >>> > >> >>> > >> >>> http://bazaar.launchpad.net/~mysql/mysql-server/5.7/revision/7222 . > >> >>> These > >> >>> give a pretty good idea about what are the hard-coded > >> >>> constants > >> >>> and > >> >>> where > >> >>> are they being used. > >> >>> > >> >>> The idea is to multiply "READ_TIME_FACTOR" and > >> >>> "SCAN_TIME_FACTOR" > >> >>> to > >> >>> the > >> >>> values returned by read_time() and scan_time() in handler.h, > >> >>> while > >> >>> returning. These values would be read from a table in mysql > >> >>> db. > >> >>> For > >> >>> that > >> >>> I've looked at sql_statistics.cc. After completing this, > >> >>> I'll > >> >>> first > >> >>> change > >> >>> the values of these constants manually and check if the > >> >>> better > >> >>> or > >> >>> worse > >> >>> query plans are being selected. I'll first do the last step > >> >>> manually, > >> >>> to > >> >>> check if everything is working as expected and later > >> >>> automate > >> >>> it. > >> >>> > >> >>> Regards > >> >>> Anshu > >> >>> > >> >>> > >> >>> On Mon, May 12, 2014 at 11:22 AM, Anshu Avinash > >> >>> <anshu.avinash35@gmail.com> wrote: > >> >>> > >> >>> Hi all, > >> >>> > >> >>> You can find my blog entry for this week at > >> >>> http://igniting.in/gsoc2014/2014/05/11/first-steps/ . > >> >>> > >> >>> Regards > >> >>> Anshu Avinash > >> >>> > >> >>> > >> >>> On Thu, May 8, 2014 at 11:46 PM, Anshu Avinash > >> >>> <anshu.avinash35@gmail.com> wrote: > >> >>> > >> >>> Hi all, > >> >>> > >> >>> Sorry for the irregular updates. I had been busy for last > >> >>> couple > >> >>> of > >> >>> days > >> >>> and might still be busy for 1-2 days more. I would be > >> >>> completely > >> >>> free > >> >>> starting next week, and would be updating my blog weekly on > >> >>> every > >> >>> Monday (so > >> >>> 1st update would be on May 12). I would also send the link > >> >>> of my > >> >>> post > >> >>> weekly > >> >>> on the mailing list. > >> >>> > >> >>> As discussed on irc, I started to explore the pair of > >> >>> constants: > >> >>> handler::scan_time() and handler::read_time(). > >> >> > >> >> > >> >> > >> >> -- > >> >> Roberto Spadim > >> >> SPAEmpresarial > >> >> Eng. Automação e Controle > >> >> > >> > > >> > >> > >> > >> -- > >> Roberto Spadim > >> SPAEmpresarial > >> Eng. Automação e Controle > > > > > > > > -- > Roberto Spadim > SPAEmpresarial > Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
ops, linear equation 2014-07-08 15:47 GMT-03:00 Roberto Spadim <roberto@spadim.com.br>:
going back... a1t1 + a2t2 + … + a130t130= ttotal
a1, t1...
a1 is something you don't know t1 is the coefficients[i]?
it's a first order equation, right?
2014-07-08 15:20 GMT-03:00 Anshu Avinash <anshu.avinash35@gmail.com>:
Hi,
The idea is we know the total time the query took, and how many times an operation was performed. For example, consider the case of 'read_time'. We know how many times an index read took place, but don't know how much time does it take to do an index read. By solving these equations, we are trying to find out time for individual operations. coefficients[i].value is `how many time the operation i took place in a single query.`
Hope this clears things up.
Regards Anshu Avinash
On Tue, Jul 8, 2014 at 10:57 PM, Roberto Spadim <roberto@spadim.com.br> wrote:
just to understand... --- the solve_equation part, today only used to save information: std::ofstream datafile; char file_name[100]; my_snprintf(file_name, 100, "/tmp/mariadb_cost_coefficients_%lu.txt", thread_id); datafile.open(file_name, std::ios::app); for(int i=0; i < MAX_CONSTANTS; i++) datafile << coefficients[i].value << " "; datafile << total_time << "\n"; datafile.close(); ----
the idea is: given a query and some coefficients[i].value, you got total_time need to execute the query you want to "train" something to tell you how many time the same query should execute? or, what's the "x[i]" variables from your system (hardware/hard disk/etc), and extend this to others queries?
2014-07-08 14:20 GMT-03:00 Roberto Spadim <roberto@spadim.com.br>:
=] nice
2014-07-08 14:18 GMT-03:00 Anshu Avinash <anshu.avinash35@gmail.com>:
Hi all,
You can download it here
(https://drive.google.com/file/d/0B7NiQb4EbbUVNVJFZ2xkRVR3Ylk/edit?usp=sharin...). It is around 26M. I have added the link on blog too.
Regards Anshu
On Tue, Jul 8, 2014 at 10:38 PM, Roberto Spadim <roberto@spadim.com.br> wrote:
could you 'display' the dataset you used with octave?
2014-07-08 13:55 GMT-03:00 Anshu Avinash <anshu.avinash35@gmail.com>: > Hi all, > > This week's blog post is at: > http://igniting.in/gsoc2014/2014/07/08/solving-linear-equations/ . > Sorry > for > the delay. > Suggestions for an approach to solve the system of linear equations > are > welcome. > > Regards > Anshu Avinash > > > On Mon, Jun 23, 2014 at 7:39 PM, Roberto Spadim > <roberto@spadim.com.br> > wrote: >> >> " MDEV. " >> it's nice to put full name (MDEV-350), since google and others >> search >> engines help when someone try to find information about mdev 350 >> >> text is ok :) >> >> 2014-06-23 11:04 GMT-03:00 Anshu Avinash >> <anshu.avinash35@gmail.com>: >> > Hi, >> > >> > Sorry for the confusion, this is the new link: >> > http://igniting.in/gsoc2014/2014/06/23/work-before-mid-term/ >> > Thanks for pointing out. >> > >> > Regards >> > Anshu >> > >> > >> > On Mon, Jun 23, 2014 at 7:32 PM, Roberto Spadim >> > <roberto@spadim.com.br> >> > wrote: >> >> >> >> "Sorry this page does not exist =(" >> >> >> >> 2014-06-23 8:07 GMT-03:00 Anshu Avinash >> >> <anshu.avinash35@gmail.com>: >> >> > Hi all, >> >> > >> >> > You can find this week's blog entry at: >> >> > http://igniting.in/2014/06/23/work-before-mid-term/ >> >> > Suggestions/reviews are welcome. >> >> > >> >> > Regards >> >> > Anshu Avinash >> >> > >> >> > >> >> > On Mon, Jun 9, 2014 at 7:30 PM, Roberto Spadim >> >> > <roberto@spadim.com.br> >> >> > wrote: >> >> >> >> >> >> Well i wws reading your posts >> >> >> Do you need big data to test read and scan times? >> >> >> >> >> >> Em segunda-feira, 9 de junho de 2014, Anshu Avinash >> >> >> <anshu.avinash35@gmail.com> escreveu: >> >> >> >> >> >>> Hi all, >> >> >>> >> >> >>> You can find this week's blog entry at >> >> >>> http://igniting.in/gsoc2014/2014/06/09/more-coding/. I'm now >> >> >>> maintaining the >> >> >>> code only on github: >> >> >>> https://github.com/igniting/server/tree/selfTuningOptimizer. >> >> >>> >> >> >>> Regards >> >> >>> Anshu Avinash >> >> >>> >> >> >>> >> >> >>> On Sun, May 25, 2014 at 3:27 PM, Anshu Avinash >> >> >>> <anshu.avinash35@gmail.com> wrote: >> >> >>> >> >> >>> Hi all, >> >> >>> >> >> >>> You can find my this week's blog entry at >> >> >>> http://igniting.in/gsoc2014/2014/05/25/coding-things-up/ . I >> >> >>> have >> >> >>> created a >> >> >>> branch on launchpad for my work: >> >> >>> >> >> >>> http://bazaar.launchpad.net/~igniting/maria/maria/revision/4211 >> >> >>> . >> >> >>> You >> >> >>> can >> >> >>> give your suggestions/reviews either on this thread or as a >> >> >>> comment >> >> >>> on >> >> >>> the >> >> >>> blog itself. >> >> >>> >> >> >>> Regards >> >> >>> Anshu Avinash >> >> >>> >> >> >>> >> >> >>> On Tue, May 20, 2014 at 1:22 AM, Roberto Spadim >> >> >>> <roberto@spadim.com.br> >> >> >>> wrote: >> >> >>> >> >> >>> wow a big work, congratulation guy, i will read part by part >> >> >>> to >> >> >>> better >> >> >>> understand mariadb code >> >> >>> >> >> >>> >> >> >>> 2014-05-19 16:33 GMT-03:00 Anshu Avinash >> >> >>> <anshu.avinash35@gmail.com>: >> >> >>> >> >> >>> Hi all, >> >> >>> >> >> >>> This week's blog entry would get delayed by couple of days. >> >> >>> I >> >> >>> have >> >> >>> started coding though and would like to give heads up on >> >> >>> what >> >> >>> I'm >> >> >>> doing. >> >> >>> >> >> >>> I've looked at the diffs for "Cost model project" of mysql: >> >> >>> >> >> >>> >> >> >>> http://bazaar.launchpad.net/~mysql/mysql-server/5.7/revision/7596 >> >> >>> and >> >> >>> >> >> >>> >> >> >>> http://bazaar.launchpad.net/~mysql/mysql-server/5.7/revision/7222 . >> >> >>> These >> >> >>> give a pretty good idea about what are the hard-coded >> >> >>> constants >> >> >>> and >> >> >>> where >> >> >>> are they being used. >> >> >>> >> >> >>> The idea is to multiply "READ_TIME_FACTOR" and >> >> >>> "SCAN_TIME_FACTOR" >> >> >>> to >> >> >>> the >> >> >>> values returned by read_time() and scan_time() in handler.h, >> >> >>> while >> >> >>> returning. These values would be read from a table in mysql >> >> >>> db. >> >> >>> For >> >> >>> that >> >> >>> I've looked at sql_statistics.cc. After completing this, >> >> >>> I'll >> >> >>> first >> >> >>> change >> >> >>> the values of these constants manually and check if the >> >> >>> better >> >> >>> or >> >> >>> worse >> >> >>> query plans are being selected. I'll first do the last step >> >> >>> manually, >> >> >>> to >> >> >>> check if everything is working as expected and later >> >> >>> automate >> >> >>> it. >> >> >>> >> >> >>> Regards >> >> >>> Anshu >> >> >>> >> >> >>> >> >> >>> On Mon, May 12, 2014 at 11:22 AM, Anshu Avinash >> >> >>> <anshu.avinash35@gmail.com> wrote: >> >> >>> >> >> >>> Hi all, >> >> >>> >> >> >>> You can find my blog entry for this week at >> >> >>> http://igniting.in/gsoc2014/2014/05/11/first-steps/ . >> >> >>> >> >> >>> Regards >> >> >>> Anshu Avinash >> >> >>> >> >> >>> >> >> >>> On Thu, May 8, 2014 at 11:46 PM, Anshu Avinash >> >> >>> <anshu.avinash35@gmail.com> wrote: >> >> >>> >> >> >>> Hi all, >> >> >>> >> >> >>> Sorry for the irregular updates. I had been busy for last >> >> >>> couple >> >> >>> of >> >> >>> days >> >> >>> and might still be busy for 1-2 days more. I would be >> >> >>> completely >> >> >>> free >> >> >>> starting next week, and would be updating my blog weekly on >> >> >>> every >> >> >>> Monday (so >> >> >>> 1st update would be on May 12). I would also send the link >> >> >>> of my >> >> >>> post >> >> >>> weekly >> >> >>> on the mailing list. >> >> >>> >> >> >>> As discussed on irc, I started to explore the pair of >> >> >>> constants: >> >> >>> handler::scan_time() and handler::read_time(). >> >> >> >> >> >> >> >> >> >> >> >> -- >> >> >> Roberto Spadim >> >> >> SPAEmpresarial >> >> >> Eng. Automação e Controle >> >> >> >> >> > >> >> >> >> >> >> >> >> -- >> >> Roberto Spadim >> >> SPAEmpresarial >> >> Eng. Automação e Controle >> > >> > >> >> >> >> -- >> Roberto Spadim >> SPAEmpresarial >> Eng. Automação e Controle > >
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
Hi, a1, a2, ..., a130 are coefficients. t1, t2, .., t130 are unknowns. We need 130 linearly independent equations to solve for these variables. We can never get 130 linearly independent equations as some of the coefficients would be 0 every time. Hence, we get an approximate solution by forming an overdetermined system (http://en.wikipedia.org/wiki/Overdetermined_system). Let me know if you have any further doubts. Regards Anshu Avinash On Wed, Jul 9, 2014 at 12:18 AM, Roberto Spadim <roberto@spadim.com.br> wrote:
ops, linear equation
going back... a1t1 + a2t2 + … + a130t130= ttotal
a1, t1...
a1 is something you don't know t1 is the coefficients[i]?
it's a first order equation, right?
2014-07-08 15:20 GMT-03:00 Anshu Avinash <anshu.avinash35@gmail.com>:
Hi,
The idea is we know the total time the query took, and how many times an operation was performed. For example, consider the case of 'read_time'. We know how many times an index read took place, but don't know how much time does it take to do an index read. By solving these equations, we are
to find out time for individual operations. coefficients[i].value is `how many time the operation i took place in a single query.`
Hope this clears things up.
Regards Anshu Avinash
On Tue, Jul 8, 2014 at 10:57 PM, Roberto Spadim <roberto@spadim.com.br> wrote:
just to understand... --- the solve_equation part, today only used to save information: std::ofstream datafile; char file_name[100]; my_snprintf(file_name, 100, "/tmp/mariadb_cost_coefficients_%lu.txt", thread_id); datafile.open(file_name, std::ios::app); for(int i=0; i < MAX_CONSTANTS; i++) datafile << coefficients[i].value << " "; datafile << total_time << "\n"; datafile.close(); ----
the idea is: given a query and some coefficients[i].value, you got total_time need to execute the query you want to "train" something to tell you how many time the same query should execute? or, what's the "x[i]" variables from your system (hardware/hard disk/etc), and extend this to others queries?
2014-07-08 14:20 GMT-03:00 Roberto Spadim <roberto@spadim.com.br>:
=] nice
2014-07-08 14:18 GMT-03:00 Anshu Avinash <anshu.avinash35@gmail.com
:
Hi all,
You can download it here
( https://drive.google.com/file/d/0B7NiQb4EbbUVNVJFZ2xkRVR3Ylk/edit?usp=sharin... ). It is around 26M. I have added the link on blog too.
Regards Anshu
On Tue, Jul 8, 2014 at 10:38 PM, Roberto Spadim < roberto@spadim.com.br> wrote: > > could you 'display' the dataset you used with octave? > > 2014-07-08 13:55 GMT-03:00 Anshu Avinash < anshu.avinash35@gmail.com>: > > Hi all, > > > > This week's blog post is at: > > http://igniting.in/gsoc2014/2014/07/08/solving-linear-equations/ . > > Sorry > > for > > the delay. > > Suggestions for an approach to solve the system of linear equations > > are > > welcome. > > > > Regards > > Anshu Avinash > > > > > > On Mon, Jun 23, 2014 at 7:39 PM, Roberto Spadim > > <roberto@spadim.com.br> > > wrote: > >> > >> " MDEV. " > >> it's nice to put full name (MDEV-350), since google and others > >> search > >> engines help when someone try to find information about mdev 350 > >> > >> text is ok :) > >> > >> 2014-06-23 11:04 GMT-03:00 Anshu Avinash > >> <anshu.avinash35@gmail.com>: > >> > Hi, > >> > > >> > Sorry for the confusion, this is the new link: > >> > http://igniting.in/gsoc2014/2014/06/23/work-before-mid-term/ > >> > Thanks for pointing out. > >> > > >> > Regards > >> > Anshu > >> > > >> > > >> > On Mon, Jun 23, 2014 at 7:32 PM, Roberto Spadim > >> > <roberto@spadim.com.br> > >> > wrote: > >> >> > >> >> "Sorry this page does not exist =(" > >> >> > >> >> 2014-06-23 8:07 GMT-03:00 Anshu Avinash > >> >> <anshu.avinash35@gmail.com>: > >> >> > Hi all, > >> >> > > >> >> > You can find this week's blog entry at: > >> >> > http://igniting.in/2014/06/23/work-before-mid-term/ > >> >> > Suggestions/reviews are welcome. > >> >> > > >> >> > Regards > >> >> > Anshu Avinash > >> >> > > >> >> > > >> >> > On Mon, Jun 9, 2014 at 7:30 PM, Roberto Spadim > >> >> > <roberto@spadim.com.br> > >> >> > wrote: > >> >> >> > >> >> >> Well i wws reading your posts > >> >> >> Do you need big data to test read and scan times? > >> >> >> > >> >> >> Em segunda-feira, 9 de junho de 2014, Anshu Avinash > >> >> >> <anshu.avinash35@gmail.com> escreveu: > >> >> >> > >> >> >>> Hi all, > >> >> >>> > >> >> >>> You can find this week's blog entry at > >> >> >>> http://igniting.in/gsoc2014/2014/06/09/more-coding/. I'm now > >> >> >>> maintaining the > >> >> >>> code only on github: > >> >> >>> https://github.com/igniting/server/tree/selfTuningOptimizer. > >> >> >>> > >> >> >>> Regards > >> >> >>> Anshu Avinash > >> >> >>> > >> >> >>> > >> >> >>> On Sun, May 25, 2014 at 3:27 PM, Anshu Avinash > >> >> >>> <anshu.avinash35@gmail.com> wrote: > >> >> >>> > >> >> >>> Hi all, > >> >> >>> > >> >> >>> You can find my this week's blog entry at > >> >> >>> http://igniting.in/gsoc2014/2014/05/25/coding-things-up/ . I > >> >> >>> have > >> >> >>> created a > >> >> >>> branch on launchpad for my work: > >> >> >>> > >> >> >>> http://bazaar.launchpad.net/~igniting/maria/maria/revision/4211 > >> >> >>> . > >> >> >>> You > >> >> >>> can > >> >> >>> give your suggestions/reviews either on this thread or as a > >> >> >>> comment > >> >> >>> on > >> >> >>> the > >> >> >>> blog itself. > >> >> >>> > >> >> >>> Regards > >> >> >>> Anshu Avinash > >> >> >>> > >> >> >>> > >> >> >>> On Tue, May 20, 2014 at 1:22 AM, Roberto Spadim > >> >> >>> <roberto@spadim.com.br> > >> >> >>> wrote: > >> >> >>> > >> >> >>> wow a big work, congratulation guy, i will read part by
> >> >> >>> to > >> >> >>> better > >> >> >>> understand mariadb code > >> >> >>> > >> >> >>> > >> >> >>> 2014-05-19 16:33 GMT-03:00 Anshu Avinash > >> >> >>> <anshu.avinash35@gmail.com>: > >> >> >>> > >> >> >>> Hi all, > >> >> >>> > >> >> >>> This week's blog entry would get delayed by couple of days. > >> >> >>> I > >> >> >>> have > >> >> >>> started coding though and would like to give heads up on > >> >> >>> what > >> >> >>> I'm > >> >> >>> doing. > >> >> >>> > >> >> >>> I've looked at the diffs for "Cost model project" of mysql: > >> >> >>> > >> >> >>> > >> >> >>> http://bazaar.launchpad.net/~mysql/mysql-server/5.7/revision/7596 > >> >> >>> and > >> >> >>> > >> >> >>> > >> >> >>> http://bazaar.launchpad.net/~mysql/mysql-server/5.7/revision/7222 . > >> >> >>> These > >> >> >>> give a pretty good idea about what are the hard-coded > >> >> >>> constants > >> >> >>> and > >> >> >>> where > >> >> >>> are they being used. > >> >> >>> > >> >> >>> The idea is to multiply "READ_TIME_FACTOR" and > >> >> >>> "SCAN_TIME_FACTOR" > >> >> >>> to > >> >> >>> the > >> >> >>> values returned by read_time() and scan_time() in handler.h, > >> >> >>> while > >> >> >>> returning. These values would be read from a table in mysql > >> >> >>> db. > >> >> >>> For > >> >> >>> that > >> >> >>> I've looked at sql_statistics.cc. After completing this, > >> >> >>> I'll > >> >> >>> first > >> >> >>> change > >> >> >>> the values of these constants manually and check if the > >> >> >>> better > >> >> >>> or > >> >> >>> worse > >> >> >>> query plans are being selected. I'll first do the last step > >> >> >>> manually, > >> >> >>> to > >> >> >>> check if everything is working as expected and later > >> >> >>> automate > >> >> >>> it. > >> >> >>> > >> >> >>> Regards > >> >> >>> Anshu > >> >> >>> > >> >> >>> > >> >> >>> On Mon, May 12, 2014 at 11:22 AM, Anshu Avinash > >> >> >>> <anshu.avinash35@gmail.com> wrote: > >> >> >>> > >> >> >>> Hi all, > >> >> >>> > >> >> >>> You can find my blog entry for this week at > >> >> >>> http://igniting.in/gsoc2014/2014/05/11/first-steps/ . > >> >> >>> > >> >> >>> Regards > >> >> >>> Anshu Avinash > >> >> >>> > >> >> >>> > >> >> >>> On Thu, May 8, 2014 at 11:46 PM, Anshu Avinash > >> >> >>> <anshu.avinash35@gmail.com> wrote: > >> >> >>> > >> >> >>> Hi all, > >> >> >>> > >> >> >>> Sorry for the irregular updates. I had been busy for last > >> >> >>> couple > >> >> >>> of > >> >> >>> days > >> >> >>> and might still be busy for 1-2 days more. I would be > >> >> >>> completely > >> >> >>> free > >> >> >>> starting next week, and would be updating my blog weekly on > >> >> >>> every > >> >> >>> Monday (so > >> >> >>> 1st update would be on May 12). I would also send the
2014-07-08 15:47 GMT-03:00 Roberto Spadim <roberto@spadim.com.br>: trying part link
> >> >> >>> of my > >> >> >>> post > >> >> >>> weekly > >> >> >>> on the mailing list. > >> >> >>> > >> >> >>> As discussed on irc, I started to explore the pair of > >> >> >>> constants: > >> >> >>> handler::scan_time() and handler::read_time(). > >> >> >> > >> >> >> > >> >> >> > >> >> >> -- > >> >> >> Roberto Spadim > >> >> >> SPAEmpresarial > >> >> >> Eng. Automação e Controle > >> >> >> > >> >> > > >> >> > >> >> > >> >> > >> >> -- > >> >> Roberto Spadim > >> >> SPAEmpresarial > >> >> Eng. Automação e Controle > >> > > >> > > >> > >> > >> > >> -- > >> Roberto Spadim > >> SPAEmpresarial > >> Eng. Automação e Controle > > > > > > > > -- > Roberto Spadim > SPAEmpresarial > Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
a1,a2 is what your solve equation function is saving? and you want know t1,t2,t3..t130, to understand how much time each 'read function' take, that's it? doing this, what's the next step? this is a start point to select what's better? index vs table scan? 2014-07-08 15:54 GMT-03:00 Anshu Avinash <anshu.avinash35@gmail.com>:
Hi,
a1, a2, ..., a130 are coefficients. t1, t2, .., t130 are unknowns. We need 130 linearly independent equations to solve for these variables. We can never get 130 linearly independent equations as some of the coefficients would be 0 every time. Hence, we get an approximate solution by forming an overdetermined system (http://en.wikipedia.org/wiki/Overdetermined_system). Let me know if you have any further doubts.
Regards Anshu Avinash
On Wed, Jul 9, 2014 at 12:18 AM, Roberto Spadim <roberto@spadim.com.br> wrote:
ops, linear equation
2014-07-08 15:47 GMT-03:00 Roberto Spadim <roberto@spadim.com.br>:
going back... a1t1 + a2t2 + … + a130t130= ttotal
a1, t1...
a1 is something you don't know t1 is the coefficients[i]?
it's a first order equation, right?
2014-07-08 15:20 GMT-03:00 Anshu Avinash <anshu.avinash35@gmail.com>:
Hi,
The idea is we know the total time the query took, and how many times an operation was performed. For example, consider the case of 'read_time'. We know how many times an index read took place, but don't know how much time does it take to do an index read. By solving these equations, we are trying to find out time for individual operations. coefficients[i].value is `how many time the operation i took place in a single query.`
Hope this clears things up.
Regards Anshu Avinash
On Tue, Jul 8, 2014 at 10:57 PM, Roberto Spadim <roberto@spadim.com.br> wrote:
just to understand... --- the solve_equation part, today only used to save information: std::ofstream datafile; char file_name[100]; my_snprintf(file_name, 100, "/tmp/mariadb_cost_coefficients_%lu.txt", thread_id); datafile.open(file_name, std::ios::app); for(int i=0; i < MAX_CONSTANTS; i++) datafile << coefficients[i].value << " "; datafile << total_time << "\n"; datafile.close(); ----
the idea is: given a query and some coefficients[i].value, you got total_time need to execute the query you want to "train" something to tell you how many time the same query should execute? or, what's the "x[i]" variables from your system (hardware/hard disk/etc), and extend this to others queries?
2014-07-08 14:20 GMT-03:00 Roberto Spadim <roberto@spadim.com.br>:
=] nice
2014-07-08 14:18 GMT-03:00 Anshu Avinash <anshu.avinash35@gmail.com>: > Hi all, > > You can download it here > > > (https://drive.google.com/file/d/0B7NiQb4EbbUVNVJFZ2xkRVR3Ylk/edit?usp=sharin...). > It is around 26M. I have added the link on blog too. > > Regards > Anshu > > > On Tue, Jul 8, 2014 at 10:38 PM, Roberto Spadim > <roberto@spadim.com.br> > wrote: >> >> could you 'display' the dataset you used with octave? >> >> 2014-07-08 13:55 GMT-03:00 Anshu Avinash >> <anshu.avinash35@gmail.com>: >> > Hi all, >> > >> > This week's blog post is at: >> > http://igniting.in/gsoc2014/2014/07/08/solving-linear-equations/ >> > . >> > Sorry >> > for >> > the delay. >> > Suggestions for an approach to solve the system of linear >> > equations >> > are >> > welcome. >> > >> > Regards >> > Anshu Avinash >> > >> > >> > On Mon, Jun 23, 2014 at 7:39 PM, Roberto Spadim >> > <roberto@spadim.com.br> >> > wrote: >> >> >> >> " MDEV. " >> >> it's nice to put full name (MDEV-350), since google and others >> >> search >> >> engines help when someone try to find information about mdev >> >> 350 >> >> >> >> text is ok :) >> >> >> >> 2014-06-23 11:04 GMT-03:00 Anshu Avinash >> >> <anshu.avinash35@gmail.com>: >> >> > Hi, >> >> > >> >> > Sorry for the confusion, this is the new link: >> >> > http://igniting.in/gsoc2014/2014/06/23/work-before-mid-term/ >> >> > Thanks for pointing out. >> >> > >> >> > Regards >> >> > Anshu >> >> > >> >> > >> >> > On Mon, Jun 23, 2014 at 7:32 PM, Roberto Spadim >> >> > <roberto@spadim.com.br> >> >> > wrote: >> >> >> >> >> >> "Sorry this page does not exist =(" >> >> >> >> >> >> 2014-06-23 8:07 GMT-03:00 Anshu Avinash >> >> >> <anshu.avinash35@gmail.com>: >> >> >> > Hi all, >> >> >> > >> >> >> > You can find this week's blog entry at: >> >> >> > http://igniting.in/2014/06/23/work-before-mid-term/ >> >> >> > Suggestions/reviews are welcome. >> >> >> > >> >> >> > Regards >> >> >> > Anshu Avinash >> >> >> > >> >> >> > >> >> >> > On Mon, Jun 9, 2014 at 7:30 PM, Roberto Spadim >> >> >> > <roberto@spadim.com.br> >> >> >> > wrote: >> >> >> >> >> >> >> >> Well i wws reading your posts >> >> >> >> Do you need big data to test read and scan times? >> >> >> >> >> >> >> >> Em segunda-feira, 9 de junho de 2014, Anshu Avinash >> >> >> >> <anshu.avinash35@gmail.com> escreveu: >> >> >> >> >> >> >> >>> Hi all, >> >> >> >>> >> >> >> >>> You can find this week's blog entry at >> >> >> >>> http://igniting.in/gsoc2014/2014/06/09/more-coding/. I'm >> >> >> >>> now >> >> >> >>> maintaining the >> >> >> >>> code only on github: >> >> >> >>> >> >> >> >>> https://github.com/igniting/server/tree/selfTuningOptimizer. >> >> >> >>> >> >> >> >>> Regards >> >> >> >>> Anshu Avinash >> >> >> >>> >> >> >> >>> >> >> >> >>> On Sun, May 25, 2014 at 3:27 PM, Anshu Avinash >> >> >> >>> <anshu.avinash35@gmail.com> wrote: >> >> >> >>> >> >> >> >>> Hi all, >> >> >> >>> >> >> >> >>> You can find my this week's blog entry at >> >> >> >>> http://igniting.in/gsoc2014/2014/05/25/coding-things-up/ >> >> >> >>> . I >> >> >> >>> have >> >> >> >>> created a >> >> >> >>> branch on launchpad for my work: >> >> >> >>> >> >> >> >>> >> >> >> >>> http://bazaar.launchpad.net/~igniting/maria/maria/revision/4211 >> >> >> >>> . >> >> >> >>> You >> >> >> >>> can >> >> >> >>> give your suggestions/reviews either on this thread or >> >> >> >>> as a >> >> >> >>> comment >> >> >> >>> on >> >> >> >>> the >> >> >> >>> blog itself. >> >> >> >>> >> >> >> >>> Regards >> >> >> >>> Anshu Avinash >> >> >> >>> >> >> >> >>> >> >> >> >>> On Tue, May 20, 2014 at 1:22 AM, Roberto Spadim >> >> >> >>> <roberto@spadim.com.br> >> >> >> >>> wrote: >> >> >> >>> >> >> >> >>> wow a big work, congratulation guy, i will read part by >> >> >> >>> part >> >> >> >>> to >> >> >> >>> better >> >> >> >>> understand mariadb code >> >> >> >>> >> >> >> >>> >> >> >> >>> 2014-05-19 16:33 GMT-03:00 Anshu Avinash >> >> >> >>> <anshu.avinash35@gmail.com>: >> >> >> >>> >> >> >> >>> Hi all, >> >> >> >>> >> >> >> >>> This week's blog entry would get delayed by couple of >> >> >> >>> days. >> >> >> >>> I >> >> >> >>> have >> >> >> >>> started coding though and would like to give heads up on >> >> >> >>> what >> >> >> >>> I'm >> >> >> >>> doing. >> >> >> >>> >> >> >> >>> I've looked at the diffs for "Cost model project" of >> >> >> >>> mysql: >> >> >> >>> >> >> >> >>> >> >> >> >>> >> >> >> >>> http://bazaar.launchpad.net/~mysql/mysql-server/5.7/revision/7596 >> >> >> >>> and >> >> >> >>> >> >> >> >>> >> >> >> >>> >> >> >> >>> http://bazaar.launchpad.net/~mysql/mysql-server/5.7/revision/7222 . >> >> >> >>> These >> >> >> >>> give a pretty good idea about what are the hard-coded >> >> >> >>> constants >> >> >> >>> and >> >> >> >>> where >> >> >> >>> are they being used. >> >> >> >>> >> >> >> >>> The idea is to multiply "READ_TIME_FACTOR" and >> >> >> >>> "SCAN_TIME_FACTOR" >> >> >> >>> to >> >> >> >>> the >> >> >> >>> values returned by read_time() and scan_time() in >> >> >> >>> handler.h, >> >> >> >>> while >> >> >> >>> returning. These values would be read from a table in >> >> >> >>> mysql >> >> >> >>> db. >> >> >> >>> For >> >> >> >>> that >> >> >> >>> I've looked at sql_statistics.cc. After completing this, >> >> >> >>> I'll >> >> >> >>> first >> >> >> >>> change >> >> >> >>> the values of these constants manually and check if the >> >> >> >>> better >> >> >> >>> or >> >> >> >>> worse >> >> >> >>> query plans are being selected. I'll first do the last >> >> >> >>> step >> >> >> >>> manually, >> >> >> >>> to >> >> >> >>> check if everything is working as expected and later >> >> >> >>> automate >> >> >> >>> it. >> >> >> >>> >> >> >> >>> Regards >> >> >> >>> Anshu >> >> >> >>> >> >> >> >>> >> >> >> >>> On Mon, May 12, 2014 at 11:22 AM, Anshu Avinash >> >> >> >>> <anshu.avinash35@gmail.com> wrote: >> >> >> >>> >> >> >> >>> Hi all, >> >> >> >>> >> >> >> >>> You can find my blog entry for this week at >> >> >> >>> http://igniting.in/gsoc2014/2014/05/11/first-steps/ . >> >> >> >>> >> >> >> >>> Regards >> >> >> >>> Anshu Avinash >> >> >> >>> >> >> >> >>> >> >> >> >>> On Thu, May 8, 2014 at 11:46 PM, Anshu Avinash >> >> >> >>> <anshu.avinash35@gmail.com> wrote: >> >> >> >>> >> >> >> >>> Hi all, >> >> >> >>> >> >> >> >>> Sorry for the irregular updates. I had been busy for >> >> >> >>> last >> >> >> >>> couple >> >> >> >>> of >> >> >> >>> days >> >> >> >>> and might still be busy for 1-2 days more. I would be >> >> >> >>> completely >> >> >> >>> free >> >> >> >>> starting next week, and would be updating my blog weekly >> >> >> >>> on >> >> >> >>> every >> >> >> >>> Monday (so >> >> >> >>> 1st update would be on May 12). I would also send the >> >> >> >>> link >> >> >> >>> of my >> >> >> >>> post >> >> >> >>> weekly >> >> >> >>> on the mailing list. >> >> >> >>> >> >> >> >>> As discussed on irc, I started to explore the pair of >> >> >> >>> constants: >> >> >> >>> handler::scan_time() and handler::read_time(). >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> -- >> >> >> >> Roberto Spadim >> >> >> >> SPAEmpresarial >> >> >> >> Eng. Automação e Controle >> >> >> >> >> >> >> > >> >> >> >> >> >> >> >> >> >> >> >> -- >> >> >> Roberto Spadim >> >> >> SPAEmpresarial >> >> >> Eng. Automação e Controle >> >> > >> >> > >> >> >> >> >> >> >> >> -- >> >> Roberto Spadim >> >> SPAEmpresarial >> >> Eng. Automação e Controle >> > >> > >> >> >> >> -- >> Roberto Spadim >> SPAEmpresarial >> Eng. Automação e Controle > >
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
check lapack lib, i used it some years ago, and it solve linear equations, at least you don't waste time with 'how to solve linear equations', if you want to study :) lapack was a nice lib, at least i used without problems 2014-07-08 16:04 GMT-03:00 Roberto Spadim <roberto@spadim.com.br>:
a1,a2 is what your solve equation function is saving? and you want know t1,t2,t3..t130, to understand how much time each 'read function' take, that's it?
doing this, what's the next step? this is a start point to select what's better? index vs table scan?
2014-07-08 15:54 GMT-03:00 Anshu Avinash <anshu.avinash35@gmail.com>:
Hi,
a1, a2, ..., a130 are coefficients. t1, t2, .., t130 are unknowns. We need 130 linearly independent equations to solve for these variables. We can never get 130 linearly independent equations as some of the coefficients would be 0 every time. Hence, we get an approximate solution by forming an overdetermined system (http://en.wikipedia.org/wiki/Overdetermined_system). Let me know if you have any further doubts.
Regards Anshu Avinash
On Wed, Jul 9, 2014 at 12:18 AM, Roberto Spadim <roberto@spadim.com.br> wrote:
ops, linear equation
2014-07-08 15:47 GMT-03:00 Roberto Spadim <roberto@spadim.com.br>:
going back... a1t1 + a2t2 + … + a130t130= ttotal
a1, t1...
a1 is something you don't know t1 is the coefficients[i]?
it's a first order equation, right?
2014-07-08 15:20 GMT-03:00 Anshu Avinash <anshu.avinash35@gmail.com>:
Hi,
The idea is we know the total time the query took, and how many times an operation was performed. For example, consider the case of 'read_time'. We know how many times an index read took place, but don't know how much time does it take to do an index read. By solving these equations, we are trying to find out time for individual operations. coefficients[i].value is `how many time the operation i took place in a single query.`
Hope this clears things up.
Regards Anshu Avinash
On Tue, Jul 8, 2014 at 10:57 PM, Roberto Spadim <roberto@spadim.com.br> wrote:
just to understand... --- the solve_equation part, today only used to save information: std::ofstream datafile; char file_name[100]; my_snprintf(file_name, 100, "/tmp/mariadb_cost_coefficients_%lu.txt", thread_id); datafile.open(file_name, std::ios::app); for(int i=0; i < MAX_CONSTANTS; i++) datafile << coefficients[i].value << " "; datafile << total_time << "\n"; datafile.close(); ----
the idea is: given a query and some coefficients[i].value, you got total_time need to execute the query you want to "train" something to tell you how many time the same query should execute? or, what's the "x[i]" variables from your system (hardware/hard disk/etc), and extend this to others queries?
2014-07-08 14:20 GMT-03:00 Roberto Spadim <roberto@spadim.com.br>: > =] nice > > 2014-07-08 14:18 GMT-03:00 Anshu Avinash > <anshu.avinash35@gmail.com>: >> Hi all, >> >> You can download it here >> >> >> (https://drive.google.com/file/d/0B7NiQb4EbbUVNVJFZ2xkRVR3Ylk/edit?usp=sharin...). >> It is around 26M. I have added the link on blog too. >> >> Regards >> Anshu >> >> >> On Tue, Jul 8, 2014 at 10:38 PM, Roberto Spadim >> <roberto@spadim.com.br> >> wrote: >>> >>> could you 'display' the dataset you used with octave? >>> >>> 2014-07-08 13:55 GMT-03:00 Anshu Avinash >>> <anshu.avinash35@gmail.com>: >>> > Hi all, >>> > >>> > This week's blog post is at: >>> > http://igniting.in/gsoc2014/2014/07/08/solving-linear-equations/ >>> > . >>> > Sorry >>> > for >>> > the delay. >>> > Suggestions for an approach to solve the system of linear >>> > equations >>> > are >>> > welcome. >>> > >>> > Regards >>> > Anshu Avinash >>> > >>> > >>> > On Mon, Jun 23, 2014 at 7:39 PM, Roberto Spadim >>> > <roberto@spadim.com.br> >>> > wrote: >>> >> >>> >> " MDEV. " >>> >> it's nice to put full name (MDEV-350), since google and others >>> >> search >>> >> engines help when someone try to find information about mdev >>> >> 350 >>> >> >>> >> text is ok :) >>> >> >>> >> 2014-06-23 11:04 GMT-03:00 Anshu Avinash >>> >> <anshu.avinash35@gmail.com>: >>> >> > Hi, >>> >> > >>> >> > Sorry for the confusion, this is the new link: >>> >> > http://igniting.in/gsoc2014/2014/06/23/work-before-mid-term/ >>> >> > Thanks for pointing out. >>> >> > >>> >> > Regards >>> >> > Anshu >>> >> > >>> >> > >>> >> > On Mon, Jun 23, 2014 at 7:32 PM, Roberto Spadim >>> >> > <roberto@spadim.com.br> >>> >> > wrote: >>> >> >> >>> >> >> "Sorry this page does not exist =(" >>> >> >> >>> >> >> 2014-06-23 8:07 GMT-03:00 Anshu Avinash >>> >> >> <anshu.avinash35@gmail.com>: >>> >> >> > Hi all, >>> >> >> > >>> >> >> > You can find this week's blog entry at: >>> >> >> > http://igniting.in/2014/06/23/work-before-mid-term/ >>> >> >> > Suggestions/reviews are welcome. >>> >> >> > >>> >> >> > Regards >>> >> >> > Anshu Avinash >>> >> >> > >>> >> >> > >>> >> >> > On Mon, Jun 9, 2014 at 7:30 PM, Roberto Spadim >>> >> >> > <roberto@spadim.com.br> >>> >> >> > wrote: >>> >> >> >> >>> >> >> >> Well i wws reading your posts >>> >> >> >> Do you need big data to test read and scan times? >>> >> >> >> >>> >> >> >> Em segunda-feira, 9 de junho de 2014, Anshu Avinash >>> >> >> >> <anshu.avinash35@gmail.com> escreveu: >>> >> >> >> >>> >> >> >>> Hi all, >>> >> >> >>> >>> >> >> >>> You can find this week's blog entry at >>> >> >> >>> http://igniting.in/gsoc2014/2014/06/09/more-coding/. I'm >>> >> >> >>> now >>> >> >> >>> maintaining the >>> >> >> >>> code only on github: >>> >> >> >>> >>> >> >> >>> https://github.com/igniting/server/tree/selfTuningOptimizer. >>> >> >> >>> >>> >> >> >>> Regards >>> >> >> >>> Anshu Avinash >>> >> >> >>> >>> >> >> >>> >>> >> >> >>> On Sun, May 25, 2014 at 3:27 PM, Anshu Avinash >>> >> >> >>> <anshu.avinash35@gmail.com> wrote: >>> >> >> >>> >>> >> >> >>> Hi all, >>> >> >> >>> >>> >> >> >>> You can find my this week's blog entry at >>> >> >> >>> http://igniting.in/gsoc2014/2014/05/25/coding-things-up/ >>> >> >> >>> . I >>> >> >> >>> have >>> >> >> >>> created a >>> >> >> >>> branch on launchpad for my work: >>> >> >> >>> >>> >> >> >>> >>> >> >> >>> http://bazaar.launchpad.net/~igniting/maria/maria/revision/4211 >>> >> >> >>> . >>> >> >> >>> You >>> >> >> >>> can >>> >> >> >>> give your suggestions/reviews either on this thread or >>> >> >> >>> as a >>> >> >> >>> comment >>> >> >> >>> on >>> >> >> >>> the >>> >> >> >>> blog itself. >>> >> >> >>> >>> >> >> >>> Regards >>> >> >> >>> Anshu Avinash >>> >> >> >>> >>> >> >> >>> >>> >> >> >>> On Tue, May 20, 2014 at 1:22 AM, Roberto Spadim >>> >> >> >>> <roberto@spadim.com.br> >>> >> >> >>> wrote: >>> >> >> >>> >>> >> >> >>> wow a big work, congratulation guy, i will read part by >>> >> >> >>> part >>> >> >> >>> to >>> >> >> >>> better >>> >> >> >>> understand mariadb code >>> >> >> >>> >>> >> >> >>> >>> >> >> >>> 2014-05-19 16:33 GMT-03:00 Anshu Avinash >>> >> >> >>> <anshu.avinash35@gmail.com>: >>> >> >> >>> >>> >> >> >>> Hi all, >>> >> >> >>> >>> >> >> >>> This week's blog entry would get delayed by couple of >>> >> >> >>> days. >>> >> >> >>> I >>> >> >> >>> have >>> >> >> >>> started coding though and would like to give heads up on >>> >> >> >>> what >>> >> >> >>> I'm >>> >> >> >>> doing. >>> >> >> >>> >>> >> >> >>> I've looked at the diffs for "Cost model project" of >>> >> >> >>> mysql: >>> >> >> >>> >>> >> >> >>> >>> >> >> >>> >>> >> >> >>> http://bazaar.launchpad.net/~mysql/mysql-server/5.7/revision/7596 >>> >> >> >>> and >>> >> >> >>> >>> >> >> >>> >>> >> >> >>> >>> >> >> >>> http://bazaar.launchpad.net/~mysql/mysql-server/5.7/revision/7222 . >>> >> >> >>> These >>> >> >> >>> give a pretty good idea about what are the hard-coded >>> >> >> >>> constants >>> >> >> >>> and >>> >> >> >>> where >>> >> >> >>> are they being used. >>> >> >> >>> >>> >> >> >>> The idea is to multiply "READ_TIME_FACTOR" and >>> >> >> >>> "SCAN_TIME_FACTOR" >>> >> >> >>> to >>> >> >> >>> the >>> >> >> >>> values returned by read_time() and scan_time() in >>> >> >> >>> handler.h, >>> >> >> >>> while >>> >> >> >>> returning. These values would be read from a table in >>> >> >> >>> mysql >>> >> >> >>> db. >>> >> >> >>> For >>> >> >> >>> that >>> >> >> >>> I've looked at sql_statistics.cc. After completing this, >>> >> >> >>> I'll >>> >> >> >>> first >>> >> >> >>> change >>> >> >> >>> the values of these constants manually and check if the >>> >> >> >>> better >>> >> >> >>> or >>> >> >> >>> worse >>> >> >> >>> query plans are being selected. I'll first do the last >>> >> >> >>> step >>> >> >> >>> manually, >>> >> >> >>> to >>> >> >> >>> check if everything is working as expected and later >>> >> >> >>> automate >>> >> >> >>> it. >>> >> >> >>> >>> >> >> >>> Regards >>> >> >> >>> Anshu >>> >> >> >>> >>> >> >> >>> >>> >> >> >>> On Mon, May 12, 2014 at 11:22 AM, Anshu Avinash >>> >> >> >>> <anshu.avinash35@gmail.com> wrote: >>> >> >> >>> >>> >> >> >>> Hi all, >>> >> >> >>> >>> >> >> >>> You can find my blog entry for this week at >>> >> >> >>> http://igniting.in/gsoc2014/2014/05/11/first-steps/ . >>> >> >> >>> >>> >> >> >>> Regards >>> >> >> >>> Anshu Avinash >>> >> >> >>> >>> >> >> >>> >>> >> >> >>> On Thu, May 8, 2014 at 11:46 PM, Anshu Avinash >>> >> >> >>> <anshu.avinash35@gmail.com> wrote: >>> >> >> >>> >>> >> >> >>> Hi all, >>> >> >> >>> >>> >> >> >>> Sorry for the irregular updates. I had been busy for >>> >> >> >>> last >>> >> >> >>> couple >>> >> >> >>> of >>> >> >> >>> days >>> >> >> >>> and might still be busy for 1-2 days more. I would be >>> >> >> >>> completely >>> >> >> >>> free >>> >> >> >>> starting next week, and would be updating my blog weekly >>> >> >> >>> on >>> >> >> >>> every >>> >> >> >>> Monday (so >>> >> >> >>> 1st update would be on May 12). I would also send the >>> >> >> >>> link >>> >> >> >>> of my >>> >> >> >>> post >>> >> >> >>> weekly >>> >> >> >>> on the mailing list. >>> >> >> >>> >>> >> >> >>> As discussed on irc, I started to explore the pair of >>> >> >> >>> constants: >>> >> >> >>> handler::scan_time() and handler::read_time(). >>> >> >> >> >>> >> >> >> >>> >> >> >> >>> >> >> >> -- >>> >> >> >> Roberto Spadim >>> >> >> >> SPAEmpresarial >>> >> >> >> Eng. Automação e Controle >>> >> >> >> >>> >> >> > >>> >> >> >>> >> >> >>> >> >> >>> >> >> -- >>> >> >> Roberto Spadim >>> >> >> SPAEmpresarial >>> >> >> Eng. Automação e Controle >>> >> > >>> >> > >>> >> >>> >> >>> >> >>> >> -- >>> >> Roberto Spadim >>> >> SPAEmpresarial >>> >> Eng. Automação e Controle >>> > >>> > >>> >>> >>> >>> -- >>> Roberto Spadim >>> SPAEmpresarial >>> Eng. Automação e Controle >> >> > > > > -- > Roberto Spadim > SPAEmpresarial > Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
http://www.netlib.org/lapack/lapacke.html a C api to lapack 2014-07-08 16:23 GMT-03:00 Roberto Spadim <roberto@spadim.com.br>:
check lapack lib, i used it some years ago, and it solve linear equations, at least you don't waste time with 'how to solve linear equations', if you want to study :) lapack was a nice lib, at least i used without problems
2014-07-08 16:04 GMT-03:00 Roberto Spadim <roberto@spadim.com.br>:
a1,a2 is what your solve equation function is saving? and you want know t1,t2,t3..t130, to understand how much time each 'read function' take, that's it?
doing this, what's the next step? this is a start point to select what's better? index vs table scan?
2014-07-08 15:54 GMT-03:00 Anshu Avinash <anshu.avinash35@gmail.com>:
Hi,
a1, a2, ..., a130 are coefficients. t1, t2, .., t130 are unknowns. We need 130 linearly independent equations to solve for these variables. We can never get 130 linearly independent equations as some of the coefficients would be 0 every time. Hence, we get an approximate solution by forming an overdetermined system (http://en.wikipedia.org/wiki/Overdetermined_system). Let me know if you have any further doubts.
Regards Anshu Avinash
On Wed, Jul 9, 2014 at 12:18 AM, Roberto Spadim <roberto@spadim.com.br> wrote:
ops, linear equation
2014-07-08 15:47 GMT-03:00 Roberto Spadim <roberto@spadim.com.br>:
going back... a1t1 + a2t2 + … + a130t130= ttotal
a1, t1...
a1 is something you don't know t1 is the coefficients[i]?
it's a first order equation, right?
2014-07-08 15:20 GMT-03:00 Anshu Avinash <anshu.avinash35@gmail.com>:
Hi,
The idea is we know the total time the query took, and how many times an operation was performed. For example, consider the case of 'read_time'. We know how many times an index read took place, but don't know how much time does it take to do an index read. By solving these equations, we are trying to find out time for individual operations. coefficients[i].value is `how many time the operation i took place in a single query.`
Hope this clears things up.
Regards Anshu Avinash
On Tue, Jul 8, 2014 at 10:57 PM, Roberto Spadim <roberto@spadim.com.br> wrote: > > just to understand... > --- the solve_equation part, today only used to save information: > std::ofstream datafile; > char file_name[100]; > my_snprintf(file_name, 100, > "/tmp/mariadb_cost_coefficients_%lu.txt", thread_id); > datafile.open(file_name, std::ios::app); > for(int i=0; i < MAX_CONSTANTS; i++) > datafile << coefficients[i].value << " "; > datafile << total_time << "\n"; > datafile.close(); > ---- > > the idea is: given a query and some coefficients[i].value, you got > total_time need to execute the query > you want to "train" something to tell you how many time the same query > should execute? > or, what's the "x[i]" variables from your system (hardware/hard > disk/etc), and extend this to others queries? > > > 2014-07-08 14:20 GMT-03:00 Roberto Spadim <roberto@spadim.com.br>: > > =] nice > > > > 2014-07-08 14:18 GMT-03:00 Anshu Avinash > > <anshu.avinash35@gmail.com>: > >> Hi all, > >> > >> You can download it here > >> > >> > >> (https://drive.google.com/file/d/0B7NiQb4EbbUVNVJFZ2xkRVR3Ylk/edit?usp=sharin...). > >> It is around 26M. I have added the link on blog too. > >> > >> Regards > >> Anshu > >> > >> > >> On Tue, Jul 8, 2014 at 10:38 PM, Roberto Spadim > >> <roberto@spadim.com.br> > >> wrote: > >>> > >>> could you 'display' the dataset you used with octave? > >>> > >>> 2014-07-08 13:55 GMT-03:00 Anshu Avinash > >>> <anshu.avinash35@gmail.com>: > >>> > Hi all, > >>> > > >>> > This week's blog post is at: > >>> > http://igniting.in/gsoc2014/2014/07/08/solving-linear-equations/ > >>> > . > >>> > Sorry > >>> > for > >>> > the delay. > >>> > Suggestions for an approach to solve the system of linear > >>> > equations > >>> > are > >>> > welcome. > >>> > > >>> > Regards > >>> > Anshu Avinash > >>> > > >>> > > >>> > On Mon, Jun 23, 2014 at 7:39 PM, Roberto Spadim > >>> > <roberto@spadim.com.br> > >>> > wrote: > >>> >> > >>> >> " MDEV. " > >>> >> it's nice to put full name (MDEV-350), since google and others > >>> >> search > >>> >> engines help when someone try to find information about mdev > >>> >> 350 > >>> >> > >>> >> text is ok :) > >>> >> > >>> >> 2014-06-23 11:04 GMT-03:00 Anshu Avinash > >>> >> <anshu.avinash35@gmail.com>: > >>> >> > Hi, > >>> >> > > >>> >> > Sorry for the confusion, this is the new link: > >>> >> > http://igniting.in/gsoc2014/2014/06/23/work-before-mid-term/ > >>> >> > Thanks for pointing out. > >>> >> > > >>> >> > Regards > >>> >> > Anshu > >>> >> > > >>> >> > > >>> >> > On Mon, Jun 23, 2014 at 7:32 PM, Roberto Spadim > >>> >> > <roberto@spadim.com.br> > >>> >> > wrote: > >>> >> >> > >>> >> >> "Sorry this page does not exist =(" > >>> >> >> > >>> >> >> 2014-06-23 8:07 GMT-03:00 Anshu Avinash > >>> >> >> <anshu.avinash35@gmail.com>: > >>> >> >> > Hi all, > >>> >> >> > > >>> >> >> > You can find this week's blog entry at: > >>> >> >> > http://igniting.in/2014/06/23/work-before-mid-term/ > >>> >> >> > Suggestions/reviews are welcome. > >>> >> >> > > >>> >> >> > Regards > >>> >> >> > Anshu Avinash > >>> >> >> > > >>> >> >> > > >>> >> >> > On Mon, Jun 9, 2014 at 7:30 PM, Roberto Spadim > >>> >> >> > <roberto@spadim.com.br> > >>> >> >> > wrote: > >>> >> >> >> > >>> >> >> >> Well i wws reading your posts > >>> >> >> >> Do you need big data to test read and scan times? > >>> >> >> >> > >>> >> >> >> Em segunda-feira, 9 de junho de 2014, Anshu Avinash > >>> >> >> >> <anshu.avinash35@gmail.com> escreveu: > >>> >> >> >> > >>> >> >> >>> Hi all, > >>> >> >> >>> > >>> >> >> >>> You can find this week's blog entry at > >>> >> >> >>> http://igniting.in/gsoc2014/2014/06/09/more-coding/. I'm > >>> >> >> >>> now > >>> >> >> >>> maintaining the > >>> >> >> >>> code only on github: > >>> >> >> >>> > >>> >> >> >>> https://github.com/igniting/server/tree/selfTuningOptimizer. > >>> >> >> >>> > >>> >> >> >>> Regards > >>> >> >> >>> Anshu Avinash > >>> >> >> >>> > >>> >> >> >>> > >>> >> >> >>> On Sun, May 25, 2014 at 3:27 PM, Anshu Avinash > >>> >> >> >>> <anshu.avinash35@gmail.com> wrote: > >>> >> >> >>> > >>> >> >> >>> Hi all, > >>> >> >> >>> > >>> >> >> >>> You can find my this week's blog entry at > >>> >> >> >>> http://igniting.in/gsoc2014/2014/05/25/coding-things-up/ > >>> >> >> >>> . I > >>> >> >> >>> have > >>> >> >> >>> created a > >>> >> >> >>> branch on launchpad for my work: > >>> >> >> >>> > >>> >> >> >>> > >>> >> >> >>> http://bazaar.launchpad.net/~igniting/maria/maria/revision/4211 > >>> >> >> >>> . > >>> >> >> >>> You > >>> >> >> >>> can > >>> >> >> >>> give your suggestions/reviews either on this thread or > >>> >> >> >>> as a > >>> >> >> >>> comment > >>> >> >> >>> on > >>> >> >> >>> the > >>> >> >> >>> blog itself. > >>> >> >> >>> > >>> >> >> >>> Regards > >>> >> >> >>> Anshu Avinash > >>> >> >> >>> > >>> >> >> >>> > >>> >> >> >>> On Tue, May 20, 2014 at 1:22 AM, Roberto Spadim > >>> >> >> >>> <roberto@spadim.com.br> > >>> >> >> >>> wrote: > >>> >> >> >>> > >>> >> >> >>> wow a big work, congratulation guy, i will read part by > >>> >> >> >>> part > >>> >> >> >>> to > >>> >> >> >>> better > >>> >> >> >>> understand mariadb code > >>> >> >> >>> > >>> >> >> >>> > >>> >> >> >>> 2014-05-19 16:33 GMT-03:00 Anshu Avinash > >>> >> >> >>> <anshu.avinash35@gmail.com>: > >>> >> >> >>> > >>> >> >> >>> Hi all, > >>> >> >> >>> > >>> >> >> >>> This week's blog entry would get delayed by couple of > >>> >> >> >>> days. > >>> >> >> >>> I > >>> >> >> >>> have > >>> >> >> >>> started coding though and would like to give heads up on > >>> >> >> >>> what > >>> >> >> >>> I'm > >>> >> >> >>> doing. > >>> >> >> >>> > >>> >> >> >>> I've looked at the diffs for "Cost model project" of > >>> >> >> >>> mysql: > >>> >> >> >>> > >>> >> >> >>> > >>> >> >> >>> > >>> >> >> >>> http://bazaar.launchpad.net/~mysql/mysql-server/5.7/revision/7596 > >>> >> >> >>> and > >>> >> >> >>> > >>> >> >> >>> > >>> >> >> >>> > >>> >> >> >>> http://bazaar.launchpad.net/~mysql/mysql-server/5.7/revision/7222 . > >>> >> >> >>> These > >>> >> >> >>> give a pretty good idea about what are the hard-coded > >>> >> >> >>> constants > >>> >> >> >>> and > >>> >> >> >>> where > >>> >> >> >>> are they being used. > >>> >> >> >>> > >>> >> >> >>> The idea is to multiply "READ_TIME_FACTOR" and > >>> >> >> >>> "SCAN_TIME_FACTOR" > >>> >> >> >>> to > >>> >> >> >>> the > >>> >> >> >>> values returned by read_time() and scan_time() in > >>> >> >> >>> handler.h, > >>> >> >> >>> while > >>> >> >> >>> returning. These values would be read from a table in > >>> >> >> >>> mysql > >>> >> >> >>> db. > >>> >> >> >>> For > >>> >> >> >>> that > >>> >> >> >>> I've looked at sql_statistics.cc. After completing this, > >>> >> >> >>> I'll > >>> >> >> >>> first > >>> >> >> >>> change > >>> >> >> >>> the values of these constants manually and check if the > >>> >> >> >>> better > >>> >> >> >>> or > >>> >> >> >>> worse > >>> >> >> >>> query plans are being selected. I'll first do the last > >>> >> >> >>> step > >>> >> >> >>> manually, > >>> >> >> >>> to > >>> >> >> >>> check if everything is working as expected and later > >>> >> >> >>> automate > >>> >> >> >>> it. > >>> >> >> >>> > >>> >> >> >>> Regards > >>> >> >> >>> Anshu > >>> >> >> >>> > >>> >> >> >>> > >>> >> >> >>> On Mon, May 12, 2014 at 11:22 AM, Anshu Avinash > >>> >> >> >>> <anshu.avinash35@gmail.com> wrote: > >>> >> >> >>> > >>> >> >> >>> Hi all, > >>> >> >> >>> > >>> >> >> >>> You can find my blog entry for this week at > >>> >> >> >>> http://igniting.in/gsoc2014/2014/05/11/first-steps/ . > >>> >> >> >>> > >>> >> >> >>> Regards > >>> >> >> >>> Anshu Avinash > >>> >> >> >>> > >>> >> >> >>> > >>> >> >> >>> On Thu, May 8, 2014 at 11:46 PM, Anshu Avinash > >>> >> >> >>> <anshu.avinash35@gmail.com> wrote: > >>> >> >> >>> > >>> >> >> >>> Hi all, > >>> >> >> >>> > >>> >> >> >>> Sorry for the irregular updates. I had been busy for > >>> >> >> >>> last > >>> >> >> >>> couple > >>> >> >> >>> of > >>> >> >> >>> days > >>> >> >> >>> and might still be busy for 1-2 days more. I would be > >>> >> >> >>> completely > >>> >> >> >>> free > >>> >> >> >>> starting next week, and would be updating my blog weekly > >>> >> >> >>> on > >>> >> >> >>> every > >>> >> >> >>> Monday (so > >>> >> >> >>> 1st update would be on May 12). I would also send the > >>> >> >> >>> link > >>> >> >> >>> of my > >>> >> >> >>> post > >>> >> >> >>> weekly > >>> >> >> >>> on the mailing list. > >>> >> >> >>> > >>> >> >> >>> As discussed on irc, I started to explore the pair of > >>> >> >> >>> constants: > >>> >> >> >>> handler::scan_time() and handler::read_time(). > >>> >> >> >> > >>> >> >> >> > >>> >> >> >> > >>> >> >> >> -- > >>> >> >> >> Roberto Spadim > >>> >> >> >> SPAEmpresarial > >>> >> >> >> Eng. Automação e Controle > >>> >> >> >> > >>> >> >> > > >>> >> >> > >>> >> >> > >>> >> >> > >>> >> >> -- > >>> >> >> Roberto Spadim > >>> >> >> SPAEmpresarial > >>> >> >> Eng. Automação e Controle > >>> >> > > >>> >> > > >>> >> > >>> >> > >>> >> > >>> >> -- > >>> >> Roberto Spadim > >>> >> SPAEmpresarial > >>> >> Eng. Automação e Controle > >>> > > >>> > > >>> > >>> > >>> > >>> -- > >>> Roberto Spadim > >>> SPAEmpresarial > >>> Eng. Automação e Controle > >> > >> > > > > > > > > -- > > Roberto Spadim > > SPAEmpresarial > > Eng. Automação e Controle > > > > -- > Roberto Spadim > SPAEmpresarial > Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
please include the output (solutions.txt) link too, to check what happened 2014-07-08 16:24 GMT-03:00 Roberto Spadim <roberto@spadim.com.br>:
http://www.netlib.org/lapack/lapacke.html a C api to lapack
2014-07-08 16:23 GMT-03:00 Roberto Spadim <roberto@spadim.com.br>:
check lapack lib, i used it some years ago, and it solve linear equations, at least you don't waste time with 'how to solve linear equations', if you want to study :) lapack was a nice lib, at least i used without problems
2014-07-08 16:04 GMT-03:00 Roberto Spadim <roberto@spadim.com.br>:
a1,a2 is what your solve equation function is saving? and you want know t1,t2,t3..t130, to understand how much time each 'read function' take, that's it?
doing this, what's the next step? this is a start point to select what's better? index vs table scan?
2014-07-08 15:54 GMT-03:00 Anshu Avinash <anshu.avinash35@gmail.com>:
Hi,
a1, a2, ..., a130 are coefficients. t1, t2, .., t130 are unknowns. We need 130 linearly independent equations to solve for these variables. We can never get 130 linearly independent equations as some of the coefficients would be 0 every time. Hence, we get an approximate solution by forming an overdetermined system (http://en.wikipedia.org/wiki/Overdetermined_system). Let me know if you have any further doubts.
Regards Anshu Avinash
On Wed, Jul 9, 2014 at 12:18 AM, Roberto Spadim <roberto@spadim.com.br> wrote:
ops, linear equation
2014-07-08 15:47 GMT-03:00 Roberto Spadim <roberto@spadim.com.br>:
going back... a1t1 + a2t2 + … + a130t130= ttotal
a1, t1...
a1 is something you don't know t1 is the coefficients[i]?
it's a first order equation, right?
2014-07-08 15:20 GMT-03:00 Anshu Avinash <anshu.avinash35@gmail.com>: > Hi, > > The idea is we know the total time the query took, and how many times > an > operation was performed. For example, consider the case of 'read_time'. > We > know how many times an index read took place, but don't know how much > time > does it take to do an index read. By solving these equations, we are > trying > to find out time for individual operations. coefficients[i].value is > `how > many time the operation i took place in a single query.` > > Hope this clears things up. > > Regards > Anshu Avinash > > > On Tue, Jul 8, 2014 at 10:57 PM, Roberto Spadim <roberto@spadim.com.br> > wrote: >> >> just to understand... >> --- the solve_equation part, today only used to save information: >> std::ofstream datafile; >> char file_name[100]; >> my_snprintf(file_name, 100, >> "/tmp/mariadb_cost_coefficients_%lu.txt", thread_id); >> datafile.open(file_name, std::ios::app); >> for(int i=0; i < MAX_CONSTANTS; i++) >> datafile << coefficients[i].value << " "; >> datafile << total_time << "\n"; >> datafile.close(); >> ---- >> >> the idea is: given a query and some coefficients[i].value, you got >> total_time need to execute the query >> you want to "train" something to tell you how many time the same query >> should execute? >> or, what's the "x[i]" variables from your system (hardware/hard >> disk/etc), and extend this to others queries? >> >> >> 2014-07-08 14:20 GMT-03:00 Roberto Spadim <roberto@spadim.com.br>: >> > =] nice >> > >> > 2014-07-08 14:18 GMT-03:00 Anshu Avinash >> > <anshu.avinash35@gmail.com>: >> >> Hi all, >> >> >> >> You can download it here >> >> >> >> >> >> (https://drive.google.com/file/d/0B7NiQb4EbbUVNVJFZ2xkRVR3Ylk/edit?usp=sharin...). >> >> It is around 26M. I have added the link on blog too. >> >> >> >> Regards >> >> Anshu >> >> >> >> >> >> On Tue, Jul 8, 2014 at 10:38 PM, Roberto Spadim >> >> <roberto@spadim.com.br> >> >> wrote: >> >>> >> >>> could you 'display' the dataset you used with octave? >> >>> >> >>> 2014-07-08 13:55 GMT-03:00 Anshu Avinash >> >>> <anshu.avinash35@gmail.com>: >> >>> > Hi all, >> >>> > >> >>> > This week's blog post is at: >> >>> > http://igniting.in/gsoc2014/2014/07/08/solving-linear-equations/ >> >>> > . >> >>> > Sorry >> >>> > for >> >>> > the delay. >> >>> > Suggestions for an approach to solve the system of linear >> >>> > equations >> >>> > are >> >>> > welcome. >> >>> > >> >>> > Regards >> >>> > Anshu Avinash >> >>> > >> >>> > >> >>> > On Mon, Jun 23, 2014 at 7:39 PM, Roberto Spadim >> >>> > <roberto@spadim.com.br> >> >>> > wrote: >> >>> >> >> >>> >> " MDEV. " >> >>> >> it's nice to put full name (MDEV-350), since google and others >> >>> >> search >> >>> >> engines help when someone try to find information about mdev >> >>> >> 350 >> >>> >> >> >>> >> text is ok :) >> >>> >> >> >>> >> 2014-06-23 11:04 GMT-03:00 Anshu Avinash >> >>> >> <anshu.avinash35@gmail.com>: >> >>> >> > Hi, >> >>> >> > >> >>> >> > Sorry for the confusion, this is the new link: >> >>> >> > http://igniting.in/gsoc2014/2014/06/23/work-before-mid-term/ >> >>> >> > Thanks for pointing out. >> >>> >> > >> >>> >> > Regards >> >>> >> > Anshu >> >>> >> > >> >>> >> > >> >>> >> > On Mon, Jun 23, 2014 at 7:32 PM, Roberto Spadim >> >>> >> > <roberto@spadim.com.br> >> >>> >> > wrote: >> >>> >> >> >> >>> >> >> "Sorry this page does not exist =(" >> >>> >> >> >> >>> >> >> 2014-06-23 8:07 GMT-03:00 Anshu Avinash >> >>> >> >> <anshu.avinash35@gmail.com>: >> >>> >> >> > Hi all, >> >>> >> >> > >> >>> >> >> > You can find this week's blog entry at: >> >>> >> >> > http://igniting.in/2014/06/23/work-before-mid-term/ >> >>> >> >> > Suggestions/reviews are welcome. >> >>> >> >> > >> >>> >> >> > Regards >> >>> >> >> > Anshu Avinash >> >>> >> >> > >> >>> >> >> > >> >>> >> >> > On Mon, Jun 9, 2014 at 7:30 PM, Roberto Spadim >> >>> >> >> > <roberto@spadim.com.br> >> >>> >> >> > wrote: >> >>> >> >> >> >> >>> >> >> >> Well i wws reading your posts >> >>> >> >> >> Do you need big data to test read and scan times? >> >>> >> >> >> >> >>> >> >> >> Em segunda-feira, 9 de junho de 2014, Anshu Avinash >> >>> >> >> >> <anshu.avinash35@gmail.com> escreveu: >> >>> >> >> >> >> >>> >> >> >>> Hi all, >> >>> >> >> >>> >> >>> >> >> >>> You can find this week's blog entry at >> >>> >> >> >>> http://igniting.in/gsoc2014/2014/06/09/more-coding/. I'm >> >>> >> >> >>> now >> >>> >> >> >>> maintaining the >> >>> >> >> >>> code only on github: >> >>> >> >> >>> >> >>> >> >> >>> https://github.com/igniting/server/tree/selfTuningOptimizer. >> >>> >> >> >>> >> >>> >> >> >>> Regards >> >>> >> >> >>> Anshu Avinash >> >>> >> >> >>> >> >>> >> >> >>> >> >>> >> >> >>> On Sun, May 25, 2014 at 3:27 PM, Anshu Avinash >> >>> >> >> >>> <anshu.avinash35@gmail.com> wrote: >> >>> >> >> >>> >> >>> >> >> >>> Hi all, >> >>> >> >> >>> >> >>> >> >> >>> You can find my this week's blog entry at >> >>> >> >> >>> http://igniting.in/gsoc2014/2014/05/25/coding-things-up/ >> >>> >> >> >>> . I >> >>> >> >> >>> have >> >>> >> >> >>> created a >> >>> >> >> >>> branch on launchpad for my work: >> >>> >> >> >>> >> >>> >> >> >>> >> >>> >> >> >>> http://bazaar.launchpad.net/~igniting/maria/maria/revision/4211 >> >>> >> >> >>> . >> >>> >> >> >>> You >> >>> >> >> >>> can >> >>> >> >> >>> give your suggestions/reviews either on this thread or >> >>> >> >> >>> as a >> >>> >> >> >>> comment >> >>> >> >> >>> on >> >>> >> >> >>> the >> >>> >> >> >>> blog itself. >> >>> >> >> >>> >> >>> >> >> >>> Regards >> >>> >> >> >>> Anshu Avinash >> >>> >> >> >>> >> >>> >> >> >>> >> >>> >> >> >>> On Tue, May 20, 2014 at 1:22 AM, Roberto Spadim >> >>> >> >> >>> <roberto@spadim.com.br> >> >>> >> >> >>> wrote: >> >>> >> >> >>> >> >>> >> >> >>> wow a big work, congratulation guy, i will read part by >> >>> >> >> >>> part >> >>> >> >> >>> to >> >>> >> >> >>> better >> >>> >> >> >>> understand mariadb code >> >>> >> >> >>> >> >>> >> >> >>> >> >>> >> >> >>> 2014-05-19 16:33 GMT-03:00 Anshu Avinash >> >>> >> >> >>> <anshu.avinash35@gmail.com>: >> >>> >> >> >>> >> >>> >> >> >>> Hi all, >> >>> >> >> >>> >> >>> >> >> >>> This week's blog entry would get delayed by couple of >> >>> >> >> >>> days. >> >>> >> >> >>> I >> >>> >> >> >>> have >> >>> >> >> >>> started coding though and would like to give heads up on >> >>> >> >> >>> what >> >>> >> >> >>> I'm >> >>> >> >> >>> doing. >> >>> >> >> >>> >> >>> >> >> >>> I've looked at the diffs for "Cost model project" of >> >>> >> >> >>> mysql: >> >>> >> >> >>> >> >>> >> >> >>> >> >>> >> >> >>> >> >>> >> >> >>> http://bazaar.launchpad.net/~mysql/mysql-server/5.7/revision/7596 >> >>> >> >> >>> and >> >>> >> >> >>> >> >>> >> >> >>> >> >>> >> >> >>> >> >>> >> >> >>> http://bazaar.launchpad.net/~mysql/mysql-server/5.7/revision/7222 . >> >>> >> >> >>> These >> >>> >> >> >>> give a pretty good idea about what are the hard-coded >> >>> >> >> >>> constants >> >>> >> >> >>> and >> >>> >> >> >>> where >> >>> >> >> >>> are they being used. >> >>> >> >> >>> >> >>> >> >> >>> The idea is to multiply "READ_TIME_FACTOR" and >> >>> >> >> >>> "SCAN_TIME_FACTOR" >> >>> >> >> >>> to >> >>> >> >> >>> the >> >>> >> >> >>> values returned by read_time() and scan_time() in >> >>> >> >> >>> handler.h, >> >>> >> >> >>> while >> >>> >> >> >>> returning. These values would be read from a table in >> >>> >> >> >>> mysql >> >>> >> >> >>> db. >> >>> >> >> >>> For >> >>> >> >> >>> that >> >>> >> >> >>> I've looked at sql_statistics.cc. After completing this, >> >>> >> >> >>> I'll >> >>> >> >> >>> first >> >>> >> >> >>> change >> >>> >> >> >>> the values of these constants manually and check if the >> >>> >> >> >>> better >> >>> >> >> >>> or >> >>> >> >> >>> worse >> >>> >> >> >>> query plans are being selected. I'll first do the last >> >>> >> >> >>> step >> >>> >> >> >>> manually, >> >>> >> >> >>> to >> >>> >> >> >>> check if everything is working as expected and later >> >>> >> >> >>> automate >> >>> >> >> >>> it. >> >>> >> >> >>> >> >>> >> >> >>> Regards >> >>> >> >> >>> Anshu >> >>> >> >> >>> >> >>> >> >> >>> >> >>> >> >> >>> On Mon, May 12, 2014 at 11:22 AM, Anshu Avinash >> >>> >> >> >>> <anshu.avinash35@gmail.com> wrote: >> >>> >> >> >>> >> >>> >> >> >>> Hi all, >> >>> >> >> >>> >> >>> >> >> >>> You can find my blog entry for this week at >> >>> >> >> >>> http://igniting.in/gsoc2014/2014/05/11/first-steps/ . >> >>> >> >> >>> >> >>> >> >> >>> Regards >> >>> >> >> >>> Anshu Avinash >> >>> >> >> >>> >> >>> >> >> >>> >> >>> >> >> >>> On Thu, May 8, 2014 at 11:46 PM, Anshu Avinash >> >>> >> >> >>> <anshu.avinash35@gmail.com> wrote: >> >>> >> >> >>> >> >>> >> >> >>> Hi all, >> >>> >> >> >>> >> >>> >> >> >>> Sorry for the irregular updates. I had been busy for >> >>> >> >> >>> last >> >>> >> >> >>> couple >> >>> >> >> >>> of >> >>> >> >> >>> days >> >>> >> >> >>> and might still be busy for 1-2 days more. I would be >> >>> >> >> >>> completely >> >>> >> >> >>> free >> >>> >> >> >>> starting next week, and would be updating my blog weekly >> >>> >> >> >>> on >> >>> >> >> >>> every >> >>> >> >> >>> Monday (so >> >>> >> >> >>> 1st update would be on May 12). I would also send the >> >>> >> >> >>> link >> >>> >> >> >>> of my >> >>> >> >> >>> post >> >>> >> >> >>> weekly >> >>> >> >> >>> on the mailing list. >> >>> >> >> >>> >> >>> >> >> >>> As discussed on irc, I started to explore the pair of >> >>> >> >> >>> constants: >> >>> >> >> >>> handler::scan_time() and handler::read_time(). >> >>> >> >> >> >> >>> >> >> >> >> >>> >> >> >> >> >>> >> >> >> -- >> >>> >> >> >> Roberto Spadim >> >>> >> >> >> SPAEmpresarial >> >>> >> >> >> Eng. Automação e Controle >> >>> >> >> >> >> >>> >> >> > >> >>> >> >> >> >>> >> >> >> >>> >> >> >> >>> >> >> -- >> >>> >> >> Roberto Spadim >> >>> >> >> SPAEmpresarial >> >>> >> >> Eng. Automação e Controle >> >>> >> > >> >>> >> > >> >>> >> >> >>> >> >> >>> >> >> >>> >> -- >> >>> >> Roberto Spadim >> >>> >> SPAEmpresarial >> >>> >> Eng. Automação e Controle >> >>> > >> >>> > >> >>> >> >>> >> >>> >> >>> -- >> >>> Roberto Spadim >> >>> SPAEmpresarial >> >>> Eng. Automação e Controle >> >> >> >> >> > >> > >> > >> > -- >> > Roberto Spadim >> > SPAEmpresarial >> > Eng. Automação e Controle >> >> >> >> -- >> Roberto Spadim >> SPAEmpresarial >> Eng. Automação e Controle > >
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
Hi all, Here is a blog post describing my progress so far: http://igniting.in/gsoc2014/2014/08/04/progress-so-far/ Comments and suggestions are welcome. Regards Anshu Avinash On Wed, Jul 9, 2014 at 1:14 AM, Roberto Spadim <roberto@spadim.com.br> wrote:
please include the output (solutions.txt) link too, to check what happened
http://www.netlib.org/lapack/lapacke.html a C api to lapack
2014-07-08 16:23 GMT-03:00 Roberto Spadim <roberto@spadim.com.br>:
check lapack lib, i used it some years ago, and it solve linear equations, at least you don't waste time with 'how to solve linear equations', if you want to study :) lapack was a nice lib, at least i used without problems
2014-07-08 16:04 GMT-03:00 Roberto Spadim <roberto@spadim.com.br>:
a1,a2 is what your solve equation function is saving? and you want know t1,t2,t3..t130, to understand how much time each 'read function' take, that's it?
doing this, what's the next step? this is a start point to select what's better? index vs table scan?
2014-07-08 15:54 GMT-03:00 Anshu Avinash <anshu.avinash35@gmail.com>:
Hi,
a1, a2, ..., a130 are coefficients. t1, t2, .., t130 are unknowns. We need 130 linearly independent equations to solve for these variables. We can never get 130 linearly independent equations as some of the coefficients would be 0 every time. Hence, we get an approximate solution by
overdetermined system ( http://en.wikipedia.org/wiki/Overdetermined_system). Let me know if you have any further doubts.
Regards Anshu Avinash
On Wed, Jul 9, 2014 at 12:18 AM, Roberto Spadim < roberto@spadim.com.br> wrote:
ops, linear equation
2014-07-08 15:47 GMT-03:00 Roberto Spadim <roberto@spadim.com.br>: > going back... > a1t1 + a2t2 + … + a130t130= ttotal > > a1, t1... > > a1 is something you don't know > t1 is the coefficients[i]? > > it's a first order equation, right? > > > > 2014-07-08 15:20 GMT-03:00 Anshu Avinash <
anshu.avinash35@gmail.com>:
>> Hi, >> >> The idea is we know the total time the query took, and how many times >> an >> operation was performed. For example, consider the case of 'read_time'. >> We >> know how many times an index read took place, but don't know how much >> time >> does it take to do an index read. By solving these equations, we are >> trying >> to find out time for individual operations. coefficients[i].value is >> `how >> many time the operation i took place in a single query.` >> >> Hope this clears things up. >> >> Regards >> Anshu Avinash >> >> >> On Tue, Jul 8, 2014 at 10:57 PM, Roberto Spadim < roberto@spadim.com.br> >> wrote: >>> >>> just to understand... >>> --- the solve_equation part, today only used to save information: >>> std::ofstream datafile; >>> char file_name[100]; >>> my_snprintf(file_name, 100, >>> "/tmp/mariadb_cost_coefficients_%lu.txt", thread_id); >>> datafile.open(file_name, std::ios::app); >>> for(int i=0; i < MAX_CONSTANTS; i++) >>> datafile << coefficients[i].value << " "; >>> datafile << total_time << "\n"; >>> datafile.close(); >>> ---- >>> >>> the idea is: given a query and some coefficients[i].value, you got >>> total_time need to execute the query >>> you want to "train" something to tell you how many time the same query >>> should execute? >>> or, what's the "x[i]" variables from your system (hardware/hard >>> disk/etc), and extend this to others queries? >>> >>> >>> 2014-07-08 14:20 GMT-03:00 Roberto Spadim <roberto@spadim.com.br : >>> > =] nice >>> > >>> > 2014-07-08 14:18 GMT-03:00 Anshu Avinash >>> > <anshu.avinash35@gmail.com>: >>> >> Hi all, >>> >> >>> >> You can download it here >>> >> >>> >> >>> >> ( https://drive.google.com/file/d/0B7NiQb4EbbUVNVJFZ2xkRVR3Ylk/edit?usp=sharin... ). >>> >> It is around 26M. I have added the link on blog too. >>> >> >>> >> Regards >>> >> Anshu >>> >> >>> >> >>> >> On Tue, Jul 8, 2014 at 10:38 PM, Roberto Spadim >>> >> <roberto@spadim.com.br> >>> >> wrote: >>> >>> >>> >>> could you 'display' the dataset you used with octave? >>> >>> >>> >>> 2014-07-08 13:55 GMT-03:00 Anshu Avinash >>> >>> <anshu.avinash35@gmail.com>: >>> >>> > Hi all, >>> >>> > >>> >>> > This week's blog post is at: >>> >>> > http://igniting.in/gsoc2014/2014/07/08/solving-linear-equations/ >>> >>> > . >>> >>> > Sorry >>> >>> > for >>> >>> > the delay. >>> >>> > Suggestions for an approach to solve the system of linear >>> >>> > equations >>> >>> > are >>> >>> > welcome. >>> >>> > >>> >>> > Regards >>> >>> > Anshu Avinash >>> >>> > >>> >>> > >>> >>> > On Mon, Jun 23, 2014 at 7:39 PM, Roberto Spadim >>> >>> > <roberto@spadim.com.br> >>> >>> > wrote: >>> >>> >> >>> >>> >> " MDEV. " >>> >>> >> it's nice to put full name (MDEV-350), since google and others >>> >>> >> search >>> >>> >> engines help when someone try to find information about mdev >>> >>> >> 350 >>> >>> >> >>> >>> >> text is ok :) >>> >>> >> >>> >>> >> 2014-06-23 11:04 GMT-03:00 Anshu Avinash >>> >>> >> <anshu.avinash35@gmail.com>: >>> >>> >> > Hi, >>> >>> >> > >>> >>> >> > Sorry for the confusion, this is the new link: >>> >>> >> > http://igniting.in/gsoc2014/2014/06/23/work-before-mid-term/ >>> >>> >> > Thanks for pointing out. >>> >>> >> > >>> >>> >> > Regards >>> >>> >> > Anshu >>> >>> >> > >>> >>> >> > >>> >>> >> > On Mon, Jun 23, 2014 at 7:32 PM, Roberto Spadim >>> >>> >> > <roberto@spadim.com.br> >>> >>> >> > wrote: >>> >>> >> >> >>> >>> >> >> "Sorry this page does not exist =(" >>> >>> >> >> >>> >>> >> >> 2014-06-23 8:07 GMT-03:00 Anshu Avinash >>> >>> >> >> <anshu.avinash35@gmail.com>: >>> >>> >> >> > Hi all, >>> >>> >> >> > >>> >>> >> >> > You can find this week's blog entry at: >>> >>> >> >> > http://igniting.in/2014/06/23/work-before-mid-term/ >>> >>> >> >> > Suggestions/reviews are welcome. >>> >>> >> >> > >>> >>> >> >> > Regards >>> >>> >> >> > Anshu Avinash >>> >>> >> >> > >>> >>> >> >> > >>> >>> >> >> > On Mon, Jun 9, 2014 at 7:30 PM, Roberto Spadim >>> >>> >> >> > <roberto@spadim.com.br> >>> >>> >> >> > wrote: >>> >>> >> >> >> >>> >>> >> >> >> Well i wws reading your posts >>> >>> >> >> >> Do you need big data to test read and scan times? >>> >>> >> >> >> >>> >>> >> >> >> Em segunda-feira, 9 de junho de 2014, Anshu Avinash >>> >>> >> >> >> <anshu.avinash35@gmail.com> escreveu: >>> >>> >> >> >> >>> >>> >> >> >>> Hi all, >>> >>> >> >> >>> >>> >>> >> >> >>> You can find this week's blog entry at >>> >>> >> >> >>> http://igniting.in/gsoc2014/2014/06/09/more-coding/. I'm >>> >>> >> >> >>> now >>> >>> >> >> >>> maintaining the >>> >>> >> >> >>> code only on github: >>> >>> >> >> >>> >>> >>> >> >> >>> https://github.com/igniting/server/tree/selfTuningOptimizer. >>> >>> >> >> >>> >>> >>> >> >> >>> Regards >>> >>> >> >> >>> Anshu Avinash >>> >>> >> >> >>> >>> >>> >> >> >>> >>> >>> >> >> >>> On Sun, May 25, 2014 at 3:27 PM, Anshu Avinash >>> >>> >> >> >>> <anshu.avinash35@gmail.com> wrote: >>> >>> >> >> >>> >>> >>> >> >> >>> Hi all, >>> >>> >> >> >>> >>> >>> >> >> >>> You can find my this week's blog entry at >>> >>> >> >> >>> http://igniting.in/gsoc2014/2014/05/25/coding-things-up/ >>> >>> >> >> >>> . I >>> >>> >> >> >>> have >>> >>> >> >> >>> created a >>> >>> >> >> >>> branch on launchpad for my work: >>> >>> >> >> >>> >>> >>> >> >> >>> >>> >>> >> >> >>> http://bazaar.launchpad.net/~igniting/maria/maria/revision/4211 >>> >>> >> >> >>> . >>> >>> >> >> >>> You >>> >>> >> >> >>> can >>> >>> >> >> >>> give your suggestions/reviews either on this
>>> >>> >> >> >>> as a >>> >>> >> >> >>> comment >>> >>> >> >> >>> on >>> >>> >> >> >>> the >>> >>> >> >> >>> blog itself. >>> >>> >> >> >>> >>> >>> >> >> >>> Regards >>> >>> >> >> >>> Anshu Avinash >>> >>> >> >> >>> >>> >>> >> >> >>> >>> >>> >> >> >>> On Tue, May 20, 2014 at 1:22 AM, Roberto Spadim >>> >>> >> >> >>> <roberto@spadim.com.br> >>> >>> >> >> >>> wrote: >>> >>> >> >> >>> >>> >>> >> >> >>> wow a big work, congratulation guy, i will read
>>> >>> >> >> >>> part >>> >>> >> >> >>> to >>> >>> >> >> >>> better >>> >>> >> >> >>> understand mariadb code >>> >>> >> >> >>> >>> >>> >> >> >>> >>> >>> >> >> >>> 2014-05-19 16:33 GMT-03:00 Anshu Avinash >>> >>> >> >> >>> <anshu.avinash35@gmail.com>: >>> >>> >> >> >>> >>> >>> >> >> >>> Hi all, >>> >>> >> >> >>> >>> >>> >> >> >>> This week's blog entry would get delayed by couple of >>> >>> >> >> >>> days. >>> >>> >> >> >>> I >>> >>> >> >> >>> have >>> >>> >> >> >>> started coding though and would like to give heads up on >>> >>> >> >> >>> what >>> >>> >> >> >>> I'm >>> >>> >> >> >>> doing. >>> >>> >> >> >>> >>> >>> >> >> >>> I've looked at the diffs for "Cost model project" of >>> >>> >> >> >>> mysql: >>> >>> >> >> >>> >>> >>> >> >> >>> >>> >>> >> >> >>> >>> >>> >> >> >>> http://bazaar.launchpad.net/~mysql/mysql-server/5.7/revision/7596 >>> >>> >> >> >>> and >>> >>> >> >> >>> >>> >>> >> >> >>> >>> >>> >> >> >>> >>> >>> >> >> >>> http://bazaar.launchpad.net/~mysql/mysql-server/5.7/revision/7222 . >>> >>> >> >> >>> These >>> >>> >> >> >>> give a pretty good idea about what are the hard-coded >>> >>> >> >> >>> constants >>> >>> >> >> >>> and >>> >>> >> >> >>> where >>> >>> >> >> >>> are they being used. >>> >>> >> >> >>> >>> >>> >> >> >>> The idea is to multiply "READ_TIME_FACTOR" and >>> >>> >> >> >>> "SCAN_TIME_FACTOR" >>> >>> >> >> >>> to >>> >>> >> >> >>> the >>> >>> >> >> >>> values returned by read_time() and scan_time() in >>> >>> >> >> >>> handler.h, >>> >>> >> >> >>> while >>> >>> >> >> >>> returning. These values would be read from a table in >>> >>> >> >> >>> mysql >>> >>> >> >> >>> db. >>> >>> >> >> >>> For >>> >>> >> >> >>> that >>> >>> >> >> >>> I've looked at sql_statistics.cc. After completing
>>> >>> >> >> >>> I'll >>> >>> >> >> >>> first >>> >>> >> >> >>> change >>> >>> >> >> >>> the values of these constants manually and check if the >>> >>> >> >> >>> better >>> >>> >> >> >>> or >>> >>> >> >> >>> worse >>> >>> >> >> >>> query plans are being selected. I'll first do the last >>> >>> >> >> >>> step >>> >>> >> >> >>> manually, >>> >>> >> >> >>> to >>> >>> >> >> >>> check if everything is working as expected and later >>> >>> >> >> >>> automate >>> >>> >> >> >>> it. >>> >>> >> >> >>> >>> >>> >> >> >>> Regards >>> >>> >> >> >>> Anshu >>> >>> >> >> >>> >>> >>> >> >> >>> >>> >>> >> >> >>> On Mon, May 12, 2014 at 11:22 AM, Anshu Avinash >>> >>> >> >> >>> <anshu.avinash35@gmail.com> wrote: >>> >>> >> >> >>> >>> >>> >> >> >>> Hi all, >>> >>> >> >> >>> >>> >>> >> >> >>> You can find my blog entry for this week at >>> >>> >> >> >>> http://igniting.in/gsoc2014/2014/05/11/first-steps/ . >>> >>> >> >> >>> >>> >>> >> >> >>> Regards >>> >>> >> >> >>> Anshu Avinash >>> >>> >> >> >>> >>> >>> >> >> >>> >>> >>> >> >> >>> On Thu, May 8, 2014 at 11:46 PM, Anshu Avinash >>> >>> >> >> >>> <anshu.avinash35@gmail.com> wrote: >>> >>> >> >> >>> >>> >>> >> >> >>> Hi all, >>> >>> >> >> >>> >>> >>> >> >> >>> Sorry for the irregular updates. I had been busy for >>> >>> >> >> >>> last >>> >>> >> >> >>> couple >>> >>> >> >> >>> of >>> >>> >> >> >>> days >>> >>> >> >> >>> and might still be busy for 1-2 days more. I would be >>> >>> >> >> >>> completely >>> >>> >> >> >>> free >>> >>> >> >> >>> starting next week, and would be updating my blog weekly >>> >>> >> >> >>> on >>> >>> >> >> >>> every >>> >>> >> >> >>> Monday (so >>> >>> >> >> >>> 1st update would be on May 12). I would also send
2014-07-08 16:24 GMT-03:00 Roberto Spadim <roberto@spadim.com.br>: forming an thread or part by this, the
>>> >>> >> >> >>> link >>> >>> >> >> >>> of my >>> >>> >> >> >>> post >>> >>> >> >> >>> weekly >>> >>> >> >> >>> on the mailing list. >>> >>> >> >> >>> >>> >>> >> >> >>> As discussed on irc, I started to explore the pair of >>> >>> >> >> >>> constants: >>> >>> >> >> >>> handler::scan_time() and handler::read_time(). >>> >>> >> >> >> >>> >>> >> >> >> >>> >>> >> >> >> >>> >>> >> >> >> -- >>> >>> >> >> >> Roberto Spadim >>> >>> >> >> >> SPAEmpresarial >>> >>> >> >> >> Eng. Automação e Controle >>> >>> >> >> >> >>> >>> >> >> > >>> >>> >> >> >>> >>> >> >> >>> >>> >> >> >>> >>> >> >> -- >>> >>> >> >> Roberto Spadim >>> >>> >> >> SPAEmpresarial >>> >>> >> >> Eng. Automação e Controle >>> >>> >> > >>> >>> >> > >>> >>> >> >>> >>> >> >>> >>> >> >>> >>> >> -- >>> >>> >> Roberto Spadim >>> >>> >> SPAEmpresarial >>> >>> >> Eng. Automação e Controle >>> >>> > >>> >>> > >>> >>> >>> >>> >>> >>> >>> >>> -- >>> >>> Roberto Spadim >>> >>> SPAEmpresarial >>> >>> Eng. Automação e Controle >>> >> >>> >> >>> > >>> > >>> > >>> > -- >>> > Roberto Spadim >>> > SPAEmpresarial >>> > Eng. Automação e Controle >>> >>> >>> >>> -- >>> Roberto Spadim >>> SPAEmpresarial >>> Eng. Automação e Controle >> >> > > > > -- > Roberto Spadim > SPAEmpresarial > Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
i didn't read the code yet, but some min/max problems can add restrictions about variables, for example you could set that variables are >=0 must check but well that's a nice step guy :) a nice work :) 2014-08-04 13:31 GMT-03:00 Anshu Avinash <anshu.avinash35@gmail.com>:
Hi all,
Here is a blog post describing my progress so far: http://igniting.in/gsoc2014/2014/08/04/progress-so-far/
Comments and suggestions are welcome.
Regards Anshu Avinash
On Wed, Jul 9, 2014 at 1:14 AM, Roberto Spadim <roberto@spadim.com.br> wrote:
please include the output (solutions.txt) link too, to check what happened
http://www.netlib.org/lapack/lapacke.html a C api to lapack
2014-07-08 16:23 GMT-03:00 Roberto Spadim <roberto@spadim.com.br>:
check lapack lib, i used it some years ago, and it solve linear equations, at least you don't waste time with 'how to solve linear equations', if you want to study :) lapack was a nice lib, at least i used without problems
2014-07-08 16:04 GMT-03:00 Roberto Spadim <roberto@spadim.com.br>:
a1,a2 is what your solve equation function is saving? and you want know t1,t2,t3..t130, to understand how much time each 'read function' take, that's it?
doing this, what's the next step? this is a start point to select what's better? index vs table scan?
2014-07-08 15:54 GMT-03:00 Anshu Avinash <anshu.avinash35@gmail.com>:
Hi,
a1, a2, ..., a130 are coefficients. t1, t2, .., t130 are unknowns. We need 130 linearly independent equations to solve for these variables. We can never get 130 linearly independent equations as some of the coefficients would be 0 every time. Hence, we get an approximate solution by
overdetermined system ( http://en.wikipedia.org/wiki/Overdetermined_system). Let me know if you have any further doubts.
Regards Anshu Avinash
On Wed, Jul 9, 2014 at 12:18 AM, Roberto Spadim < roberto@spadim.com.br> wrote: > > ops, linear equation > > 2014-07-08 15:47 GMT-03:00 Roberto Spadim <roberto@spadim.com.br>: > > going back... > > a1t1 + a2t2 + … + a130t130= ttotal > > > > a1, t1... > > > > a1 is something you don't know > > t1 is the coefficients[i]? > > > > it's a first order equation, right? > > > > > > > > 2014-07-08 15:20 GMT-03:00 Anshu Avinash < anshu.avinash35@gmail.com>: > >> Hi, > >> > >> The idea is we know the total time the query took, and how many times > >> an > >> operation was performed. For example, consider the case of 'read_time'. > >> We > >> know how many times an index read took place, but don't know how much > >> time > >> does it take to do an index read. By solving these equations, we are > >> trying > >> to find out time for individual operations. coefficients[i].value is > >> `how > >> many time the operation i took place in a single query.` > >> > >> Hope this clears things up. > >> > >> Regards > >> Anshu Avinash > >> > >> > >> On Tue, Jul 8, 2014 at 10:57 PM, Roberto Spadim < roberto@spadim.com.br> > >> wrote: > >>> > >>> just to understand... > >>> --- the solve_equation part, today only used to save information: > >>> std::ofstream datafile; > >>> char file_name[100]; > >>> my_snprintf(file_name, 100, > >>> "/tmp/mariadb_cost_coefficients_%lu.txt", thread_id); > >>> datafile.open(file_name, std::ios::app); > >>> for(int i=0; i < MAX_CONSTANTS; i++) > >>> datafile << coefficients[i].value << " "; > >>> datafile << total_time << "\n"; > >>> datafile.close(); > >>> ---- > >>> > >>> the idea is: given a query and some coefficients[i].value, you got > >>> total_time need to execute the query > >>> you want to "train" something to tell you how many time the same query > >>> should execute? > >>> or, what's the "x[i]" variables from your system (hardware/hard > >>> disk/etc), and extend this to others queries? > >>> > >>> > >>> 2014-07-08 14:20 GMT-03:00 Roberto Spadim < roberto@spadim.com.br>: > >>> > =] nice > >>> > > >>> > 2014-07-08 14:18 GMT-03:00 Anshu Avinash > >>> > <anshu.avinash35@gmail.com>: > >>> >> Hi all, > >>> >> > >>> >> You can download it here > >>> >> > >>> >> > >>> >> ( https://drive.google.com/file/d/0B7NiQb4EbbUVNVJFZ2xkRVR3Ylk/edit?usp=sharin... ). > >>> >> It is around 26M. I have added the link on blog too. > >>> >> > >>> >> Regards > >>> >> Anshu > >>> >> > >>> >> > >>> >> On Tue, Jul 8, 2014 at 10:38 PM, Roberto Spadim > >>> >> <roberto@spadim.com.br> > >>> >> wrote: > >>> >>> > >>> >>> could you 'display' the dataset you used with octave? > >>> >>> > >>> >>> 2014-07-08 13:55 GMT-03:00 Anshu Avinash > >>> >>> <anshu.avinash35@gmail.com>: > >>> >>> > Hi all, > >>> >>> > > >>> >>> > This week's blog post is at: > >>> >>> > http://igniting.in/gsoc2014/2014/07/08/solving-linear-equations/ > >>> >>> > . > >>> >>> > Sorry > >>> >>> > for > >>> >>> > the delay. > >>> >>> > Suggestions for an approach to solve the system of linear > >>> >>> > equations > >>> >>> > are > >>> >>> > welcome. > >>> >>> > > >>> >>> > Regards > >>> >>> > Anshu Avinash > >>> >>> > > >>> >>> > > >>> >>> > On Mon, Jun 23, 2014 at 7:39 PM, Roberto Spadim > >>> >>> > <roberto@spadim.com.br> > >>> >>> > wrote: > >>> >>> >> > >>> >>> >> " MDEV. " > >>> >>> >> it's nice to put full name (MDEV-350), since google and others > >>> >>> >> search > >>> >>> >> engines help when someone try to find information about mdev > >>> >>> >> 350 > >>> >>> >> > >>> >>> >> text is ok :) > >>> >>> >> > >>> >>> >> 2014-06-23 11:04 GMT-03:00 Anshu Avinash > >>> >>> >> <anshu.avinash35@gmail.com>: > >>> >>> >> > Hi, > >>> >>> >> > > >>> >>> >> > Sorry for the confusion, this is the new link: > >>> >>> >> > http://igniting.in/gsoc2014/2014/06/23/work-before-mid-term/ > >>> >>> >> > Thanks for pointing out. > >>> >>> >> > > >>> >>> >> > Regards > >>> >>> >> > Anshu > >>> >>> >> > > >>> >>> >> > > >>> >>> >> > On Mon, Jun 23, 2014 at 7:32 PM, Roberto Spadim > >>> >>> >> > <roberto@spadim.com.br> > >>> >>> >> > wrote: > >>> >>> >> >> > >>> >>> >> >> "Sorry this page does not exist =(" > >>> >>> >> >> > >>> >>> >> >> 2014-06-23 8:07 GMT-03:00 Anshu Avinash > >>> >>> >> >> <anshu.avinash35@gmail.com>: > >>> >>> >> >> > Hi all, > >>> >>> >> >> > > >>> >>> >> >> > You can find this week's blog entry at: > >>> >>> >> >> > http://igniting.in/2014/06/23/work-before-mid-term/ > >>> >>> >> >> > Suggestions/reviews are welcome. > >>> >>> >> >> > > >>> >>> >> >> > Regards > >>> >>> >> >> > Anshu Avinash > >>> >>> >> >> > > >>> >>> >> >> > > >>> >>> >> >> > On Mon, Jun 9, 2014 at 7:30 PM, Roberto Spadim > >>> >>> >> >> > <roberto@spadim.com.br> > >>> >>> >> >> > wrote: > >>> >>> >> >> >> > >>> >>> >> >> >> Well i wws reading your posts > >>> >>> >> >> >> Do you need big data to test read and scan times? > >>> >>> >> >> >> > >>> >>> >> >> >> Em segunda-feira, 9 de junho de 2014, Anshu Avinash > >>> >>> >> >> >> <anshu.avinash35@gmail.com> escreveu: > >>> >>> >> >> >> > >>> >>> >> >> >>> Hi all, > >>> >>> >> >> >>> > >>> >>> >> >> >>> You can find this week's blog entry at > >>> >>> >> >> >>> http://igniting.in/gsoc2014/2014/06/09/more-coding/. I'm > >>> >>> >> >> >>> now > >>> >>> >> >> >>> maintaining the > >>> >>> >> >> >>> code only on github: > >>> >>> >> >> >>> > >>> >>> >> >> >>> https://github.com/igniting/server/tree/selfTuningOptimizer. > >>> >>> >> >> >>> > >>> >>> >> >> >>> Regards > >>> >>> >> >> >>> Anshu Avinash > >>> >>> >> >> >>> > >>> >>> >> >> >>> > >>> >>> >> >> >>> On Sun, May 25, 2014 at 3:27 PM, Anshu Avinash > >>> >>> >> >> >>> <anshu.avinash35@gmail.com> wrote: > >>> >>> >> >> >>> > >>> >>> >> >> >>> Hi all, > >>> >>> >> >> >>> > >>> >>> >> >> >>> You can find my this week's blog entry at > >>> >>> >> >> >>> http://igniting.in/gsoc2014/2014/05/25/coding-things-up/ > >>> >>> >> >> >>> . I > >>> >>> >> >> >>> have > >>> >>> >> >> >>> created a > >>> >>> >> >> >>> branch on launchpad for my work: > >>> >>> >> >> >>> > >>> >>> >> >> >>> > >>> >>> >> >> >>> http://bazaar.launchpad.net/~igniting/maria/maria/revision/4211 > >>> >>> >> >> >>> . > >>> >>> >> >> >>> You > >>> >>> >> >> >>> can > >>> >>> >> >> >>> give your suggestions/reviews either on this
> >>> >>> >> >> >>> as a > >>> >>> >> >> >>> comment > >>> >>> >> >> >>> on > >>> >>> >> >> >>> the > >>> >>> >> >> >>> blog itself. > >>> >>> >> >> >>> > >>> >>> >> >> >>> Regards > >>> >>> >> >> >>> Anshu Avinash > >>> >>> >> >> >>> > >>> >>> >> >> >>> > >>> >>> >> >> >>> On Tue, May 20, 2014 at 1:22 AM, Roberto Spadim > >>> >>> >> >> >>> <roberto@spadim.com.br> > >>> >>> >> >> >>> wrote: > >>> >>> >> >> >>> > >>> >>> >> >> >>> wow a big work, congratulation guy, i will read
> >>> >>> >> >> >>> part > >>> >>> >> >> >>> to > >>> >>> >> >> >>> better > >>> >>> >> >> >>> understand mariadb code > >>> >>> >> >> >>> > >>> >>> >> >> >>> > >>> >>> >> >> >>> 2014-05-19 16:33 GMT-03:00 Anshu Avinash > >>> >>> >> >> >>> <anshu.avinash35@gmail.com>: > >>> >>> >> >> >>> > >>> >>> >> >> >>> Hi all, > >>> >>> >> >> >>> > >>> >>> >> >> >>> This week's blog entry would get delayed by couple of > >>> >>> >> >> >>> days. > >>> >>> >> >> >>> I > >>> >>> >> >> >>> have > >>> >>> >> >> >>> started coding though and would like to give
> >>> >>> >> >> >>> what > >>> >>> >> >> >>> I'm > >>> >>> >> >> >>> doing. > >>> >>> >> >> >>> > >>> >>> >> >> >>> I've looked at the diffs for "Cost model project" of > >>> >>> >> >> >>> mysql: > >>> >>> >> >> >>> > >>> >>> >> >> >>> > >>> >>> >> >> >>> > >>> >>> >> >> >>> http://bazaar.launchpad.net/~mysql/mysql-server/5.7/revision/7596 > >>> >>> >> >> >>> and > >>> >>> >> >> >>> > >>> >>> >> >> >>> > >>> >>> >> >> >>> > >>> >>> >> >> >>> http://bazaar.launchpad.net/~mysql/mysql-server/5.7/revision/7222 . > >>> >>> >> >> >>> These > >>> >>> >> >> >>> give a pretty good idea about what are the hard-coded > >>> >>> >> >> >>> constants > >>> >>> >> >> >>> and > >>> >>> >> >> >>> where > >>> >>> >> >> >>> are they being used. > >>> >>> >> >> >>> > >>> >>> >> >> >>> The idea is to multiply "READ_TIME_FACTOR" and > >>> >>> >> >> >>> "SCAN_TIME_FACTOR" > >>> >>> >> >> >>> to > >>> >>> >> >> >>> the > >>> >>> >> >> >>> values returned by read_time() and scan_time() in > >>> >>> >> >> >>> handler.h, > >>> >>> >> >> >>> while > >>> >>> >> >> >>> returning. These values would be read from a
> >>> >>> >> >> >>> mysql > >>> >>> >> >> >>> db. > >>> >>> >> >> >>> For > >>> >>> >> >> >>> that > >>> >>> >> >> >>> I've looked at sql_statistics.cc. After completing this, > >>> >>> >> >> >>> I'll > >>> >>> >> >> >>> first > >>> >>> >> >> >>> change > >>> >>> >> >> >>> the values of these constants manually and check if the > >>> >>> >> >> >>> better > >>> >>> >> >> >>> or > >>> >>> >> >> >>> worse > >>> >>> >> >> >>> query plans are being selected. I'll first do the last > >>> >>> >> >> >>> step > >>> >>> >> >> >>> manually, > >>> >>> >> >> >>> to > >>> >>> >> >> >>> check if everything is working as expected and later > >>> >>> >> >> >>> automate > >>> >>> >> >> >>> it. > >>> >>> >> >> >>> > >>> >>> >> >> >>> Regards > >>> >>> >> >> >>> Anshu > >>> >>> >> >> >>> > >>> >>> >> >> >>> > >>> >>> >> >> >>> On Mon, May 12, 2014 at 11:22 AM, Anshu Avinash > >>> >>> >> >> >>> <anshu.avinash35@gmail.com> wrote: > >>> >>> >> >> >>> > >>> >>> >> >> >>> Hi all, > >>> >>> >> >> >>> > >>> >>> >> >> >>> You can find my blog entry for this week at > >>> >>> >> >> >>> http://igniting.in/gsoc2014/2014/05/11/first-steps/ . > >>> >>> >> >> >>> > >>> >>> >> >> >>> Regards > >>> >>> >> >> >>> Anshu Avinash > >>> >>> >> >> >>> > >>> >>> >> >> >>> > >>> >>> >> >> >>> On Thu, May 8, 2014 at 11:46 PM, Anshu Avinash > >>> >>> >> >> >>> <anshu.avinash35@gmail.com> wrote: > >>> >>> >> >> >>> > >>> >>> >> >> >>> Hi all, > >>> >>> >> >> >>> > >>> >>> >> >> >>> Sorry for the irregular updates. I had been busy for > >>> >>> >> >> >>> last > >>> >>> >> >> >>> couple > >>> >>> >> >> >>> of > >>> >>> >> >> >>> days > >>> >>> >> >> >>> and might still be busy for 1-2 days more. I would be > >>> >>> >> >> >>> completely > >>> >>> >> >> >>> free > >>> >>> >> >> >>> starting next week, and would be updating my blog weekly > >>> >>> >> >> >>> on > >>> >>> >> >> >>> every > >>> >>> >> >> >>> Monday (so > >>> >>> >> >> >>> 1st update would be on May 12). I would also send
> >>> >>> >> >> >>> link > >>> >>> >> >> >>> of my > >>> >>> >> >> >>> post > >>> >>> >> >> >>> weekly > >>> >>> >> >> >>> on the mailing list. > >>> >>> >> >> >>> > >>> >>> >> >> >>> As discussed on irc, I started to explore the
2014-07-08 16:24 GMT-03:00 Roberto Spadim <roberto@spadim.com.br>: forming an thread or part by heads up on table in the pair of
> >>> >>> >> >> >>> constants: > >>> >>> >> >> >>> handler::scan_time() and handler::read_time(). > >>> >>> >> >> >> > >>> >>> >> >> >> > >>> >>> >> >> >> > >>> >>> >> >> >> -- > >>> >>> >> >> >> Roberto Spadim > >>> >>> >> >> >> SPAEmpresarial > >>> >>> >> >> >> Eng. Automação e Controle > >>> >>> >> >> >> > >>> >>> >> >> > > >>> >>> >> >> > >>> >>> >> >> > >>> >>> >> >> > >>> >>> >> >> -- > >>> >>> >> >> Roberto Spadim > >>> >>> >> >> SPAEmpresarial > >>> >>> >> >> Eng. Automação e Controle > >>> >>> >> > > >>> >>> >> > > >>> >>> >> > >>> >>> >> > >>> >>> >> > >>> >>> >> -- > >>> >>> >> Roberto Spadim > >>> >>> >> SPAEmpresarial > >>> >>> >> Eng. Automação e Controle > >>> >>> > > >>> >>> > > >>> >>> > >>> >>> > >>> >>> > >>> >>> -- > >>> >>> Roberto Spadim > >>> >>> SPAEmpresarial > >>> >>> Eng. Automação e Controle > >>> >> > >>> >> > >>> > > >>> > > >>> > > >>> > -- > >>> > Roberto Spadim > >>> > SPAEmpresarial > >>> > Eng. Automação e Controle > >>> > >>> > >>> > >>> -- > >>> Roberto Spadim > >>> SPAEmpresarial > >>> Eng. Automação e Controle > >> > >> > > > > > > > > -- > > Roberto Spadim > > SPAEmpresarial > > Eng. Automação e Controle > > > > -- > Roberto Spadim > SPAEmpresarial > Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
if you have time, you could take a look at linear programming lpsolve is a well know library, i used it a lot some years ago, there's others libs but this one is well known, http://lpsolve.sourceforge.net/5.5/ maybe it give better results, instead of trying to solve a equation you execute a aproximation (there's many optional parameters and others flags about solving a linear problem with this lib) 2014-08-04 13:35 GMT-03:00 Roberto Spadim <roberto@spadim.com.br>:
i didn't read the code yet, but some min/max problems can add restrictions about variables, for example you could set that variables are >=0 must check but well that's a nice step guy :) a nice work :)
2014-08-04 13:31 GMT-03:00 Anshu Avinash <anshu.avinash35@gmail.com>:
Hi all,
Here is a blog post describing my progress so far: http://igniting.in/gsoc2014/2014/08/04/progress-so-far/
Comments and suggestions are welcome.
Regards Anshu Avinash
On Wed, Jul 9, 2014 at 1:14 AM, Roberto Spadim <roberto@spadim.com.br> wrote:
please include the output (solutions.txt) link too, to check what happened
http://www.netlib.org/lapack/lapacke.html a C api to lapack
2014-07-08 16:23 GMT-03:00 Roberto Spadim <roberto@spadim.com.br>:
check lapack lib, i used it some years ago, and it solve linear equations, at least you don't waste time with 'how to solve linear equations', if you want to study :) lapack was a nice lib, at least i used without problems
2014-07-08 16:04 GMT-03:00 Roberto Spadim <roberto@spadim.com.br>:
a1,a2 is what your solve equation function is saving? and you want know t1,t2,t3..t130, to understand how much time each 'read function' take, that's it?
doing this, what's the next step? this is a start point to select what's better? index vs table scan?
2014-07-08 15:54 GMT-03:00 Anshu Avinash <anshu.avinash35@gmail.com : > Hi, > > a1, a2, ..., a130 are coefficients. t1, t2, .., t130 are unknowns. We need > 130 linearly independent equations to solve for these variables. We can > never get 130 linearly independent equations as some of the coefficients > would be 0 every time. Hence, we get an approximate solution by
> overdetermined system ( http://en.wikipedia.org/wiki/Overdetermined_system). > Let me know if you have any further doubts. > > Regards > Anshu Avinash > > > On Wed, Jul 9, 2014 at 12:18 AM, Roberto Spadim < roberto@spadim.com.br> > wrote: >> >> ops, linear equation >> >> 2014-07-08 15:47 GMT-03:00 Roberto Spadim <roberto@spadim.com.br>: >> > going back... >> > a1t1 + a2t2 + … + a130t130= ttotal >> > >> > a1, t1... >> > >> > a1 is something you don't know >> > t1 is the coefficients[i]? >> > >> > it's a first order equation, right? >> > >> > >> > >> > 2014-07-08 15:20 GMT-03:00 Anshu Avinash < anshu.avinash35@gmail.com>: >> >> Hi, >> >> >> >> The idea is we know the total time the query took, and how many times >> >> an >> >> operation was performed. For example, consider the case of 'read_time'. >> >> We >> >> know how many times an index read took place, but don't know how much >> >> time >> >> does it take to do an index read. By solving these equations, we are >> >> trying >> >> to find out time for individual operations. coefficients[i].value is >> >> `how >> >> many time the operation i took place in a single query.` >> >> >> >> Hope this clears things up. >> >> >> >> Regards >> >> Anshu Avinash >> >> >> >> >> >> On Tue, Jul 8, 2014 at 10:57 PM, Roberto Spadim < roberto@spadim.com.br> >> >> wrote: >> >>> >> >>> just to understand... >> >>> --- the solve_equation part, today only used to save information: >> >>> std::ofstream datafile; >> >>> char file_name[100]; >> >>> my_snprintf(file_name, 100, >> >>> "/tmp/mariadb_cost_coefficients_%lu.txt", thread_id); >> >>> datafile.open(file_name, std::ios::app); >> >>> for(int i=0; i < MAX_CONSTANTS; i++) >> >>> datafile << coefficients[i].value << " "; >> >>> datafile << total_time << "\n"; >> >>> datafile.close(); >> >>> ---- >> >>> >> >>> the idea is: given a query and some coefficients[i].value, you got >> >>> total_time need to execute the query >> >>> you want to "train" something to tell you how many time the same query >> >>> should execute? >> >>> or, what's the "x[i]" variables from your system (hardware/hard >> >>> disk/etc), and extend this to others queries? >> >>> >> >>> >> >>> 2014-07-08 14:20 GMT-03:00 Roberto Spadim < roberto@spadim.com.br>: >> >>> > =] nice >> >>> > >> >>> > 2014-07-08 14:18 GMT-03:00 Anshu Avinash >> >>> > <anshu.avinash35@gmail.com>: >> >>> >> Hi all, >> >>> >> >> >>> >> You can download it here >> >>> >> >> >>> >> >> >>> >> ( https://drive.google.com/file/d/0B7NiQb4EbbUVNVJFZ2xkRVR3Ylk/edit?usp=sharin... ). >> >>> >> It is around 26M. I have added the link on blog too. >> >>> >> >> >>> >> Regards >> >>> >> Anshu >> >>> >> >> >>> >> >> >>> >> On Tue, Jul 8, 2014 at 10:38 PM, Roberto Spadim >> >>> >> <roberto@spadim.com.br> >> >>> >> wrote: >> >>> >>> >> >>> >>> could you 'display' the dataset you used with octave? >> >>> >>> >> >>> >>> 2014-07-08 13:55 GMT-03:00 Anshu Avinash >> >>> >>> <anshu.avinash35@gmail.com>: >> >>> >>> > Hi all, >> >>> >>> > >> >>> >>> > This week's blog post is at: >> >>> >>> > http://igniting.in/gsoc2014/2014/07/08/solving-linear-equations/ >> >>> >>> > . >> >>> >>> > Sorry >> >>> >>> > for >> >>> >>> > the delay. >> >>> >>> > Suggestions for an approach to solve the system of linear >> >>> >>> > equations >> >>> >>> > are >> >>> >>> > welcome. >> >>> >>> > >> >>> >>> > Regards >> >>> >>> > Anshu Avinash >> >>> >>> > >> >>> >>> > >> >>> >>> > On Mon, Jun 23, 2014 at 7:39 PM, Roberto Spadim >> >>> >>> > <roberto@spadim.com.br> >> >>> >>> > wrote: >> >>> >>> >> >> >>> >>> >> " MDEV. " >> >>> >>> >> it's nice to put full name (MDEV-350), since google and others >> >>> >>> >> search >> >>> >>> >> engines help when someone try to find information about mdev >> >>> >>> >> 350 >> >>> >>> >> >> >>> >>> >> text is ok :) >> >>> >>> >> >> >>> >>> >> 2014-06-23 11:04 GMT-03:00 Anshu Avinash >> >>> >>> >> <anshu.avinash35@gmail.com>: >> >>> >>> >> > Hi, >> >>> >>> >> > >> >>> >>> >> > Sorry for the confusion, this is the new link: >> >>> >>> >> > http://igniting.in/gsoc2014/2014/06/23/work-before-mid-term/ >> >>> >>> >> > Thanks for pointing out. >> >>> >>> >> > >> >>> >>> >> > Regards >> >>> >>> >> > Anshu >> >>> >>> >> > >> >>> >>> >> > >> >>> >>> >> > On Mon, Jun 23, 2014 at 7:32 PM, Roberto Spadim >> >>> >>> >> > <roberto@spadim.com.br> >> >>> >>> >> > wrote: >> >>> >>> >> >> >> >>> >>> >> >> "Sorry this page does not exist =(" >> >>> >>> >> >> >> >>> >>> >> >> 2014-06-23 8:07 GMT-03:00 Anshu Avinash >> >>> >>> >> >> <anshu.avinash35@gmail.com>: >> >>> >>> >> >> > Hi all, >> >>> >>> >> >> > >> >>> >>> >> >> > You can find this week's blog entry at: >> >>> >>> >> >> > http://igniting.in/2014/06/23/work-before-mid-term/ >> >>> >>> >> >> > Suggestions/reviews are welcome. >> >>> >>> >> >> > >> >>> >>> >> >> > Regards >> >>> >>> >> >> > Anshu Avinash >> >>> >>> >> >> > >> >>> >>> >> >> > >> >>> >>> >> >> > On Mon, Jun 9, 2014 at 7:30 PM, Roberto Spadim >> >>> >>> >> >> > <roberto@spadim.com.br> >> >>> >>> >> >> > wrote: >> >>> >>> >> >> >> >> >>> >>> >> >> >> Well i wws reading your posts >> >>> >>> >> >> >> Do you need big data to test read and scan times? >> >>> >>> >> >> >> >> >>> >>> >> >> >> Em segunda-feira, 9 de junho de 2014, Anshu Avinash >> >>> >>> >> >> >> <anshu.avinash35@gmail.com> escreveu: >> >>> >>> >> >> >> >> >>> >>> >> >> >>> Hi all, >> >>> >>> >> >> >>> >> >>> >>> >> >> >>> You can find this week's blog entry at >> >>> >>> >> >> >>> http://igniting.in/gsoc2014/2014/06/09/more-coding/. I'm >> >>> >>> >> >> >>> now >> >>> >>> >> >> >>> maintaining the >> >>> >>> >> >> >>> code only on github: >> >>> >>> >> >> >>> >> >>> >>> >> >> >>> https://github.com/igniting/server/tree/selfTuningOptimizer. >> >>> >>> >> >> >>> >> >>> >>> >> >> >>> Regards >> >>> >>> >> >> >>> Anshu Avinash >> >>> >>> >> >> >>> >> >>> >>> >> >> >>> >> >>> >>> >> >> >>> On Sun, May 25, 2014 at 3:27 PM, Anshu Avinash >> >>> >>> >> >> >>> <anshu.avinash35@gmail.com> wrote: >> >>> >>> >> >> >>> >> >>> >>> >> >> >>> Hi all, >> >>> >>> >> >> >>> >> >>> >>> >> >> >>> You can find my this week's blog entry at >> >>> >>> >> >> >>> http://igniting.in/gsoc2014/2014/05/25/coding-things-up/ >> >>> >>> >> >> >>> . I >> >>> >>> >> >> >>> have >> >>> >>> >> >> >>> created a >> >>> >>> >> >> >>> branch on launchpad for my work: >> >>> >>> >> >> >>> >> >>> >>> >> >> >>> >> >>> >>> >> >> >>> http://bazaar.launchpad.net/~igniting/maria/maria/revision/4211 >> >>> >>> >> >> >>> . >> >>> >>> >> >> >>> You >> >>> >>> >> >> >>> can >> >>> >>> >> >> >>> give your suggestions/reviews either on this
>> >>> >>> >> >> >>> as a >> >>> >>> >> >> >>> comment >> >>> >>> >> >> >>> on >> >>> >>> >> >> >>> the >> >>> >>> >> >> >>> blog itself. >> >>> >>> >> >> >>> >> >>> >>> >> >> >>> Regards >> >>> >>> >> >> >>> Anshu Avinash >> >>> >>> >> >> >>> >> >>> >>> >> >> >>> >> >>> >>> >> >> >>> On Tue, May 20, 2014 at 1:22 AM, Roberto Spadim >> >>> >>> >> >> >>> <roberto@spadim.com.br> >> >>> >>> >> >> >>> wrote: >> >>> >>> >> >> >>> >> >>> >>> >> >> >>> wow a big work, congratulation guy, i will read
>> >>> >>> >> >> >>> part >> >>> >>> >> >> >>> to >> >>> >>> >> >> >>> better >> >>> >>> >> >> >>> understand mariadb code >> >>> >>> >> >> >>> >> >>> >>> >> >> >>> >> >>> >>> >> >> >>> 2014-05-19 16:33 GMT-03:00 Anshu Avinash >> >>> >>> >> >> >>> <anshu.avinash35@gmail.com>: >> >>> >>> >> >> >>> >> >>> >>> >> >> >>> Hi all, >> >>> >>> >> >> >>> >> >>> >>> >> >> >>> This week's blog entry would get delayed by couple of >> >>> >>> >> >> >>> days. >> >>> >>> >> >> >>> I >> >>> >>> >> >> >>> have >> >>> >>> >> >> >>> started coding though and would like to give
>> >>> >>> >> >> >>> what >> >>> >>> >> >> >>> I'm >> >>> >>> >> >> >>> doing. >> >>> >>> >> >> >>> >> >>> >>> >> >> >>> I've looked at the diffs for "Cost model
>> >>> >>> >> >> >>> mysql: >> >>> >>> >> >> >>> >> >>> >>> >> >> >>> >> >>> >>> >> >> >>> >> >>> >>> >> >> >>> http://bazaar.launchpad.net/~mysql/mysql-server/5.7/revision/7596 >> >>> >>> >> >> >>> and >> >>> >>> >> >> >>> >> >>> >>> >> >> >>> >> >>> >>> >> >> >>> >> >>> >>> >> >> >>> http://bazaar.launchpad.net/~mysql/mysql-server/5.7/revision/7222 . >> >>> >>> >> >> >>> These >> >>> >>> >> >> >>> give a pretty good idea about what are the hard-coded >> >>> >>> >> >> >>> constants >> >>> >>> >> >> >>> and >> >>> >>> >> >> >>> where >> >>> >>> >> >> >>> are they being used. >> >>> >>> >> >> >>> >> >>> >>> >> >> >>> The idea is to multiply "READ_TIME_FACTOR" and >> >>> >>> >> >> >>> "SCAN_TIME_FACTOR" >> >>> >>> >> >> >>> to >> >>> >>> >> >> >>> the >> >>> >>> >> >> >>> values returned by read_time() and scan_time() in >> >>> >>> >> >> >>> handler.h, >> >>> >>> >> >> >>> while >> >>> >>> >> >> >>> returning. These values would be read from a
>> >>> >>> >> >> >>> mysql >> >>> >>> >> >> >>> db. >> >>> >>> >> >> >>> For >> >>> >>> >> >> >>> that >> >>> >>> >> >> >>> I've looked at sql_statistics.cc. After completing this, >> >>> >>> >> >> >>> I'll >> >>> >>> >> >> >>> first >> >>> >>> >> >> >>> change >> >>> >>> >> >> >>> the values of these constants manually and check if the >> >>> >>> >> >> >>> better >> >>> >>> >> >> >>> or >> >>> >>> >> >> >>> worse >> >>> >>> >> >> >>> query plans are being selected. I'll first do
>> >>> >>> >> >> >>> step >> >>> >>> >> >> >>> manually, >> >>> >>> >> >> >>> to >> >>> >>> >> >> >>> check if everything is working as expected and later >> >>> >>> >> >> >>> automate >> >>> >>> >> >> >>> it. >> >>> >>> >> >> >>> >> >>> >>> >> >> >>> Regards >> >>> >>> >> >> >>> Anshu >> >>> >>> >> >> >>> >> >>> >>> >> >> >>> >> >>> >>> >> >> >>> On Mon, May 12, 2014 at 11:22 AM, Anshu Avinash >> >>> >>> >> >> >>> <anshu.avinash35@gmail.com> wrote: >> >>> >>> >> >> >>> >> >>> >>> >> >> >>> Hi all, >> >>> >>> >> >> >>> >> >>> >>> >> >> >>> You can find my blog entry for this week at >> >>> >>> >> >> >>> http://igniting.in/gsoc2014/2014/05/11/first-steps/ . >> >>> >>> >> >> >>> >> >>> >>> >> >> >>> Regards >> >>> >>> >> >> >>> Anshu Avinash >> >>> >>> >> >> >>> >> >>> >>> >> >> >>> >> >>> >>> >> >> >>> On Thu, May 8, 2014 at 11:46 PM, Anshu Avinash >> >>> >>> >> >> >>> <anshu.avinash35@gmail.com> wrote: >> >>> >>> >> >> >>> >> >>> >>> >> >> >>> Hi all, >> >>> >>> >> >> >>> >> >>> >>> >> >> >>> Sorry for the irregular updates. I had been busy for >> >>> >>> >> >> >>> last >> >>> >>> >> >> >>> couple >> >>> >>> >> >> >>> of >> >>> >>> >> >> >>> days >> >>> >>> >> >> >>> and might still be busy for 1-2 days more. I would be >> >>> >>> >> >> >>> completely >> >>> >>> >> >> >>> free >> >>> >>> >> >> >>> starting next week, and would be updating my blog weekly >> >>> >>> >> >> >>> on >> >>> >>> >> >> >>> every >> >>> >>> >> >> >>> Monday (so >> >>> >>> >> >> >>> 1st update would be on May 12). I would also send the >> >>> >>> >> >> >>> link >> >>> >>> >> >> >>> of my >> >>> >>> >> >> >>> post >> >>> >>> >> >> >>> weekly >> >>> >>> >> >> >>> on the mailing list. >> >>> >>> >> >> >>> >> >>> >>> >> >> >>> As discussed on irc, I started to explore the
2014-07-08 16:24 GMT-03:00 Roberto Spadim <roberto@spadim.com.br>: forming an thread or part by heads up on project" of table in the last pair of
>> >>> >>> >> >> >>> constants: >> >>> >>> >> >> >>> handler::scan_time() and handler::read_time(). >> >>> >>> >> >> >> >> >>> >>> >> >> >> >> >>> >>> >> >> >> >> >>> >>> >> >> >> -- >> >>> >>> >> >> >> Roberto Spadim >> >>> >>> >> >> >> SPAEmpresarial >> >>> >>> >> >> >> Eng. Automação e Controle >> >>> >>> >> >> >> >> >>> >>> >> >> > >> >>> >>> >> >> >> >>> >>> >> >> >> >>> >>> >> >> >> >>> >>> >> >> -- >> >>> >>> >> >> Roberto Spadim >> >>> >>> >> >> SPAEmpresarial >> >>> >>> >> >> Eng. Automação e Controle >> >>> >>> >> > >> >>> >>> >> > >> >>> >>> >> >> >>> >>> >> >> >>> >>> >> >> >>> >>> >> -- >> >>> >>> >> Roberto Spadim >> >>> >>> >> SPAEmpresarial >> >>> >>> >> Eng. Automação e Controle >> >>> >>> > >> >>> >>> > >> >>> >>> >> >>> >>> >> >>> >>> >> >>> >>> -- >> >>> >>> Roberto Spadim >> >>> >>> SPAEmpresarial >> >>> >>> Eng. Automação e Controle >> >>> >> >> >>> >> >> >>> > >> >>> > >> >>> > >> >>> > -- >> >>> > Roberto Spadim >> >>> > SPAEmpresarial >> >>> > Eng. Automação e Controle >> >>> >> >>> >> >>> >> >>> -- >> >>> Roberto Spadim >> >>> SPAEmpresarial >> >>> Eng. Automação e Controle >> >> >> >> >> > >> > >> > >> > -- >> > Roberto Spadim >> > SPAEmpresarial >> > Eng. Automação e Controle >> >> >> >> -- >> Roberto Spadim >> SPAEmpresarial >> Eng. Automação e Controle > >
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
participants (5)
-
Anshu Avinash
-
Colin Charles
-
Jocelyn Fournier
-
Roberto Spadim
-
Sergei Golubchik