developers
Threads by month
- ----- 2025 -----
- May
- April
- March
- February
- January
- ----- 2024 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2023 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2022 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2021 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2020 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2019 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2018 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2017 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2016 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2015 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2014 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2013 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2012 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2011 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2010 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2009 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- 5 participants
- 6857 discussions

[Maria-developers] Rev 2802: Fix for https://bugs.launchpad.net/maria/+bug/509795 in file:///Users/hakan/work/monty_program/maria/
by Hakan Kuecuekyilmaz 22 Jan '10
by Hakan Kuecuekyilmaz 22 Jan '10
22 Jan '10
At file:///Users/hakan/work/monty_program/maria/
------------------------------------------------------------
revno: 2802
revision-id: hakan(a)askmonty.org-20100122211921-e177xzarwy7l8j5y
parent: knielsen(a)knielsen-hq.org-20100118125610-s55p0dsjofl9eclf
committer: Hakan Kuecuekyilmaz <hakan(a)askmonty.org>
branch nick: maria
timestamp: Fri 2010-01-22 22:19:21 +0100
message:
Fix for https://bugs.launchpad.net/maria/+bug/509795
Result of reverse_lookup("localhost") is system depended.
Therefore we disable the result of it.
=== modified file 'mysql-test/r/udf.result'
--- a/mysql-test/r/udf.result 2010-01-11 13:15:28 +0000
+++ b/mysql-test/r/udf.result 2010-01-22 21:19:21 +0000
@@ -38,8 +38,6 @@
select reverse_lookup("127.0.0.1");
select reverse_lookup(127,0,0,1);
select reverse_lookup("localhost");
-reverse_lookup("localhost")
-NULL
select avgcost();
ERROR HY000: Can't initialize function 'avgcost'; wrong number of arguments: AVGCOST() requires two arguments
select avgcost(100,23.76);
=== modified file 'mysql-test/t/udf.test'
--- a/mysql-test/t/udf.test 2010-01-11 13:15:28 +0000
+++ b/mysql-test/t/udf.test 2010-01-22 21:19:21 +0000
@@ -56,9 +56,9 @@
--disable_result_log
select reverse_lookup("127.0.0.1");
select reverse_lookup(127,0,0,1);
+select reverse_lookup("localhost");
--enable_result_log
-select reverse_lookup("localhost");
--error ER_CANT_INITIALIZE_UDF
select avgcost();
--error ER_CANT_INITIALIZE_UDF
1
0

[Maria-developers] Updated (by Knielsen): Add support for google protocol buffers (34)
by worklog-noreply@askmonty.org 22 Jan '10
by worklog-noreply@askmonty.org 22 Jan '10
22 Jan '10
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Add support for google protocol buffers
CREATION DATE..: Tue, 21 Jul 2009, 21:11
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......:
CATEGORY.......: Server-Sprint
TASK ID........: 34 (http://askmonty.org/worklog/?tid=34)
VERSION........: WorkLog-3.4
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Knielsen - Fri, 22 Jan 2010, 11:38)=-=-
Low Level Design modified.
--- /tmp/wklog.34.old.29965 2010-01-22 11:38:57.000000000 +0200
+++ /tmp/wklog.34.new.29965 2010-01-22 11:38:57.000000000 +0200
@@ -2,3 +2,12 @@
and a parser for text form of .proto file which then exposes the parsed
file via standard GPB message navigation API.
+* We should have both server-side support and client-side support (client side
+ means functions in libmysqlclient so that user can select the full BLOB and
+ extract fields in the application).
+
+* Add some kind of header to the GPB blob to support versioning and future
+ extensibility.
+
+* Add complete syntax description (update, add, drop, exists, ...).
+
-=-=(Psergey - Tue, 21 Jul 2009, 21:13)=-=-
Low Level Design modified.
--- /tmp/wklog.34.old.6462 2009-07-21 21:13:13.000000000 +0300
+++ /tmp/wklog.34.new.6462 2009-07-21 21:13:13.000000000 +0300
@@ -1 +1,4 @@
+* GPB tarball contains a protocol definition for .proto file structure itself
+ and a parser for text form of .proto file which then exposes the parsed
+ file via standard GPB message navigation API.
-=-=(Psergey - Tue, 21 Jul 2009, 21:12)=-=-
High-Level Specification modified.
--- /tmp/wklog.34.old.6399 2009-07-21 21:12:23.000000000 +0300
+++ /tmp/wklog.34.new.6399 2009-07-21 21:12:23.000000000 +0300
@@ -1 +1,78 @@
+<contents>
+1. GPB Encoding overview
+2. GPB in an SQL database
+2.1 Informing server about GPB field names and types
+2.2 Addressing GPB fields
+2.2.1 Option1: SQL Function
+2.2.2 Option2: SQL columns
+</contents>
+
+
+1. GPB Encoding overview
+========================
+
+GBB is a compact encoding for structured and typed data. A unit of GPB data
+(it is called message) is only partially self-describing: it's possible to
+iterate over its parts, but, quoting the spec
+
+http://code.google.com/apis/protocolbuffers/docs/encoding.html:
+ " the name and declared type for each field can only be determined on the
+ decoding end by referencing the message type's definition (i.e. the .proto
+ file). "
+
+2. GPB in an SQL database
+=========================
+
+It is possible to store GPB data in MariaDB today - one can declare a binary
+blob column and use it to store GPB messages. Storing and retrieving entire
+messages will be the only available operations, though, as the server has no
+idea about the GPB format.
+It is apparent that ability to peek inside GPB data from SQL layer would be of
+great advantage: one would be able to
+- select only certain fields or parts of GPB messages
+- filter records based on the values of GPB fields
+- etc
+performing such operations at SQL layer will allow to reduce client<->server
+traffic right away, and will open path to getting the best possible
+performance.
+
+2.1 Informing server about GPB field names and types
+----------------------------------------------------
+User-friendly/meaningful access to GPB fields requires knowledge of GPB field
+names and types, which are not available from GPB message itself (see "GPB
+encoding overview" section).
+
+So the first issue to be addressed is to get the server to know the definition
+of stored messages. We intend to assume that all records have GPB messages
+that conform to a certain single definition, which gives one definition per
+GPB field.
+
+DecisionToMake: How to pass the server the GPB definition?
+First idea: add a CREATE TABLE parameter which will specify either the
+definition itself or path to .proto file with the definition.
+
+2.2 Addressing GPB fields
+-------------------------
+We'll need to provide a way to access GPB fields. This can be complicated as
+structures that are encoded in GPB message can be nested and recursive.
+
+2.2.1 Option1: SQL Function
+~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Introduce an SQL function GPB_FIELD(path) which will return contents of the
+field.
+- Return type of the function will be determined from GPB message definition.
+- For path, we can use XPath selector (a subset of XPath) syntax.
+
+(TODO ^ the above needs to be specified in more detail. is the selector as
+simple as filesystem path or we allow quantifiers (with predicates?)?)
+
+2.2.2 Option2: SQL columns
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+Make GPB columns to be accessible as SQL columns.
+This approach has problems:
+- It might be hard to implement code-wise
+ - (TODO will Virtual columns patch help??)
+- It is not clear how to access fields from nested structures. Should we allow
+ quoted names like `foo/bar[2]/baz' ?
+
DESCRIPTION:
Add support for Google Protocol Buffers (further GPB). It should be possible
to have columns that store GPB-encoded data, as well as use SQL constructs to
extract parts of GPB data for use in select list, for filtering, and so forth.
Any support for indexing GPB data is outside of scope of this WL entry.
HIGH-LEVEL SPECIFICATION:
<contents>
1. GPB Encoding overview
2. GPB in an SQL database
2.1 Informing server about GPB field names and types
2.2 Addressing GPB fields
2.2.1 Option1: SQL Function
2.2.2 Option2: SQL columns
</contents>
1. GPB Encoding overview
========================
GBB is a compact encoding for structured and typed data. A unit of GPB data
(it is called message) is only partially self-describing: it's possible to
iterate over its parts, but, quoting the spec
http://code.google.com/apis/protocolbuffers/docs/encoding.html:
" the name and declared type for each field can only be determined on the
decoding end by referencing the message type's definition (i.e. the .proto
file). "
2. GPB in an SQL database
=========================
It is possible to store GPB data in MariaDB today - one can declare a binary
blob column and use it to store GPB messages. Storing and retrieving entire
messages will be the only available operations, though, as the server has no
idea about the GPB format.
It is apparent that ability to peek inside GPB data from SQL layer would be of
great advantage: one would be able to
- select only certain fields or parts of GPB messages
- filter records based on the values of GPB fields
- etc
performing such operations at SQL layer will allow to reduce client<->server
traffic right away, and will open path to getting the best possible
performance.
2.1 Informing server about GPB field names and types
----------------------------------------------------
User-friendly/meaningful access to GPB fields requires knowledge of GPB field
names and types, which are not available from GPB message itself (see "GPB
encoding overview" section).
So the first issue to be addressed is to get the server to know the definition
of stored messages. We intend to assume that all records have GPB messages
that conform to a certain single definition, which gives one definition per
GPB field.
DecisionToMake: How to pass the server the GPB definition?
First idea: add a CREATE TABLE parameter which will specify either the
definition itself or path to .proto file with the definition.
2.2 Addressing GPB fields
-------------------------
We'll need to provide a way to access GPB fields. This can be complicated as
structures that are encoded in GPB message can be nested and recursive.
2.2.1 Option1: SQL Function
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Introduce an SQL function GPB_FIELD(path) which will return contents of the
field.
- Return type of the function will be determined from GPB message definition.
- For path, we can use XPath selector (a subset of XPath) syntax.
(TODO ^ the above needs to be specified in more detail. is the selector as
simple as filesystem path or we allow quantifiers (with predicates?)?)
2.2.2 Option2: SQL columns
~~~~~~~~~~~~~~~~~~~~~~~~~~
Make GPB columns to be accessible as SQL columns.
This approach has problems:
- It might be hard to implement code-wise
- (TODO will Virtual columns patch help??)
- It is not clear how to access fields from nested structures. Should we allow
quoted names like `foo/bar[2]/baz' ?
LOW-LEVEL DESIGN:
* GPB tarball contains a protocol definition for .proto file structure itself
and a parser for text form of .proto file which then exposes the parsed
file via standard GPB message navigation API.
* We should have both server-side support and client-side support (client side
means functions in libmysqlclient so that user can select the full BLOB and
extract fields in the application).
* Add some kind of header to the GPB blob to support versioning and future
extensibility.
* Add complete syntax description (update, add, drop, exists, ...).
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Knielsen): Add support for google protocol buffers (34)
by worklog-noreply@askmonty.org 22 Jan '10
by worklog-noreply@askmonty.org 22 Jan '10
22 Jan '10
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Add support for google protocol buffers
CREATION DATE..: Tue, 21 Jul 2009, 21:11
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......:
CATEGORY.......: Server-Sprint
TASK ID........: 34 (http://askmonty.org/worklog/?tid=34)
VERSION........: WorkLog-3.4
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Knielsen - Fri, 22 Jan 2010, 11:38)=-=-
Low Level Design modified.
--- /tmp/wklog.34.old.29965 2010-01-22 11:38:57.000000000 +0200
+++ /tmp/wklog.34.new.29965 2010-01-22 11:38:57.000000000 +0200
@@ -2,3 +2,12 @@
and a parser for text form of .proto file which then exposes the parsed
file via standard GPB message navigation API.
+* We should have both server-side support and client-side support (client side
+ means functions in libmysqlclient so that user can select the full BLOB and
+ extract fields in the application).
+
+* Add some kind of header to the GPB blob to support versioning and future
+ extensibility.
+
+* Add complete syntax description (update, add, drop, exists, ...).
+
-=-=(Psergey - Tue, 21 Jul 2009, 21:13)=-=-
Low Level Design modified.
--- /tmp/wklog.34.old.6462 2009-07-21 21:13:13.000000000 +0300
+++ /tmp/wklog.34.new.6462 2009-07-21 21:13:13.000000000 +0300
@@ -1 +1,4 @@
+* GPB tarball contains a protocol definition for .proto file structure itself
+ and a parser for text form of .proto file which then exposes the parsed
+ file via standard GPB message navigation API.
-=-=(Psergey - Tue, 21 Jul 2009, 21:12)=-=-
High-Level Specification modified.
--- /tmp/wklog.34.old.6399 2009-07-21 21:12:23.000000000 +0300
+++ /tmp/wklog.34.new.6399 2009-07-21 21:12:23.000000000 +0300
@@ -1 +1,78 @@
+<contents>
+1. GPB Encoding overview
+2. GPB in an SQL database
+2.1 Informing server about GPB field names and types
+2.2 Addressing GPB fields
+2.2.1 Option1: SQL Function
+2.2.2 Option2: SQL columns
+</contents>
+
+
+1. GPB Encoding overview
+========================
+
+GBB is a compact encoding for structured and typed data. A unit of GPB data
+(it is called message) is only partially self-describing: it's possible to
+iterate over its parts, but, quoting the spec
+
+http://code.google.com/apis/protocolbuffers/docs/encoding.html:
+ " the name and declared type for each field can only be determined on the
+ decoding end by referencing the message type's definition (i.e. the .proto
+ file). "
+
+2. GPB in an SQL database
+=========================
+
+It is possible to store GPB data in MariaDB today - one can declare a binary
+blob column and use it to store GPB messages. Storing and retrieving entire
+messages will be the only available operations, though, as the server has no
+idea about the GPB format.
+It is apparent that ability to peek inside GPB data from SQL layer would be of
+great advantage: one would be able to
+- select only certain fields or parts of GPB messages
+- filter records based on the values of GPB fields
+- etc
+performing such operations at SQL layer will allow to reduce client<->server
+traffic right away, and will open path to getting the best possible
+performance.
+
+2.1 Informing server about GPB field names and types
+----------------------------------------------------
+User-friendly/meaningful access to GPB fields requires knowledge of GPB field
+names and types, which are not available from GPB message itself (see "GPB
+encoding overview" section).
+
+So the first issue to be addressed is to get the server to know the definition
+of stored messages. We intend to assume that all records have GPB messages
+that conform to a certain single definition, which gives one definition per
+GPB field.
+
+DecisionToMake: How to pass the server the GPB definition?
+First idea: add a CREATE TABLE parameter which will specify either the
+definition itself or path to .proto file with the definition.
+
+2.2 Addressing GPB fields
+-------------------------
+We'll need to provide a way to access GPB fields. This can be complicated as
+structures that are encoded in GPB message can be nested and recursive.
+
+2.2.1 Option1: SQL Function
+~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Introduce an SQL function GPB_FIELD(path) which will return contents of the
+field.
+- Return type of the function will be determined from GPB message definition.
+- For path, we can use XPath selector (a subset of XPath) syntax.
+
+(TODO ^ the above needs to be specified in more detail. is the selector as
+simple as filesystem path or we allow quantifiers (with predicates?)?)
+
+2.2.2 Option2: SQL columns
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+Make GPB columns to be accessible as SQL columns.
+This approach has problems:
+- It might be hard to implement code-wise
+ - (TODO will Virtual columns patch help??)
+- It is not clear how to access fields from nested structures. Should we allow
+ quoted names like `foo/bar[2]/baz' ?
+
DESCRIPTION:
Add support for Google Protocol Buffers (further GPB). It should be possible
to have columns that store GPB-encoded data, as well as use SQL constructs to
extract parts of GPB data for use in select list, for filtering, and so forth.
Any support for indexing GPB data is outside of scope of this WL entry.
HIGH-LEVEL SPECIFICATION:
<contents>
1. GPB Encoding overview
2. GPB in an SQL database
2.1 Informing server about GPB field names and types
2.2 Addressing GPB fields
2.2.1 Option1: SQL Function
2.2.2 Option2: SQL columns
</contents>
1. GPB Encoding overview
========================
GBB is a compact encoding for structured and typed data. A unit of GPB data
(it is called message) is only partially self-describing: it's possible to
iterate over its parts, but, quoting the spec
http://code.google.com/apis/protocolbuffers/docs/encoding.html:
" the name and declared type for each field can only be determined on the
decoding end by referencing the message type's definition (i.e. the .proto
file). "
2. GPB in an SQL database
=========================
It is possible to store GPB data in MariaDB today - one can declare a binary
blob column and use it to store GPB messages. Storing and retrieving entire
messages will be the only available operations, though, as the server has no
idea about the GPB format.
It is apparent that ability to peek inside GPB data from SQL layer would be of
great advantage: one would be able to
- select only certain fields or parts of GPB messages
- filter records based on the values of GPB fields
- etc
performing such operations at SQL layer will allow to reduce client<->server
traffic right away, and will open path to getting the best possible
performance.
2.1 Informing server about GPB field names and types
----------------------------------------------------
User-friendly/meaningful access to GPB fields requires knowledge of GPB field
names and types, which are not available from GPB message itself (see "GPB
encoding overview" section).
So the first issue to be addressed is to get the server to know the definition
of stored messages. We intend to assume that all records have GPB messages
that conform to a certain single definition, which gives one definition per
GPB field.
DecisionToMake: How to pass the server the GPB definition?
First idea: add a CREATE TABLE parameter which will specify either the
definition itself or path to .proto file with the definition.
2.2 Addressing GPB fields
-------------------------
We'll need to provide a way to access GPB fields. This can be complicated as
structures that are encoded in GPB message can be nested and recursive.
2.2.1 Option1: SQL Function
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Introduce an SQL function GPB_FIELD(path) which will return contents of the
field.
- Return type of the function will be determined from GPB message definition.
- For path, we can use XPath selector (a subset of XPath) syntax.
(TODO ^ the above needs to be specified in more detail. is the selector as
simple as filesystem path or we allow quantifiers (with predicates?)?)
2.2.2 Option2: SQL columns
~~~~~~~~~~~~~~~~~~~~~~~~~~
Make GPB columns to be accessible as SQL columns.
This approach has problems:
- It might be hard to implement code-wise
- (TODO will Virtual columns patch help??)
- It is not clear how to access fields from nested structures. Should we allow
quoted names like `foo/bar[2]/baz' ?
LOW-LEVEL DESIGN:
* GPB tarball contains a protocol definition for .proto file structure itself
and a parser for text form of .proto file which then exposes the parsed
file via standard GPB message navigation API.
* We should have both server-side support and client-side support (client side
means functions in libmysqlclient so that user can select the full BLOB and
extract fields in the application).
* Add some kind of header to the GPB blob to support versioning and future
extensibility.
* Add complete syntax description (update, add, drop, exists, ...).
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] New (by Knielsen): Re-merge PBXT into MariaDB with proper history using bzr merge-into (76)
by worklog-noreply@askmonty.org 22 Jan '10
by worklog-noreply@askmonty.org 22 Jan '10
22 Jan '10
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Re-merge PBXT into MariaDB with proper history using bzr merge-into
CREATION DATE..: Fri, 22 Jan 2010, 10:03
SUPERVISOR.....: Knielsen
IMPLEMENTOR....:
COPIES TO......:
CATEGORY.......: Server-BackLog
TASK ID........: 76 (http://askmonty.org/worklog/?tid=76)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
DESCRIPTION:
PBXT is maintained in its own repository separate from MariaDB (as it is also
used for MySQL and Drizzle).
Merging new PBXT features into MariaDB is currently a manual process where a
copy inside the MariaDB tree is manually patched to match the latest version
in the real PBXT repository. This is a pity, as it requires extra work, and
the PBXT revision history is not available in the MariaDB tree.
Bzr actually has a feature to properly merge one repository into another with
preservation of full history and bzr-assisted merging. This is used for merging
XtraDB into MariaDB. Some information on this here:
http://askmonty.org/wiki/index.php/Merging_into_MariaDB#Merging_from_XtraDB…
We want to do this for PBXT also. However, this is currently blocked on a bug
in bzr:
bug: https://bugs.launchpad.net/bzr/+bug/375898
Once this bug is resolved, we would want to re-do the inclusion of PBXT into
MariaDB, using `bzr merge-into` and following the procedure in the bug report.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] New (by Knielsen): Extend build to create a shared libmysqld.so library (75)
by worklog-noreply@askmonty.org 22 Jan '10
by worklog-noreply@askmonty.org 22 Jan '10
22 Jan '10
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Extend build to create a shared libmysqld.so library
CREATION DATE..: Fri, 22 Jan 2010, 09:39
SUPERVISOR.....: Knielsen
IMPLEMENTOR....:
COPIES TO......:
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 75 (http://askmonty.org/worklog/?tid=75)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
DESCRIPTION:
Currently, the embedded library libmysqld is only built as a static library,
libmysqld.a.
However, there is also a need for a shared embedded library, libmysqld.so.
A good example is Amarok, which is now using MySQL (and by default an embedded
libmysqld) for storing the user's music collection. Amarok is designed to load
a number of modules as .so plugins, and the code using libmysqld is one such
plugin. Any code that is to be loaded as a shared object on Linux must be
built with -fPIC (position independent code). Any library used must thus also
be built with -fPIC and preferably itself be linked statically.
Amarok is a widely used application (end-user desktop). It could be a good
leverage for making libmysqld more popular eg. in distros. However, currently
the distros need to resort to various hacks to make things work with libmysqld
and Amarok. Some links:
Fedora patches .spec to add -fPIC in the build, extract all objects from
libmysqld.a, and re-link them as libmysqld.so.
http://cvs.fedoraproject.org/viewvc/devel/mysql/mysql.spec?r1=1.108&r2=1.109
Gentoo apparently have a patch for MySQL to build libmysqld.so:
http://bugs.gentoo.org/attachment.cgi?id=188057
Debian seems to suggest building both libmysqld.a and libmysqld_pic.a, the
latter to be used for shared objects linking with embedded server:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=508406#52
It would be nice if the main build system was able to build properly
libmysqld.so, so different distros would not need to resort to different hacks
to get things working.
The normal way to build both .a and .so as far as I know is to use libtool; it
will then build each object twice (with and without -fPIC) if needed. This one
might want to make optional to reduce build times. It would also need to be
done for each storage engine.
An alternative would be to build everything with -fPIC. This should probably
be optional also, as on some architectures (ELF x86 32-bit), there is some
speed penalty for -fPIC. The libmysqld.so would then be made only in this
case. Distro packages could do a separate build with -fPIC to make
libmysqld.so (to not get -fPIC into the main server code).
There is a MySQL bug for this:
http://bugs.mysql.com/bug.php?id=39288
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Psergey): Subquery optimization: Avoid recalculating subquery if external fields values found in subquery cache (66)
by worklog-noreply@askmonty.org 20 Jan '10
by worklog-noreply@askmonty.org 20 Jan '10
20 Jan '10
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Subquery optimization: Avoid recalculating subquery if external fields
values found in subquery cache
CREATION DATE..: Wed, 25 Nov 2009, 22:25
SUPERVISOR.....: Monty
IMPLEMENTOR....: Sanja
COPIES TO......:
CATEGORY.......: Server-BackLog
TASK ID........: 66 (http://askmonty.org/worklog/?tid=66)
VERSION........: Server-5.2
STATUS.........: Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Psergey - Wed, 20 Jan 2010, 14:50)=-=-
High-Level Specification modified.
--- /tmp/wklog.66.old.26873 2010-01-20 14:50:41.000000000 +0200
+++ /tmp/wklog.66.new.26873 2010-01-20 14:50:41.000000000 +0200
@@ -4,7 +4,6 @@
To check/discuss:
-----------------
-* Do we put subquery cache on all levels of subqueries or on highest level only
* Will there be any means to measure subquery cache hit rate?
* MySQL-6.0 has a one-element predicate result cache. It is called "left
expression cache", grep for left_expr_cache in sql/item_subselect.*
@@ -41,7 +40,12 @@
- subquery_item_result is 'bool' for subquery predicates, and is of
some scalar or ROW(scalar1,...scalarN) type for scalar-context subquery.
-We dont support cases when outer_expr or correlation_references are blobs.
+We don't support cases when outer_expr or correlation_references are blobs.
+
+All subquery predicates are cached. That is, if one subquery predicate is
+located within another, both of them will have caches (one option to reduce
+cache memory usage was to use cache only for the upper-most select. we decided
+against it).
2. Data structure used for the cache
------------------------------------
-=-=(Psergey - Wed, 20 Jan 2010, 13:07)=-=-
High-Level Specification modified.
--- /tmp/wklog.66.old.17649 2010-01-20 13:07:07.000000000 +0200
+++ /tmp/wklog.66.new.17649 2010-01-20 13:07:07.000000000 +0200
@@ -3,7 +3,13 @@
To check/discuss:
- To put subquery cache on all levels of subqueries or on highest level only.
+-----------------
+* Do we put subquery cache on all levels of subqueries or on highest level only
+* Will there be any means to measure subquery cache hit rate?
+* MySQL-6.0 has a one-element predicate result cache. It is called "left
+ expression cache", grep for left_expr_cache in sql/item_subselect.*
+ When this WL is merged with 6.0's optimizations, these two caches will
+ need to be unified somehow.
<contents>
-=-=(Psergey - Mon, 18 Jan 2010, 16:40)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.24899 2010-01-18 16:40:16.000000000 +0200
+++ /tmp/wklog.66.new.24899 2010-01-18 16:40:16.000000000 +0200
@@ -1,3 +1,5 @@
+* Target version: base on mysql-5.2 code
+
All items on which subquery depend could be collected in
st_select_lex::mark_as_dependent (direct of indirect reference?)
-=-=(Psergey - Mon, 18 Jan 2010, 16:37)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.24586 2010-01-18 16:37:07.000000000 +0200
+++ /tmp/wklog.66.new.24586 2010-01-18 16:37:07.000000000 +0200
@@ -4,6 +4,11 @@
Temporary table index should be created by all fields except result field
(TMP_TABLE_PARAM::keyinfo).
+How to fill the temptable
+-------------------------
+Can reuse approach from SJ-Materialization. Its code is in end_sj_materialize()
+and is supposed to be quite trivial.
+
How to make lookups into temptable
----------------------------------
We'll reuse approach used by SJ-Materialization in 6.0.
-=-=(Psergey - Mon, 18 Jan 2010, 16:34)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.24328 2010-01-18 16:34:19.000000000 +0200
+++ /tmp/wklog.66.new.24328 2010-01-18 16:34:19.000000000 +0200
@@ -32,8 +32,8 @@
Question: or perhaps that is not necessarry?
</questionable>
-Execution process
-~~~~~~~~~~~~~~~~~
+Doing the lookup
+~~~~~~~~~~~~~~~~
SJ-Materialization does lookup in sub_select_sjm(), with this code:
/* Do index lookup in the materialized table */
@@ -42,4 +42,12 @@
if (res || !sjm->in_equality->val_int())
DBUG_RETURN(NESTED_LOOP_NO_MORE_ROWS);
+The code in this WL will use the same approach
+Extracting the value of the subquery predicate
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The goal of making the lookup is to get the value of subquery predicate.
+This is done by creating an Item_field $I which refers to appropriate
+temporary table's field and then subquery_predicate->val_int() will invoke
+$I->val_int(), subquery_predicate->val_str() will invoke $I->val_str() and so
+forth.
-=-=(Psergey - Mon, 18 Jan 2010, 16:23)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.23203 2010-01-18 16:23:18.000000000 +0200
+++ /tmp/wklog.66.new.23203 2010-01-18 16:23:18.000000000 +0200
@@ -31,3 +31,15 @@
Question: or perhaps that is not necessarry?
</questionable>
+
+Execution process
+~~~~~~~~~~~~~~~~~
+SJ-Materialization does lookup in sub_select_sjm(), with this code:
+
+ /* Do index lookup in the materialized table */
+ if ((res= join_read_key2(join_tab, sjm->table, sjm->tab_ref)) == 1)
+ DBUG_RETURN(NESTED_LOOP_ERROR); /* purecov: inspected */
+ if (res || !sjm->in_equality->val_int())
+ DBUG_RETURN(NESTED_LOOP_NO_MORE_ROWS);
+
+
-=-=(Psergey - Mon, 18 Jan 2010, 16:22)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.23076 2010-01-18 16:22:07.000000000 +0200
+++ /tmp/wklog.66.new.23076 2010-01-18 16:22:07.000000000 +0200
@@ -4,3 +4,30 @@
Temporary table index should be created by all fields except result field
(TMP_TABLE_PARAM::keyinfo).
+How to make lookups into temptable
+----------------------------------
+We'll reuse approach used by SJ-Materialization in 6.0.
+
+Setup process
+~~~~~~~~~~~~~
+Setup is performed in the same way as in setup_sj_materialization(),
+see the code that starts these lines:
+
+ /*
+ Create/initialize everything we will need to index lookups into the
+ temptable.
+ */
+
+and ends at this line:
+
+ Remove the injected semi-join IN-equalities from join_tab conds. This
+
+<questionable>
+We'll also need to check equalities, i.e. do an equivalent of this:
+
+ if (!(sjm->in_equality= create_subq_in_equalities(thd, sjm,
+ emb_sj_nest->sj_subq_pred)))
+ DBUG_RETURN(TRUE); /* purecov: inspected */
+
+Question: or perhaps that is not necessarry?
+</questionable>
-=-=(Psergey - Tue, 12 Jan 2010, 18:39)=-=-
High-Level Specification modified.
--- /tmp/wklog.66.old.31666 2010-01-12 18:39:43.000000000 +0200
+++ /tmp/wklog.66.new.31666 2010-01-12 18:39:43.000000000 +0200
@@ -4,3 +4,99 @@
To check/discuss:
To put subquery cache on all levels of subqueries or on highest level only.
+
+
+<contents>
+1. Scope of the task
+2. Data structure used for the cache
+3. Cache size
+4. Interplay with other subquery optimizations
+5. User interface
+</contents>
+
+1. Scope of the task
+--------------------
+This WL should handle all subquery predicates, i.e. it should handle these
+cases:
+
+ outer_expr IN (SELECT correlated_select)
+ outer_expr $CMP$ ALL/ANY (SELECT correlated_select)
+ EXISTS (SELECT correlated_select)
+ scalar-context subquery: (SELECT correlated_select)
+
+The cache will maintain
+
+ (outer_expr, correlation_references)-> subquery_item_result
+
+mapping, where
+- correlation_references is a list of tablename.column_name that are referred
+ from the correlated_select but tablename is a table that is ouside the
+ subquery.
+- subquery_item_result is 'bool' for subquery predicates, and is of
+some scalar or ROW(scalar1,...scalarN) type for scalar-context subquery.
+
+We dont support cases when outer_expr or correlation_references are blobs.
+
+2. Data structure used for the cache
+------------------------------------
+There are two data structures available in the codebase that will allow fast
+equality lookups:
+
+1. HASH (mysys/hash.c) tables
+2. Temporary tables (the ones that are used for e.g. GROUP BY)
+
+None of them has any support for element eviction on overflow (using LRU or
+some other policy).
+
+Query cache and MyISAM/Maria's key/page cache ought to support some eviction
+mechanism, but code-wise it is not readily reusable, one will need to factor
+it out (or copy it).
+
+We choose to use #2, and not to have any eviction policy. See subsequent
+sections for details and reasoning behind the decision.
+
+3. Cache size
+-------------
+Typically, a cache has some maximum size and a policy which is used to
+select a cache entry for removal when the cache becomes full (e.g. find
+and remove the least [recently] used entry)
+
+For this WL entry we will use a cache of infinite size. The reasoning behind
+this is that:
+- is is easy to do: we have temporary tables that can grow to arbitrarily
+ large size while still providing the same insert/lookup interface.
+- it suits us: unless the subquery is resolved with one index lookup,
+ hitting the cache would be many times cheaper than re-running the
+ subquery, so cache is worth having.
+
+4. Interplay with other subquery optimizations
+----------------------------------------------
+* This WL entry should not care about IN->EXISTS transformation: caching for
+ IN subquery and result of its conversion to EXISTS would work in the same
+ way.
+
+* This optimization is orthogonal to <=>ANY -> MIN/MAX rewrite (it will
+ work/be useful irrespectively of whether the rewrite has been performed or
+ not)
+
+* TODO: compare this with materialization for uncorrelated IN-subqueries. Is
+ this basically the same?
+ A: no, it is not:
+ - IN-Materialization has to perform full materialization before it can
+ do the first subquery evaluation. This WL's code has almost no startup
+ costs.
+ - This optimization has temp.table of (corr_reference, predicate_value),
+ while IN-materialization will have (corr_reference) only.
+
+5. User interface
+-----------------
+* There will be an @@optimizer_switch flag to turn this optimization on and
+ off (TODO: name of the flag?)
+
+* TODO: how do we show this in EXPLAIN [EXTENDED]? The most easiest is to
+ print something in the warning text of EXPLAIN EXTEDED that would indicate
+ use of cache.
+
+* temporary table sizing (max size for heap table, whether to use MyISAM or
+ Maria) will be controlled with common temp.table control variables.
+
-=-=(Psergey - Mon, 11 Jan 2010, 13:25)=-=-
As of today, there is code that
- collects outside references
- creates a temporary table with index that would allow for fast lookups.
there is no code to
- fill the temporary table
- make lookups into it
Reported zero hours worked. Estimate unchanged.
-=-=(Sanja - Fri, 11 Dec 2009, 15:09)=-=-
High-Level Specification modified.
--- /tmp/wklog.66.old.28164 2009-12-11 15:09:15.000000000 +0200
+++ /tmp/wklog.66.new.28164 2009-12-11 15:09:15.000000000 +0200
@@ -3,4 +3,4 @@
To check/discuss:
-Are there sens to put subquery cache on all levels of subqueries of on highest.
+ To put subquery cache on all levels of subqueries or on highest level only.
------------------------------------------------------------
-=-=(View All Progress Notes, 12 total)=-=-
http://askmonty.org/worklog/index.pl?tid=66&nolimit=1
DESCRIPTION:
Collect all outer items/references (left part of the subquiery and outer
references inside the subquery) in key string. Compare the string (which
represents certain value set of the references) against values in hash table and
return cached result of subquery if the reference values combination has already
been used.
For example in the following subquery:
(L1, L2) IN (SELECT A, B FROM T WHERE T.F1>OTER_FIELD)
set of references to look into the subquery cache is (L1, L2, OTER_FIELD).
The subquery cache should be implemented as simple LRU connected to the subquery.
Size of the subquery cache (in number of results (but maybe in used memory
amount)) is limited by session variable (query parameter?).
HIGH-LEVEL SPECIFICATION:
Attach subquery cache to each Item_subquery. Interface should allow to use hash
or temporary table inside.
To check/discuss:
-----------------
* Will there be any means to measure subquery cache hit rate?
* MySQL-6.0 has a one-element predicate result cache. It is called "left
expression cache", grep for left_expr_cache in sql/item_subselect.*
When this WL is merged with 6.0's optimizations, these two caches will
need to be unified somehow.
<contents>
1. Scope of the task
2. Data structure used for the cache
3. Cache size
4. Interplay with other subquery optimizations
5. User interface
</contents>
1. Scope of the task
--------------------
This WL should handle all subquery predicates, i.e. it should handle these
cases:
outer_expr IN (SELECT correlated_select)
outer_expr $CMP$ ALL/ANY (SELECT correlated_select)
EXISTS (SELECT correlated_select)
scalar-context subquery: (SELECT correlated_select)
The cache will maintain
(outer_expr, correlation_references)-> subquery_item_result
mapping, where
- correlation_references is a list of tablename.column_name that are referred
from the correlated_select but tablename is a table that is ouside the
subquery.
- subquery_item_result is 'bool' for subquery predicates, and is of
some scalar or ROW(scalar1,...scalarN) type for scalar-context subquery.
We don't support cases when outer_expr or correlation_references are blobs.
All subquery predicates are cached. That is, if one subquery predicate is
located within another, both of them will have caches (one option to reduce
cache memory usage was to use cache only for the upper-most select. we decided
against it).
2. Data structure used for the cache
------------------------------------
There are two data structures available in the codebase that will allow fast
equality lookups:
1. HASH (mysys/hash.c) tables
2. Temporary tables (the ones that are used for e.g. GROUP BY)
None of them has any support for element eviction on overflow (using LRU or
some other policy).
Query cache and MyISAM/Maria's key/page cache ought to support some eviction
mechanism, but code-wise it is not readily reusable, one will need to factor
it out (or copy it).
We choose to use #2, and not to have any eviction policy. See subsequent
sections for details and reasoning behind the decision.
3. Cache size
-------------
Typically, a cache has some maximum size and a policy which is used to
select a cache entry for removal when the cache becomes full (e.g. find
and remove the least [recently] used entry)
For this WL entry we will use a cache of infinite size. The reasoning behind
this is that:
- is is easy to do: we have temporary tables that can grow to arbitrarily
large size while still providing the same insert/lookup interface.
- it suits us: unless the subquery is resolved with one index lookup,
hitting the cache would be many times cheaper than re-running the
subquery, so cache is worth having.
4. Interplay with other subquery optimizations
----------------------------------------------
* This WL entry should not care about IN->EXISTS transformation: caching for
IN subquery and result of its conversion to EXISTS would work in the same
way.
* This optimization is orthogonal to <=>ANY -> MIN/MAX rewrite (it will
work/be useful irrespectively of whether the rewrite has been performed or
not)
* TODO: compare this with materialization for uncorrelated IN-subqueries. Is
this basically the same?
A: no, it is not:
- IN-Materialization has to perform full materialization before it can
do the first subquery evaluation. This WL's code has almost no startup
costs.
- This optimization has temp.table of (corr_reference, predicate_value),
while IN-materialization will have (corr_reference) only.
5. User interface
-----------------
* There will be an @@optimizer_switch flag to turn this optimization on and
off (TODO: name of the flag?)
* TODO: how do we show this in EXPLAIN [EXTENDED]? The most easiest is to
print something in the warning text of EXPLAIN EXTEDED that would indicate
use of cache.
* temporary table sizing (max size for heap table, whether to use MyISAM or
Maria) will be controlled with common temp.table control variables.
LOW-LEVEL DESIGN:
* Target version: base on mysql-5.2 code
All items on which subquery depend could be collected in
st_select_lex::mark_as_dependent (direct of indirect reference?)
Temporary table index should be created by all fields except result field
(TMP_TABLE_PARAM::keyinfo).
How to fill the temptable
-------------------------
Can reuse approach from SJ-Materialization. Its code is in end_sj_materialize()
and is supposed to be quite trivial.
How to make lookups into temptable
----------------------------------
We'll reuse approach used by SJ-Materialization in 6.0.
Setup process
~~~~~~~~~~~~~
Setup is performed in the same way as in setup_sj_materialization(),
see the code that starts these lines:
/*
Create/initialize everything we will need to index lookups into the
temptable.
*/
and ends at this line:
Remove the injected semi-join IN-equalities from join_tab conds. This
<questionable>
We'll also need to check equalities, i.e. do an equivalent of this:
if (!(sjm->in_equality= create_subq_in_equalities(thd, sjm,
emb_sj_nest->sj_subq_pred)))
DBUG_RETURN(TRUE); /* purecov: inspected */
Question: or perhaps that is not necessarry?
</questionable>
Doing the lookup
~~~~~~~~~~~~~~~~
SJ-Materialization does lookup in sub_select_sjm(), with this code:
/* Do index lookup in the materialized table */
if ((res= join_read_key2(join_tab, sjm->table, sjm->tab_ref)) == 1)
DBUG_RETURN(NESTED_LOOP_ERROR); /* purecov: inspected */
if (res || !sjm->in_equality->val_int())
DBUG_RETURN(NESTED_LOOP_NO_MORE_ROWS);
The code in this WL will use the same approach
Extracting the value of the subquery predicate
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The goal of making the lookup is to get the value of subquery predicate.
This is done by creating an Item_field $I which refers to appropriate
temporary table's field and then subquery_predicate->val_int() will invoke
$I->val_int(), subquery_predicate->val_str() will invoke $I->val_str() and so
forth.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Psergey): Subquery optimization: Avoid recalculating subquery if external fields values found in subquery cache (66)
by worklog-noreply@askmonty.org 20 Jan '10
by worklog-noreply@askmonty.org 20 Jan '10
20 Jan '10
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Subquery optimization: Avoid recalculating subquery if external fields
values found in subquery cache
CREATION DATE..: Wed, 25 Nov 2009, 22:25
SUPERVISOR.....: Monty
IMPLEMENTOR....: Sanja
COPIES TO......:
CATEGORY.......: Server-BackLog
TASK ID........: 66 (http://askmonty.org/worklog/?tid=66)
VERSION........: Server-5.2
STATUS.........: Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Psergey - Wed, 20 Jan 2010, 14:50)=-=-
High-Level Specification modified.
--- /tmp/wklog.66.old.26873 2010-01-20 14:50:41.000000000 +0200
+++ /tmp/wklog.66.new.26873 2010-01-20 14:50:41.000000000 +0200
@@ -4,7 +4,6 @@
To check/discuss:
-----------------
-* Do we put subquery cache on all levels of subqueries or on highest level only
* Will there be any means to measure subquery cache hit rate?
* MySQL-6.0 has a one-element predicate result cache. It is called "left
expression cache", grep for left_expr_cache in sql/item_subselect.*
@@ -41,7 +40,12 @@
- subquery_item_result is 'bool' for subquery predicates, and is of
some scalar or ROW(scalar1,...scalarN) type for scalar-context subquery.
-We dont support cases when outer_expr or correlation_references are blobs.
+We don't support cases when outer_expr or correlation_references are blobs.
+
+All subquery predicates are cached. That is, if one subquery predicate is
+located within another, both of them will have caches (one option to reduce
+cache memory usage was to use cache only for the upper-most select. we decided
+against it).
2. Data structure used for the cache
------------------------------------
-=-=(Psergey - Wed, 20 Jan 2010, 13:07)=-=-
High-Level Specification modified.
--- /tmp/wklog.66.old.17649 2010-01-20 13:07:07.000000000 +0200
+++ /tmp/wklog.66.new.17649 2010-01-20 13:07:07.000000000 +0200
@@ -3,7 +3,13 @@
To check/discuss:
- To put subquery cache on all levels of subqueries or on highest level only.
+-----------------
+* Do we put subquery cache on all levels of subqueries or on highest level only
+* Will there be any means to measure subquery cache hit rate?
+* MySQL-6.0 has a one-element predicate result cache. It is called "left
+ expression cache", grep for left_expr_cache in sql/item_subselect.*
+ When this WL is merged with 6.0's optimizations, these two caches will
+ need to be unified somehow.
<contents>
-=-=(Psergey - Mon, 18 Jan 2010, 16:40)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.24899 2010-01-18 16:40:16.000000000 +0200
+++ /tmp/wklog.66.new.24899 2010-01-18 16:40:16.000000000 +0200
@@ -1,3 +1,5 @@
+* Target version: base on mysql-5.2 code
+
All items on which subquery depend could be collected in
st_select_lex::mark_as_dependent (direct of indirect reference?)
-=-=(Psergey - Mon, 18 Jan 2010, 16:37)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.24586 2010-01-18 16:37:07.000000000 +0200
+++ /tmp/wklog.66.new.24586 2010-01-18 16:37:07.000000000 +0200
@@ -4,6 +4,11 @@
Temporary table index should be created by all fields except result field
(TMP_TABLE_PARAM::keyinfo).
+How to fill the temptable
+-------------------------
+Can reuse approach from SJ-Materialization. Its code is in end_sj_materialize()
+and is supposed to be quite trivial.
+
How to make lookups into temptable
----------------------------------
We'll reuse approach used by SJ-Materialization in 6.0.
-=-=(Psergey - Mon, 18 Jan 2010, 16:34)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.24328 2010-01-18 16:34:19.000000000 +0200
+++ /tmp/wklog.66.new.24328 2010-01-18 16:34:19.000000000 +0200
@@ -32,8 +32,8 @@
Question: or perhaps that is not necessarry?
</questionable>
-Execution process
-~~~~~~~~~~~~~~~~~
+Doing the lookup
+~~~~~~~~~~~~~~~~
SJ-Materialization does lookup in sub_select_sjm(), with this code:
/* Do index lookup in the materialized table */
@@ -42,4 +42,12 @@
if (res || !sjm->in_equality->val_int())
DBUG_RETURN(NESTED_LOOP_NO_MORE_ROWS);
+The code in this WL will use the same approach
+Extracting the value of the subquery predicate
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The goal of making the lookup is to get the value of subquery predicate.
+This is done by creating an Item_field $I which refers to appropriate
+temporary table's field and then subquery_predicate->val_int() will invoke
+$I->val_int(), subquery_predicate->val_str() will invoke $I->val_str() and so
+forth.
-=-=(Psergey - Mon, 18 Jan 2010, 16:23)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.23203 2010-01-18 16:23:18.000000000 +0200
+++ /tmp/wklog.66.new.23203 2010-01-18 16:23:18.000000000 +0200
@@ -31,3 +31,15 @@
Question: or perhaps that is not necessarry?
</questionable>
+
+Execution process
+~~~~~~~~~~~~~~~~~
+SJ-Materialization does lookup in sub_select_sjm(), with this code:
+
+ /* Do index lookup in the materialized table */
+ if ((res= join_read_key2(join_tab, sjm->table, sjm->tab_ref)) == 1)
+ DBUG_RETURN(NESTED_LOOP_ERROR); /* purecov: inspected */
+ if (res || !sjm->in_equality->val_int())
+ DBUG_RETURN(NESTED_LOOP_NO_MORE_ROWS);
+
+
-=-=(Psergey - Mon, 18 Jan 2010, 16:22)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.23076 2010-01-18 16:22:07.000000000 +0200
+++ /tmp/wklog.66.new.23076 2010-01-18 16:22:07.000000000 +0200
@@ -4,3 +4,30 @@
Temporary table index should be created by all fields except result field
(TMP_TABLE_PARAM::keyinfo).
+How to make lookups into temptable
+----------------------------------
+We'll reuse approach used by SJ-Materialization in 6.0.
+
+Setup process
+~~~~~~~~~~~~~
+Setup is performed in the same way as in setup_sj_materialization(),
+see the code that starts these lines:
+
+ /*
+ Create/initialize everything we will need to index lookups into the
+ temptable.
+ */
+
+and ends at this line:
+
+ Remove the injected semi-join IN-equalities from join_tab conds. This
+
+<questionable>
+We'll also need to check equalities, i.e. do an equivalent of this:
+
+ if (!(sjm->in_equality= create_subq_in_equalities(thd, sjm,
+ emb_sj_nest->sj_subq_pred)))
+ DBUG_RETURN(TRUE); /* purecov: inspected */
+
+Question: or perhaps that is not necessarry?
+</questionable>
-=-=(Psergey - Tue, 12 Jan 2010, 18:39)=-=-
High-Level Specification modified.
--- /tmp/wklog.66.old.31666 2010-01-12 18:39:43.000000000 +0200
+++ /tmp/wklog.66.new.31666 2010-01-12 18:39:43.000000000 +0200
@@ -4,3 +4,99 @@
To check/discuss:
To put subquery cache on all levels of subqueries or on highest level only.
+
+
+<contents>
+1. Scope of the task
+2. Data structure used for the cache
+3. Cache size
+4. Interplay with other subquery optimizations
+5. User interface
+</contents>
+
+1. Scope of the task
+--------------------
+This WL should handle all subquery predicates, i.e. it should handle these
+cases:
+
+ outer_expr IN (SELECT correlated_select)
+ outer_expr $CMP$ ALL/ANY (SELECT correlated_select)
+ EXISTS (SELECT correlated_select)
+ scalar-context subquery: (SELECT correlated_select)
+
+The cache will maintain
+
+ (outer_expr, correlation_references)-> subquery_item_result
+
+mapping, where
+- correlation_references is a list of tablename.column_name that are referred
+ from the correlated_select but tablename is a table that is ouside the
+ subquery.
+- subquery_item_result is 'bool' for subquery predicates, and is of
+some scalar or ROW(scalar1,...scalarN) type for scalar-context subquery.
+
+We dont support cases when outer_expr or correlation_references are blobs.
+
+2. Data structure used for the cache
+------------------------------------
+There are two data structures available in the codebase that will allow fast
+equality lookups:
+
+1. HASH (mysys/hash.c) tables
+2. Temporary tables (the ones that are used for e.g. GROUP BY)
+
+None of them has any support for element eviction on overflow (using LRU or
+some other policy).
+
+Query cache and MyISAM/Maria's key/page cache ought to support some eviction
+mechanism, but code-wise it is not readily reusable, one will need to factor
+it out (or copy it).
+
+We choose to use #2, and not to have any eviction policy. See subsequent
+sections for details and reasoning behind the decision.
+
+3. Cache size
+-------------
+Typically, a cache has some maximum size and a policy which is used to
+select a cache entry for removal when the cache becomes full (e.g. find
+and remove the least [recently] used entry)
+
+For this WL entry we will use a cache of infinite size. The reasoning behind
+this is that:
+- is is easy to do: we have temporary tables that can grow to arbitrarily
+ large size while still providing the same insert/lookup interface.
+- it suits us: unless the subquery is resolved with one index lookup,
+ hitting the cache would be many times cheaper than re-running the
+ subquery, so cache is worth having.
+
+4. Interplay with other subquery optimizations
+----------------------------------------------
+* This WL entry should not care about IN->EXISTS transformation: caching for
+ IN subquery and result of its conversion to EXISTS would work in the same
+ way.
+
+* This optimization is orthogonal to <=>ANY -> MIN/MAX rewrite (it will
+ work/be useful irrespectively of whether the rewrite has been performed or
+ not)
+
+* TODO: compare this with materialization for uncorrelated IN-subqueries. Is
+ this basically the same?
+ A: no, it is not:
+ - IN-Materialization has to perform full materialization before it can
+ do the first subquery evaluation. This WL's code has almost no startup
+ costs.
+ - This optimization has temp.table of (corr_reference, predicate_value),
+ while IN-materialization will have (corr_reference) only.
+
+5. User interface
+-----------------
+* There will be an @@optimizer_switch flag to turn this optimization on and
+ off (TODO: name of the flag?)
+
+* TODO: how do we show this in EXPLAIN [EXTENDED]? The most easiest is to
+ print something in the warning text of EXPLAIN EXTEDED that would indicate
+ use of cache.
+
+* temporary table sizing (max size for heap table, whether to use MyISAM or
+ Maria) will be controlled with common temp.table control variables.
+
-=-=(Psergey - Mon, 11 Jan 2010, 13:25)=-=-
As of today, there is code that
- collects outside references
- creates a temporary table with index that would allow for fast lookups.
there is no code to
- fill the temporary table
- make lookups into it
Reported zero hours worked. Estimate unchanged.
-=-=(Sanja - Fri, 11 Dec 2009, 15:09)=-=-
High-Level Specification modified.
--- /tmp/wklog.66.old.28164 2009-12-11 15:09:15.000000000 +0200
+++ /tmp/wklog.66.new.28164 2009-12-11 15:09:15.000000000 +0200
@@ -3,4 +3,4 @@
To check/discuss:
-Are there sens to put subquery cache on all levels of subqueries of on highest.
+ To put subquery cache on all levels of subqueries or on highest level only.
------------------------------------------------------------
-=-=(View All Progress Notes, 12 total)=-=-
http://askmonty.org/worklog/index.pl?tid=66&nolimit=1
DESCRIPTION:
Collect all outer items/references (left part of the subquiery and outer
references inside the subquery) in key string. Compare the string (which
represents certain value set of the references) against values in hash table and
return cached result of subquery if the reference values combination has already
been used.
For example in the following subquery:
(L1, L2) IN (SELECT A, B FROM T WHERE T.F1>OTER_FIELD)
set of references to look into the subquery cache is (L1, L2, OTER_FIELD).
The subquery cache should be implemented as simple LRU connected to the subquery.
Size of the subquery cache (in number of results (but maybe in used memory
amount)) is limited by session variable (query parameter?).
HIGH-LEVEL SPECIFICATION:
Attach subquery cache to each Item_subquery. Interface should allow to use hash
or temporary table inside.
To check/discuss:
-----------------
* Will there be any means to measure subquery cache hit rate?
* MySQL-6.0 has a one-element predicate result cache. It is called "left
expression cache", grep for left_expr_cache in sql/item_subselect.*
When this WL is merged with 6.0's optimizations, these two caches will
need to be unified somehow.
<contents>
1. Scope of the task
2. Data structure used for the cache
3. Cache size
4. Interplay with other subquery optimizations
5. User interface
</contents>
1. Scope of the task
--------------------
This WL should handle all subquery predicates, i.e. it should handle these
cases:
outer_expr IN (SELECT correlated_select)
outer_expr $CMP$ ALL/ANY (SELECT correlated_select)
EXISTS (SELECT correlated_select)
scalar-context subquery: (SELECT correlated_select)
The cache will maintain
(outer_expr, correlation_references)-> subquery_item_result
mapping, where
- correlation_references is a list of tablename.column_name that are referred
from the correlated_select but tablename is a table that is ouside the
subquery.
- subquery_item_result is 'bool' for subquery predicates, and is of
some scalar or ROW(scalar1,...scalarN) type for scalar-context subquery.
We don't support cases when outer_expr or correlation_references are blobs.
All subquery predicates are cached. That is, if one subquery predicate is
located within another, both of them will have caches (one option to reduce
cache memory usage was to use cache only for the upper-most select. we decided
against it).
2. Data structure used for the cache
------------------------------------
There are two data structures available in the codebase that will allow fast
equality lookups:
1. HASH (mysys/hash.c) tables
2. Temporary tables (the ones that are used for e.g. GROUP BY)
None of them has any support for element eviction on overflow (using LRU or
some other policy).
Query cache and MyISAM/Maria's key/page cache ought to support some eviction
mechanism, but code-wise it is not readily reusable, one will need to factor
it out (or copy it).
We choose to use #2, and not to have any eviction policy. See subsequent
sections for details and reasoning behind the decision.
3. Cache size
-------------
Typically, a cache has some maximum size and a policy which is used to
select a cache entry for removal when the cache becomes full (e.g. find
and remove the least [recently] used entry)
For this WL entry we will use a cache of infinite size. The reasoning behind
this is that:
- is is easy to do: we have temporary tables that can grow to arbitrarily
large size while still providing the same insert/lookup interface.
- it suits us: unless the subquery is resolved with one index lookup,
hitting the cache would be many times cheaper than re-running the
subquery, so cache is worth having.
4. Interplay with other subquery optimizations
----------------------------------------------
* This WL entry should not care about IN->EXISTS transformation: caching for
IN subquery and result of its conversion to EXISTS would work in the same
way.
* This optimization is orthogonal to <=>ANY -> MIN/MAX rewrite (it will
work/be useful irrespectively of whether the rewrite has been performed or
not)
* TODO: compare this with materialization for uncorrelated IN-subqueries. Is
this basically the same?
A: no, it is not:
- IN-Materialization has to perform full materialization before it can
do the first subquery evaluation. This WL's code has almost no startup
costs.
- This optimization has temp.table of (corr_reference, predicate_value),
while IN-materialization will have (corr_reference) only.
5. User interface
-----------------
* There will be an @@optimizer_switch flag to turn this optimization on and
off (TODO: name of the flag?)
* TODO: how do we show this in EXPLAIN [EXTENDED]? The most easiest is to
print something in the warning text of EXPLAIN EXTEDED that would indicate
use of cache.
* temporary table sizing (max size for heap table, whether to use MyISAM or
Maria) will be controlled with common temp.table control variables.
LOW-LEVEL DESIGN:
* Target version: base on mysql-5.2 code
All items on which subquery depend could be collected in
st_select_lex::mark_as_dependent (direct of indirect reference?)
Temporary table index should be created by all fields except result field
(TMP_TABLE_PARAM::keyinfo).
How to fill the temptable
-------------------------
Can reuse approach from SJ-Materialization. Its code is in end_sj_materialize()
and is supposed to be quite trivial.
How to make lookups into temptable
----------------------------------
We'll reuse approach used by SJ-Materialization in 6.0.
Setup process
~~~~~~~~~~~~~
Setup is performed in the same way as in setup_sj_materialization(),
see the code that starts these lines:
/*
Create/initialize everything we will need to index lookups into the
temptable.
*/
and ends at this line:
Remove the injected semi-join IN-equalities from join_tab conds. This
<questionable>
We'll also need to check equalities, i.e. do an equivalent of this:
if (!(sjm->in_equality= create_subq_in_equalities(thd, sjm,
emb_sj_nest->sj_subq_pred)))
DBUG_RETURN(TRUE); /* purecov: inspected */
Question: or perhaps that is not necessarry?
</questionable>
Doing the lookup
~~~~~~~~~~~~~~~~
SJ-Materialization does lookup in sub_select_sjm(), with this code:
/* Do index lookup in the materialized table */
if ((res= join_read_key2(join_tab, sjm->table, sjm->tab_ref)) == 1)
DBUG_RETURN(NESTED_LOOP_ERROR); /* purecov: inspected */
if (res || !sjm->in_equality->val_int())
DBUG_RETURN(NESTED_LOOP_NO_MORE_ROWS);
The code in this WL will use the same approach
Extracting the value of the subquery predicate
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The goal of making the lookup is to get the value of subquery predicate.
This is done by creating an Item_field $I which refers to appropriate
temporary table's field and then subquery_predicate->val_int() will invoke
$I->val_int(), subquery_predicate->val_str() will invoke $I->val_str() and so
forth.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Psergey): Subquery optimization: Avoid recalculating subquery if external fields values found in subquery cache (66)
by worklog-noreply@askmonty.org 20 Jan '10
by worklog-noreply@askmonty.org 20 Jan '10
20 Jan '10
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Subquery optimization: Avoid recalculating subquery if external fields
values found in subquery cache
CREATION DATE..: Wed, 25 Nov 2009, 22:25
SUPERVISOR.....: Monty
IMPLEMENTOR....: Sanja
COPIES TO......:
CATEGORY.......: Server-BackLog
TASK ID........: 66 (http://askmonty.org/worklog/?tid=66)
VERSION........: Server-5.2
STATUS.........: Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Psergey - Wed, 20 Jan 2010, 13:07)=-=-
High-Level Specification modified.
--- /tmp/wklog.66.old.17649 2010-01-20 13:07:07.000000000 +0200
+++ /tmp/wklog.66.new.17649 2010-01-20 13:07:07.000000000 +0200
@@ -3,7 +3,13 @@
To check/discuss:
- To put subquery cache on all levels of subqueries or on highest level only.
+-----------------
+* Do we put subquery cache on all levels of subqueries or on highest level only
+* Will there be any means to measure subquery cache hit rate?
+* MySQL-6.0 has a one-element predicate result cache. It is called "left
+ expression cache", grep for left_expr_cache in sql/item_subselect.*
+ When this WL is merged with 6.0's optimizations, these two caches will
+ need to be unified somehow.
<contents>
-=-=(Psergey - Mon, 18 Jan 2010, 16:40)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.24899 2010-01-18 16:40:16.000000000 +0200
+++ /tmp/wklog.66.new.24899 2010-01-18 16:40:16.000000000 +0200
@@ -1,3 +1,5 @@
+* Target version: base on mysql-5.2 code
+
All items on which subquery depend could be collected in
st_select_lex::mark_as_dependent (direct of indirect reference?)
-=-=(Psergey - Mon, 18 Jan 2010, 16:37)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.24586 2010-01-18 16:37:07.000000000 +0200
+++ /tmp/wklog.66.new.24586 2010-01-18 16:37:07.000000000 +0200
@@ -4,6 +4,11 @@
Temporary table index should be created by all fields except result field
(TMP_TABLE_PARAM::keyinfo).
+How to fill the temptable
+-------------------------
+Can reuse approach from SJ-Materialization. Its code is in end_sj_materialize()
+and is supposed to be quite trivial.
+
How to make lookups into temptable
----------------------------------
We'll reuse approach used by SJ-Materialization in 6.0.
-=-=(Psergey - Mon, 18 Jan 2010, 16:34)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.24328 2010-01-18 16:34:19.000000000 +0200
+++ /tmp/wklog.66.new.24328 2010-01-18 16:34:19.000000000 +0200
@@ -32,8 +32,8 @@
Question: or perhaps that is not necessarry?
</questionable>
-Execution process
-~~~~~~~~~~~~~~~~~
+Doing the lookup
+~~~~~~~~~~~~~~~~
SJ-Materialization does lookup in sub_select_sjm(), with this code:
/* Do index lookup in the materialized table */
@@ -42,4 +42,12 @@
if (res || !sjm->in_equality->val_int())
DBUG_RETURN(NESTED_LOOP_NO_MORE_ROWS);
+The code in this WL will use the same approach
+Extracting the value of the subquery predicate
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The goal of making the lookup is to get the value of subquery predicate.
+This is done by creating an Item_field $I which refers to appropriate
+temporary table's field and then subquery_predicate->val_int() will invoke
+$I->val_int(), subquery_predicate->val_str() will invoke $I->val_str() and so
+forth.
-=-=(Psergey - Mon, 18 Jan 2010, 16:23)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.23203 2010-01-18 16:23:18.000000000 +0200
+++ /tmp/wklog.66.new.23203 2010-01-18 16:23:18.000000000 +0200
@@ -31,3 +31,15 @@
Question: or perhaps that is not necessarry?
</questionable>
+
+Execution process
+~~~~~~~~~~~~~~~~~
+SJ-Materialization does lookup in sub_select_sjm(), with this code:
+
+ /* Do index lookup in the materialized table */
+ if ((res= join_read_key2(join_tab, sjm->table, sjm->tab_ref)) == 1)
+ DBUG_RETURN(NESTED_LOOP_ERROR); /* purecov: inspected */
+ if (res || !sjm->in_equality->val_int())
+ DBUG_RETURN(NESTED_LOOP_NO_MORE_ROWS);
+
+
-=-=(Psergey - Mon, 18 Jan 2010, 16:22)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.23076 2010-01-18 16:22:07.000000000 +0200
+++ /tmp/wklog.66.new.23076 2010-01-18 16:22:07.000000000 +0200
@@ -4,3 +4,30 @@
Temporary table index should be created by all fields except result field
(TMP_TABLE_PARAM::keyinfo).
+How to make lookups into temptable
+----------------------------------
+We'll reuse approach used by SJ-Materialization in 6.0.
+
+Setup process
+~~~~~~~~~~~~~
+Setup is performed in the same way as in setup_sj_materialization(),
+see the code that starts these lines:
+
+ /*
+ Create/initialize everything we will need to index lookups into the
+ temptable.
+ */
+
+and ends at this line:
+
+ Remove the injected semi-join IN-equalities from join_tab conds. This
+
+<questionable>
+We'll also need to check equalities, i.e. do an equivalent of this:
+
+ if (!(sjm->in_equality= create_subq_in_equalities(thd, sjm,
+ emb_sj_nest->sj_subq_pred)))
+ DBUG_RETURN(TRUE); /* purecov: inspected */
+
+Question: or perhaps that is not necessarry?
+</questionable>
-=-=(Psergey - Tue, 12 Jan 2010, 18:39)=-=-
High-Level Specification modified.
--- /tmp/wklog.66.old.31666 2010-01-12 18:39:43.000000000 +0200
+++ /tmp/wklog.66.new.31666 2010-01-12 18:39:43.000000000 +0200
@@ -4,3 +4,99 @@
To check/discuss:
To put subquery cache on all levels of subqueries or on highest level only.
+
+
+<contents>
+1. Scope of the task
+2. Data structure used for the cache
+3. Cache size
+4. Interplay with other subquery optimizations
+5. User interface
+</contents>
+
+1. Scope of the task
+--------------------
+This WL should handle all subquery predicates, i.e. it should handle these
+cases:
+
+ outer_expr IN (SELECT correlated_select)
+ outer_expr $CMP$ ALL/ANY (SELECT correlated_select)
+ EXISTS (SELECT correlated_select)
+ scalar-context subquery: (SELECT correlated_select)
+
+The cache will maintain
+
+ (outer_expr, correlation_references)-> subquery_item_result
+
+mapping, where
+- correlation_references is a list of tablename.column_name that are referred
+ from the correlated_select but tablename is a table that is ouside the
+ subquery.
+- subquery_item_result is 'bool' for subquery predicates, and is of
+some scalar or ROW(scalar1,...scalarN) type for scalar-context subquery.
+
+We dont support cases when outer_expr or correlation_references are blobs.
+
+2. Data structure used for the cache
+------------------------------------
+There are two data structures available in the codebase that will allow fast
+equality lookups:
+
+1. HASH (mysys/hash.c) tables
+2. Temporary tables (the ones that are used for e.g. GROUP BY)
+
+None of them has any support for element eviction on overflow (using LRU or
+some other policy).
+
+Query cache and MyISAM/Maria's key/page cache ought to support some eviction
+mechanism, but code-wise it is not readily reusable, one will need to factor
+it out (or copy it).
+
+We choose to use #2, and not to have any eviction policy. See subsequent
+sections for details and reasoning behind the decision.
+
+3. Cache size
+-------------
+Typically, a cache has some maximum size and a policy which is used to
+select a cache entry for removal when the cache becomes full (e.g. find
+and remove the least [recently] used entry)
+
+For this WL entry we will use a cache of infinite size. The reasoning behind
+this is that:
+- is is easy to do: we have temporary tables that can grow to arbitrarily
+ large size while still providing the same insert/lookup interface.
+- it suits us: unless the subquery is resolved with one index lookup,
+ hitting the cache would be many times cheaper than re-running the
+ subquery, so cache is worth having.
+
+4. Interplay with other subquery optimizations
+----------------------------------------------
+* This WL entry should not care about IN->EXISTS transformation: caching for
+ IN subquery and result of its conversion to EXISTS would work in the same
+ way.
+
+* This optimization is orthogonal to <=>ANY -> MIN/MAX rewrite (it will
+ work/be useful irrespectively of whether the rewrite has been performed or
+ not)
+
+* TODO: compare this with materialization for uncorrelated IN-subqueries. Is
+ this basically the same?
+ A: no, it is not:
+ - IN-Materialization has to perform full materialization before it can
+ do the first subquery evaluation. This WL's code has almost no startup
+ costs.
+ - This optimization has temp.table of (corr_reference, predicate_value),
+ while IN-materialization will have (corr_reference) only.
+
+5. User interface
+-----------------
+* There will be an @@optimizer_switch flag to turn this optimization on and
+ off (TODO: name of the flag?)
+
+* TODO: how do we show this in EXPLAIN [EXTENDED]? The most easiest is to
+ print something in the warning text of EXPLAIN EXTEDED that would indicate
+ use of cache.
+
+* temporary table sizing (max size for heap table, whether to use MyISAM or
+ Maria) will be controlled with common temp.table control variables.
+
-=-=(Psergey - Mon, 11 Jan 2010, 13:25)=-=-
As of today, there is code that
- collects outside references
- creates a temporary table with index that would allow for fast lookups.
there is no code to
- fill the temporary table
- make lookups into it
Reported zero hours worked. Estimate unchanged.
-=-=(Sanja - Fri, 11 Dec 2009, 15:09)=-=-
High-Level Specification modified.
--- /tmp/wklog.66.old.28164 2009-12-11 15:09:15.000000000 +0200
+++ /tmp/wklog.66.new.28164 2009-12-11 15:09:15.000000000 +0200
@@ -3,4 +3,4 @@
To check/discuss:
-Are there sens to put subquery cache on all levels of subqueries of on highest.
+ To put subquery cache on all levels of subqueries or on highest level only.
-=-=(Sanja - Fri, 11 Dec 2009, 15:08)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.28072 2009-12-11 15:08:16.000000000 +0200
+++ /tmp/wklog.66.new.28072 2009-12-11 15:08:16.000000000 +0200
@@ -1 +1,6 @@
+All items on which subquery depend could be collected in
+st_select_lex::mark_as_dependent (direct of indirect reference?)
+
+Temporary table index should be created by all fields except result field
+(TMP_TABLE_PARAM::keyinfo).
------------------------------------------------------------
-=-=(View All Progress Notes, 11 total)=-=-
http://askmonty.org/worklog/index.pl?tid=66&nolimit=1
DESCRIPTION:
Collect all outer items/references (left part of the subquiery and outer
references inside the subquery) in key string. Compare the string (which
represents certain value set of the references) against values in hash table and
return cached result of subquery if the reference values combination has already
been used.
For example in the following subquery:
(L1, L2) IN (SELECT A, B FROM T WHERE T.F1>OTER_FIELD)
set of references to look into the subquery cache is (L1, L2, OTER_FIELD).
The subquery cache should be implemented as simple LRU connected to the subquery.
Size of the subquery cache (in number of results (but maybe in used memory
amount)) is limited by session variable (query parameter?).
HIGH-LEVEL SPECIFICATION:
Attach subquery cache to each Item_subquery. Interface should allow to use hash
or temporary table inside.
To check/discuss:
-----------------
* Do we put subquery cache on all levels of subqueries or on highest level only
* Will there be any means to measure subquery cache hit rate?
* MySQL-6.0 has a one-element predicate result cache. It is called "left
expression cache", grep for left_expr_cache in sql/item_subselect.*
When this WL is merged with 6.0's optimizations, these two caches will
need to be unified somehow.
<contents>
1. Scope of the task
2. Data structure used for the cache
3. Cache size
4. Interplay with other subquery optimizations
5. User interface
</contents>
1. Scope of the task
--------------------
This WL should handle all subquery predicates, i.e. it should handle these
cases:
outer_expr IN (SELECT correlated_select)
outer_expr $CMP$ ALL/ANY (SELECT correlated_select)
EXISTS (SELECT correlated_select)
scalar-context subquery: (SELECT correlated_select)
The cache will maintain
(outer_expr, correlation_references)-> subquery_item_result
mapping, where
- correlation_references is a list of tablename.column_name that are referred
from the correlated_select but tablename is a table that is ouside the
subquery.
- subquery_item_result is 'bool' for subquery predicates, and is of
some scalar or ROW(scalar1,...scalarN) type for scalar-context subquery.
We dont support cases when outer_expr or correlation_references are blobs.
2. Data structure used for the cache
------------------------------------
There are two data structures available in the codebase that will allow fast
equality lookups:
1. HASH (mysys/hash.c) tables
2. Temporary tables (the ones that are used for e.g. GROUP BY)
None of them has any support for element eviction on overflow (using LRU or
some other policy).
Query cache and MyISAM/Maria's key/page cache ought to support some eviction
mechanism, but code-wise it is not readily reusable, one will need to factor
it out (or copy it).
We choose to use #2, and not to have any eviction policy. See subsequent
sections for details and reasoning behind the decision.
3. Cache size
-------------
Typically, a cache has some maximum size and a policy which is used to
select a cache entry for removal when the cache becomes full (e.g. find
and remove the least [recently] used entry)
For this WL entry we will use a cache of infinite size. The reasoning behind
this is that:
- is is easy to do: we have temporary tables that can grow to arbitrarily
large size while still providing the same insert/lookup interface.
- it suits us: unless the subquery is resolved with one index lookup,
hitting the cache would be many times cheaper than re-running the
subquery, so cache is worth having.
4. Interplay with other subquery optimizations
----------------------------------------------
* This WL entry should not care about IN->EXISTS transformation: caching for
IN subquery and result of its conversion to EXISTS would work in the same
way.
* This optimization is orthogonal to <=>ANY -> MIN/MAX rewrite (it will
work/be useful irrespectively of whether the rewrite has been performed or
not)
* TODO: compare this with materialization for uncorrelated IN-subqueries. Is
this basically the same?
A: no, it is not:
- IN-Materialization has to perform full materialization before it can
do the first subquery evaluation. This WL's code has almost no startup
costs.
- This optimization has temp.table of (corr_reference, predicate_value),
while IN-materialization will have (corr_reference) only.
5. User interface
-----------------
* There will be an @@optimizer_switch flag to turn this optimization on and
off (TODO: name of the flag?)
* TODO: how do we show this in EXPLAIN [EXTENDED]? The most easiest is to
print something in the warning text of EXPLAIN EXTEDED that would indicate
use of cache.
* temporary table sizing (max size for heap table, whether to use MyISAM or
Maria) will be controlled with common temp.table control variables.
LOW-LEVEL DESIGN:
* Target version: base on mysql-5.2 code
All items on which subquery depend could be collected in
st_select_lex::mark_as_dependent (direct of indirect reference?)
Temporary table index should be created by all fields except result field
(TMP_TABLE_PARAM::keyinfo).
How to fill the temptable
-------------------------
Can reuse approach from SJ-Materialization. Its code is in end_sj_materialize()
and is supposed to be quite trivial.
How to make lookups into temptable
----------------------------------
We'll reuse approach used by SJ-Materialization in 6.0.
Setup process
~~~~~~~~~~~~~
Setup is performed in the same way as in setup_sj_materialization(),
see the code that starts these lines:
/*
Create/initialize everything we will need to index lookups into the
temptable.
*/
and ends at this line:
Remove the injected semi-join IN-equalities from join_tab conds. This
<questionable>
We'll also need to check equalities, i.e. do an equivalent of this:
if (!(sjm->in_equality= create_subq_in_equalities(thd, sjm,
emb_sj_nest->sj_subq_pred)))
DBUG_RETURN(TRUE); /* purecov: inspected */
Question: or perhaps that is not necessarry?
</questionable>
Doing the lookup
~~~~~~~~~~~~~~~~
SJ-Materialization does lookup in sub_select_sjm(), with this code:
/* Do index lookup in the materialized table */
if ((res= join_read_key2(join_tab, sjm->table, sjm->tab_ref)) == 1)
DBUG_RETURN(NESTED_LOOP_ERROR); /* purecov: inspected */
if (res || !sjm->in_equality->val_int())
DBUG_RETURN(NESTED_LOOP_NO_MORE_ROWS);
The code in this WL will use the same approach
Extracting the value of the subquery predicate
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The goal of making the lookup is to get the value of subquery predicate.
This is done by creating an Item_field $I which refers to appropriate
temporary table's field and then subquery_predicate->val_int() will invoke
$I->val_int(), subquery_predicate->val_str() will invoke $I->val_str() and so
forth.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Psergey): Subquery optimization: Avoid recalculating subquery if external fields values found in subquery cache (66)
by worklog-noreply@askmonty.org 20 Jan '10
by worklog-noreply@askmonty.org 20 Jan '10
20 Jan '10
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Subquery optimization: Avoid recalculating subquery if external fields
values found in subquery cache
CREATION DATE..: Wed, 25 Nov 2009, 22:25
SUPERVISOR.....: Monty
IMPLEMENTOR....: Sanja
COPIES TO......:
CATEGORY.......: Server-BackLog
TASK ID........: 66 (http://askmonty.org/worklog/?tid=66)
VERSION........: Server-5.2
STATUS.........: Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Psergey - Wed, 20 Jan 2010, 13:07)=-=-
High-Level Specification modified.
--- /tmp/wklog.66.old.17649 2010-01-20 13:07:07.000000000 +0200
+++ /tmp/wklog.66.new.17649 2010-01-20 13:07:07.000000000 +0200
@@ -3,7 +3,13 @@
To check/discuss:
- To put subquery cache on all levels of subqueries or on highest level only.
+-----------------
+* Do we put subquery cache on all levels of subqueries or on highest level only
+* Will there be any means to measure subquery cache hit rate?
+* MySQL-6.0 has a one-element predicate result cache. It is called "left
+ expression cache", grep for left_expr_cache in sql/item_subselect.*
+ When this WL is merged with 6.0's optimizations, these two caches will
+ need to be unified somehow.
<contents>
-=-=(Psergey - Mon, 18 Jan 2010, 16:40)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.24899 2010-01-18 16:40:16.000000000 +0200
+++ /tmp/wklog.66.new.24899 2010-01-18 16:40:16.000000000 +0200
@@ -1,3 +1,5 @@
+* Target version: base on mysql-5.2 code
+
All items on which subquery depend could be collected in
st_select_lex::mark_as_dependent (direct of indirect reference?)
-=-=(Psergey - Mon, 18 Jan 2010, 16:37)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.24586 2010-01-18 16:37:07.000000000 +0200
+++ /tmp/wklog.66.new.24586 2010-01-18 16:37:07.000000000 +0200
@@ -4,6 +4,11 @@
Temporary table index should be created by all fields except result field
(TMP_TABLE_PARAM::keyinfo).
+How to fill the temptable
+-------------------------
+Can reuse approach from SJ-Materialization. Its code is in end_sj_materialize()
+and is supposed to be quite trivial.
+
How to make lookups into temptable
----------------------------------
We'll reuse approach used by SJ-Materialization in 6.0.
-=-=(Psergey - Mon, 18 Jan 2010, 16:34)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.24328 2010-01-18 16:34:19.000000000 +0200
+++ /tmp/wklog.66.new.24328 2010-01-18 16:34:19.000000000 +0200
@@ -32,8 +32,8 @@
Question: or perhaps that is not necessarry?
</questionable>
-Execution process
-~~~~~~~~~~~~~~~~~
+Doing the lookup
+~~~~~~~~~~~~~~~~
SJ-Materialization does lookup in sub_select_sjm(), with this code:
/* Do index lookup in the materialized table */
@@ -42,4 +42,12 @@
if (res || !sjm->in_equality->val_int())
DBUG_RETURN(NESTED_LOOP_NO_MORE_ROWS);
+The code in this WL will use the same approach
+Extracting the value of the subquery predicate
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The goal of making the lookup is to get the value of subquery predicate.
+This is done by creating an Item_field $I which refers to appropriate
+temporary table's field and then subquery_predicate->val_int() will invoke
+$I->val_int(), subquery_predicate->val_str() will invoke $I->val_str() and so
+forth.
-=-=(Psergey - Mon, 18 Jan 2010, 16:23)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.23203 2010-01-18 16:23:18.000000000 +0200
+++ /tmp/wklog.66.new.23203 2010-01-18 16:23:18.000000000 +0200
@@ -31,3 +31,15 @@
Question: or perhaps that is not necessarry?
</questionable>
+
+Execution process
+~~~~~~~~~~~~~~~~~
+SJ-Materialization does lookup in sub_select_sjm(), with this code:
+
+ /* Do index lookup in the materialized table */
+ if ((res= join_read_key2(join_tab, sjm->table, sjm->tab_ref)) == 1)
+ DBUG_RETURN(NESTED_LOOP_ERROR); /* purecov: inspected */
+ if (res || !sjm->in_equality->val_int())
+ DBUG_RETURN(NESTED_LOOP_NO_MORE_ROWS);
+
+
-=-=(Psergey - Mon, 18 Jan 2010, 16:22)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.23076 2010-01-18 16:22:07.000000000 +0200
+++ /tmp/wklog.66.new.23076 2010-01-18 16:22:07.000000000 +0200
@@ -4,3 +4,30 @@
Temporary table index should be created by all fields except result field
(TMP_TABLE_PARAM::keyinfo).
+How to make lookups into temptable
+----------------------------------
+We'll reuse approach used by SJ-Materialization in 6.0.
+
+Setup process
+~~~~~~~~~~~~~
+Setup is performed in the same way as in setup_sj_materialization(),
+see the code that starts these lines:
+
+ /*
+ Create/initialize everything we will need to index lookups into the
+ temptable.
+ */
+
+and ends at this line:
+
+ Remove the injected semi-join IN-equalities from join_tab conds. This
+
+<questionable>
+We'll also need to check equalities, i.e. do an equivalent of this:
+
+ if (!(sjm->in_equality= create_subq_in_equalities(thd, sjm,
+ emb_sj_nest->sj_subq_pred)))
+ DBUG_RETURN(TRUE); /* purecov: inspected */
+
+Question: or perhaps that is not necessarry?
+</questionable>
-=-=(Psergey - Tue, 12 Jan 2010, 18:39)=-=-
High-Level Specification modified.
--- /tmp/wklog.66.old.31666 2010-01-12 18:39:43.000000000 +0200
+++ /tmp/wklog.66.new.31666 2010-01-12 18:39:43.000000000 +0200
@@ -4,3 +4,99 @@
To check/discuss:
To put subquery cache on all levels of subqueries or on highest level only.
+
+
+<contents>
+1. Scope of the task
+2. Data structure used for the cache
+3. Cache size
+4. Interplay with other subquery optimizations
+5. User interface
+</contents>
+
+1. Scope of the task
+--------------------
+This WL should handle all subquery predicates, i.e. it should handle these
+cases:
+
+ outer_expr IN (SELECT correlated_select)
+ outer_expr $CMP$ ALL/ANY (SELECT correlated_select)
+ EXISTS (SELECT correlated_select)
+ scalar-context subquery: (SELECT correlated_select)
+
+The cache will maintain
+
+ (outer_expr, correlation_references)-> subquery_item_result
+
+mapping, where
+- correlation_references is a list of tablename.column_name that are referred
+ from the correlated_select but tablename is a table that is ouside the
+ subquery.
+- subquery_item_result is 'bool' for subquery predicates, and is of
+some scalar or ROW(scalar1,...scalarN) type for scalar-context subquery.
+
+We dont support cases when outer_expr or correlation_references are blobs.
+
+2. Data structure used for the cache
+------------------------------------
+There are two data structures available in the codebase that will allow fast
+equality lookups:
+
+1. HASH (mysys/hash.c) tables
+2. Temporary tables (the ones that are used for e.g. GROUP BY)
+
+None of them has any support for element eviction on overflow (using LRU or
+some other policy).
+
+Query cache and MyISAM/Maria's key/page cache ought to support some eviction
+mechanism, but code-wise it is not readily reusable, one will need to factor
+it out (or copy it).
+
+We choose to use #2, and not to have any eviction policy. See subsequent
+sections for details and reasoning behind the decision.
+
+3. Cache size
+-------------
+Typically, a cache has some maximum size and a policy which is used to
+select a cache entry for removal when the cache becomes full (e.g. find
+and remove the least [recently] used entry)
+
+For this WL entry we will use a cache of infinite size. The reasoning behind
+this is that:
+- is is easy to do: we have temporary tables that can grow to arbitrarily
+ large size while still providing the same insert/lookup interface.
+- it suits us: unless the subquery is resolved with one index lookup,
+ hitting the cache would be many times cheaper than re-running the
+ subquery, so cache is worth having.
+
+4. Interplay with other subquery optimizations
+----------------------------------------------
+* This WL entry should not care about IN->EXISTS transformation: caching for
+ IN subquery and result of its conversion to EXISTS would work in the same
+ way.
+
+* This optimization is orthogonal to <=>ANY -> MIN/MAX rewrite (it will
+ work/be useful irrespectively of whether the rewrite has been performed or
+ not)
+
+* TODO: compare this with materialization for uncorrelated IN-subqueries. Is
+ this basically the same?
+ A: no, it is not:
+ - IN-Materialization has to perform full materialization before it can
+ do the first subquery evaluation. This WL's code has almost no startup
+ costs.
+ - This optimization has temp.table of (corr_reference, predicate_value),
+ while IN-materialization will have (corr_reference) only.
+
+5. User interface
+-----------------
+* There will be an @@optimizer_switch flag to turn this optimization on and
+ off (TODO: name of the flag?)
+
+* TODO: how do we show this in EXPLAIN [EXTENDED]? The most easiest is to
+ print something in the warning text of EXPLAIN EXTEDED that would indicate
+ use of cache.
+
+* temporary table sizing (max size for heap table, whether to use MyISAM or
+ Maria) will be controlled with common temp.table control variables.
+
-=-=(Psergey - Mon, 11 Jan 2010, 13:25)=-=-
As of today, there is code that
- collects outside references
- creates a temporary table with index that would allow for fast lookups.
there is no code to
- fill the temporary table
- make lookups into it
Reported zero hours worked. Estimate unchanged.
-=-=(Sanja - Fri, 11 Dec 2009, 15:09)=-=-
High-Level Specification modified.
--- /tmp/wklog.66.old.28164 2009-12-11 15:09:15.000000000 +0200
+++ /tmp/wklog.66.new.28164 2009-12-11 15:09:15.000000000 +0200
@@ -3,4 +3,4 @@
To check/discuss:
-Are there sens to put subquery cache on all levels of subqueries of on highest.
+ To put subquery cache on all levels of subqueries or on highest level only.
-=-=(Sanja - Fri, 11 Dec 2009, 15:08)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.28072 2009-12-11 15:08:16.000000000 +0200
+++ /tmp/wklog.66.new.28072 2009-12-11 15:08:16.000000000 +0200
@@ -1 +1,6 @@
+All items on which subquery depend could be collected in
+st_select_lex::mark_as_dependent (direct of indirect reference?)
+
+Temporary table index should be created by all fields except result field
+(TMP_TABLE_PARAM::keyinfo).
------------------------------------------------------------
-=-=(View All Progress Notes, 11 total)=-=-
http://askmonty.org/worklog/index.pl?tid=66&nolimit=1
DESCRIPTION:
Collect all outer items/references (left part of the subquiery and outer
references inside the subquery) in key string. Compare the string (which
represents certain value set of the references) against values in hash table and
return cached result of subquery if the reference values combination has already
been used.
For example in the following subquery:
(L1, L2) IN (SELECT A, B FROM T WHERE T.F1>OTER_FIELD)
set of references to look into the subquery cache is (L1, L2, OTER_FIELD).
The subquery cache should be implemented as simple LRU connected to the subquery.
Size of the subquery cache (in number of results (but maybe in used memory
amount)) is limited by session variable (query parameter?).
HIGH-LEVEL SPECIFICATION:
Attach subquery cache to each Item_subquery. Interface should allow to use hash
or temporary table inside.
To check/discuss:
-----------------
* Do we put subquery cache on all levels of subqueries or on highest level only
* Will there be any means to measure subquery cache hit rate?
* MySQL-6.0 has a one-element predicate result cache. It is called "left
expression cache", grep for left_expr_cache in sql/item_subselect.*
When this WL is merged with 6.0's optimizations, these two caches will
need to be unified somehow.
<contents>
1. Scope of the task
2. Data structure used for the cache
3. Cache size
4. Interplay with other subquery optimizations
5. User interface
</contents>
1. Scope of the task
--------------------
This WL should handle all subquery predicates, i.e. it should handle these
cases:
outer_expr IN (SELECT correlated_select)
outer_expr $CMP$ ALL/ANY (SELECT correlated_select)
EXISTS (SELECT correlated_select)
scalar-context subquery: (SELECT correlated_select)
The cache will maintain
(outer_expr, correlation_references)-> subquery_item_result
mapping, where
- correlation_references is a list of tablename.column_name that are referred
from the correlated_select but tablename is a table that is ouside the
subquery.
- subquery_item_result is 'bool' for subquery predicates, and is of
some scalar or ROW(scalar1,...scalarN) type for scalar-context subquery.
We dont support cases when outer_expr or correlation_references are blobs.
2. Data structure used for the cache
------------------------------------
There are two data structures available in the codebase that will allow fast
equality lookups:
1. HASH (mysys/hash.c) tables
2. Temporary tables (the ones that are used for e.g. GROUP BY)
None of them has any support for element eviction on overflow (using LRU or
some other policy).
Query cache and MyISAM/Maria's key/page cache ought to support some eviction
mechanism, but code-wise it is not readily reusable, one will need to factor
it out (or copy it).
We choose to use #2, and not to have any eviction policy. See subsequent
sections for details and reasoning behind the decision.
3. Cache size
-------------
Typically, a cache has some maximum size and a policy which is used to
select a cache entry for removal when the cache becomes full (e.g. find
and remove the least [recently] used entry)
For this WL entry we will use a cache of infinite size. The reasoning behind
this is that:
- is is easy to do: we have temporary tables that can grow to arbitrarily
large size while still providing the same insert/lookup interface.
- it suits us: unless the subquery is resolved with one index lookup,
hitting the cache would be many times cheaper than re-running the
subquery, so cache is worth having.
4. Interplay with other subquery optimizations
----------------------------------------------
* This WL entry should not care about IN->EXISTS transformation: caching for
IN subquery and result of its conversion to EXISTS would work in the same
way.
* This optimization is orthogonal to <=>ANY -> MIN/MAX rewrite (it will
work/be useful irrespectively of whether the rewrite has been performed or
not)
* TODO: compare this with materialization for uncorrelated IN-subqueries. Is
this basically the same?
A: no, it is not:
- IN-Materialization has to perform full materialization before it can
do the first subquery evaluation. This WL's code has almost no startup
costs.
- This optimization has temp.table of (corr_reference, predicate_value),
while IN-materialization will have (corr_reference) only.
5. User interface
-----------------
* There will be an @@optimizer_switch flag to turn this optimization on and
off (TODO: name of the flag?)
* TODO: how do we show this in EXPLAIN [EXTENDED]? The most easiest is to
print something in the warning text of EXPLAIN EXTEDED that would indicate
use of cache.
* temporary table sizing (max size for heap table, whether to use MyISAM or
Maria) will be controlled with common temp.table control variables.
LOW-LEVEL DESIGN:
* Target version: base on mysql-5.2 code
All items on which subquery depend could be collected in
st_select_lex::mark_as_dependent (direct of indirect reference?)
Temporary table index should be created by all fields except result field
(TMP_TABLE_PARAM::keyinfo).
How to fill the temptable
-------------------------
Can reuse approach from SJ-Materialization. Its code is in end_sj_materialize()
and is supposed to be quite trivial.
How to make lookups into temptable
----------------------------------
We'll reuse approach used by SJ-Materialization in 6.0.
Setup process
~~~~~~~~~~~~~
Setup is performed in the same way as in setup_sj_materialization(),
see the code that starts these lines:
/*
Create/initialize everything we will need to index lookups into the
temptable.
*/
and ends at this line:
Remove the injected semi-join IN-equalities from join_tab conds. This
<questionable>
We'll also need to check equalities, i.e. do an equivalent of this:
if (!(sjm->in_equality= create_subq_in_equalities(thd, sjm,
emb_sj_nest->sj_subq_pred)))
DBUG_RETURN(TRUE); /* purecov: inspected */
Question: or perhaps that is not necessarry?
</questionable>
Doing the lookup
~~~~~~~~~~~~~~~~
SJ-Materialization does lookup in sub_select_sjm(), with this code:
/* Do index lookup in the materialized table */
if ((res= join_read_key2(join_tab, sjm->table, sjm->tab_ref)) == 1)
DBUG_RETURN(NESTED_LOOP_ERROR); /* purecov: inspected */
if (res || !sjm->in_equality->val_int())
DBUG_RETURN(NESTED_LOOP_NO_MORE_ROWS);
The code in this WL will use the same approach
Extracting the value of the subquery predicate
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The goal of making the lookup is to get the value of subquery predicate.
This is done by creating an Item_field $I which refers to appropriate
temporary table's field and then subquery_predicate->val_int() will invoke
$I->val_int(), subquery_predicate->val_str() will invoke $I->val_str() and so
forth.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Rev 2769: Added create options for table^ fields and keys (MWL#43). in file:///Users/bell/maria/bzr/work-maria-5.1-create-options/
by sanja@askmonty.org 19 Jan '10
by sanja@askmonty.org 19 Jan '10
19 Jan '10
At file:///Users/bell/maria/bzr/work-maria-5.1-create-options/
------------------------------------------------------------
revno: 2769
revision-id: sanja(a)askmonty.org-20091201120747-cvp0uni2rd4iy5xi
parent: bo.thorsen(a)canonical.com-20091126153249-0xfszhcynbym2lvr
committer: sanja(a)askmonty.org
branch nick: work-maria-5.1-create-options
timestamp: Tue 2009-12-01 14:07:47 +0200
message:
Added create options for table^ fields and keys (MWL#43).
Diff too large for email (1985 lines, the limit is 1000).
3
6