[Maria-developers] A fix for MDEV-5689 ExtractValue(xml, 'substring(/x, /y)') crashes
Hi Sergei, please review a fix for MDEV-5689. It also fixes MDEV-5709 ExtractValue() with XPath variable references returns wrong result. Description: 1. The main problem was that that nodeset_func->fix_fields() was called in Item_func_xml_extractvalue::val_str() and Item_func_xml_update::val_str(), which led in some cases to execution of the XPath engine *before* having a parsed XML value. Moved to Item_xml_str_func::fix_fields(). 2. Cleanup: added a new method Item_xml_str_func::fix_fields() and moved most of the code from Item_xml_str_func::fix_length_and_dec() to Item_xml_str_func::fix_fields(), to follow the usual Item layout. 3. Cleanup: a parsed XML value is useless without the raw XML value it was built from. Previously the parsed and the raw values where stored in separate String instances. It was hard to follow how they are synchronized. Added a helper class XML which contains both parsed and raw values. Makes things easier to read and modify. 4. MDEV-5709: const_item() could incorrectly return a "true" result when XPath expression contains users/SP variable references. Now nodeset_func->const_item() is also taken into account to catch such cases. 5. Minor code enhancements.
Hi, Alexander! On Feb 20, Alexander Barkov wrote:
please review a fix for MDEV-5689.
It also fixes MDEV-5709 ExtractValue() with XPath variable references returns wrong result.
Description:
1. The main problem was that that nodeset_func->fix_fields() was called in Item_func_xml_extractvalue::val_str() and Item_func_xml_update::val_str(), which led in some cases to execution of the XPath engine *before* having a parsed XML value. Moved to Item_xml_str_func::fix_fields().
2. Cleanup: added a new method Item_xml_str_func::fix_fields() and moved most of the code from Item_xml_str_func::fix_length_and_dec() to Item_xml_str_func::fix_fields(), to follow the usual Item layout.
3. Cleanup: a parsed XML value is useless without the raw XML value it was built from.
Previously the parsed and the raw values where stored in separate String instances. It was hard to follow how they are synchronized. Added a helper class XML which contains both parsed and raw values. Makes things easier to read and modify.
4. MDEV-5709: const_item() could incorrectly return a "true" result when XPath expression contains users/SP variable references. Now nodeset_func->const_item() is also taken into account to catch such cases.
5. Minor code enhancements.
Good, please put this in a changeset comment.
=== modified file 'sql/item_xmlfunc.h' --- sql/item_xmlfunc.h 2013-11-28 21:35:59 +0000 +++ sql/item_xmlfunc.h 2014-02-20 13:45:36 +0000 @@ -26,11 +26,55 @@ #endif
+typedef struct my_xml_node_st MY_XML_NODE; + + class Item_xml_str_func: public Item_str_func { protected: - String tmp_value, pxml; + /* + A helper class to store raw and parsed XML. + */ + class XML + { + bool m_cached; + String *m_raw_ptr; // Pointer to text representation + String m_raw_buf; // Cached text representation + String m_parsed_buf; // Array of MY_XML_NODEs, pointing to raw_buffer
How's that an "Array of MY_XML_NODEs", if it's just a String?
+ bool parse();
Ok. I'm confused. In some places it looks like m_parsed_buf is, indeed, an array of MY_XML_NODEs. In other places it looks like it's a string. How comes? Regards, Sergei
Hi Sergei, Thanks for review. Please see my comments inline: On 03/19/2014 12:03 AM, Sergei Golubchik wrote:
Hi, Alexander!
On Feb 20, Alexander Barkov wrote:
please review a fix for MDEV-5689.
It also fixes MDEV-5709 ExtractValue() with XPath variable references returns wrong result.
Description:
1. The main problem was that that nodeset_func->fix_fields() was called in Item_func_xml_extractvalue::val_str() and Item_func_xml_update::val_str(), which led in some cases to execution of the XPath engine *before* having a parsed XML value. Moved to Item_xml_str_func::fix_fields().
2. Cleanup: added a new method Item_xml_str_func::fix_fields() and moved most of the code from Item_xml_str_func::fix_length_and_dec() to Item_xml_str_func::fix_fields(), to follow the usual Item layout.
3. Cleanup: a parsed XML value is useless without the raw XML value it was built from.
Previously the parsed and the raw values where stored in separate String instances. It was hard to follow how they are synchronized. Added a helper class XML which contains both parsed and raw values. Makes things easier to read and modify.
4. MDEV-5709: const_item() could incorrectly return a "true" result when XPath expression contains users/SP variable references. Now nodeset_func->const_item() is also taken into account to catch such cases.
5. Minor code enhancements.
Good, please put this in a changeset comment.
Sure,
=== modified file 'sql/item_xmlfunc.h' --- sql/item_xmlfunc.h 2013-11-28 21:35:59 +0000 +++ sql/item_xmlfunc.h 2014-02-20 13:45:36 +0000 @@ -26,11 +26,55 @@ #endif
+typedef struct my_xml_node_st MY_XML_NODE; + + class Item_xml_str_func: public Item_str_func { protected: - String tmp_value, pxml; + /* + A helper class to store raw and parsed XML. + */ + class XML + { + bool m_cached; + String *m_raw_ptr; // Pointer to text representation + String m_raw_buf; // Cached text representation + String m_parsed_buf; // Array of MY_XML_NODEs, pointing to raw_buffer
How's that an "Array of MY_XML_NODEs", if it's just a String?
+ bool parse();
Ok. I'm confused. In some places it looks like m_parsed_buf is, indeed, an array of MY_XML_NODEs. In other places it looks like it's a string. How comes?
It's always a dynamic array of MY_XML_NODEs, which just uses String as a dynamic storage. It's never used as a "normal" string. There is one exception though: m_parsed_bug.charset() is used to store character set of the entire XML value, to make MY_XPATH aware of the character set of the XML value. It could be stored in a structure of "DYNAMIC_ARRAY + CHARSET_INFO" instead. But that will need a bigger patch.
Regards, Sergei
Hi, Alexander! On Mar 20, Alexander Barkov wrote:
On 03/19/2014 12:03 AM, Sergei Golubchik wrote:
On Feb 20, Alexander Barkov wrote:
please review a fix for MDEV-5689.
It also fixes MDEV-5709 ExtractValue() with XPath variable references returns wrong result.
+ String m_parsed_buf; // Array of MY_XML_NODEs, pointing to raw_buffer
How's that an "Array of MY_XML_NODEs", if it's just a String? Ok. I'm confused. In some places it looks like m_parsed_buf is, indeed, an array of MY_XML_NODEs. In other places it looks like it's a string. How comes?
It's always a dynamic array of MY_XML_NODEs, which just uses String as a dynamic storage. It's never used as a "normal" string. There is one exception though: m_parsed_bug.charset() is used to store character set of the entire XML value, to make MY_XPATH aware of the character set of the XML value.
It could be stored in a structure of "DYNAMIC_ARRAY + CHARSET_INFO" instead. But that will need a bigger patch.
Okay. While it'd be a nice chance, I agree that it's outside of the scope of this bug fix. Regards, Sergei
Hi Sergei, On 03/20/2014 12:04 PM, Sergei Golubchik wrote:
Hi, Alexander!
On Mar 20, Alexander Barkov wrote:
On 03/19/2014 12:03 AM, Sergei Golubchik wrote:
On Feb 20, Alexander Barkov wrote:
please review a fix for MDEV-5689.
It also fixes MDEV-5709 ExtractValue() with XPath variable references returns wrong result.
+ String m_parsed_buf; // Array of MY_XML_NODEs, pointing to raw_buffer
How's that an "Array of MY_XML_NODEs", if it's just a String? Ok. I'm confused. In some places it looks like m_parsed_buf is, indeed, an array of MY_XML_NODEs. In other places it looks like it's a string. How comes?
It's always a dynamic array of MY_XML_NODEs, which just uses String as a dynamic storage. It's never used as a "normal" string. There is one exception though: m_parsed_bug.charset() is used to store character set of the entire XML value, to make MY_XPATH aware of the character set of the XML value.
It could be stored in a structure of "DYNAMIC_ARRAY + CHARSET_INFO" instead. But that will need a bigger patch.
Okay. While it'd be a nice chance, I agree that it's outside of the scope of this bug fix.
I pushed my patch to 10.0. But the reported test case did not crash without the fix. I guess this is because some fix was earlier merged from MySQL to MariaDB-5.5 and then to MariaDB-10.0. Should now a part of this MySQL fix be reverted in MariaDB-10.0?
Regards, Sergei
Hi, Alexander! On Mar 23, Alexander Barkov wrote:
On Feb 20, Alexander Barkov wrote:
please review a fix for MDEV-5689.
It also fixes MDEV-5709 ExtractValue() with XPath variable references returns wrong result.
I pushed my patch to 10.0. But the reported test case did not crash without the fix. I guess this is because some fix was earlier merged from MySQL to MariaDB-5.5 and then to MariaDB-10.0.
Yes, I've merged the fix from MySQL-5.5
Should now a part of this MySQL fix be reverted in MariaDB-10.0?
As you like. If I recall correctly, it marks Item_nodesets as always not const. I don't see it causing any harm, these items are never part of the item tree, so their non-const-ness cannot be propagated up and affect other items and complex expression. But I could be missing something, sure. Regards, Sergei
participants (2)
-
Alexander Barkov
-
Sergei Golubchik