Hello!

Intermediate review was very helpful. It has made things more clear. I worked on it and here is the report for the same.

I committed the result file for insert_parser. Here is the link: https://github.com/rucha174/server/blob/10.4/mysql-test/main/insert_parser.result

I have made changes according to the diff file and also fixed REPLACE...RETURNING and have pushed the latest changes on my repo.
I also ran the tests related to INSERT...RETURNING. They are passing. 
 
Also, I am willing to work on fixing some of the hacks once GSoC project implementation is perfect. 

Regards,
Rucha Deodhar

On Sun, Jul 21, 2019, 12:05 AM Vicențiu Ciorbaru <vicentiu@mariadb.org> wrote:
Hi Rucha!

I've spent quite a bit of time looking at the implementation. It is working and passes almost all test cases, although you forgot to also commit the insert_parser updated result file. Good job so far.

I want to give you something to work on and think about for a bit before we do another overall overview of the whole project. Hopefully this will help you across all your coding endeavours, not just now. What I want you to learn is that it is important to think about how your code can and will evolve in the future. Just getting things to work is ok up to a certain point, but not implementing things in a decoupled fashion is what we call "hacks". These hacks are to be avoided as much as possible as it leads to much more difficult work down the line. Let's elaborate on the topic.

There are a number of problems with our current code (hacks). I am not referring to yours in particular. All these hacks are a bit difficult to overcome in one go. I do not expect you to fix them yourself, not during this GSoC at least, but it is something to consider and if you are willing to work towards fixing some of them, excellent!

Historically our codebase evolved from a simple parser to the "monster" that you see now. Sometimes due to time constraints, things were not implemented in the most extendable way, rather they made use of certain particular constructs, only valid in the contexts they were written in. Keyword here is context, as you'll soon see. Some of this contain things you have interacted with so far in your project. Here is an example that you've had to deal with:

A SELECT statement has what we call a SELECT LIST, this list is traditionally stored in SELECT_LEX::item_list. This item_list however is overloaded at times. For example DELETE ... RETURNING uses it to store the expressions part of the RETURNING clause. This overloading generally works ok, unless you run into situations like you did with your INSERT ... RETURNING project.

INSERT ... RETURNING clauses already made use of the item_list to store something else. Now you were forced to introduce another list in SELECT_LEX, which we called returning_list. Everything ok so far. But then, you ran into another problem, the bison grammar rules do not put Items in the list you want. They always used item_list, so you could not use your returning_list. So what did we do then? Well, we came up with another hack! We suggested this hack to you because it is something that will work quickly and get you unstuck and that is to use the same grammar rules like before, but swap the item_list with the returning list temporarily, such that the grammar rules will put the right items in the right list. This works, but it masks the underlying problem. The problem is that we have a very rigid implementation for select_item_list grammar rule.

What we should do is to make select_item_list grammar return a list using the bison stack. That means no relying on SELECT_LEX::item_list to store our result temporarily. You have done steps towards fixing this, which brings us to where your code's current state. The problem with your implementation is you have not lifted the restriction of the grammar rules making use of SELECT_LIST::item_list. Instead, you have masked it by returning the address of that member on the bison stack. The funny part is that this works, but it still a hack. It is still a hack, because these grammar rules have a side effect of modifying this member behind the scenes. A very, very, very (!) good rule is to consider all side effects as bad, especially now that you are starting out. With experience you might get away with them every now-and-then, but for now avoid them like the plague and try to remove them whenever possible.

The current code makes use of a function I really don't like. The inline function add_item_list(). It is one of those hacky functions that strictly relies on context. The context is: take the thd->lex->current_select and put the item passed as a parameter in the item_list. The diff I've attached shows an implementation of how to use the bison stack properly. I took inspiration from the "expr" rule. Unfortunately the hack chain goes deeper, but we need to do baby steps and some things are best left outside your GSoC project, or we risk going down a rabbit whole we might not get out of.

One other issue I've observed is with the REPLACE code here:

          insert_field_spec 
          {
            if ($6)
            {
              List<Item> list;
              list=*($6);
              Lex->current_select->item_list.empty();
              $6=&list;
            }
          }
          opt_select_expressions
          {
            if ($8)
              Lex->returning_list=*($8);
            if ($6)
              Lex->current_select->item_list=*($6);
            Lex->pop_select(); //main select
            if (Lex->check_main_unit_semantics())
              MYSQL_YYABORT;
          }

This is really problematic and your probably wrote it because of opt_select_expressions putting stuff into item_list. In the insert_field_spec rule, you receive the address of current_select->item_list as parameter $6. You store the value of this list (basically the pointer to the head element) in a temporary variable that is only in scope during the if statement (!). Attempting to use it after the if statement is over (via the $6 parameter) is illegal! It only works because the compiler did not chose to use that area of memory for something else. Please find a way to store things here properly.

After this is fixed, we'll spend the next month cleaning everything up, documenting our code, the added test cases etc. We have an overall working solution, but we need to make it as hack-free as possible, so we can later work on supporting INSERT ... RETURNING in subqueries.

Let me know if anything in the email is unclear. It took me a while to write it, trying to explain the reasoning, but I may have missed a few things. :)

Vicențiu