Hi!
I am cc:ing maria-discuss as you accidently forgot to also email the
patch there."Oleksandr" == Oleksandr Byelkin <sanja@askmonty.org> writes:
Oleksandr> Hi!
Oleksandr> Here it is dynamic columns library. I do not think that you will have time to review it completely, but please look at it.
Oleksandr> Some notes:
Oleksandr> - I changed indices from int to uint + some other minor changes in prototypes.
Oleksandr> - dynamic_column_exists() has no way to report error (what to do?)
I assume that same problem exists for all the other functions.
Two options:
- Change function type from my_bool to int where -1 means something
went bad (like bad format for the blob).
- Add an 'error' argument to all functions that is set to <> 0 if
something goes wrong.
+my_bool dynamic_column_init_str(DYNAMIC_COLUMN *str, size_t size)+{+ if (!size)+ size= DYNCOL_SYZERESERVE;+ /*+ Make string with no fields (empty header)+ - First \0 is flags+ - other four \0 is 0 fileds counter+ */+ if (init_dynamic_string(str, NULL,+ size + 5, DYNCOL_SYZERESERVE))+ return TRUE;+ return dynstr_append_mem(str, "\0\0\0\0\0", 5);+}
Why is an empty data 5 bytes instead of 1 or even 0?
+ for (i= 0; i < column_count - 1; i++)+ if (columns_order[i][0] == columns_order[i+1][0])+ goto err;++ DBUG_ASSERT(str->length == FIXED_HEADER_SIZE);+ str->str[0]|= (offset_size - 1); // size of offset+ int4store(str->str + 1, non_null_count); // columns number
Lets limit the number of columns to 65536 (2 bytes)
Don't understand why we store non_null_count in header
Don't we need to store total number of columns ?
I think we should store also null's in the dynamic columns as someone
may want to distinguish between a non existing columns and a column
with not a set value.
We could store this either as an extended type or as datetime with
size 0.
We also need functions:
Initialize the DYNAMIC_COLUMN. (Basicly just a call to
init_dynamic_string())
+ /* rewrite header*/+ str->str[0]|= (new_offset_size - 1); // size of offset+ int4store(str->str + 1, column_count - 1); // columns number+ for (i= 0, write= read= (uchar *)str->str + FIXED_HEADER_SIZE;+ i < column_count;+ i++, read+= entry_size, write+= new_entry_size)+ {+ size_t offs;+ uint nm;+ DYNAMIC_COLUMN_TYPE tp;+ if (read == header_entry)+ {+#ifndef DBUG_OFF+ nm= uint2korr(read);+ type_and_offset_read(&tp, &offs, read + 2,+ offset_size);+ DBUG_ASSERT(nm == column_nr);+ DBUG_ASSERT(offs == deleted_entry_offset);+#endif+ write-= new_entry_size; // do not move writer+ continue; // skip removed field+ }++ nm= uint2korr(read),+ type_and_offset_read(&tp, &offs, read + 2,+ offset_size);++ if (offs > deleted_entry_offset)+ offs-= length; // data stored after removed data++ int2store(write, nm);+ type_and_offset_store(write + 2, new_offset_size, tp, offs);+ }
Please optimize that if new_offset_size == old_offset_size then
we should start from header_entry.
You should also make the above a function as update also needs to do this!++ /* move data */+ {+ size_t first_chunk_len= (data - (uchar *)str->str) -+ FIXED_HEADER_SIZE - header_size;+ size_t second_chunk_len= new_data_size - first_chunk_len;+ if (first_chunk_len)+ memmove(str->str + FIXED_HEADER_SIZE + new_header_size,+ str->str + FIXED_HEADER_SIZE + header_size,+ first_chunk_len);+ if (second_chunk_len)+ memmove(str->str ++ FIXED_HEADER_SIZE + new_header_size + first_chunk_len,+ str->str ++ FIXED_HEADER_SIZE + header_size + first_chunk_len + length,+ second_chunk_len);+ }++ /* fix str length */+ DBUG_ASSERT(str->length >=+ FIXED_HEADER_SIZE + new_header_size + new_data_size);+ str->length= FIXED_HEADER_SIZE + new_header_size + new_data_size;++ return FALSE;+}+++my_bool dynamic_column_update(DYNAMIC_COLUMN *str, uint column_nr,+ DYNAMIC_COLUMN_VALUE *value)+{+ size_t offset_size, new_offset_size, data_size, new_data_size,+ add_data_size, entry_size, new_entry_size,+ header_size, new_header_size, additional_size, next_offs,+ data_ins_offset;+ uint column_count, i;+ uchar *read, *write;+ my_bool added;++ /* TODO: optimize update without deleting */+ if (dynamic_column_delete(str, column_nr))+ return TRUE;
The problem with doing delete + insert later is that the whole
function needs to be rewritten and you can't use any of the old code
(for columns that exists).
So I suggest you write this ASAP!
++ if (value->type == DYN_COL_NULL)+ return FALSE;
Should be stored.
++ if (!str || str->length < FIXED_HEADER_SIZE ||+ (str->str[0] & (~DYNCOL_FLG_KNOWN)))+ return TRUE;
length of 0 should be regarded as ok.
+ /* rewrite header */+ next_offs= str->length - FIXED_HEADER_SIZE - new_header_size;+ for (i= column_count,+ read= (uchar *)str->str + FIXED_HEADER_SIZE + header_size - entry_size,+ write= ((uchar *)str->str + FIXED_HEADER_SIZE ++ (new_entry_size * column_count));+ i > 0;+ i--, read-= entry_size, write-= new_entry_size)+ {+ size_t offs;+ uint nm;+ DYNAMIC_COLUMN_TYPE tp;++ nm= uint2korr(read);+ type_and_offset_read(&tp, &offs, read + 2, offset_size);++ if (!added && column_nr > nm)+ {+ /* place to put the new column */+ int2store(write, column_nr);+ type_and_offset_store(write + 2, new_offset_size,+ value->type,+ (data_ins_offset= next_offs - add_data_size));+ /* move previous data */+ memmove(str->str + FIXED_HEADER_SIZE + new_header_size,+ str->str + FIXED_HEADER_SIZE + new_header_size + add_data_size,+ next_offs - add_data_size);+ added= TRUE;+ write-= new_entry_size;+ }+ int2store(write, nm);+ if (!added)+ offs+= add_data_size;+ type_and_offset_store(write + 2, new_offset_size, tp, offs);+ next_offs= offs;+ }
We should just have one function to rewrite the headers!
To make life easer, maybe we should assume that header size change
operation is a very unlikely one and accept to do an extra memmove
when header offset changes?
This would allow us to have a simple function that just rewrites the
header (from a given point) and another that increases/decreases the
header size + all data ?
Then all operations are basicly:
- If offset_pointer_size changes, fix data
- Add / remove things from header and data (with new offset size)
- Update header
The whole things above would then be very trivial:
- realloc str->length if it needs to grow
- If offset_pointer_size changes, fix data and update entry_pos
- Update data (and create new header entry last)
- memmove(entry, entry + entry_size, header_end - entry);
- Write data at entry