2016-10-21 1:06 GMT+08:00 Kristian Nielsen <knielsen@knielsen-hq.org>:
GCSAdmin <notifications@github.com> writes:

> We add new event types to support compress the binlog as follow:
>      QUERY_COMPRESSED_EVENT,
>      WRITE_ROWS_COMPRESSED_EVENT_V1,
>      UPDATE_ROWS_COMPRESSED_EVENT_V1,
>      DELETE_POWS_COMPRESSED_EVENT_V1,
>      WRITE_ROWS_COMPRESSED_EVENT,
>      UPDATE_ROWS_COMPRESSED_EVENT,
>      DELETE_POWS_COMPRESSED_EVENT
>

Thanks for the patch!

Overall, I'm much impressed with the quality, I see a lot of attention to
detail. Below, I have a number of comments and suggestions, but they are all
minor for a patch of this complexity.

A number of the comments are style fixes or simple changes, and I found it
easier to just do the changes to the source for you to review. I hope that
is ok. I rebased the series without intermediate merge to simplify review to
a single patch. The rebase with my suggested changes on top is here:

  https://github.com/knielsen/server/commits/GCSAdmin-10.2-binlog-compressed2-2

Please check the changes I made and let me know if you disagree with
anything or I misunderstood something. Changes are mostly:

 - A bunch of code style (indentation, lines <= 80 chars, clarify comments,
   and so on).

 - Change LOG_EVENT_IS_QUERY() etc. from macros to static inline functions.

 - check_event_type() updated (new code merged into 10.2 recently).

 - Minor .result file update, also from code merged into 10.2 recently.

 
Thanks for your changes, it seems it's ok. And my new code will base on it.
 
I also have the following questions/suggestions:

1. I think the code should sanity-check the compressed event header, to
check that it does not access outside the event buffer in corrupt
events. Eg. if lenlen is 7 and there is less than 7 bytes available in the
event. I know that replication code in general is not robust to corrupt
events, but it seems best that new code avoids overflowing buffers in case
of corrupt event data.

Yes, you are right. I will add the check to avoid overflowing.
 
2. There are some places where the compressed event types are not added to
switch() or if() statements, mainly related to minimal binlog images, I
think: Rows_log_event::read_write_bitmaps_cmp(),
Rows_log_event::get_data_size(), Rows_log_event::do_apply_event(),
Rows_log_event::write_data_body(). I *think* the reason for that is that the
SQL thread never sees the compressed events (they are uncompressed by the IO
thread), but I would like your confirmation that my understanding is
correct.

In SQL thread, the compressed event will uncompressed the “body” in the constructor.
So, most of member functions of compressed events can inherit directly, no need to derived.

Such as read_write_bitmaps_cmp(),  there is same general_type_code for WRITE_ROWS_EVENT  and WRITE_ROWS_COMPRESSED_EVENT.
And the body is uncompressed in constructor for compressed events, so there is  no need to change. 

  bool read_write_bitmaps_cmp(TABLE *table)
  {
    bool res= FALSE;

    switch (get_general_type_code())
    {
      case DELETE_ROWS_EVENT:
        res= bitmap_cmp(get_cols(), table->read_set);
        break;
      case UPDATE_ROWS_EVENT:
        res= (bitmap_cmp(get_cols(), table->read_set) &&
              bitmap_cmp(get_cols_ai(), table->rpl_write_set));
        break;
      case WRITE_ROWS_EVENT:
        res= bitmap_cmp(get_cols(), table->rpl_write_set);
        break;
      default:
        /*
          We should just compare bitmaps for Delete, Write
          or Update rows events.
        */
        DBUG_ASSERT(0);
    }
    return res;
  }
 
3. Did you think about testing that BINLOG statements with compressed events
can execute correctly? This happens with mysqlbinlog | mysql, when there are
(compressed) row-based events in the binlog. I'm wondering if this could
expose compressed events to code that normally runs in the SQL thread and
expects to have events uncompressed for it by the IO thread?
 
As mentioned above, the compressed events would uncompressed in constructor in SQL thread.
So, it would also work even if the IO thread do nothing. 

Of course, it would also work when "mysqlbinlog | mysql" . I also add some test cases for that in mysql-test like "main.mysqlbinlog_row_compressed","main.mysqlbinlog_stmt_compressed"
 

A few detailed comments on the patch below (most comments are done as
changes in the above-linked git branch). 
 
Otherwise, after answer to above 3 questions, I think the patch looks good
to go into 10.2.

 - Kristian.

-----------------------------------------------------------------------

> +#define BINLOG_COMPRESSED_HEADER_LEN 1
> +#define BINLOG_COMPRESSED_ORIGINAL_LENGTH_MAX_BYTES 4
> +/**
> +  Compressed Record
> +    Record Header: 1 Byte
> +                   0 Bit: Always 1, mean compressed;
> +                   1-3 Bit: Reversed, compressed algorithm??Always 0, means zlib
> +                   4-7 Bit: Bytes of "Record Original Length"
> +    Record Original Length: 1-4 Bytes
> +    Compressed Buf:

I did not understand this. Why "reversed"? It seems to be bits 0-2 that have
the bytes of "Record Original Length" and bit 7 that has the '1' bit meaning
compression enabled? Maybe this can be clarified.

sorry, maybe there are some mistake here.
There following may be right:

Compressed Record
  Record Header: 1 Byte
                 7 Bit: Always 1, mean compressed;
                 4-6 Bit: Reversed, compressed algorithm  
                  Now it is always 0, means zlib. 
                  It maybe support other compression algorithm in the feture.
                 0-3 Bit: Bytes of "Record Original Length"
  Record Original Length: 1-4 Bytes
  Compressed Buf:

 

> +int query_event_uncompress(const Format_description_log_event *description_event, bool contain_checksum,
> +                                                const char *src, char* buf, ulong buf_size, bool* is_malloc,
> +               char **dst, ulong *newlen)
> +{
> +  ulong len = uint4korr(src + EVENT_LEN_OFFSET);
> +     const char *tmp = src;
> +
> +  DBUG_ASSERT((uchar)src[EVENT_TYPE_OFFSET] == QUERY_COMPRESSED_EVENT);
> +
> +     uint8 common_header_len= description_event->common_header_len;
> +     uint8 post_header_len= description_event->post_header_len[QUERY_COMPRESSED_EVENT-1];
> +
> +     tmp += common_header_len;
> +
> +     uint db_len = (uint)tmp[Q_DB_LEN_OFFSET];
> +     uint16 status_vars_len= uint2korr(tmp + Q_STATUS_VARS_LEN_OFFSET);
> +
> +     tmp += post_header_len + status_vars_len + db_len + 1;
> +
> +     uint32 un_len = binlog_get_uncompress_len(tmp);
> +     *newlen = (tmp - src) + un_len;

This is one place I would like to see a check on the data in the
event. Maybe just that 'len' is larger than 1+(tmp[0]&7). Or maybe simply
that len is at least 10, the minimal size for compressed events.

> +int row_log_event_uncompress(const Format_description_log_event *description_event, bool contain_checksum,
> +                             const char *src, char* buf, ulong buf_size, bool* is_malloc,
> +                             char **dst, ulong *newlen)
> +{

> +     uint32 un_len = binlog_get_uncompress_len(tmp);
> +     *newlen = (tmp - src) + un_len;

Another place where I'd like a check against looking outside the event
buffer.

> +int binlog_buf_uncompress(const char *src, char *dst, uint32 len, uint32 *newlen)
> +{
> +     if((src[0] & 0x80) == 0)
> +     {
> +             return 1;
> +     }
> +
> +     uint32 lenlen = src[0] & 0x07;
> +    uLongf buflen  = *newlen;
> +     if(uncompress((Bytef *)dst, &buflen, (const Bytef*)src + 1 + lenlen, len) != Z_OK)

Again check on length.
 
Ok, I will add the check.