Re: [Maria-discuss] MariaDB Server 10.3 notes

13 Oct 2016

      Whilst I agree it would be a nice feature and something I've heard for 
years, the implementation would likely be more complex than it seems.

For starters you have to consider the possibility of a procedure 
crashing and taking the whole daemon with it. You would likely need to 
fork a small worker process pool and have some kind of shared memory or 
socket communications for safety. In addition the implementation would 
have to be extremely careful not to add any potential security hole due 
to a zero-day in PHP or some bad input filtering for example.

I'm not saying it is impossible, but it will likely be a lot of work to 
get right and the APIs would need to be carefully thought out.

So the question becomes: is it worth spending time developing this over 
another feature? Or is it something that could be better implemented 
safely in another layer, such as in a database proxy?

Kind Regards
Andrew

On 13/10/16 07:04, Federico Razzoli wrote:
> Hi all,
>
> So basically everyone would love, love, love to have external languages for stored procedures, but no one is working on it... so bad. Please consider something:
> 1- Some features could be implemented as stored procedures, it's much easier. This has been done in the past (Flexviews, Securich...) but SQL is too limited.
> 2- I am sure that a lot of people would implement procedures libraries if they could use something like JavaScript or PHP. If we could use Python, stuff like NumPy and SciPy could be used.
> 3- SQL limitations could be lifted (namespaces for global objects, arrays, argv, cursors based on a prepared statement...) but will you ever do it? Probably not. But if we have external languages, who cares about procedural SQL limitations.
>
> I believe that next releases features are selected based on their cost (not only that of course). But please consider points 2 and 3, and try to estimate the cost of features that the community can develop for you, plus the cost of features that won't be really needed anymore if external languages are available (arrays in SQL).
>
> Cheers
> Federico
>
>
>
>
> --------------------------------------------
> Mer 12/10/16, Sergei Golubchik <serg@mariadb.org> ha scritto:
>
>  Oggetto: Re: [Maria-discuss] MariaDB Server 10.3 notes
>  A: "Justin Swanhart" <greenlion@gmail.com>
>  Cc: "Maria Discuss" <maria-discuss@lists.launchpad.net>
>  Data: Mercoledì 12 ottobre 2016, 14:15
>
>  Hi, Justin!
>
>  Very good questions,
>  thanks!
>  Some answers below:
>
>  On Oct 12, Justin Swanhart
>  wrote:
>  >
>  > > *
>  InnoDB: InnoDB native partitioning - so MySQL 8 InnoDB? But
>  Monty
>  > > says there's next to no
>  changes in InnoDB 8...  Instant add column.
>  > > New InnoDB deadlock detection (8.0).
>  New INFORMATION_SCHEMA table
>  > >
>  (8.0). Dedicated tablespace for temporary tables (in 5.7 and
>  merged,
>  > > check). Lock wait policy
>  (contribution)
>  >
>  >
>  Monty is *notorious* for low balling estimates.  His famous
>  phrase is "it
>  > is trivial".
>  Everybody knows that if Monty says it is trivial, you can
>  add
>  > 10x the work to get it done, or
>  more.
>
>  Yes, I tend to agree
>  with that. But Monty estimates are not that far off
>  when applied to *Monty*. They're often too
>  low when applied to others.
>
>  https://www.google.de/search?q=10x+developer
>
>  > includes transportable
>  tablespaces for partitioned tables, and will likely
>  > support native partitioning by the time 8
>  rolls.  Native partitioning also
>  >
>  entails implementing the changes to the SE.  See:
>  > http://mysqlserverteam.com/innodb-native-partitioning-early-access/
>
>  InnoDB native partitioning is
>  in 5.7.6:
>  https://dev.mysql.com/doc/relnotes/mysql/5.7/en/news-5-7-6.html
>
>  Is it better than upper-layer
>  partitioning? Why?
>
>  >
>  > * more for window functions (user defined window
>  functions, MDEV-10855)
>  > What other
>  databases have user definable window functions?  Will the
>  syntax
>  > be similar?  I can't really
>  think of a reason to need custom window
>  >
>  functions personally, as the existing set is very
>  comprehensive.  What kind
>  > of custom
>  window function do you think people would need to write?
>  In most
>  > implementations any aggregate
>  functions can be used in a window frame
>  >
>  context, thus any aggregate SQL functions should satisfy
>  requirements for
>  > user defined window
>  funcs.
>
>  Yes, that was a
>  confusing description.
>  What it really means
>  - we will have aggregate SQL functions (MDEV-7773)
>  and preferrably, it should be possible to make
>  them window-aware.
>
>  Using an
>  aggregate function as a window function, means applying it
>  to
>  every possible position of a sliding
>  frame, and it has O(N*n) complexity
>  (where N
>  is total number of rows, n is the number of rows in the
>  frame).
>  If the aggregate function can remove
>  rows from a group, it will only be O(N).
>
>  > > * socket authentication
>  > Can you explain this.  Is there an
>  MDEV?
>
>  That's using
>  unix_socket authentication by default for the root user.
>  It improves security and maintainability, but
>  is not really new - Debian
>  is already doing
>  that for months. This change could be confusing and
>  break exising user scripts, so it needs to be
>  done with care.
>
>  > > *
>  X Protocol/Document store - only if people have time or
>  money will
>  > > it be done
>  > If you want people to pay for it, perhaps
>  you should implement it in
>  > MaxScale
>  instead of the server :)
>
>  Interesting idea :)
>
>  But the MariaDB Meetup is a community event
>  organized by the MariaDB
>  Foundation. This
>  particular session was about MariaDB Server planning.
>  MariaDB Foundation has nothing to do with
>  MaxScale, so we could not have
>  planned
>  anything for it.
>
>  > >*
>  Peter asks if there is any plans to support other languages
>  like V8?
>  > PLEASE PLEASE PLEASE PLEASE
>  IMPLEMENT WL-820.  External stored procs and
>  > table functions.  Been sitting there for
>  the taking for a long time.
>  > Antony has
>  tried to get you to get it into the server, but alas, it
>  has
>  > never happened, and though there
>  continues to be wide requests for this
>  >
>  feature from the community, they fall on deaf ears.
>
>  Frankly speaking, I'd love
>  it. But this feature never got enough
>  priority, not in MySQL times nor in MariaDB.
>  Unfortunately.
>
>  Perhaps
>  we'll be able to sneak it into 10.3, but no promises
>  here.
>
>  > > *
>  Compressed binary log (from Tencent)
>  >
>  Compressing the binary log saves on space on the master, but
>  it makes
>  > seeking into binary logs much
>  more difficult, and searching backwards
>  >
>  through them becomes much more difficult (which has
>  implications for query
>  >
>  'rewind')
>
>  Tencent
>  compresses individual events (think
>  Compressed_query_log_event
>  type). So seeking
>  works as before, events are not decompressed on
>  reading, they are sent compressed to slaves,
>  etc. But the compression
>  ratio is worse, of
>  course.
>
>  > > * Fix the
>  XA transaction bug ( MySQL has fixed it already ) -
>  MDEV-7974
>  > Please actually complete XA
>  support, don't just half fix it like Oracle
>  > did.  Add full support for XA SUSPEND and
>  XA RESUME and allow more than one
>  >
>  thread to participate in a distributed transaction in the
>  server.
>
>  Right. Still,
>  MDEV-7974 is an important step, we cannot have proper XA
>  without it. XA SUSPEND and XA RESUME should be
>  the next one, I agree.
>
>  >
>  > * Indexes on expressions (this is part of virtual
>  columns, will it not
>  > > go into 10.2?
>  Check with Serg)
>  > Indexes on expressions
>  requires parser support.
>  > create index
>  expr_idx on some_table(a + b);
>  > select *
>  from some_table where a+b > 30 and a+b <= 50 -- (uses
>  range over
>  > expr_idx)
>
>  Right, but the parser support
>  is the least of my worries. I don't want
>  to low ball estmates :) but the *parser* can be
>  fixed in a few hours.
>  The most tricky part
>  will be to fix the optimizer.
>
>  > > * Flashback DDL MDEV-10571 (flashback
>  DML will come in 10.2). It only
>  > >
>  works with row based replication. Talk about what to name
>  it. There's
>  > > already a MySQL
>  time machine on github. Many like the name Rewind (but
>  > > not Monty). Let's do a poll on
>  the mailing list
>  > Okay, Flashback query
>  is VERY complicated. A flashback query is a
>  > materialized view.  There are two ways
>  generally to achieve flashback:
>  > a)
>  materialize the query as is, and instead of rolling the
>  query forward
>  > incrementally, you roll
>  it backwards.  Flexviews achieves this by computing
>  > the query as it is now, then computing the
>  delta from a prior point in time
>  > until
>  the transaction in which that computation happened. Then the
>  delta is
>  > played back against the query,
>  but the MONUS of each operation is applied
>  > which changes insertions to deletions and
>  vice versa.  This still presents
>  > a
>  problem for OUTER JOIN.  No documented asynchronous refresh
>  algorithms
>  > exist for outer join that
>  I'm aware of, that would work with the concept of
>  > reading row history from serialized
>  changes.  Flexviews uses the "rolling
>  > join propagation" algorithm, which
>  only works with inner joins.
>
>  Interesting... I'll continue reading on
>  that, thanks.
>  But see below
>
>  > b) provide a
>  point-in-time snapshot of every table used in the query at
>  the
>  > desired point in time, and run the
>  query over those tables (this is a
>  >
>  synchronous mechanism which supports OUTER JOIN).  The
>  problem here is how
>  > to provide such a
>  table.  The most straight-forward way is to copy the
>  > table, and then do the backwards replay
>  for each (ideally in parallel) but
>  > this
>  is obviously undesirable if the query is "select * from
>  some_big_table
>  > join another_big_table
>  on (...)" because you have to fully copy each table
>  > before you can undo changes.  Otherwise
>  you need to have the SE display old
>  >
>  versions, just like it does for MVCC, but it has to display
>  versions from
>  > binary logs, not undo
>  logs, and this requires a lot of SE and engine
>  > changes!
>
>  This is about correct. There are different
>  applications for "flashback"
>  with
>  different use cases and different implementation
>  trade-offs.
>
>  One is, for
>  example, to see historical sales numbers, for different
>  months or years. That is, basically, for some
>  kind of data analytics.
>  Lots of SELECT
>  queries, ad-hoc queries, at different historical time
>  points.
>
>  Another one is "damn, I've made a typo
>  in a WHERE clause and updated too
>  much". In this case one doesn't need
>  joins or materialized views, one
>  needs to
>  see the data before the erroneous statement. This is not
>  used
>  often.
>
>  For the second use case one can afford to copy
>  tables and
>  backward-replay the binary log.
>  For the first use case one would need a
>  completely different approach, I agree. May be
>  something like what
>  Flexviews does.
>
>  > If you are going to have
>  generic flashback, you might as well commit to
>  > incrementally refreshable materialized
>  views.
>
>  Right, when
>  we'll have materialized views, than we could think
>  about incrementally updating them using rolling
>  join propagation.
>
>  > >
>  * Additional GIS functions to stay compatible. Also it would
>  be good
>  > > to have a standalone GIS
>  library (Georg suggests; wlad isn't too happy
>  > > with the suggestion). Georg suggests
>  that calculations should use the
>  > >
>  reference systems (Unflatten the world - MariaDB Server GIS
>  the world
>  > > looks flat)
>  >
>  > GIS functions should
>  use gdal, just like postgresql does.  Then you can
>  > also add support for rasters.  Here is an
>  example of the awesomeness of
>  >
>  postgresql and postgis.  It uses SQL aggregate functions,
>  table functions,
>  > CTE, GIS raster
>  functions, sequences, etc:
>  > https://github.com/greenlion/osmvox/blob/master/postgresql/combined_schema.sql
>
>  May be. On the other hand,
>  we're more precise that postgis, because our
>  implementation uses fixed-point math, not
>  doubles.
>  And let's not forget that MySQL
>  uses Boost::Geometry.
>
>  >
>  > * Query rewriting - MDEV-5561
>  > I
>  would like you to provide a SQL->DOM function call
>  instead of just
>  > providing a DOM to
>  plugins.  This function could be exposed as a regular
>  > item function as well so that anything in
>  the server can parse SQL. I
>  > suggest you
>  implement my SQL "shim" interface and just provide
>  some way to
>  > get a easy to iterate over
>  parsed data structure from the SQL, such as a
>  > nested JSON array of objects.  A nested
>  array is generated by
>  > PHP-SQL-Parser
>  and would provide a good template for such a JSON object.
>  > https://github.com/greenlion/PHP-SQL-Parser
>
>  We actually have an MDEV for
>  that :)
>
>  > > * With
>  MyRocks coming, should we drop TokuDB (and maybe even
>  deprecate
>  > > in 10.2?) - bugs that
>  MariaDB Corporation reports to Percona don't
>  > > seem to get fixed. Peter says that
>  bugs are fixed for customers... and
>  >
>  > there is ongoing development to make it better
>  > It is certainly disheartening that Percona
>  isn't responsive to MariaDB
>  > bugs,
>  but I'm sure you understand that it is hard for a
>  competitor to fix
>  > bugs for another
>  competitor.  MariaDB maintains a forked version of
>  > TokuDB.  It isn't fair to expect the
>  upstream vendor to fix bugs that your
>  >
>  customers are paying you to support, does it?  Perhaps you
>  should pay a
>  > percentage of your support
>  fees for TokuDB issues to Percona, or come to
>  > some other support agreement.  Perhaps
>  YOU should make the bug fixes and
>  >
>  submit them to Percona.
>
>  And
>  we do, look for bug reports I've reported on TokuDB to
>  Percona -
>  with patches :)
>
>  But, really, Percona *is*
>  fixing TokuDB bugs, it's just that they did a
>  bit of refactoring after getting TokuDB and it
>  took time for it to
>  stabilize. And MariaDB
>  Server has a vanilla TokuDB with almost no
>  changes, we have no plans to fork it.
>
>  > > * ORDER BY LIMIT
>  optimizer bugs (MDEV-8306)
>  > Oh god, this
>  is a downward spiral every time somebody touches it.
>  > Fixing it correctly requires rewriting the
>  parser, something that
>  > Oracle is
>  undertaking.
>
>  This
>  didn't have much to do with the parser, and we've
>  refactored that
>  part of the parser in 10.2,
>  so now this has nothing to do with the
>  parser, it's purely the optimizer issue.
>
>  > What about other new
>  MySQL 8 features?  Are you getting rid of the .FRM
>  > nightmare?  Are you going to support SET
>  PERSIST?  etc?
>
>  FRM was
>  made purely optional in 10.0 - every storage engine decides
>  for
>  itself whether it uses FRMs or not. May
>  be InnoDB will use FRMs in 10.3,
>  may be it
>  will not. MyISAM most probably will continue use them.
>
>  SET PERSIST - and this is my
>  personal opinion - this needs to be thought
>  over *very* carefully. There have been quite a
>  few security
>  vulnerabilities with my.cnf
>  stored in the datadir, and that's why since
>  2005 the server no longer reads my.cnf in the
>  datadir. But mysql_safe
>  still does - and
>  there have been new security vulnerabilities *last
>  month*, caused by my.cnf in the datadir. And
>  finally both MySQL (in
>  5.7?) and MariaDB (in
>  10.2) stopped reading my.cnf in the datadir for
>  real.  So, having my security@mariadb.org
>  hat on, I look at SET PERSIST
>  with a lot of
>  suspicion, because it's nothing else than another
>  attempt
>  of storing server configuration
>  information in the datadir and have it
>  writable by the server itself.
>
>  Regards,
>  Sergei
>  Chief Architect
>  MariaDB
>  and security@mariadb.org
>
>  _______________________________________________
>  Mailing list: https://launchpad.net/~maria-discuss
>  Post to     : maria-discuss@lists.launchpad.net
>  Unsubscribe : https://launchpad.net/~maria-discuss
>  More help   : https://help.launchpad.net/ListHelp
>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~maria-discuss
> Post to     : maria-discuss@lists.launchpad.net
> Unsubscribe : https://launchpad.net/~maria-discuss
> More help   : https://help.launchpad.net/ListHelp
>

-- 
Andrew Hutchings - LinuxJedi - http://www.linuxjedi.co.uk/