13 Oct
2016
13 Oct
'16
6:53 p.m.
On 13.10.2016 17:26, Andrew Hutchings wrote: > Whilst I agree it would be a nice feature and something I've heard for > years, the implementation would likely be more complex than it seems. > > For starters you have to consider the possibility of a procedure > crashing and taking the whole daemon with it. You would likely need to > fork a small worker process pool and have some kind of shared memory > or socket communications for safety. In addition the implementation > would have to be extremely careful not to add any potential security > hole due to a zero-day in PHP or some bad input filtering for example. > > I'm not saying it is impossible, but it will likely be a lot of work > to get right and the APIs would need to be carefully thought out. > > So the question becomes: is it worth spending time developing this > over another feature? Or is it something that could be better > implemented safely in another layer, such as in a database proxy? > > Kind Regards > Andrew I think crashing is much easier if one runs native C UDFs, or any kind of natively compiled plugins. For external languages I think an important aspect is often overlooked. Namely, how to would access data. You can create a new JDBC connection, but it is likely not what you want to do. You just want to run a SELECT from the same session you're in, just like you do in SQL stored procedures. Exactly this requires a large amount of work in the connectors. It is not just running a JVM inside the server. There is no API for data access from the running session inside the server, not even for C. > > On 13/10/16 07:04, Federico Razzoli wrote: >> Hi all, >> >> So basically everyone would love, love, love to have external >> languages for stored procedures, but no one is working on it... so >> bad. Please consider something: >> 1- Some features could be implemented as stored procedures, it's much >> easier. This has been done in the past (Flexviews, Securich...) but >> SQL is too limited. >> 2- I am sure that a lot of people would implement procedures >> libraries if they could use something like JavaScript or PHP. If we >> could use Python, stuff like NumPy and SciPy could be used. >> 3- SQL limitations could be lifted (namespaces for global objects, >> arrays, argv, cursors based on a prepared statement...) but will you >> ever do it? Probably not. But if we have external languages, who >> cares about procedural SQL limitations. >> >> I believe that next releases features are selected based on their >> cost (not only that of course). But please consider points 2 and 3, >> and try to estimate the cost of features that the community can >> develop for you, plus the cost of features that won't be really >> needed anymore if external languages are available (arrays in SQL). >> >> Cheers >> Federico >> >> >> >> >> -------------------------------------------- >> Mer 12/10/16, Sergei Golubchik <serg@mariadb.org> ha scritto: >> >> Oggetto: Re: [Maria-discuss] MariaDB Server 10.3 notes >> A: "Justin Swanhart" <greenlion@gmail.com> >> Cc: "Maria Discuss" <maria-discuss@lists.launchpad.net> >> Data: Mercoledì 12 ottobre 2016, 14:15 >> >> Hi, Justin! >> >> Very good questions, >> thanks! >> Some answers below: >> >> On Oct 12, Justin Swanhart >> wrote: >> > >> > > * >> InnoDB: InnoDB native partitioning - so MySQL 8 InnoDB? But >> Monty >> > > says there's next to no >> changes in InnoDB 8... Instant add column. >> > > New InnoDB deadlock detection (8.0). >> New INFORMATION_SCHEMA table >> > > >> (8.0). Dedicated tablespace for temporary tables (in 5.7 and >> merged, >> > > check). Lock wait policy >> (contribution) >> > >> > >> Monty is *notorious* for low balling estimates. His famous >> phrase is "it >> > is trivial". >> Everybody knows that if Monty says it is trivial, you can >> add >> > 10x the work to get it done, or >> more. >> >> Yes, I tend to agree >> with that. But Monty estimates are not that far off >> when applied to *Monty*. They're often too >> low when applied to others. >> >> https://www.google.de/search?q=10x+developer >> >> > includes transportable >> tablespaces for partitioned tables, and will likely >> > support native partitioning by the time 8 >> rolls. Native partitioning also >> > >> entails implementing the changes to the SE. See: >> > http://mysqlserverteam.com/innodb-native-partitioning-early-access/ >> >> InnoDB native partitioning is >> in 5.7.6: >> https://dev.mysql.com/doc/relnotes/mysql/5.7/en/news-5-7-6.html >> >> Is it better than upper-layer >> partitioning? Why? >> >> > >> > * more for window functions (user defined window >> functions, MDEV-10855) >> > What other >> databases have user definable window functions? Will the >> syntax >> > be similar? I can't really >> think of a reason to need custom window >> > >> functions personally, as the existing set is very >> comprehensive. What kind >> > of custom >> window function do you think people would need to write? >> In most >> > implementations any aggregate >> functions can be used in a window frame >> > >> context, thus any aggregate SQL functions should satisfy >> requirements for >> > user defined window >> funcs. >> >> Yes, that was a >> confusing description. >> What it really means >> - we will have aggregate SQL functions (MDEV-7773) >> and preferrably, it should be possible to make >> them window-aware. >> >> Using an >> aggregate function as a window function, means applying it >> to >> every possible position of a sliding >> frame, and it has O(N*n) complexity >> (where N >> is total number of rows, n is the number of rows in the >> frame). >> If the aggregate function can remove >> rows from a group, it will only be O(N). >> >> > > * socket authentication >> > Can you explain this. Is there an >> MDEV? >> >> That's using >> unix_socket authentication by default for the root user. >> It improves security and maintainability, but >> is not really new - Debian >> is already doing >> that for months. This change could be confusing and >> break exising user scripts, so it needs to be >> done with care. >> >> > > * >> X Protocol/Document store - only if people have time or >> money will >> > > it be done >> > If you want people to pay for it, perhaps >> you should implement it in >> > MaxScale >> instead of the server :) >> >> Interesting idea :) >> >> But the MariaDB Meetup is a community event >> organized by the MariaDB >> Foundation. This >> particular session was about MariaDB Server planning. >> MariaDB Foundation has nothing to do with >> MaxScale, so we could not have >> planned >> anything for it. >> >> > >* >> Peter asks if there is any plans to support other languages >> like V8? >> > PLEASE PLEASE PLEASE PLEASE >> IMPLEMENT WL-820. External stored procs and >> > table functions. Been sitting there for >> the taking for a long time. >> > Antony has >> tried to get you to get it into the server, but alas, it >> has >> > never happened, and though there >> continues to be wide requests for this >> > >> feature from the community, they fall on deaf ears. >> >> Frankly speaking, I'd love >> it. But this feature never got enough >> priority, not in MySQL times nor in MariaDB. >> Unfortunately. >> >> Perhaps >> we'll be able to sneak it into 10.3, but no promises >> here. >> >> > > * >> Compressed binary log (from Tencent) >> > >> Compressing the binary log saves on space on the master, but >> it makes >> > seeking into binary logs much >> more difficult, and searching backwards >> > >> through them becomes much more difficult (which has >> implications for query >> > >> 'rewind') >> >> Tencent >> compresses individual events (think >> Compressed_query_log_event >> type). So seeking >> works as before, events are not decompressed on >> reading, they are sent compressed to slaves, >> etc. But the compression >> ratio is worse, of >> course. >> >> > > * Fix the >> XA transaction bug ( MySQL has fixed it already ) - >> MDEV-7974 >> > Please actually complete XA >> support, don't just half fix it like Oracle >> > did. Add full support for XA SUSPEND and >> XA RESUME and allow more than one >> > >> thread to participate in a distributed transaction in the >> server. >> >> Right. Still, >> MDEV-7974 is an important step, we cannot have proper XA >> without it. XA SUSPEND and XA RESUME should be >> the next one, I agree. >> >> > >> > * Indexes on expressions (this is part of virtual >> columns, will it not >> > > go into 10.2? >> Check with Serg) >> > Indexes on expressions >> requires parser support. >> > create index >> expr_idx on some_table(a + b); >> > select * >> from some_table where a+b > 30 and a+b <= 50 -- (uses >> range over >> > expr_idx) >> >> Right, but the parser support >> is the least of my worries. I don't want >> to low ball estmates :) but the *parser* can be >> fixed in a few hours. >> The most tricky part >> will be to fix the optimizer. >> >> > > * Flashback DDL MDEV-10571 (flashback >> DML will come in 10.2). It only >> > > >> works with row based replication. Talk about what to name >> it. There's >> > > already a MySQL >> time machine on github. Many like the name Rewind (but >> > > not Monty). Let's do a poll on >> the mailing list >> > Okay, Flashback query >> is VERY complicated. A flashback query is a >> > materialized view. There are two ways >> generally to achieve flashback: >> > a) >> materialize the query as is, and instead of rolling the >> query forward >> > incrementally, you roll >> it backwards. Flexviews achieves this by computing >> > the query as it is now, then computing the >> delta from a prior point in time >> > until >> the transaction in which that computation happened. Then the >> delta is >> > played back against the query, >> but the MONUS of each operation is applied >> > which changes insertions to deletions and >> vice versa. This still presents >> > a >> problem for OUTER JOIN. No documented asynchronous refresh >> algorithms >> > exist for outer join that >> I'm aware of, that would work with the concept of >> > reading row history from serialized >> changes. Flexviews uses the "rolling >> > join propagation" algorithm, which >> only works with inner joins. >> >> Interesting... I'll continue reading on >> that, thanks. >> But see below >> >> > b) provide a >> point-in-time snapshot of every table used in the query at >> the >> > desired point in time, and run the >> query over those tables (this is a >> > >> synchronous mechanism which supports OUTER JOIN). The >> problem here is how >> > to provide such a >> table. The most straight-forward way is to copy the >> > table, and then do the backwards replay >> for each (ideally in parallel) but >> > this >> is obviously undesirable if the query is "select * from >> some_big_table >> > join another_big_table >> on (...)" because you have to fully copy each table >> > before you can undo changes. Otherwise >> you need to have the SE display old >> > >> versions, just like it does for MVCC, but it has to display >> versions from >> > binary logs, not undo >> logs, and this requires a lot of SE and engine >> > changes! >> >> This is about correct. There are different >> applications for "flashback" >> with >> different use cases and different implementation >> trade-offs. >> >> One is, for >> example, to see historical sales numbers, for different >> months or years. That is, basically, for some >> kind of data analytics. >> Lots of SELECT >> queries, ad-hoc queries, at different historical time >> points. >> >> Another one is "damn, I've made a typo >> in a WHERE clause and updated too >> much". In this case one doesn't need >> joins or materialized views, one >> needs to >> see the data before the erroneous statement. This is not >> used >> often. >> >> For the second use case one can afford to copy >> tables and >> backward-replay the binary log. >> For the first use case one would need a >> completely different approach, I agree. May be >> something like what >> Flexviews does. >> >> > If you are going to have >> generic flashback, you might as well commit to >> > incrementally refreshable materialized >> views. >> >> Right, when >> we'll have materialized views, than we could think >> about incrementally updating them using rolling >> join propagation. >> >> > > >> * Additional GIS functions to stay compatible. Also it would >> be good >> > > to have a standalone GIS >> library (Georg suggests; wlad isn't too happy >> > > with the suggestion). Georg suggests >> that calculations should use the >> > > >> reference systems (Unflatten the world - MariaDB Server GIS >> the world >> > > looks flat) >> > >> > GIS functions should >> use gdal, just like postgresql does. Then you can >> > also add support for rasters. Here is an >> example of the awesomeness of >> > >> postgresql and postgis. It uses SQL aggregate functions, >> table functions, >> > CTE, GIS raster >> functions, sequences, etc: >> > >> https://github.com/greenlion/osmvox/blob/master/postgresql/combined_schema.sql >> >> May be. On the other hand, >> we're more precise that postgis, because our >> implementation uses fixed-point math, not >> doubles. >> And let's not forget that MySQL >> uses Boost::Geometry. >> >> > >> > * Query rewriting - MDEV-5561 >> > I >> would like you to provide a SQL->DOM function call >> instead of just >> > providing a DOM to >> plugins. This function could be exposed as a regular >> > item function as well so that anything in >> the server can parse SQL. I >> > suggest you >> implement my SQL "shim" interface and just provide >> some way to >> > get a easy to iterate over >> parsed data structure from the SQL, such as a >> > nested JSON array of objects. A nested >> array is generated by >> > PHP-SQL-Parser >> and would provide a good template for such a JSON object. >> > https://github.com/greenlion/PHP-SQL-Parser >> >> We actually have an MDEV for >> that :) >> >> > > * With >> MyRocks coming, should we drop TokuDB (and maybe even >> deprecate >> > > in 10.2?) - bugs that >> MariaDB Corporation reports to Percona don't >> > > seem to get fixed. Peter says that >> bugs are fixed for customers... and >> > >> > there is ongoing development to make it better >> > It is certainly disheartening that Percona >> isn't responsive to MariaDB >> > bugs, >> but I'm sure you understand that it is hard for a >> competitor to fix >> > bugs for another >> competitor. MariaDB maintains a forked version of >> > TokuDB. It isn't fair to expect the >> upstream vendor to fix bugs that your >> > >> customers are paying you to support, does it? Perhaps you >> should pay a >> > percentage of your support >> fees for TokuDB issues to Percona, or come to >> > some other support agreement. Perhaps >> YOU should make the bug fixes and >> > >> submit them to Percona. >> >> And >> we do, look for bug reports I've reported on TokuDB to >> Percona - >> with patches :) >> >> But, really, Percona *is* >> fixing TokuDB bugs, it's just that they did a >> bit of refactoring after getting TokuDB and it >> took time for it to >> stabilize. And MariaDB >> Server has a vanilla TokuDB with almost no >> changes, we have no plans to fork it. >> >> > > * ORDER BY LIMIT >> optimizer bugs (MDEV-8306) >> > Oh god, this >> is a downward spiral every time somebody touches it. >> > Fixing it correctly requires rewriting the >> parser, something that >> > Oracle is >> undertaking. >> >> This >> didn't have much to do with the parser, and we've >> refactored that >> part of the parser in 10.2, >> so now this has nothing to do with the >> parser, it's purely the optimizer issue. >> >> > What about other new >> MySQL 8 features? Are you getting rid of the .FRM >> > nightmare? Are you going to support SET >> PERSIST? etc? >> >> FRM was >> made purely optional in 10.0 - every storage engine decides >> for >> itself whether it uses FRMs or not. May >> be InnoDB will use FRMs in 10.3, >> may be it >> will not. MyISAM most probably will continue use them. >> >> SET PERSIST - and this is my >> personal opinion - this needs to be thought >> over *very* carefully. There have been quite a >> few security >> vulnerabilities with my.cnf >> stored in the datadir, and that's why since >> 2005 the server no longer reads my.cnf in the >> datadir. But mysql_safe >> still does - and >> there have been new security vulnerabilities *last >> month*, caused by my.cnf in the datadir. And >> finally both MySQL (in >> 5.7?) and MariaDB (in >> 10.2) stopped reading my.cnf in the datadir for >> real. So, having my security@mariadb.org >> hat on, I look at SET PERSIST >> with a lot of >> suspicion, because it's nothing else than another >> attempt >> of storing server configuration >> information in the datadir and have it >> writable by the server itself. >> >> Regards, >> Sergei >> Chief Architect >> MariaDB >> and security@mariadb.org >> >> _______________________________________________ >> Mailing list: https://launchpad.net/~maria-discuss >> Post to : maria-discuss@lists.launchpad.net >> Unsubscribe : https://launchpad.net/~maria-discuss >> More help : https://help.launchpad.net/ListHelp >> >> >> _______________________________________________ >> Mailing list: https://launchpad.net/~maria-discuss >> Post to : maria-discuss@lists.launchpad.net >> Unsubscribe : https://launchpad.net/~maria-discuss >> More help : https://help.launchpad.net/ListHelp >> >