Hi Kentoku, thank you, I will surely study about Oniguruma..! :) On Sat, Apr 20, 2013 at 11:26 PM, kentoku <kentokushiba@gmail.com> wrote:
Hi Sudheera and Sergei,
In case of I missed some libraries, I guess you will enlighten me to study about them too. considering the requirements I didn't see Asian multi-byte support implemented in anywhere, what would we do about that.?
Do you know "oniguruma"? http://www.geocities.jp/kosako3/oniguruma/ http://en.wikipedia.org/wiki/Oniguruma
Oniguruma is a regular expressions library, that supports multi-byte character sets like big5, euc-kr and shift_jis. Oniguruma is used by "mregexp". "Mregexp" is a multi-byte support regex UDF for MySQL. So, I think you can understand easily about how to use it.
Thanks, Kentoku
2013/4/20 Sudheera Palihakkara <catchsudheera@gmail.com>
Hello Sir,
I've been working on this project for the past couple of days. I found that there are few good regex libraries suitable for this task. Considering the requirements I think PCRE, ICU regex and RGX would do the job. But ICU regex doesn't have recursion but it has well-documented easy-to-understand code. Currently I think PCRE is the best option we can have.
In case of I missed some libraries, I guess you will enlighten me to study about them too. considering the requirements I didn't see Asian multi-byte support implemented in anywhere, what would we do about that.?
In the google-melange page, under the application template there is a field called "Project description", what should I include there.? i mean do you expect a full description about the project including figures or just a brief just like in projects ideas page.
Thank you.
On Fri, Apr 19, 2013 at 3:46 PM, Sergei Golubchik <serg@askmonty.org>wrote:
Hi, Sudheera!
On Apr 19, Sudheera Palihakkara wrote:
Hi, I went through other threads on this topic. In one thread you mentioned to choose a suitable regex library.
*( Preliminary research - only about chosing a regex library to use in MariaDB. You should be able to explain why we should use this library and not some other one.)
* What do you mean by "choosing"? don't we have to enhance the exiting regex library? Or choose from exiting already implemented libraries which are free to use? sorry if it's a stupid question, but I'm confused. :O
Enhancing our old regex library to support all modern features and multiple charsets is complex and bug-prone work.
I don't see why we should bother doing it, when there are plenty of regex libraries available.
There's PHP's mb_regex, there's prce, and many others too. We'd better just pick one that works better for MariaDB, and put it instead of Henry Spencer's library.
Regards, Sergei
P.S. Please, don't reply to me only, use reply-to-all, so that your mails appear on the mailing list.
-- *Sudheera Palihakkara.* Undergraduate Department of *Computer Science and Engineering, *Faculty of Engineering, *University of Moratuwa*, Sri Lanka.
_______________________________________________ Mailing list: https://launchpad.net/~maria-developers Post to : maria-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp
-- *Sudheera Palihakkara.* Undergraduate Department of *Computer Science and Engineering, *Faculty of Engineering, *University of Moratuwa*, Sri Lanka.