Hi Vladislav,

Thanks for your analysis !

Étienne is back home now. Let me summarize what he has found:
  - yes, pollset() is far from perfect... We have warned IBM teams about this and they plan to provide improvements... in the future. The current version of pollset() is... a damned pain, yes.
  - poll() too is not the perfect tool we need.
  - and IOCP on AIX is uncomplete compared to what Windows provides and it does not seem to be usable (this AIX version of IOCP was designed for Oracle needs only years ago, I've been said).

So, that looks bad. Yes.
Let's wait for tomorrow for Étienne to provide more details.

Regards,

Cordialement,

Tony Reix

tony.reix@atos.net

ATOS / Bull SAS
ATOS Expert
IBM Coop Architect & Technical Leader
Office : +33 (0) 4 76 29 72 67
1 rue de Provence - 38432 Échirolles - France

De : Maria-developers <maria-developers-bounces+tony.reix=atos.net@lists.launchpad.net> de la part de Vladislav Vaintroub <vvaintroub@gmail.com>
Envoyé : mardi 17 septembre 2019 18:04
À : GUESNET, ETIENNE (ext) <etienne.guesnet.external@atos.net>; maria-developers@lists.launchpad.net <maria-developers@lists.launchpad.net>
Objet : Re: [Maria-developers] Implementation of Threadpool
 

I now read the docs for  the AIX pollset API, and it does not seem an easy thing to use for the threadpool.

https://www.ibm.com/support/knowledgecenter/en/ssw_aix_71/p_bostechref/pollset.html

 

The main hurdle is that there is no “one-shot” behaviour, so you will need to pollset_ctl(PS_DELETE) each returned descriptor, and

pollset_ctl(PS_ADD) it later. That would be a good workaround, but  very surprising behaviour of the pollset is that pollset_ctl() waits for  pollset_poll() to complete, which makes it not better than a normal poll() really. I’ve no idea what IBM folks were thinking of when they designed this API.

 

As for  normal poll(2)

The problem with it, is that if one does not immediately drain data from readable socket, poll() will busy-loop returning the same fds again and again.

It is possible to fix busy loop, by clearing pollfd.events flag, but then again we need reset pollfd.events soon, after threadpool_process_request(). And for that, we’d need to somehow interrupt poll() , modify the fds array, issue poll() again. There will be a lot of interrupts, I’m afraid.  I foresee a busy loop either way, which makes poll(2) not a very good option. Unless you  can figure out how to drain all data from socket immediately after it becomes readable.

 

IOCP sounds really like a better API, in case you can use it.

 

From: Vladislav Vaintroub
Sent: Tuesday, 17 September 2019 15:20
To: GUESNET, ETIENNE (ext); maria-developers@lists.launchpad.net
Subject: RE: [Maria-developers] Implementation of Threadpool

 

Hi Etienne,

The reason why there is no poll/select implementation is that the systems that we support all have something better then poll/select, which can be used instead. So yes, “no interest” would fit. The main factor is of course that we have no access  to those commercial Unix distributions that have neither of  IOCP,epoll, kevent or ports

 

The  notification property that we want to have here is that, once there is some data on a socket coming from client, the socket is returned, and is taken out of the “poll set”. Socket returns to the “poll set” once command(e.g SQL query , or anything else that client-server protocol understands) is fully processed. We’d like to avoid multiple notifications on a socket,  until client command is fully processed, otherwise different threads from the pool will try to process client’s command at the same time.

This is the only thing we need, and this is something that poll/select do not provide.

 

As for Solaris ports –

existing Solaris implementation is based on ports, and as far as I could test, it had this one-shot behaviour, which is that we need, I.e port_get() only returns a single event, and then there must be a new port_associate() to reenable the socket/return it to “poll-set”. The documentation confirms this observation

Objects of type PORT_SOURCE_FD are file descriptors. The event types for PORT_SOURCE_FD objects are described in poll(2). At most one event notification will be generated per associated file descriptor. For example, if a file descriptor is associated with a port for the POLLRDNORM event and data is available on the file descriptor at the time the port_associate() function is called, an event is immediately sent to the port. If data is not yet available, one event is sent to the port when data first becomes available.”

 

I read somewhere IOCP would exist on AIX, so I hoped existing Windows code in threadpol_generic.cc would be of some help, if someone eventually ports that to AIX . Although, admittedly Windows code,  uses a trick to reading zero bytes in ReadFile/WSARecv, which effectively translates asynchronous IO to “poll-like” notification modes.

 

From: GUESNET, ETIENNE (ext)
Sent: Tuesday, 17 September 2019 14:13
To: maria-developers@lists.launchpad.net
Subject: [Maria-developers] Implementation of Threadpool

 

Hi,

I am implementing a threadpool system to AIX. The AIX equivalent of epoll / kqueue on AIX is pollset (and IOCP, but partial implementation only). However, pollset has only a level-trigger mode and MariaDB needs edge-trigger (see comments of sql/threadpoll_generic.h file). Adding a pollset support in MariaDB would be difficult, and probably not so efficient, as we need to simulate the edge-trigger behavior.

Obviously, AIX has poll and select support. MariaDB has not. Is there a reason to don’t implement threadpoll through poll or select? No interest? Performance issues?

MariaDB currently works on AIX without threadpool; in term of efficiency, do you know what can be obtained using threadpool with poll/select or a more modern solution?

As far I know, SunOS/Solaris/Illumos threadpoll system (called “port”) is also level-trigger only, but I do not find specific functions to manage this.

Thanks!

Etienne Guesnet.