On Tue, Jun 11, 2013 at 9:17 AM, Anton Vorontsov wrote: > On Mon, Jun 10, 2013 at 05:12:58PM +0200, Michal Hocko wrote: > > > + if (level >= ev->level && level != vmpr->current_level) { > > > eventfd_signal(ev->efd, 1); > > > signalled = true; > > > + vmpr->current_level = level; > > > > This would mean that you send a signal for, say, VMPRESSURE_LOW, then > > the reclaim finishes and two days later when you hit the reclaim again > > you would simply miss the event, right? > > > > So, unless I am missing something, then this is plain wrong. > > Yup, in it current version, it is not acceptable. For example, sometimes > we do want to see all the _LOW events, since _LOW level shows not just the > level itself, but the activity (i.e. reclaiming process). > > There are a few ways to make both parties happy, though. > > If the app wants to implement the time-based throttling, then just close > the fd and sleep for needed amount of time (or do not read from the > eventfd -- kernel then will just increment the eventfd counter, so there > won't be context switches at the least). Doing the time-based throttling > in the kernel won't buy us much, I believe. > > Or, if you still want the "one-shot"/"edge-triggered" events (which might > make perfect sense for medium and critical levels), then I'd propose to > add some additional flag when you register the event, so that the old > behaviour would be still available for those who need it. This approach I > think is the best one. > > Ok we will prepare this way and resend it. Thank you, Kyungmin Park