* [Question] vmalloc latency in RT-Linux
@ 2022-06-21 12:15 Zhipeng Shi
2022-06-23 10:51 ` Baoquan He
0 siblings, 1 reply; 7+ messages in thread
From: Zhipeng Shi @ 2022-06-21 12:15 UTC (permalink / raw)
To: linux-mm, linux-rt-users; +Cc: tglx, shengjian.xu, schspa
I noticed in rt-linux, vmalloc has a large latency. This is because the
free_vmap_area_lock is held for a long time in the function
__purge_vmap_area_lazy.
In non-RT-Linux, because the function spin_is_contended is well
implemented, so there will be no such problem.
But in RT-Linux, spin_is_contended simply returns 0. I don't understand
why this function was implemented like this before, but in order to
solve this problem, I thought of two ways.
The first is to modify the spin_is_contended definition in spinlock_rt.h
as shown below, but I'm not sure if the change has side-effects:
-#define spin_is_contended(lock) (((void)(lock), 0))
+static inline int spin_is_contended(spinlock_t *lock)
+{
+ unsigned long *p = (unsigned long *) &lock->lock.owner;
+
+ return (READ_ONCE(*p) & RT_MUTEX_HAS_WAITERS);
+}
The second is by reducing the number of lazy_max_pages, but it will lead
to lower performance of vmalloc.
Guys, Do you have any good ideas?
Best regards,
Zhipeng
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [Question] vmalloc latency in RT-Linux 2022-06-21 12:15 [Question] vmalloc latency in RT-Linux Zhipeng Shi @ 2022-06-23 10:51 ` Baoquan He 2022-06-23 18:04 ` Waiman Long 0 siblings, 1 reply; 7+ messages in thread From: Baoquan He @ 2022-06-23 10:51 UTC (permalink / raw) To: Zhipeng Shi; +Cc: linux-mm, linux-rt-users, tglx, shengjian.xu, schspa, longman On 06/21/22 at 08:15pm, Zhipeng Shi wrote: > I noticed in rt-linux, vmalloc has a large latency. This is because the > free_vmap_area_lock is held for a long time in the function > __purge_vmap_area_lazy. > > In non-RT-Linux, because the function spin_is_contended is well > implemented, so there will be no such problem. > > But in RT-Linux, spin_is_contended simply returns 0. I don't understand > why this function was implemented like this before, but in order to > solve this problem, I thought of two ways. > > The first is to modify the spin_is_contended definition in spinlock_rt.h > as shown below, but I'm not sure if the change has side-effects: > > -#define spin_is_contended(lock) (((void)(lock), 0)) > +static inline int spin_is_contended(spinlock_t *lock) > +{ > + unsigned long *p = (unsigned long *) &lock->lock.owner; > + > + return (READ_ONCE(*p) & RT_MUTEX_HAS_WAITERS); > +} > > The second is by reducing the number of lazy_max_pages, but it will lead > to lower performance of vmalloc. __purge_vmap_area_lazy() has cond_resched_lock() to reschedule and drop the lock. From your saying, it's spin_is_contended() which is not working well to make rescheduling happen during __purge_vmap_area_lazy() handling. Then the fixing should be done in lock side. > > Guys, Do you have any good ideas? > > Best regards, > Zhipeng > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Question] vmalloc latency in RT-Linux 2022-06-23 10:51 ` Baoquan He @ 2022-06-23 18:04 ` Waiman Long 2022-06-24 2:39 ` Baoquan He 0 siblings, 1 reply; 7+ messages in thread From: Waiman Long @ 2022-06-23 18:04 UTC (permalink / raw) To: Baoquan He, Zhipeng Shi Cc: linux-mm, linux-rt-users, tglx, shengjian.xu, schspa, Sebastian Andrzej Siewior On 6/23/22 06:51, Baoquan He wrote: > On 06/21/22 at 08:15pm, Zhipeng Shi wrote: >> I noticed in rt-linux, vmalloc has a large latency. This is because the >> free_vmap_area_lock is held for a long time in the function >> __purge_vmap_area_lazy. >> >> In non-RT-Linux, because the function spin_is_contended is well >> implemented, so there will be no such problem. >> >> But in RT-Linux, spin_is_contended simply returns 0. I don't understand >> why this function was implemented like this before, but in order to >> solve this problem, I thought of two ways. >> >> The first is to modify the spin_is_contended definition in spinlock_rt.h >> as shown below, but I'm not sure if the change has side-effects: >> >> -#define spin_is_contended(lock) (((void)(lock), 0)) >> +static inline int spin_is_contended(spinlock_t *lock) >> +{ >> + unsigned long *p = (unsigned long *) &lock->lock.owner; >> + >> + return (READ_ONCE(*p) & RT_MUTEX_HAS_WAITERS); >> +} >> >> The second is by reducing the number of lazy_max_pages, but it will lead >> to lower performance of vmalloc. > __purge_vmap_area_lazy() has cond_resched_lock() to reschedule and drop > the lock. From your saying, it's spin_is_contended() which is not > working well to make rescheduling happen during __purge_vmap_area_lazy() > handling. Then the fixing should be done in lock side. Sebastian had sent out patch last year to fix spin_is_contended(). https://lore.kernel.org/lkml/20210906143004.2259141-3-bigeasy@linutronix.de/ However, there is no follow-up after some discussion and the patch wasn't merged. Cheers, Longman ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Question] vmalloc latency in RT-Linux 2022-06-23 18:04 ` Waiman Long @ 2022-06-24 2:39 ` Baoquan He 2022-06-24 5:56 ` Zhipeng Shi 2022-06-24 6:46 ` Sebastian Andrzej Siewior 0 siblings, 2 replies; 7+ messages in thread From: Baoquan He @ 2022-06-24 2:39 UTC (permalink / raw) To: Waiman Long, Zhipeng Shi Cc: linux-mm, linux-rt-users, tglx, shengjian.xu, schspa, Sebastian Andrzej Siewior, peterz On 06/23/22 at 02:04pm, Waiman Long wrote: > On 6/23/22 06:51, Baoquan He wrote: > > On 06/21/22 at 08:15pm, Zhipeng Shi wrote: > > > I noticed in rt-linux, vmalloc has a large latency. This is because the > > > free_vmap_area_lock is held for a long time in the function > > > __purge_vmap_area_lazy. > > > > > > In non-RT-Linux, because the function spin_is_contended is well > > > implemented, so there will be no such problem. > > > > > > But in RT-Linux, spin_is_contended simply returns 0. I don't understand > > > why this function was implemented like this before, but in order to > > > solve this problem, I thought of two ways. > > > > > > The first is to modify the spin_is_contended definition in spinlock_rt.h > > > as shown below, but I'm not sure if the change has side-effects: > > > > > > -#define spin_is_contended(lock) (((void)(lock), 0)) > > > +static inline int spin_is_contended(spinlock_t *lock) > > > +{ > > > + unsigned long *p = (unsigned long *) &lock->lock.owner; > > > + > > > + return (READ_ONCE(*p) & RT_MUTEX_HAS_WAITERS); > > > +} > > > > > > The second is by reducing the number of lazy_max_pages, but it will lead > > > to lower performance of vmalloc. > > __purge_vmap_area_lazy() has cond_resched_lock() to reschedule and drop > > the lock. From your saying, it's spin_is_contended() which is not > > working well to make rescheduling happen during __purge_vmap_area_lazy() > > handling. Then the fixing should be done in lock side. > > Sebastian had sent out patch last year to fix spin_is_contended(). > > https://lore.kernel.org/lkml/20210906143004.2259141-3-bigeasy@linutronix.de/ > > However, there is no follow-up after some discussion and the patch wasn't > merged. That's great. Thanks, Longman. Then this is a good chance to reconsider it, maybe with a test from Zhipeng. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Question] vmalloc latency in RT-Linux 2022-06-24 2:39 ` Baoquan He @ 2022-06-24 5:56 ` Zhipeng Shi 2022-06-24 6:46 ` Sebastian Andrzej Siewior 1 sibling, 0 replies; 7+ messages in thread From: Zhipeng Shi @ 2022-06-24 5:56 UTC (permalink / raw) To: Baoquan He, Waiman Long Cc: linux-mm, linux-rt-users, tglx, shengjian.xu, schspa, Sebastian Andrzej Siewior, peterz On Fri, Jun 24, 2022 at 10:39:43AM +0800, Baoquan He wrote: > On 06/23/22 at 02:04pm, Waiman Long wrote: > > On 6/23/22 06:51, Baoquan He wrote: > > > On 06/21/22 at 08:15pm, Zhipeng Shi wrote: > > > > I noticed in rt-linux, vmalloc has a large latency. This is because the > > > > free_vmap_area_lock is held for a long time in the function > > > > __purge_vmap_area_lazy. > > > > > > > > In non-RT-Linux, because the function spin_is_contended is well > > > > implemented, so there will be no such problem. > > > > > > > > But in RT-Linux, spin_is_contended simply returns 0. I don't understand > > > > why this function was implemented like this before, but in order to > > > > solve this problem, I thought of two ways. > > > > > > > > The first is to modify the spin_is_contended definition in spinlock_rt.h > > > > as shown below, but I'm not sure if the change has side-effects: > > > > > > > > -#define spin_is_contended(lock) (((void)(lock), 0)) > > > > +static inline int spin_is_contended(spinlock_t *lock) > > > > +{ > > > > + unsigned long *p = (unsigned long *) &lock->lock.owner; > > > > + > > > > + return (READ_ONCE(*p) & RT_MUTEX_HAS_WAITERS); > > > > +} > > > > > > > > The second is by reducing the number of lazy_max_pages, but it will lead > > > > to lower performance of vmalloc. > > > __purge_vmap_area_lazy() has cond_resched_lock() to reschedule and drop > > > the lock. From your saying, it's spin_is_contended() which is not > > > working well to make rescheduling happen during __purge_vmap_area_lazy() > > > handling. Then the fixing should be done in lock side. > > > > Sebastian had sent out patch last year to fix spin_is_contended(). > > > > https://lore.kernel.org/lkml/20210906143004.2259141-3-bigeasy@linutronix.de/ > > > > However, there is no follow-up after some discussion and the patch wasn't > > merged. > > That's great. Thanks, Longman. > > Then this is a good chance to reconsider it, maybe with a test from Zhipeng. Before that, since I didn't find the patch that Sebastian sent before, I sent relevant patch for this problem (now it seems that Sebastian's changes are better than mine) and test scripts. please refer to the following links: https://lore.kernel.org/lkml/20220608142457.GA2400218@ubuntu20/ With this patch, max-latency of vmalloc reduce from 10+ msec to 200+ usec, this because spin_lock is released halfway through __purge_vmap_area_lazy. Best regards, Zhipeng ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Question] vmalloc latency in RT-Linux 2022-06-24 2:39 ` Baoquan He 2022-06-24 5:56 ` Zhipeng Shi @ 2022-06-24 6:46 ` Sebastian Andrzej Siewior 2022-06-25 2:27 ` Waiman Long 1 sibling, 1 reply; 7+ messages in thread From: Sebastian Andrzej Siewior @ 2022-06-24 6:46 UTC (permalink / raw) To: Baoquan He Cc: Waiman Long, Zhipeng Shi, linux-mm, linux-rt-users, tglx, shengjian.xu, schspa, peterz On 2022-06-24 10:39:43 [+0800], Baoquan He wrote: > Then this is a good chance to reconsider it, maybe with a test from Zhipeng. I reconsidered and it was dropped purpose, see https://lore.kernel.org/lkml/YT80AB8%2FG59QBSVq@hirez.programming.kicks-ass.net/ Sebastian ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Question] vmalloc latency in RT-Linux 2022-06-24 6:46 ` Sebastian Andrzej Siewior @ 2022-06-25 2:27 ` Waiman Long 0 siblings, 0 replies; 7+ messages in thread From: Waiman Long @ 2022-06-25 2:27 UTC (permalink / raw) To: Sebastian Andrzej Siewior, Baoquan He Cc: Zhipeng Shi, linux-mm, linux-rt-users, tglx, shengjian.xu, schspa, peterz On 6/24/22 02:46, Sebastian Andrzej Siewior wrote: > On 2022-06-24 10:39:43 [+0800], Baoquan He wrote: >> Then this is a good chance to reconsider it, maybe with a test from Zhipeng. > I reconsidered and it was dropped purpose, see > https://lore.kernel.org/lkml/YT80AB8%2FG59QBSVq@hirez.programming.kicks-ass.net/ I do agree that is_contended may not that useful for rwlock, but it can be useful for spinlock. Will you consider a version just for rt_spinlock? Cheers, Longman ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2022-06-25 2:27 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-06-21 12:15 [Question] vmalloc latency in RT-Linux Zhipeng Shi 2022-06-23 10:51 ` Baoquan He 2022-06-23 18:04 ` Waiman Long 2022-06-24 2:39 ` Baoquan He 2022-06-24 5:56 ` Zhipeng Shi 2022-06-24 6:46 ` Sebastian Andrzej Siewior 2022-06-25 2:27 ` Waiman Long
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox