* [Question] get_vma_policy() isn't compatible with {pin, get}_user_pages_remote
@ 2025-07-08 1:21 Jinjiang Tu
2025-07-08 1:53 ` Huang, Ying
0 siblings, 1 reply; 5+ messages in thread
From: Jinjiang Tu @ 2025-07-08 1:21 UTC (permalink / raw)
To: Andrew Morton, David Hildenbrand, jgg, jhubbard, Peter Xu,
Zi Yan, matthew.brost, joshua.hahnjy, rakie.kim, byungchul,
gourry, ying.huang, apopple
Cc: linux-mm, Kefeng Wang, tujinjiang
get_vma_policy() returns the mempolicy for the vma. If the vma has set
mempolicy, the policy is returned. Otherwise,
call get_task_policy(current) to get the mempolicy of current task.
However, it isn't reasonable for
pin_user_pages_remote() and get_user_pages_remote() cases.
Assume task A calls pin_user_pages_remote() to pin user pages from task
B. If the [start, start + nr_pages) isn't
populated with pages, handle_mm_fault() will be called by task A.
However, if the vma doesn't set memory policy,
the mempolicy of task A instead of task B is used to allocate. It seems
to be unreasonable. See
dequeue_hugetlb_folio_vma()->huge_node().
We can only obtain mm in get_vma_policy(), but we couldn't get the task,
since a mm can be associated with multiple
tasks(threads) and the task mempolicy is at thread granularity.
Is this situation reasonable? And if not, how could we fix it?
Thanks.
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [Question] get_vma_policy() isn't compatible with {pin, get}_user_pages_remote 2025-07-08 1:21 [Question] get_vma_policy() isn't compatible with {pin, get}_user_pages_remote Jinjiang Tu @ 2025-07-08 1:53 ` Huang, Ying 2025-07-08 2:51 ` Jinjiang Tu 0 siblings, 1 reply; 5+ messages in thread From: Huang, Ying @ 2025-07-08 1:53 UTC (permalink / raw) To: Jinjiang Tu Cc: Andrew Morton, David Hildenbrand, jgg, jhubbard, Peter Xu, Zi Yan, matthew.brost, joshua.hahnjy, rakie.kim, byungchul, gourry, apopple, linux-mm, Kefeng Wang Hi, Jinjiang, Jinjiang Tu <tujinjiang@huawei.com> writes: > get_vma_policy() returns the mempolicy for the vma. If the vma has set > mempolicy, the policy is returned. Otherwise, > call get_task_policy(current) to get the mempolicy of current > task. However, it isn't reasonable for > pin_user_pages_remote() and get_user_pages_remote() cases. > > Assume task A calls pin_user_pages_remote() to pin user pages from > task B. If the [start, start + nr_pages) isn't > populated with pages, handle_mm_fault() will be called by task > A. However, if the vma doesn't set memory policy, > the mempolicy of task A instead of task B is used to allocate. It > seems to be unreasonable. See > dequeue_hugetlb_folio_vma()->huge_node(). > > We can only obtain mm in get_vma_policy(), but we couldn't get the > task, since a mm can be associated with multiple > tasks(threads) and the task mempolicy is at thread granularity. > > Is this situation reasonable? And if not, how could we fix it? Yes. This sounds like an issue in theory and it's hard to be resolved if possible. Please take a look at get_user_pages_remote() usage in exec(). Do you have some practical issue with pin/get_user_pages_remote()? --- Best Regards, Huang, Ying ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Question] get_vma_policy() isn't compatible with {pin, get}_user_pages_remote 2025-07-08 1:53 ` Huang, Ying @ 2025-07-08 2:51 ` Jinjiang Tu 2025-07-08 3:05 ` Huang, Ying 0 siblings, 1 reply; 5+ messages in thread From: Jinjiang Tu @ 2025-07-08 2:51 UTC (permalink / raw) To: Huang, Ying Cc: Andrew Morton, David Hildenbrand, jgg, jhubbard, Peter Xu, Zi Yan, matthew.brost, joshua.hahnjy, rakie.kim, byungchul, gourry, apopple, linux-mm, Kefeng Wang 在 2025/7/8 9:53, Huang, Ying 写道: > Hi, Jinjiang, > > Jinjiang Tu <tujinjiang@huawei.com> writes: > >> get_vma_policy() returns the mempolicy for the vma. If the vma has set >> mempolicy, the policy is returned. Otherwise, >> call get_task_policy(current) to get the mempolicy of current >> task. However, it isn't reasonable for >> pin_user_pages_remote() and get_user_pages_remote() cases. >> >> Assume task A calls pin_user_pages_remote() to pin user pages from >> task B. If the [start, start + nr_pages) isn't >> populated with pages, handle_mm_fault() will be called by task >> A. However, if the vma doesn't set memory policy, >> the mempolicy of task A instead of task B is used to allocate. It >> seems to be unreasonable. See >> dequeue_hugetlb_folio_vma()->huge_node(). >> >> We can only obtain mm in get_vma_policy(), but we couldn't get the >> task, since a mm can be associated with multiple >> tasks(threads) and the task mempolicy is at thread granularity. >> >> Is this situation reasonable? And if not, how could we fix it? > Yes. This sounds like an issue in theory and it's hard to be resolved > if possible. Please take a look at get_user_pages_remote() usage in > exec(). IIUC, exec() replaces current->mm with new mm, and the task_struct isn't changed, thus task mempolicy is same, so it is reasonable to use get_user_pages_remote() in exec(). > Do you have some practical issue with pin/get_user_pages_remote()? Yes, I have a driver to pin_user_pages_remote() for other task. > --- > Best Regards, > Huang, Ying ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Question] get_vma_policy() isn't compatible with {pin, get}_user_pages_remote 2025-07-08 2:51 ` Jinjiang Tu @ 2025-07-08 3:05 ` Huang, Ying 2025-07-09 4:25 ` Jinjiang Tu 0 siblings, 1 reply; 5+ messages in thread From: Huang, Ying @ 2025-07-08 3:05 UTC (permalink / raw) To: Jinjiang Tu Cc: Andrew Morton, David Hildenbrand, jgg, jhubbard, Peter Xu, Zi Yan, matthew.brost, joshua.hahnjy, rakie.kim, byungchul, gourry, apopple, linux-mm, Kefeng Wang Jinjiang Tu <tujinjiang@huawei.com> writes: > 在 2025/7/8 9:53, Huang, Ying 写道: >> Hi, Jinjiang, >> >> Jinjiang Tu <tujinjiang@huawei.com> writes: >> >>> get_vma_policy() returns the mempolicy for the vma. If the vma has set >>> mempolicy, the policy is returned. Otherwise, >>> call get_task_policy(current) to get the mempolicy of current >>> task. However, it isn't reasonable for >>> pin_user_pages_remote() and get_user_pages_remote() cases. >>> >>> Assume task A calls pin_user_pages_remote() to pin user pages from >>> task B. If the [start, start + nr_pages) isn't >>> populated with pages, handle_mm_fault() will be called by task >>> A. However, if the vma doesn't set memory policy, >>> the mempolicy of task A instead of task B is used to allocate. It >>> seems to be unreasonable. See >>> dequeue_hugetlb_folio_vma()->huge_node(). >>> >>> We can only obtain mm in get_vma_policy(), but we couldn't get the >>> task, since a mm can be associated with multiple >>> tasks(threads) and the task mempolicy is at thread granularity. >>> >>> Is this situation reasonable? And if not, how could we fix it? >> Yes. This sounds like an issue in theory and it's hard to be resolved >> if possible. Please take a look at get_user_pages_remote() usage in >> exec(). > > IIUC, exec() replaces current->mm with new mm, and the task_struct isn't changed, > thus task mempolicy is same, so it is reasonable to use get_user_pages_remote() in exec(). > >> Do you have some practical issue with pin/get_user_pages_remote()? > > Yes, I have a driver to pin_user_pages_remote() for other task. Please give more details of your issue to help us to understand it. For example, why cannot you use pin_user_pages()? --- Best Regards, Huang, Ying ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Question] get_vma_policy() isn't compatible with {pin, get}_user_pages_remote 2025-07-08 3:05 ` Huang, Ying @ 2025-07-09 4:25 ` Jinjiang Tu 0 siblings, 0 replies; 5+ messages in thread From: Jinjiang Tu @ 2025-07-09 4:25 UTC (permalink / raw) To: Huang, Ying Cc: Andrew Morton, David Hildenbrand, jgg, jhubbard, Peter Xu, Zi Yan, matthew.brost, joshua.hahnjy, rakie.kim, byungchul, gourry, apopple, linux-mm, Kefeng Wang, liruilin4 在 2025/7/8 11:05, Huang, Ying 写道: > Jinjiang Tu <tujinjiang@huawei.com> writes: > >> 在 2025/7/8 9:53, Huang, Ying 写道: >>> Hi, Jinjiang, >>> >>> Jinjiang Tu <tujinjiang@huawei.com> writes: >>> >>>> get_vma_policy() returns the mempolicy for the vma. If the vma has set >>>> mempolicy, the policy is returned. Otherwise, >>>> call get_task_policy(current) to get the mempolicy of current >>>> task. However, it isn't reasonable for >>>> pin_user_pages_remote() and get_user_pages_remote() cases. >>>> >>>> Assume task A calls pin_user_pages_remote() to pin user pages from >>>> task B. If the [start, start + nr_pages) isn't >>>> populated with pages, handle_mm_fault() will be called by task >>>> A. However, if the vma doesn't set memory policy, >>>> the mempolicy of task A instead of task B is used to allocate. It >>>> seems to be unreasonable. See >>>> dequeue_hugetlb_folio_vma()->huge_node(). >>>> >>>> We can only obtain mm in get_vma_policy(), but we couldn't get the >>>> task, since a mm can be associated with multiple >>>> tasks(threads) and the task mempolicy is at thread granularity. >>>> >>>> Is this situation reasonable? And if not, how could we fix it? >>> Yes. This sounds like an issue in theory and it's hard to be resolved >>> if possible. Please take a look at get_user_pages_remote() usage in >>> exec(). >> IIUC, exec() replaces current->mm with new mm, and the task_struct isn't changed, >> thus task mempolicy is same, so it is reasonable to use get_user_pages_remote() in exec(). >> >>> Do you have some practical issue with pin/get_user_pages_remote()? >> Yes, I have a driver to pin_user_pages_remote() for other task. > Please give more details of your issue to help us to understand it. For > example, why cannot you use pin_user_pages()? + CC Ruilin, he understands the usage scenario better. RuiLin, could you please explain why we couldn' use pin_user_pages()? > --- > Best Regards, > Huang, Ying ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2025-07-09 4:25 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-07-08 1:21 [Question] get_vma_policy() isn't compatible with {pin, get}_user_pages_remote Jinjiang Tu
2025-07-08 1:53 ` Huang, Ying
2025-07-08 2:51 ` Jinjiang Tu
2025-07-08 3:05 ` Huang, Ying
2025-07-09 4:25 ` Jinjiang Tu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox