From: John Hubbard <jhubbard@nvidia.com>
To: 胡玮文 <huww98@outlook.com>, "Minchan Kim" <minchan@kernel.org>
Cc: 胡玮文 <sehuww@mail.scut.edu.cn>,
"Andrew Morton" <akpm@linux-foundation.org>,
linux-mm <linux-mm@kvack.org>,
LKML <linux-kernel@vger.kernel.org>,
"Michal Hocko" <mhocko@suse.com>,
"David Hildenbrand" <david@redhat.com>,
"Suren Baghdasaryan" <surenb@google.com>,
"John Dias" <joaodias@google.com>
Subject: Re: [RFC] mm: introduce page pinner
Date: Sat, 11 Dec 2021 23:21:33 -0800 [thread overview]
Message-ID: <782d631e-6610-c971-5800-66fbba41b425@nvidia.com> (raw)
In-Reply-To: <TYCP286MB2066003B2045AD6ABE4C0295C0719@TYCP286MB2066.JPNP286.PROD.OUTLOOK.COM>
On 12/10/21 01:54, 胡玮文 wrote:
...
>> So you are suspecting some kernel driver hold a addtional refcount
>> using get_user_pages or page get API?
>
> Yes. By using the trace events in this patch, I have confirmed it is nvidia
> kernel module that holds the refcount. I got the stacktrace like this (from
> "perf script"):
>
> cuda-EvtHandlr 31023 [000] 3244.976411: page_pinner:page_pinner_put: pfn=0x13e473 flags=0x8001e count=0 mapcount=0 mapping=(nil) mt=1
> ffffffff82511be4 __page_pinner_put+0x54 (/lib/modules/5.15.6+/build/vmlinux)
> ffffffff82511be4 __page_pinner_put+0x54 (/lib/modules/5.15.6+/build/vmlinux)
> ffffffffc0b71e1f os_unlock_user_pages+0xbf ([nvidia])
The corresponding call to os_unlock_user_pages() is os_lock_user_pages(). And
os_lock_user_pages() does call get_user_pages().
This is part of normal operation for many CUDA (and OpenCL) programs: "host memory"
(host == CPU, device == GPU) is pinned, and GPU pages tables set up to point to it.
If your program calls cudaHostRegister() [1], then that will in turn work its way
down to os_lock_user_pages(), and if the program is still running on the GPU, then
it's reasonable for those pages to still be pinned. This is a very common pattern
for some programs, especially for those who have tuned their access patterns and
know that most accesses are from the CPU side, with possible rare access from the
GPU.
> ffffffffc14a4546 _nv032165rm+0x96 ([nvidia])
>
> Still not much information. NVIDIA does not want me to debug its module. Maybe
> the only thing I can do is reporting this to NVIDIA.
>
...or hope that someone here, maybe even from NVIDIA, can help! :)
Let me know if there are further questions, and if they are outside of the linux-mm
area, we can take it up in an off-list email thread if necessary.
[1] cudaHostRegister():
https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__MEMORY.html#group__CUDART__MEMORY_1ge8d5c17670f16ac4fc8fcb4181cb490c
thanks,
--
John Hubbard
NVIDIA
next prev parent reply other threads:[~2021-12-12 7:21 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-12-06 18:47 Minchan Kim
2021-12-08 11:54 ` 胡玮文
2021-12-08 14:24 ` Matthew Wilcox
2021-12-08 18:42 ` Minchan Kim
2021-12-10 9:54 ` 胡玮文
2021-12-12 7:21 ` John Hubbard [this message]
2021-12-13 22:58 ` Minchan Kim
2021-12-13 1:56 ` John Hubbard
2021-12-13 23:10 ` Minchan Kim
2021-12-14 18:27 ` John Hubbard
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=782d631e-6610-c971-5800-66fbba41b425@nvidia.com \
--to=jhubbard@nvidia.com \
--cc=akpm@linux-foundation.org \
--cc=david@redhat.com \
--cc=huww98@outlook.com \
--cc=joaodias@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=minchan@kernel.org \
--cc=sehuww@mail.scut.edu.cn \
--cc=surenb@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox