From: Prakash Sangappa <prakash.sangappa@oracle.com>
To: Steven Sistare <steven.sistare@oracle.com>,
Michal Hocko <mhocko@kernel.org>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
dave.hansen@intel.com, nao.horiguchi@gmail.com,
akpm@linux-foundation.org, kirill.shutemov@linux.intel.com,
khandual@linux.vnet.ibm.com
Subject: Re: [PATCH V2 0/6] VA to numa node information
Date: Fri, 14 Sep 2018 11:04:54 -0700 [thread overview]
Message-ID: <a26a71cb-101b-e7a2-9a2f-78995538dbca@oracle.com> (raw)
In-Reply-To: <91988f05-2723-3120-5607-40fabe4a170d@oracle.com>
On 9/14/18 9:01 AM, Steven Sistare wrote:
> On 9/14/2018 1:56 AM, Michal Hocko wrote:
>> On Thu 13-09-18 15:32:25, prakash.sangappa wrote:
>>>
>>> The proc interface provides an efficient way to export address range
>>> to numa node id mapping information compared to using the API.
>> Do you have any numbers?
>>
>>> For example, for sparsely populated mappings, if a VMA has large portions
>>> not have any physical pages mapped, the page walk done thru the /proc file
>>> interface can skip over non existent PMDs / ptes. Whereas using the
>>> API the application would have to scan the entire VMA in page size units.
>> What prevents you from pre-filtering by reading /proc/$pid/maps to get
>> ranges of interest?
> That works for skipping holes, but not for skipping huge pages. I did a
> quick experiment to time move_pages on a 3 GHz Xeon and a 4.18 kernel.
> Allocate 128 GB and touch every small page. Call move_pages with nodes=NULL
> to get the node id for all pages, passing 512 consecutive small pages per
> call to move_nodes. The total move_nodes time is 1.85 secs, and 55 nsec
> per page. Extrapolating to a 1 TB range, it would take 15 sec to retrieve
> the numa node for every small page in the range. That is not terrible, but
> it is not interactive, and it becomes terrible for multiple TB.
>
Also, for valid VMAs inA 'maps' file, if the VMA is sparsely populated
withA physical pages,
the page walk can skip over non existing page table entires (PMDs) and
so can be faster.
For exampleA reading va range of a 400GB VMA which has few pages mapped
in beginning and few pages at the end and the rest of VMA does not have
any pages, it
takes 0.001s using the /proc interface. Whereas with move_page() api
passing 1024
consecutive small pages address, it takes about 2.4secs. This is on a
similar system
running 4.19 kernel.
next prev parent reply other threads:[~2018-09-14 18:05 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-09-12 20:23 Prakash Sangappa
2018-09-12 20:23 ` [PATCH V2 1/6] Add check to match numa node id when gathering pte stats Prakash Sangappa
2018-09-12 20:24 ` [PATCH V2 2/6] Add /proc/<pid>/numa_vamaps file for numa node information Prakash Sangappa
2018-09-12 20:24 ` [PATCH V2 3/6] Provide process address range to numa node id mapping Prakash Sangappa
2018-09-12 20:24 ` [PATCH V2 4/6] Add support to lseek /proc/<pid>/numa_vamaps file Prakash Sangappa
2018-09-12 20:24 ` [PATCH V2 5/6] File /proc/<pid>/numa_vamaps access needs PTRACE_MODE_READ_REALCREDS check Prakash Sangappa
2018-09-12 20:24 ` [PATCH V2 6/6] /proc/pid/numa_vamaps: document in Documentation/filesystems/proc.txt Prakash Sangappa
2018-09-13 8:40 ` [PATCH V2 0/6] VA to numa node information Michal Hocko
2018-09-13 22:32 ` prakash.sangappa
2018-09-14 0:10 ` Andrew Morton
2018-09-14 0:25 ` Dave Hansen
2018-09-15 1:31 ` Prakash Sangappa
2018-09-14 5:56 ` Michal Hocko
2018-09-14 16:01 ` Steven Sistare
2018-09-14 18:04 ` Prakash Sangappa [this message]
2018-09-14 19:01 ` Dave Hansen
2018-09-24 17:14 ` Michal Hocko
2018-11-10 4:48 ` Prakash Sangappa
2018-11-26 19:20 ` Steven Sistare
2018-12-18 23:46 ` prakash.sangappa
2018-12-19 20:52 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a26a71cb-101b-e7a2-9a2f-78995538dbca@oracle.com \
--to=prakash.sangappa@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=dave.hansen@intel.com \
--cc=khandual@linux.vnet.ibm.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=nao.horiguchi@gmail.com \
--cc=steven.sistare@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox