From: Muhammad Usama Anjum <usama.anjum@collabora.com>
To: "Michał Mirosław" <emmir@google.com>
Cc: Muhammad Usama Anjum <usama.anjum@collabora.com>,
Andrei Vagin <avagin@gmail.com>, Mike Rapoport <rppt@kernel.org>,
Nadav Amit <namit@vmware.com>,
David Hildenbrand <david@redhat.com>,
Andrew Morton <akpm@linux-foundation.org>,
Paul Gofman <pgofman@codeweavers.com>,
Cyrill Gorcunov <gorcunov@gmail.com>,
Alexander Viro <viro@zeniv.linux.org.uk>,
Shuah Khan <shuah@kernel.org>,
Christian Brauner <brauner@kernel.org>,
Yang Shi <shy828301@gmail.com>, Vlastimil Babka <vbabka@suse.cz>,
"Liam R . Howlett" <Liam.Howlett@oracle.com>,
Yun Zhou <yun.zhou@windriver.com>,
Suren Baghdasaryan <surenb@google.com>,
Alex Sierra <alex.sierra@amd.com>, Peter Xu <peterx@redhat.com>,
Matthew Wilcox <willy@infradead.org>,
Pasha Tatashin <pasha.tatashin@soleen.com>,
Axel Rasmussen <axelrasmussen@google.com>,
"Gustavo A . R . Silva" <gustavoars@kernel.org>,
Dan Williams <dan.j.williams@intel.com>,
linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
linux-mm@kvack.org, linux-kselftest@vger.kernel.org,
Greg KH <gregkh@linuxfoundation.org>,
kernel@collabora.com, Danylo Mocherniuk <mdanylo@google.com>
Subject: Re: [PATCH v10 3/6] fs/proc/task_mmu: Implement IOCTL to get and/or the clear info about PTEs
Date: Thu, 23 Feb 2023 11:44:04 +0500 [thread overview]
Message-ID: <473b32fd-24f9-88fd-602f-3ba11d725472@collabora.com> (raw)
In-Reply-To: <CABb0KFEBpJTNF7V0XfuvbtaHUiN0Zpx6FqD+BRyXf2gjxiVgTA@mail.gmail.com>
On 2/22/23 4:48 PM, Michał Mirosław wrote:
> On Wed, 22 Feb 2023 at 12:06, Muhammad Usama Anjum
> <usama.anjum@collabora.com> wrote:
>>
>> On 2/22/23 3:44 PM, Michał Mirosław wrote:
>>> On Wed, 22 Feb 2023 at 11:11, Muhammad Usama Anjum
>>> <usama.anjum@collabora.com> wrote:
>>>> On 2/21/23 5:42 PM, Michał Mirosław wrote:
>>>>> On Tue, 21 Feb 2023 at 11:28, Muhammad Usama Anjum
>>>>> <usama.anjum@collabora.com> wrote:
>>>>>>
>>>>>> Hi Michał,
>>>>>>
>>>>>> Thank you so much for comment!
>>>>>>
>>>>>> On 2/17/23 8:18 PM, Michał Mirosław wrote:
>>>>> [...]
>>>>>>> For the page-selection mechanism, currently required_mask and
>>>>>>> excluded_mask have conflicting
>>>>>> They are opposite of each other:
>>>>>> All the set bits in required_mask must be set for the page to be selected.
>>>>>> All the set bits in excluded_mask must _not_ be set for the page to be
>>>>>> selected.
>>>>>>
>>>>>>> responsibilities. I suggest to rework that to:
>>>>>>> 1. negated_flags: page flags which are to be negated before applying
>>>>>>> the page selection using following masks;
>>>>>> Sorry I'm unable to understand the negation (which is XOR?). Lets look at
>>>>>> the truth table:
>>>>>> Page Flag negated_flags
>>>>>> 0 0 0
>>>>>> 0 1 1
>>>>>> 1 0 1
>>>>>> 1 1 0
>>>>>>
>>>>>> If a page flag is 0 and negated_flag is 1, the result would be 1 which has
>>>>>> changed the page flag. It isn't making sense to me. Why the page flag bit
>>>>>> is being fliped?
>>>>>>
>>>>>> When Anrdei had proposed these masks, they seemed like a fancy way of
>>>>>> filtering inside kernel and it was straight forward to understand. These
>>>>>> masks would help his use cases for CRIU. So I'd included it. Please can you
>>>>>> elaborate what is the purpose of negation?
>>>>>
>>>>> The XOR is a way to invert the tested value of a flag (from positive
>>>>> to negative and the other way) without having the API with invalid
>>>>> values (with required_flags and excluded_flags you need to define a
>>>>> rule about what happens if a flag is present in both of the masks -
>>>>> either prioritise one mask over the other or reject the call).
>>>> At minimum, one mask (required, any or excluded) must be specified. For a
>>>> page to get selected, the page flags must fulfill the criterion of all the
>>>> specified masks.
>>>
>>> [Please see the comment below.]
>>>
>>> [...]
>>>> Lets translate words into table:
>>> [Yes, those tables captured the intent correctly.]
>>>
>>>>> BTW, I think I assumed that both conditions (all flags in
>>>>> required_flags and at least one in anyof_flags is present) need to be
>>>>> true for the page to be selected - is this your intention?
>>>> All the masks are optional. If all or any of the 3 masks are specified, the
>>>> page flags must pass these masks to get selected.
>>>
>>> This explanation contradicts in part the introductory paragraph, but
>>> this version seems more useful as you can pass all masks zero to have
>>> all pages selected.
>> Sorry, I wrote it wrongly. (All the masks are not optional.) Let me
>> rephrase. All or at least any 1 of the 3 masks (required, any, exclude)
>> must be specified. The return_mask must always be specified. Error is
>> returned if all 3 masks (required, anyof, exclude) are zero or return_mask
>> is zero.
>
> Why do you need those restrictions? I'd guess it is valid to request a
> list of all pages with zero return_mask - this will return a compact
> list of used ranges of the virtual address space.
At the time, we are supporting 4 flags (PAGE_IS_WRITTEN, PAGE_IS_FILE,
PAGE_IS_PRESENT and PAGE_IS_SWAPPED). The idea is that user mention his
flags of interest in the return_mask. If he wants only 1 flag, he'll
specify it. Definitely if user wants only 1 flag, initially it doesn't make
any sense to mention in the return mask. But we want uniformity. If user
want, 2 or more flags in returned, return_mask becomes compulsory. So to
keep things simple and generic for any number of flags of interest
returned, the return_mask must be specified even if the flag of interest is
only 1.
>
>>>> After taking a while to understand this and compare with already present
>>>> flag system, `negated flags` is comparatively difficult to understand while
>>>> already present flags seem easier.
>>>
>>> Maybe replacing negated_flags in the API with matched_values =
>>> ~negated_flags would make this better?
>>>
>>> We compare having to understand XOR vs having to understand ordering
>>> of required_flags and excluded_flags.
>> There is no ordering in current masks scheme. No mask is preferable. For a
>> page to get selected, all the definitions of the masks must be fulfilled.
>> You have come up with good example that what if required_mask =
>> exclude_mask. In this case, no page will fulfill the criterion and hence no
>> page would be selected. It is user's fault that he isn't understanding the
>> definitions of these masks correctly.
>>
>> Now thinking about it, I can add a error check which would return error if
>> a bit in required and excluded masks matches. Would you like it? Lets put
>> this check in place.
>> (Previously I'd left it for user's wisdom not to do this. If he'll specify
>> same masks in them, he'll get no addresses out of the syscall.)
>
> This error case is (one of) the problems I propose avoiding. You also
> need much more text to describe the requred/excluded flags
> interactions and edge cases than saying that a flag must have a value
> equal to corresponding bit in ~negated_flags to be matched by
> requried/anyof masks.
I've found excluded_mask very intuitive as compared to negated_mask which
is so difficult to understand that I don't know how to use it correctly.
Lets take an example, I want pages which are PAGE_IS_WRITTEN and are not
PAGE_IS_FILE. In addition, the pages must be PAGE_IS_PRESENT or
PAGE_IS_SWAPPED. This can be specified as:
required_mask = PAGE_IS_WRITTEN
excluded_mask = PAGE_IS_FILE
anyof_mask = PAGE_IS_PRESETNT | PAGE_IS_SWAP
(a) assume page_flags = 0b1111
skip page as 0b1111 & 0b0010 = true
(b) assume page_flags = 0b1001
select page as 0b1001 & 0b0010 = false
It seemed intuitive. Right? How would you achieve same thing with negated_mask?
required_mask = PAGE_IS_WRITTEN
negated_mask = PAGE_IS_FILE
anyof_mask = PAGE_IS_PRESETNT | PAGE_IS_SWAP
(1) assume page_flags = 0b1111
tested_flags = 0b1111 ^ 0b0010 = 0b1101
(2) assume page_flags = 0b1001
tested_flags = 0b1001 ^ 0b0010 = 0b1011
In (1), we wanted to skip pages which have PAGE_IS_FILE set. But
negated_mask has just masked it and page is still getting tested if it
should be selected and it would get selected. It is wrong.
In (2), the PAGE_IS_FILE bit of page_flags was 0 and got updated to 1 or
PAGE_IS_FILE in tested_flags.
>
>>> IOW my proposal is to replace branches in the masks interpretation (if
>>> in one set then matches but if in another set then doesn't; if flags
>>> match ... ) with plain calculation (flag is matching when equals
>>> ~negated_flags; if flags match the masks ...).
>
> Best Regards
> Michał Mirosław
--
BR,
Muhammad Usama Anjum
next prev parent reply other threads:[~2023-02-23 6:44 UTC|newest]
Thread overview: 54+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-02 11:29 [PATCH v10 0/6] " Muhammad Usama Anjum
2023-02-02 11:29 ` [PATCH v10 1/6] userfaultfd: Add UFFD WP Async support Muhammad Usama Anjum
2023-02-08 21:12 ` Peter Xu
2023-02-09 15:27 ` Muhammad Usama Anjum
2023-02-17 9:37 ` Mike Rapoport
2023-02-20 8:36 ` Muhammad Usama Anjum
2023-02-02 11:29 ` [PATCH v10 2/6] userfaultfd: update documentation to describe UFFD_FEATURE_WP_ASYNC Muhammad Usama Anjum
2023-02-08 21:31 ` Peter Xu
2023-02-09 15:47 ` Muhammad Usama Anjum
2023-02-02 11:29 ` [PATCH v10 3/6] fs/proc/task_mmu: Implement IOCTL to get and/or the clear info about PTEs Muhammad Usama Anjum
2023-02-08 22:15 ` Peter Xu
2023-02-13 12:55 ` Muhammad Usama Anjum
2023-02-13 21:42 ` Peter Xu
2023-02-14 7:57 ` Muhammad Usama Anjum
2023-02-14 20:59 ` Peter Xu
2023-02-15 10:03 ` Muhammad Usama Anjum
2023-02-15 21:12 ` Peter Xu
2023-02-17 10:39 ` Muhammad Usama Anjum
[not found] ` <Y+QgtVSEl4w2NgtJ@grain>
2023-02-13 8:19 ` Muhammad Usama Anjum
2023-02-17 10:10 ` Mike Rapoport
2023-02-20 10:38 ` Muhammad Usama Anjum
2023-02-20 11:38 ` Muhammad Usama Anjum
2023-02-20 13:17 ` Mike Rapoport
2023-02-17 15:18 ` Michał Mirosław
2023-02-21 10:28 ` Muhammad Usama Anjum
2023-02-21 12:42 ` Michał Mirosław
2023-02-22 10:11 ` Muhammad Usama Anjum
2023-02-22 10:44 ` Michał Mirosław
2023-02-22 11:06 ` Muhammad Usama Anjum
2023-02-22 11:48 ` Michał Mirosław
2023-02-23 6:44 ` Muhammad Usama Anjum [this message]
2023-02-23 8:41 ` Michał Mirosław
2023-02-23 9:23 ` Muhammad Usama Anjum
2023-02-23 9:42 ` Michał Mirosław
2023-02-24 2:20 ` Andrei Vagin
2023-02-25 9:38 ` Michał Mirosław
2023-02-19 13:52 ` Nadav Amit
2023-02-20 13:24 ` Muhammad Usama Anjum
2023-02-22 19:10 ` Nadav Amit
2023-02-23 7:10 ` Muhammad Usama Anjum
2023-02-23 17:11 ` Nadav Amit
2023-02-27 21:18 ` Peter Xu
2023-02-27 23:09 ` Nadav Amit
2023-02-28 15:55 ` Peter Xu
2023-02-28 17:21 ` Nadav Amit
2023-02-28 19:31 ` Peter Xu
2023-03-01 1:59 ` Nadav Amit
2023-02-20 13:26 ` Mike Rapoport
2023-02-21 7:02 ` Muhammad Usama Anjum
2023-02-02 11:29 ` [PATCH v10 4/6] tools headers UAPI: Update linux/fs.h with the kernel sources Muhammad Usama Anjum
2023-02-02 11:29 ` [PATCH v10 5/6] mm/pagemap: add documentation of PAGEMAP_SCAN IOCTL Muhammad Usama Anjum
2023-02-09 19:26 ` Peter Xu
2023-02-13 10:44 ` Muhammad Usama Anjum
2023-02-02 11:29 ` [PATCH v10 6/6] selftests: vm: add pagemap ioctl tests Muhammad Usama Anjum
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=473b32fd-24f9-88fd-602f-3ba11d725472@collabora.com \
--to=usama.anjum@collabora.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=alex.sierra@amd.com \
--cc=avagin@gmail.com \
--cc=axelrasmussen@google.com \
--cc=brauner@kernel.org \
--cc=dan.j.williams@intel.com \
--cc=david@redhat.com \
--cc=emmir@google.com \
--cc=gorcunov@gmail.com \
--cc=gregkh@linuxfoundation.org \
--cc=gustavoars@kernel.org \
--cc=kernel@collabora.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mdanylo@google.com \
--cc=namit@vmware.com \
--cc=pasha.tatashin@soleen.com \
--cc=peterx@redhat.com \
--cc=pgofman@codeweavers.com \
--cc=rppt@kernel.org \
--cc=shuah@kernel.org \
--cc=shy828301@gmail.com \
--cc=surenb@google.com \
--cc=vbabka@suse.cz \
--cc=viro@zeniv.linux.org.uk \
--cc=willy@infradead.org \
--cc=yun.zhou@windriver.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox