From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 63ACEC636D7 for ; Thu, 23 Feb 2023 09:42:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B64F46B0072; Thu, 23 Feb 2023 04:42:24 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id AEE886B0073; Thu, 23 Feb 2023 04:42:24 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9B6B56B0074; Thu, 23 Feb 2023 04:42:24 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 8B08B6B0072 for ; Thu, 23 Feb 2023 04:42:24 -0500 (EST) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 1B40E8134F for ; Thu, 23 Feb 2023 09:42:24 +0000 (UTC) X-FDA: 80498066208.26.90A5F82 Received: from mail-ed1-f47.google.com (mail-ed1-f47.google.com [209.85.208.47]) by imf05.hostedemail.com (Postfix) with ESMTP id 4672E100007 for ; Thu, 23 Feb 2023 09:42:22 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=eXToJJmI; spf=pass (imf05.hostedemail.com: domain of emmir@google.com designates 209.85.208.47 as permitted sender) smtp.mailfrom=emmir@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1677145342; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Ecq76MOyjJbouEEYrBGWIv8bJtN500LoCU0lJZ9XGak=; b=k2aAtSqwUASc4/u0WNmTdBexLIhRy9WtA34gCOYN6qzOmn5XBfC6jtd/VIXpXIanyHChPi JJggy95lZt7Gir0FjDHTgTx4XBHktozYlJVbRYw1oncdnvRb/nNhAQfbARLbc0CjxYf+Tc GMvgFxiqXbQTj7DxQBdzDRlE/37vQ6o= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=eXToJJmI; spf=pass (imf05.hostedemail.com: domain of emmir@google.com designates 209.85.208.47 as permitted sender) smtp.mailfrom=emmir@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1677145342; a=rsa-sha256; cv=none; b=rwfkge4M7OzLznaYbOGONcgQFhut+ed0K571rEwM6KMeN2qnmFuOU0tkhGZk9aXn4sEr3c dmqSC2w9MKqvTieavQsLrBIE7tc7jBGdbogD7bUosuYYJQtLVkIInMDnANxXZOSJpf7LNF vKMuONl/tDVZxmr3FyWH72oqYNM5jeE= Received: by mail-ed1-f47.google.com with SMTP id ec43so39415819edb.8 for ; Thu, 23 Feb 2023 01:42:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Ecq76MOyjJbouEEYrBGWIv8bJtN500LoCU0lJZ9XGak=; b=eXToJJmIw51YrtYOygLqqPIeSzpSM2S5WHaK56xWG2I5QDaQToqq4NX0NdYMb7Mi4t OTuIpaLudTWPja/vdYRVEQ/DHBiZXUYRn4TyHtBYnKTAu3jCl+HXCEzaSdhGaLMxrqXN JxA1Kbg5ci1GgbgwOg4TdrG9eIG8QWEEJJToBCxOERyrzw/8A9UPr+1uU+dSQoFPNwUG qiPMUZ6YNEXHglgKz9Y/A3lHwr/ctbmo5BvuxwnpWPf4Bsbg+e9qO98dFXYhJeC6Dt0P joA2xZpxkvrrfbJvFygOKt7iF6gvn8k2BNfONQxjHmYS8nrlgzY6NjMMFuPvBezUYfGQ jigw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Ecq76MOyjJbouEEYrBGWIv8bJtN500LoCU0lJZ9XGak=; b=JlzoKOMFcGq0n44m0x6qo2sjFNs9mdrRUqSbbaclSzlm97g3iU/04WkZ5pjwiZFs7y pQiNIHIFA5+VARrMuG5Qepkv8KJxytptZxId6Wm9sErwZkLGQiig/Sp1gbP7gPdWgiFK H66qnEauMd6EtVteJRzTv2R9UOugO56x7QmZlfPVl7VBtN8cYdGLV9v4tLy2ti2v9eaW pbI9cjgagCX7m+8hJEAmgJ+pnnS8afBq5L/wyfDhEfjlPU7/6LvKu6HhEao7n4iGOPbv dmEW8HCPg8cZ5TYAri7cVo1H4ezTE7Wf+ncV8eThiGaKR+Bpsl3uHJ9YKRiyQKZmkfHZ mF6g== X-Gm-Message-State: AO0yUKWVBoDj+JCaSJ9DrlPTdunhkE7S9f7hzrGpq7Y/mF3g4R7nYQ6Y w71LNGdGWYlxXnCYRitmaF2MUc1uZZtrLD8BJ8/0CA== X-Google-Smtp-Source: AK7set8g3+NL8qMaNsG/P3D3VCYixxEJX8SEXwcgNz0g/mt9FnFBM8cKN/Y+zJ2o4H4P8mxzDQH5vHGQgfvmedL5IRA= X-Received: by 2002:a50:d086:0:b0:4ad:72b2:cf57 with SMTP id v6-20020a50d086000000b004ad72b2cf57mr4996730edd.0.1677145340537; Thu, 23 Feb 2023 01:42:20 -0800 (PST) MIME-Version: 1.0 References: <20230202112915.867409-1-usama.anjum@collabora.com> <20230202112915.867409-4-usama.anjum@collabora.com> <36ddfd75-5c58-197b-16c9-9f819099ea6d@collabora.com> <6d2b40c6-bed9-69a6-e198-537b50953acd@collabora.com> <473b32fd-24f9-88fd-602f-3ba11d725472@collabora.com> <324564ba-2cdc-258e-1f57-d0a1d27e9da5@collabora.com> In-Reply-To: <324564ba-2cdc-258e-1f57-d0a1d27e9da5@collabora.com> From: =?UTF-8?B?TWljaGHFgiBNaXJvc8WCYXc=?= Date: Thu, 23 Feb 2023 10:42:08 +0100 Message-ID: Subject: Re: [PATCH v10 3/6] fs/proc/task_mmu: Implement IOCTL to get and/or the clear info about PTEs To: Muhammad Usama Anjum Cc: Andrei Vagin , Danylo Mocherniuk , Mike Rapoport , Nadav Amit , David Hildenbrand , Andrew Morton , Paul Gofman , Cyrill Gorcunov , Alexander Viro , Shuah Khan , Christian Brauner , Yang Shi , Vlastimil Babka , "Liam R . Howlett" , Yun Zhou , Suren Baghdasaryan , Alex Sierra , Peter Xu , Matthew Wilcox , Pasha Tatashin , Axel Rasmussen , "Gustavo A . R . Silva" , Dan Williams , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, Greg KH , kernel@collabora.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 4672E100007 X-Rspam-User: X-Stat-Signature: ah6jgdbqaqic1wat1r37t94mjb46nim6 X-HE-Tag: 1677145342-661060 X-HE-Meta: U2FsdGVkX18WCOUEBTbsJGWtLR1UEOeIuSRzwsW47zL2LotJbfOheFv9CmD3LMGi3i42Qq5AzRNIep4Ql9HjoFCVcqyTWraaoWx256gL4o2tMUE/CmZ5ZNOu5twmyNjaj/Vlbow9+4zX/5zvSHjziOSPRZD3uwX0BoiQ0OcchMgixeX+0iRqZ5M8VySPuAGINfWVsdS1eVjA9CP4o09sywsrFmiZRSuIkj8Kw2GNuHTEsNu4hSQZkCbYpJZZmesnATel0zfJxLU6rPW6dL08ESlN5SvejvfbDpBkQCFzPQmjGfai6WDb6RyZ4M9T9P/SdcOa1IrCc62+c+zNQUXPfAhBGdAouDfBsM3i3aHq6dX8Td7A9WzkztstRuhIxSgnl/hgu0MrEW58zADEMBemWBiSbYk08nlNXwj+n0bOEUkS+yKwn2HQ+rAM4p8jvKNv43tjOvUqpUwDkg9YsZ+qMsgg+FtErAGS+dtbNj2cEheRkWCBRRNzhp+s4veTio6M2rRoo26aQczttrgpET4MT0Nfsut7y/JfyyXPaTprziNrZvRCt80baApMGfv1A1DkWiEX4yrUkchQKYIrF7a+cW+oe61ncz/4y//7KaQoUd6+EByzpJQUoI7wt71Np3OV9OLkApevTg6TURDXWP0fmGSbBV0JgX+d4wXkwDh9lK6ReYHFRZfsdGDCA3qRln+focMAMPvMRQrcHfL7vpLSFtOoQBcmjB+cOXYt2r7qlnAx7Gh3qxj1wjysu4sLxfhu1HN1IAirHkJgOTuyVFI6dDEa9JxEnQUTxeZfecOq0O0jj2dWVcTobKxk2BoEEYiGkiQMWX2SR2b6ap7R0evWDZteFcL9iu4uXDHJijVnnxXd4o3yUSSx0GAbMbCC5wXVsl392rec8dD0+xLKMPyeii6Mot1yrEs8CmCWGRCpBbPtINfHcvWuJhj6roAqtFYmm4LN7UQIWxKbZVFShEh EvivxZaB +D1fxgcbtkidYjsZNvQ/2NU/P4OEhKBcygTn2hxeRRYIp/SQ2hyYuzoK8v/4bALTFukNYoDm49/VRlC6jM/YfrOu61blqSmVkuYApBUEyK4ZNeCTbZDhsWKYGERBXW915u1loZ5YuW2BO5Yb163yTaF3IfxRkV13eo22pQFMlTnxSl7FDvhJNHBAgFOdWtAcwOzEiHM1Ivx7AH0lJSRaQcADoxj3lXSW9XRiS4MMamPw9vnxRjpzuetd+MWORbuXybfmXBk5UCNVdskL1hHKd//lhv2h3F23EPBmH5/lciCpJXFGgxDi5WgE1sk+tYwHrdkKuXKjEBalXLRea6LPmbTNUJoIe49YZjqzhU735mknGddEBQAP6TOEOUw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000010, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, 23 Feb 2023 at 10:23, Muhammad Usama Anjum wrote: > > On 2/23/23 1:41=E2=80=AFPM, Micha=C5=82 Miros=C5=82aw wrote: > > On Thu, 23 Feb 2023 at 07:44, Muhammad Usama Anjum > > wrote: > >> > >> On 2/22/23 4:48=E2=80=AFPM, Micha=C5=82 Miros=C5=82aw wrote: > >>> On Wed, 22 Feb 2023 at 12:06, Muhammad Usama Anjum > >>> wrote: > > [...] > >>>>>>> BTW, I think I assumed that both conditions (all flags in > >>>>>>> required_flags and at least one in anyof_flags is present) need t= o be > >>>>>>> true for the page to be selected - is this your intention? > >>>>>> All the masks are optional. If all or any of the 3 masks are speci= fied, the > >>>>>> page flags must pass these masks to get selected. > >>>>> > >>>>> This explanation contradicts in part the introductory paragraph, bu= t > >>>>> this version seems more useful as you can pass all masks zero to ha= ve > >>>>> all pages selected. > >>>> Sorry, I wrote it wrongly. (All the masks are not optional.) Let me > >>>> rephrase. All or at least any 1 of the 3 masks (required, any, exclu= de) > >>>> must be specified. The return_mask must always be specified. Error i= s > >>>> returned if all 3 masks (required, anyof, exclude) are zero or retur= n_mask > >>>> is zero. > >>> > >>> Why do you need those restrictions? I'd guess it is valid to request = a > >>> list of all pages with zero return_mask - this will return a compact > >>> list of used ranges of the virtual address space. > >> At the time, we are supporting 4 flags (PAGE_IS_WRITTEN, PAGE_IS_FILE, > >> PAGE_IS_PRESENT and PAGE_IS_SWAPPED). The idea is that user mention hi= s > >> flags of interest in the return_mask. If he wants only 1 flag, he'll > >> specify it. Definitely if user wants only 1 flag, initially it doesn't= make > >> any sense to mention in the return mask. But we want uniformity. If us= er > >> want, 2 or more flags in returned, return_mask becomes compulsory. So = to > >> keep things simple and generic for any number of flags of interest > >> returned, the return_mask must be specified even if the flag of intere= st is > >> only 1. > > > > I'm not sure why do we want uniformity in the case of 1 flag? If a > > user specifies a single required flag, I'd expect he doesn't need to > > look at the flags returned as those will duplicate the information > > from mere presence of a page. A user might also require a single flag, > > but want all of them returned. Both requests - return 1 flag and > > return 0 flags would give meaningful output, so why force one way or > > the other? Allowing two will also enable users to express the intent: > > they need either just a list of pages, or they need a list with > > per-page flags - the need would follow from the code structure or > > other factors. > We can add as much flexibility as much people ask by keeping code simple. > But it is going to be dirty to add error check which detects if return_ma= sk > =3D 0 and if there is only 1 flag of interest mentioned by the user. The > following mentioned error check is essential to return deterministic > output. Do you think this case is worth it to support and we don't want t= o > go with the generality for both 1 or more flag cases? > > if (return_mask =3D=3D 0 && hweight_long(required_mask | any_mask) !=3D 1= ) > return error; Why would you want to add this error check? If a user requires multiple flags but cares only about a list of matching pages, then it would be natural to express this intent as return_mask =3D 0. > >>>>>> After taking a while to understand this and compare with already p= resent > >>>>>> flag system, `negated flags` is comparatively difficult to underst= and while > >>>>>> already present flags seem easier. > >>>>> > >>>>> Maybe replacing negated_flags in the API with matched_values =3D > >>>>> ~negated_flags would make this better? > >>>>> > >>>>> We compare having to understand XOR vs having to understand orderin= g > >>>>> of required_flags and excluded_flags. > >>>> There is no ordering in current masks scheme. No mask is preferable.= For a > >>>> page to get selected, all the definitions of the masks must be fulfi= lled. > >>>> You have come up with good example that what if required_mask =3D > >>>> exclude_mask. In this case, no page will fulfill the criterion and h= ence no > >>>> page would be selected. It is user's fault that he isn't understandi= ng the > >>>> definitions of these masks correctly. > >>>> > >>>> Now thinking about it, I can add a error check which would return er= ror if > >>>> a bit in required and excluded masks matches. Would you like it? Let= s put > >>>> this check in place. > >>>> (Previously I'd left it for user's wisdom not to do this. If he'll s= pecify > >>>> same masks in them, he'll get no addresses out of the syscall.) > >>> > >>> This error case is (one of) the problems I propose avoiding. You also > >>> need much more text to describe the requred/excluded flags > >>> interactions and edge cases than saying that a flag must have a value > >>> equal to corresponding bit in ~negated_flags to be matched by > >>> requried/anyof masks. > >> I've found excluded_mask very intuitive as compared to negated_mask wh= ich > >> is so difficult to understand that I don't know how to use it correctl= y. > >> Lets take an example, I want pages which are PAGE_IS_WRITTEN and are n= ot > >> PAGE_IS_FILE. In addition, the pages must be PAGE_IS_PRESENT or > >> PAGE_IS_SWAPPED. This can be specified as: > >> > >> required_mask =3D PAGE_IS_WRITTEN > >> excluded_mask =3D PAGE_IS_FILE > >> anyof_mask =3D PAGE_IS_PRESETNT | PAGE_IS_SWAP > >> > >> (a) assume page_flags =3D 0b1111 > >> skip page as 0b1111 & 0b0010 =3D true > >> > >> (b) assume page_flags =3D 0b1001 > >> select page as 0b1001 & 0b0010 =3D false > >> > >> It seemed intuitive. Right? How would you achieve same thing with nega= ted_mask? > >> > >> required_mask =3D PAGE_IS_WRITTEN > >> negated_mask =3D PAGE_IS_FILE > >> anyof_mask =3D PAGE_IS_PRESETNT | PAGE_IS_SWAP > >> > >> (1) assume page_flags =3D 0b1111 > >> tested_flags =3D 0b1111 ^ 0b0010 =3D 0b1101 > >> > >> (2) assume page_flags =3D 0b1001 > >> tested_flags =3D 0b1001 ^ 0b0010 =3D 0b1011 > >> > >> In (1), we wanted to skip pages which have PAGE_IS_FILE set. But > >> negated_mask has just masked it and page is still getting tested if it > >> should be selected and it would get selected. It is wrong. > >> > >> In (2), the PAGE_IS_FILE bit of page_flags was 0 and got updated to 1 = or > >> PAGE_IS_FILE in tested_flags. > > > > I require flags PAGE_IS_WRITTEN=3D1, PAGE_IS_FILE=3D0, so: > > > > required_mask =3D PAGE_IS_WRITTEN | PAGE_IS_FILE; > > negated_flags =3D PAGE_IS_FILE; // flags I want zero > You want PAGE_IS_FILE to be zero and at the same time you are requiring t= he > PAGE_IS_FILE. It is confusing. Ok, I believe the misunderstanding comes from the naming. I "require" the flag to be a particular value - hence include it in "required_flags" and specify the required value in ~negated_flags. You "require" the flag to be set (equal 1) and so include it in "required_flags" and you "require" the flag to be clear (equal to 0) so include it in "excluded_flags". Both approaches are correct, but I would not consider one "easier" than the other. The former is more general, though - makes any_of also able to match on flags cleared and removes the possibility of a conflicting case of a flag present in both sets. Maybe considered_flags or matched_flags then would make the field better understandable? Best Regards Micha=C5=82 Miros=C5=82aw