From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5DDA1C6FD1D for ; Fri, 7 Apr 2023 07:34:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9F7306B0072; Fri, 7 Apr 2023 03:34:55 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9CE1D6B0074; Fri, 7 Apr 2023 03:34:55 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 86FAD6B0075; Fri, 7 Apr 2023 03:34:55 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 7718C6B0072 for ; Fri, 7 Apr 2023 03:34:55 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 584C7120F59 for ; Fri, 7 Apr 2023 07:34:55 +0000 (UTC) X-FDA: 80653783350.10.C59E4BC Received: from mail-ej1-f50.google.com (mail-ej1-f50.google.com [209.85.218.50]) by imf11.hostedemail.com (Postfix) with ESMTP id 8AC634001B for ; Fri, 7 Apr 2023 07:34:53 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=gDMbr9D7; spf=pass (imf11.hostedemail.com: domain of emmir@google.com designates 209.85.218.50 as permitted sender) smtp.mailfrom=emmir@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680852893; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=xufJewXyhNV91ceoQQRbqkfFuhj5XUSD9386rC0IB28=; b=Wzk6CCjBru7tue1dZOeGcX6VmGLrxQDxxefWkU2GFLLXFjZX1fGL8RMYOC9MwC61oH+QIU 1yEfJ3jhTMoJSFdP1NTNurlP9HEAJ+6Z50HaWX1t1FR7NoPCG9Mb+uyOMEz1OMl+4NxcZ8 7GXzHtaxxQnRWRDoHEOPoqq0SDEf92A= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=gDMbr9D7; spf=pass (imf11.hostedemail.com: domain of emmir@google.com designates 209.85.218.50 as permitted sender) smtp.mailfrom=emmir@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680852893; a=rsa-sha256; cv=none; b=R6cfIU7dl6GOfAiIK1w+d//6B8zAwkMFe6cA1Qo3/o/Tx5dcpkacj+4jfxmHqKzKPFH+LA nYQvlcTLe1SzdUDWLGmlFP8gH36zF3RbtpIqVEhUBA7gVnCXc2R75KG4QM58nlisRRLpxG jywI+yMUgbdEsL9f2uxzbseuE18ZrNg= Received: by mail-ej1-f50.google.com with SMTP id a640c23a62f3a-94748e41044so135898466b.1 for ; Fri, 07 Apr 2023 00:34:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; t=1680852892; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=xufJewXyhNV91ceoQQRbqkfFuhj5XUSD9386rC0IB28=; b=gDMbr9D7tJfW5EhGD1TauIK78v+jZ7HUY7e7xfnLQhu1bEnuRBb0SqZjJ51+UqnBdc QrABNaCBE42IvweCIbz+OmT7FfcD6fDU31Aj8En6ELtBabJM0gTH+lLXGmKH/DZRtp+P FfNP2yFQVIMY0xfUglALIrdcxAM40WZgLRaKlFjvmvF8nzE0LpLZqVSTArizOpK6o86n F59MbAyWPqLUz1onQKYeDGol9W6HmV3j24HnaNBKgFUPpu5qPJSUTORcD+uWZSzrhoJH Gft2x4kLBBdmGR32hJdvvPnUqtQCfIryMpFtmCWIgxEzTKwurrEKiBUVgyu4XHizGsur iMhQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680852892; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=xufJewXyhNV91ceoQQRbqkfFuhj5XUSD9386rC0IB28=; b=jdz9u+0x81rEWAJ2bq8lyqHCcjB36eUD5i00wS4Nc/YK8r93Esmj0QT0otynFoVr2X GdRHyw+GDqECRh+L6QifFkGmAkzY6o1OAhn+c4BtUBl5SgsJNW3JK5cC4Vb40uB5tWVj QOfMdbtiAE7DEkSjfDLINVeNDgMai7VrjI1WriXHlEzyXiQW0IzTJUMc8Y/psv+qMQsS FIGFmQInoLSM/7HKavt/P2osEtL5dZUZZrq2Tl0ZtG90ntv1b25SG9ZzjPvEqy2g9Seq RVAlItXNNulpjMxOsb9ZAW4S5tsby+aoFkQr8y7Z2Zcq2zB3A7gC4orSuo/eUCb3Dh+j 5K0Q== X-Gm-Message-State: AAQBX9dpH+44S/pxjU0nG5VXXXslX89iFMosP/NfFUxZL+B3p62LGaI+ Xw1y5SCzo5KN6hQHk/SPKPm3t0WKDcFChaHwjPk5AA== X-Google-Smtp-Source: AKy350bF8961SbdQkzvhhN/J2V2TxAtpP2NJYlX5JDDN+lnoULk6K/VoRyybz83rRM6Adk8ORW5BfMtw4ISKJ9e8OKk= X-Received: by 2002:a50:d716:0:b0:502:6d4b:40f5 with SMTP id t22-20020a50d716000000b005026d4b40f5mr997137edi.7.1680852891875; Fri, 07 Apr 2023 00:34:51 -0700 (PDT) MIME-Version: 1.0 References: <20230406074005.1784728-1-usama.anjum@collabora.com> <20230406074005.1784728-3-usama.anjum@collabora.com> <0351b563-5193-6431-aa9c-c5bf5741b791@collabora.com> <8a837998-604f-a871-729e-aa274a621481@collabora.com> In-Reply-To: <8a837998-604f-a871-729e-aa274a621481@collabora.com> From: =?UTF-8?B?TWljaGHFgiBNaXJvc8WCYXc=?= Date: Fri, 7 Apr 2023 09:34:40 +0200 Message-ID: Subject: Re: [PATCH v12 2/5] fs/proc/task_mmu: Implement IOCTL to get and optionally clear info about PTEs To: Muhammad Usama Anjum Cc: Mike Rapoport , Peter Xu , David Hildenbrand , Andrew Morton , Andrei Vagin , Danylo Mocherniuk , Paul Gofman , Cyrill Gorcunov , Nadav Amit , Alexander Viro , Shuah Khan , Christian Brauner , Yang Shi , Vlastimil Babka , "Liam R . Howlett" , Yun Zhou , Suren Baghdasaryan , Alex Sierra , Matthew Wilcox , Pasha Tatashin , Axel Rasmussen , "Gustavo A . R . Silva" , Dan Williams , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, Greg KH , kernel@collabora.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 8AC634001B X-Stat-Signature: mr14rccj3cku4crmqn4oxwbp1ei6c3uj X-HE-Tag: 1680852893-471941 X-HE-Meta: U2FsdGVkX1/If3DZ57eZNzwxpWJPm8Kk907r1DlcQNlEnuaFs8fuRt4I7mFCiXYbcRG78yB1FD8hwTWFokJT/CimVVLffpc/yk4flL10l+QVh3xynTzPzv318jyKJcz5jNCz5InCgljqt704FQ9H7kbOqPuXfgjFOlvS8ypxgX1OzgicVHhGTxPyNAOoaMLdvYRdl/RpsUrz9DV06x9JDtB7VtktSGt3DS+RKld6s8EdohnBbdSLaceZ3hWZYW24MdBipLgmrS9iqnpRiFErcPGDV9ns5fqQp51JiNGjwIpxGCSAzz58fUdZmCZGaj5Y3MGW0ozsGOOs0++9H0ay7yzQ+pIJxk41BcT23CAoLXkcPcWkCVWCK1OKWBH58U+0kM/YVus2MxoN1nckLxpmc41hwpJ6rao5C13PrcT51fAC0Lg7ydPuTmq2Ifcs2v1AQ+pF++IzjOaRPJVm9LyY11rW1YjCnjI00yHL3lMSkBl1eAgjN0fjB49k+gPHBZ9CGY5H2MBt8+KfjVEiIlc12sVEqqqmPUpS7mjhsJX513bY6WngXNxg7bOunnnotRgQ78Bs4g4v1p9w/3jW8RxNKO/EZ/0jRr9ugyTXYTRPQq2BLrl+cvnM/9jccG0oN9npgp6DbcJoeAsgh50pwTQKnJoiIydnPGq8eWEpztbddX37WRIKmGO16FFbf8qVCxepIfO9fTSFDw98DYSfLNhUwp+/1XgdKZkLD/1XBJ1cSVWE+O+WhOkvQrYmmg3a35wpGpgRzue22MjM/9DQWa7JHhh/+qKhKXhSXDn6rEFyK6OakztYBW2mmH2Eniunx2C3IT+Bw2C7A2ar88Cnvqpr0px82DePi7e1Y+SCdBVcwKOqCCcOUCpJwyiOaxRN5TZ80T63Ecs+7PxzHTRfeFFfGVGpkHHqiCYmtqWjUjqg6qsSYOgqvCZflYQ0XdYwqflTH9RvGhhIYrK6nB4bLJM VzC9C2iA yFyTT3AxgzbiD4xGQO1GfgUahXTM9E1rN2CnCR1cYuWU6cFLHADLm/SLb8hCKVVc+KbT4NpaSV8iYAlEFi2mV9HdDIfbZz/xNCWvxSZfVyFAddQgIoe9Q6NtSGErwfn42LAZ5rUcFz23PnsKCWsmIROka5mlqbh6gvpVt7D4x6dRM2jBkFvlPk47wZMMeIw2NNU0qPJM3h4EQ4FnlolhR7G2VEeOLgQvM/n4z4lUjC2oJEnzNbRsHvpIUBADEOontFoJF5Xw+8rrh4GLr5eTzrRm4ccx5+NpMzituPQUBHCBkCKB/tD1GCZGy51WkngRI/jbN/djq8SGPWi43DtMwdq42BWe/iYgyOLzM/6TsZTIhcRieeTJ6rXpU1Z78q5KFzFsLsblKbN7zxlo2kB9OkjE/GA6xUGc4DDG0RE5Q8skp4v7+AkrWwQe8ww== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, 6 Apr 2023 at 23:04, Muhammad Usama Anjum wrote: > On 4/7/23 1:00=E2=80=AFAM, Micha=C5=82 Miros=C5=82aw wrote: > > On Thu, 6 Apr 2023 at 19:58, Muhammad Usama Anjum > > wrote: [...] > >>>> + cur->len +=3D n_pages; > >>>> + p->found_pages +=3D n_pages; > >>>> + > >>>> + if (p->max_pages && (p->found_pages =3D=3D p->max_pa= ges)) > >>>> + return PM_SCAN_FOUND_MAX_PAGES; > >>>> + > >>>> + return 0; > >>>> + } > >>>> + > >>>> + if (!p->vec_index || ((p->vec_index + 1) < p->vec_len)) { > >>> > >>> It looks that `if (p->vec_index < p->vec_len)` is enough here - if we > >>> have vec_len =3D=3D 0 here, then we'd not fit the entry in the usersp= ace > >>> buffer anyway. Am I missing something? > >> No. I'd explained it with diagram last time: > >> https://lore.kernel.org/all/3c8d9ea0-1382-be0c-8dd2-d490eedd3b55@colla= bora.com > >> > >> I'll add a concise comment here. > > > > So it seems, but I think the code changed a bit and maybe could be > > simplified now? Since p->vec_len =3D=3D 0 is currently not valid, the > > field could count only the entries available in p->vec[] -- IOW: not > > include p->cur in the count. > I see. But this'll not work as we need to count p->cur to don't go above > the maximum count, p->vec_size. You can subtract 1 from p->vec_size before the page walk to account for the buffer in `cur`. [...] > >>>> +static inline int pagemap_scan_deposit(struct pagemap_scan_private = *p, > >>>> + struct page_region __user *ve= c, > >>>> + unsigned long *vec_index) > >>> > >>> ..._deposit() is used only in single place - please inline. > >> It is already inline. > > > > Sorry. I mean: please paste the code in place of the single call. > I've made it a separate function to make the code look better in the call= er > function and logically easier to understand. This function is ugly. > do_pagemap_scan() is also already very long function with lots of things > happening. If you still insist, I'll remove this function. Please do remove - it will make the copy to userspace code all neatly toget= her. [...] > >>>> + */ > >>>> + if (is_written && PM_SCAN_OP_IS_WP(p) && > >>>> + ((end - start < HPAGE_SIZE) || > >>>> + (p->max_pages && > >>>> + (p->max_pages - p->found_pages) < n_pages))) { > >>>> + > >>>> + split_huge_pmd(vma, pmd, start); > >>>> + goto process_smaller_pages; > >>>> + } > >>>> + > >>>> + if (p->max_pages && > >>>> + p->found_pages + n_pages > p->max_pages) > >>>> + n_pages =3D p->max_pages - p->found_pages; > >>>> + > >>>> + ret =3D pagemap_scan_output(is_written, is_file, is_= present, > >>>> + is_swap, p, start, n_pages= ); > >>>> + if (ret < 0) > >>>> + return ret; > > > > So let's simplify this: > > > > if (p->max_pages && n_pages > max_pages - found_pages) > > n_pages =3D max_pages - found_pages; > > > > if (is_written && DO_WP && n_pages !=3D HPAGE_SIZE / PAGE_SIZE) { > > split_thp(); > > goto process_smaller_pages; > > } > Clever!! This looks very sleek. > > > > > BTW, THP handling could be extracted to a function that would return > > -EAGAIN if it has split the page or it wasn't a THP -- and that would > > mean `goto process_smaller_pages`. > Other functions in this file handle the THP in this same way. So it feels > like more intuitive that we follow to same pattern in this file. I'll leave it to you. Extracting THP support would avoid a goto and #ifdef inside a function, though (and make the function smaller). > >>>> + /* > >>>> + * Allocate smaller buffer to get output from inside the pag= e walk > >>>> + * functions and walk page range in PAGEMAP_WALK_SIZE size c= hunks. As > >>>> + * we want to return output to user in compact form where no= two > >>>> + * consecutive regions should be continuous and have the sam= e flags. > >>>> + * So store the latest element in p.cur between different wa= lks and > >>>> + * store the p.cur at the end of the walk to the user buffer= . > >>>> + */ > >>>> + p.vec =3D kmalloc_array(p.vec_len, sizeof(struct page_region= ), > >>>> + GFP_KERNEL); > >>>> + if (!p.vec) > >>>> + return -ENOMEM; > >>>> + > >>>> + walk_start =3D walk_end =3D start; > >>>> + while (walk_end < end && !ret) { > >>> > >>> The loop will stop if a previous iteration returned ENOSPC (and the > >>> error will be lost) - is it intended? > >> It is intentional. -ENOSPC means that the user buffer is full even tho= ugh > >> there was more memory to walk over. We don't treat this error. So when > >> buffer gets full, we stop walking over further as user buffer has gott= en > >> full and return as success. > > > > Thanks. What's the difference between -ENOSPC and > > PM_SCAN_FOUND_MAX_PAGES? They seem to result in the same effect (code > > flow). > -ENOSPC --> user buffer has been filled completely > PM_SCAN_FOUND_MAX_PAGES --> max_pages have been found, user buffer may > still have more space What is the difference in code behaviour when those two cases are compared? (I'd expect none.) Best Regards Micha=C5=82 Miros=C5=82aw