From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 18130C0015E for ; Mon, 24 Jul 2023 14:38:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 954AA8E0006; Mon, 24 Jul 2023 10:38:17 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8DE0C8E0001; Mon, 24 Jul 2023 10:38:17 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7575C8E0006; Mon, 24 Jul 2023 10:38:17 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 5F0578E0001 for ; Mon, 24 Jul 2023 10:38:17 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 2D89B120ACA for ; Mon, 24 Jul 2023 14:38:17 +0000 (UTC) X-FDA: 81046760634.19.5383D81 Received: from mail-ed1-f43.google.com (mail-ed1-f43.google.com [209.85.208.43]) by imf23.hostedemail.com (Postfix) with ESMTP id 3572F140024 for ; Mon, 24 Jul 2023 14:38:14 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=C9Suheke; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf23.hostedemail.com: domain of emmir@google.com designates 209.85.208.43 as permitted sender) smtp.mailfrom=emmir@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1690209495; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=b5SDTe6qOC5PXRB6SyhCG7ZaeHW8gj0AmjKckKwv3io=; b=QexZNS3ZTS1z/jhrhprtDmwfkIYeCZ6ph7+Mze9Hga6HDRg4wJnxnfsjMv9zERvj+G9xys DTEQorvWbGoR1bKqZZ5acVMoGFGVCwzRTtPuuO+7pJbR+ht1P3Kb878EiSdzO0q6/hWt3R RvR2giNMNBIDzewpm9/Kajd6/jI+IFQ= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=C9Suheke; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf23.hostedemail.com: domain of emmir@google.com designates 209.85.208.43 as permitted sender) smtp.mailfrom=emmir@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1690209495; a=rsa-sha256; cv=none; b=s5B6/t9FUnqbFbqngn4hVETf+bVGmKUG7ZgaKH6Y8FDXDwJgkPpjeaUg5X09t6J/MkSclQ Z+Hy/1s/k4ttqdcSrpLytG59LHShdTQvizVyyV5E8tEQS5+At6lzfEkw94om946nxg/A4F CV20uqRDSHEuW49y3D3EeTKTCjznfjQ= Received: by mail-ed1-f43.google.com with SMTP id 4fb4d7f45d1cf-521e046f6c7so12970a12.1 for ; Mon, 24 Jul 2023 07:38:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1690209493; x=1690814293; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=b5SDTe6qOC5PXRB6SyhCG7ZaeHW8gj0AmjKckKwv3io=; b=C9Suheke7UQIS/avFEu/JXXrfnObD0Xt0TNh7dM22D9jAawL6KZjL3QqM/z3wqzpm7 Mm6YMbmLhKk4HT4js0BuYeNzm0Hs5lGiTUonNYK4R1YxlN9UufVYBC7ruD+OUlg74rUR 46a2ArThiXaiWvRGM9jjZeLnWKh1O9ka7Tkd21FhQTItonpzlPJs1Ww16WWjwRf87RtU 59vTx2NKqdYe45KK5Y1YFLUDOCStjKyzdM3/kJ9CHWfHj2NuKmfVO4KpV9SaWezw8n2W 3tSWQhJJs7Y1INcgZ7gb204PsM5UMzzIkApxtp0SU2N09PJkoT5Pi0eXXnhEybhV9Nnv tmnQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690209493; x=1690814293; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=b5SDTe6qOC5PXRB6SyhCG7ZaeHW8gj0AmjKckKwv3io=; b=dkbCWd3sYxvMifEiGt3jzJIzQcgYQf4HR/uUm+9Z6byMfonNAfPAF6mFn+LplQZbsv XDiGPWjrX5TuR7lHcunebX7zWN8I4/4HnavrlA5btygJ4r/j2Iu6ywuZ4v8s5o7G1kyn jfO1dRfaOQXwIbWNtAWELkyP/O7HSVW/Znx4tcgMUjb2sh9xl3SVgHKIv6zvSYliCu+j hltMBAzF1vCv/kIcyOUFyFi8ZRGDd81jt7/jCGQrZHQ0znDENfMzNbRynyc6zW6cFi0h 9bnIc6JHEC9NQWtGoz5+4payPMLQb2DABqwrZTm/oYMA4V0oBvsLyrCStDrwlddAynnm VT5A== X-Gm-Message-State: ABy/qLajA9ridOv7haSFfCF3+xKt7LgtZp0k2T3EEANbAc9gSfKBAcbv Xfk4iKNf3F06oseLgyvd9qwxZyvwBnGMpzLd+5wvyA== X-Google-Smtp-Source: APBJJlFRzqXuIv1qPeWfBAjjHGcqTmgLN2bBdY5Dll/6hSi/V+dFxY9ftxsLNyp7eYXdBv5srMzDm+fP5nEzVgqY6F0= X-Received: by 2002:a50:d492:0:b0:51a:1ffd:10e with SMTP id s18-20020a50d492000000b0051a1ffd010emr178091edi.3.1690209493308; Mon, 24 Jul 2023 07:38:13 -0700 (PDT) MIME-Version: 1.0 References: <20230713101415.108875-6-usama.anjum@collabora.com> <7eedf953-7cf6-c342-8fa8-b7626d69ab63@collabora.com> <382f4435-2088-08ce-20e9-bc1a15050861@collabora.com> In-Reply-To: From: =?UTF-8?B?TWljaGHFgiBNaXJvc8WCYXc=?= Date: Mon, 24 Jul 2023 16:38:01 +0200 Message-ID: Subject: Re: [v2] fs/proc/task_mmu: Implement IOCTL for efficient page table scanning To: Muhammad Usama Anjum Cc: =?UTF-8?B?TWljaGHFgiBNaXJvc8WCYXc=?= , Andrei Vagin , Danylo Mocherniuk , Alex Sierra , Alexander Viro , Andrew Morton , Axel Rasmussen , Christian Brauner , Cyrill Gorcunov , Dan Williams , David Hildenbrand , Greg KH , "Gustavo A . R . Silva" , "Liam R . Howlett" , Matthew Wilcox , Mike Rapoport , Nadav Amit , Pasha Tatashin , Paul Gofman , Peter Xu , Shuah Khan , Suren Baghdasaryan , Vlastimil Babka , Yang Shi , Yun Zhou , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, kernel@collabora.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 3572F140024 X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: ipmje9tmzm7frnku1d1ozrzcgbspxp4a X-HE-Tag: 1690209494-408369 X-HE-Meta: U2FsdGVkX1+aqwHZ6FDb7lQHqXvnuuzXXcx1dM2S9jExnBe+Ve4tmWHZvtGtYSwKYYcS/hPDkPs2GtNn8cNzArru3o8HiI/zd1NFyrbF3DxvlGsFVdxqkLzgcdjsRsQkHTrS2rNidxRqMNTOdgiCXxZgF+RhfrcyyjmkO5+FTUwfCTe+tx5Ry0nWxVKlh8cDXznGmlv4w7eObT1jRHGWUw/mCPIgeOZHI+cA6C5x/k/sz+EQNEWbFzupMBMewyF9h5lZHGB2LIJRFcxrIBi94wWi/JYFQ57pyPmOucfPYduAOgYKyx+hrpBKKHdYJz7kqm/C6JQzLL3+furl+MqUzmMy84xgtN7aqEaJyVxS5n6QJa4VhRi+Lm+IzUXVkqFG0OJF78L/i6zT9g7qCgkQTSzCUlMteXlkWWBbVuj6Raok4zeSeOC3ce5o2h7Rf3c5bBzymnveIMauEnQpKNnMNvWVCEJSNkrWe0+rA8tUyJTOg6jDAZboDlnT/ugZ5+nCwGO9YDTfehAjMMR4ajHV8URH4NMMODMPCgwo4pF/c/tbfDd3jLTvaYrRko/j4padlvuQHGLnF3wKqz7RbEf4J78kw9a2a6+K/20nUTJinM8pN+UkR+62QtlaEMakGEytLLlRR6YxyJyDuODoh3M6jgrg7iXZT2MipgjF63Yg5sZHd5yGcv7pBobhSF3no77U7Jdw+DXPutOf7iCNhaK+SDC7Z7me+x8z7JLa6mSDjm8GWDskW6J9uYdVyUlri4PzATrvvrbDaGWJqWpD2T2ygGBmG9yrI2WM/mRmUpxi1mbFEj7SBKhcZCLDHQkEOGQ1bfVU+pEfnPYOsyaE2jz9pMtEjIsB/M+L5tt7u1TE4xcFx3p06Zhd1o7luqJMB+uMbWmbUMu1vpQNNdtU2pEO0CjbOLEVYr2KJ5ssnUZISCrcKkLEHVjuYUfQBXNCm406nrcTm2xEhpICeZSmYoS W3qdI4fY YuyAqnpVE0V1jGvg4M+gpJ0B2Uycg8GmZKMDSlDammR4+l30QCz5Cj2mbkRUx+zEcxgp9JTQ6eklBnponh88IakwJQO3LD9U2vLfCbrr4As2cBE6218f1r0Ag0ISCpThazULyrDPzU8ahJEWsCu17dPbAmyHx77TPYdkOk8S2PxSS6/s3IrEXAFV40n23J7HsC+VESuzFxUf9o7Z0TAQVClw/cZMGlFQ4olFn4Vuyyv8VbIyg3lKetjPfY82KQOQRioUAYLa7ZcoQt/VeB86LuR4W3zrYWRQOeWbUq45F+fJj38ONVbL+ndJsAgP3erO0D5EPEuoa7yOjpraKgIml4QJndd0rXA5tj1Yl4nYEShh6GtnggGSZp+cLM1MlMagN7GAlIkTogxPR0fGdkYRer06jIQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, 24 Jul 2023 at 16:04, Muhammad Usama Anjum wrote: > > Fixed found bugs. Testing it further. > > - Split and backoff in case buffer full case as well > - Fix the wrong breaking of loop if page isn't interesting, skip intead > - Untag the address and save them into struct > - Round off the end address to next page > > Signed-off-by: Muhammad Usama Anjum > --- > fs/proc/task_mmu.c | 54 ++++++++++++++++++++++++++-------------------- > 1 file changed, 31 insertions(+), 23 deletions(-) > > diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c > index add21fdf3c9a..64b326d0ec6d 100644 > --- a/fs/proc/task_mmu.c > +++ b/fs/proc/task_mmu.c > @@ -1985,18 +1989,19 @@ static int pagemap_scan_output(unsigned long > categories, > unsigned long n_pages, total_pages; > int ret =3D 0; > > + if (!p->vec_buf) > + return 0; > + > if (!pagemap_scan_is_interesting_page(categories, p)) { > *end =3D addr; > return 0; > } > > - if (!p->vec_buf) > - return 0; > - > categories &=3D p->arg.return_mask; This is wrong - is_interesting() check must happen before output as the `*end =3D addr` means the range should be skipped, but return 0 requests continuing of the walk. > @@ -2044,7 +2050,7 @@ static int pagemap_scan_thp_entry(pmd_t *pmd, > unsigned long start, > * Break huge page into small pages if the WP operation > * need to be performed is on a portion of the huge page. > */ > - if (end !=3D start + HPAGE_SIZE) { > + if (end !=3D start + HPAGE_SIZE || ret =3D=3D -ENOSPC) { Why is it needed? If `end =3D=3D start + HPAGE_SIZE` then we're handling a full hugepage anyway. > @@ -2066,8 +2072,8 @@ static int pagemap_scan_pmd_entry(pmd_t *pmd, > unsigned long start, > { > struct pagemap_scan_private *p =3D walk->private; > struct vm_area_struct *vma =3D walk->vma; > + unsigned long addr, categories, next; > pte_t *pte, *start_pte; > - unsigned long addr; > bool flush =3D false; > spinlock_t *ptl; > int ret; > @@ -2088,12 +2094,14 @@ static int pagemap_scan_pmd_entry(pmd_t *pmd, > unsigned long start, > } > > for (addr =3D start; addr !=3D end; pte++, addr +=3D PAGE_SIZE) { > - unsigned long categories =3D p->cur_vma_category | > - pagemap_page_category(vma, addr, ptep_get(pte)); > - unsigned long next =3D addr + PAGE_SIZE; > + categories =3D p->cur_vma_category | > + pagemap_page_category(vma, addr, ptep_get(pt= e)); > + next =3D addr + PAGE_SIZE; Why moving the variable declarations out of the loop? > > ret =3D pagemap_scan_output(categories, p, addr, &next); > - if (next =3D=3D addr) > + if (ret =3D=3D 0 && next =3D=3D addr) > + continue; > + else if (next =3D=3D addr) > break; Ah, this indeed was a bug. Nit: if (next =3D=3D addr) { if (!ret) continue; break; } > @@ -2204,8 +2212,6 @@ static const struct mm_walk_ops pagemap_scan_ops = =3D { > static int pagemap_scan_get_args(struct pm_scan_arg *arg, > unsigned long uarg) > { > - unsigned long start, end, vec; > - > if (copy_from_user(arg, (void __user *)uarg, sizeof(*arg))) > return -EFAULT; > > @@ -2219,22 +2225,24 @@ static int pagemap_scan_get_args(struct pm_scan_a= rg > *arg, > arg->category_anyof_mask | arg->return_mask) & ~PM_SCAN_CATE= GORIES) > return -EINVAL; > > - start =3D untagged_addr((unsigned long)arg->start); > - end =3D untagged_addr((unsigned long)arg->end); > - vec =3D untagged_addr((unsigned long)arg->vec); > + arg->start =3D untagged_addr((unsigned long)arg->start); > + arg->end =3D untagged_addr((unsigned long)arg->end); > + arg->vec =3D untagged_addr((unsigned long)arg->vec); BTW, We should we keep the tag in args writeback(). > /* Validate memory pointers */ > - if (!IS_ALIGNED(start, PAGE_SIZE)) > + if (!IS_ALIGNED(arg->start, PAGE_SIZE)) > return -EINVAL; > - if (!access_ok((void __user *)start, end - start)) > + if (!access_ok((void __user *)arg->start, arg->end - arg->start)) > return -EFAULT; > - if (!vec && arg->vec_len) > + if (!arg->vec && arg->vec_len) > return -EFAULT; > - if (vec && !access_ok((void __user *)vec, > + if (arg->vec && !access_ok((void __user *)arg->vec, > arg->vec_len * sizeof(struct page_region))) > return -EFAULT; > > /* Fixup default values */ > + arg->end =3D (arg->end & ~PAGE_MASK) ? > + ((arg->end & PAGE_MASK) + PAGE_SIZE) : (arg->end); arg->end =3D ALIGN(arg->end, PAGE_SIZE); > if (!arg->max_pages) > arg->max_pages =3D ULONG_MAX; > Best Regards Micha=C5=82 Miros=C5=82aw