From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EFAE0E748EE for ; Mon, 2 Oct 2023 15:55:47 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6905E8D0037; Mon, 2 Oct 2023 11:55:47 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 640798D000E; Mon, 2 Oct 2023 11:55:47 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5078B8D0037; Mon, 2 Oct 2023 11:55:47 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 4229A8D000E for ; Mon, 2 Oct 2023 11:55:47 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 15110160314 for ; Mon, 2 Oct 2023 15:55:47 +0000 (UTC) X-FDA: 81300971934.21.E7B75D3 Received: from mail-yb1-f171.google.com (mail-yb1-f171.google.com [209.85.219.171]) by imf21.hostedemail.com (Postfix) with ESMTP id 281671C0015 for ; Mon, 2 Oct 2023 15:55:43 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=aMP0K1sM; spf=pass (imf21.hostedemail.com: domain of vishal.moola@gmail.com designates 209.85.219.171 as permitted sender) smtp.mailfrom=vishal.moola@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1696262144; a=rsa-sha256; cv=none; b=T33Ik2gbm+xQGOWv0I2f4e87MGIt+WWw+0x83SAWvw5tkriRoXuQcxZnmqJZxRTAoE3noN pYdJdQPuEAGCYtwyk99vmc2yY6ryWgwH0zb9Vz2mFMR59jLnjdgr7TjPfDrrc/9KZmb0Nj FlLyKZK2qd4ftPtKSt8E+6I6e6UKlK0= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=aMP0K1sM; spf=pass (imf21.hostedemail.com: domain of vishal.moola@gmail.com designates 209.85.219.171 as permitted sender) smtp.mailfrom=vishal.moola@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1696262144; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=1mnaiR8TXXIrREj/0Esj6L2AmB5bef0JgMh7T1K6lR8=; b=HoXWZKmq1Ac7+gsOcgUWjzZgk33LaDBe6X4FydDb5PSm4T8IjYbXflj+AxzSIDYnVAUBdg ZXrSb+MmdnrQqnuTT4W++JPh2sSRna8KCR8tI3KCmvkiIytnal7WbSLa0PSkxxdu9LQufD cBZkdgJlVIyhqVpUodtnzSYXD2evATY= Received: by mail-yb1-f171.google.com with SMTP id 3f1490d57ef6-d9151fc5903so148940276.2 for ; Mon, 02 Oct 2023 08:55:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1696262143; x=1696866943; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=1mnaiR8TXXIrREj/0Esj6L2AmB5bef0JgMh7T1K6lR8=; b=aMP0K1sMBQTfhdKr2PbCE7HBImRcsWOrVv6ZBj8sYZkUEJLoYlOcykXVlUrQ412Wc6 /RVvsoYcOBY/798N4EsaP3GLN7qCkeje3NZR7Ek8SbDJ/z4Z64yZ7ELz+XjB7VLiyGmV uTllsqjizMN4X/Ib+vYxQri/CZIg+1fPlVTvNJs3Bli6ODTcy1+Hgrx6eNaoPNa8xyaC jUhPaJ+Z/wB3/fzGLR3io0Z9JoLNkvlUQXeijexpRtEvJbRKf1nuzrNu8ZilhRxYQaZb /s1HyqYI9J9RJSsZceAysdwaolYZ7YTk+lQLPGjTOhIHXmdsW/6+0+2tFfKb9Q1E1sDt OhQg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1696262143; x=1696866943; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=1mnaiR8TXXIrREj/0Esj6L2AmB5bef0JgMh7T1K6lR8=; b=F8Bx49XX0fqpgQf60tBgxRK9vDaXtaYh6pGxB0ufPGVxf/CyWekh3u1CAVguXe3mE7 E1pm547zQ6erLM1o8U6IQz2KiN4+y/8WAErnv8EzC8D4QOQEr4S0nOjiIzcFiExeQ82i STAJ8GDC/W9rBzCU7AmhKAHTs2RzWb01obYG4xIuIgLSLzMe0uJFf6GEdC3PpQKC/jyP m/YOwzjOVensFx36ZxQDZaoWlC2T7S9JhlBGyDvYcwSWdsuarY3s5x/RrgZexNnpjVw3 HLWhij4mftmqa5qxCf/TYVdL0Bs44N3xv/23gDvOyt8PpKucJjd5FeLgVv/J72ccrK/Z sUJQ== X-Gm-Message-State: AOJu0YyExW1/5PdVmT0MnsZ7SgcPu49tGO4JXbyEvDOiEEgiOxe0kiS0 ZAqbtfK/d6ZuGzS3d7T5heiVmf5YCRbvRzuNDRM= X-Google-Smtp-Source: AGHT+IHkpCA2eqF2W2EP/F9HGpvjnYAiwfb7O5/E6Y3fIUbc1GjKWKQTVIaC4N7EZmgzBlyRALj4OsIrc9BLMKqTRsU= X-Received: by 2002:a25:b10:0:b0:d7a:ee98:4f8 with SMTP id 16-20020a250b10000000b00d7aee9804f8mr11528185ybl.30.1696262142966; Mon, 02 Oct 2023 08:55:42 -0700 (PDT) MIME-Version: 1.0 References: <20230922193639.10158-1-vishal.moola@gmail.com> <20230922193639.10158-3-vishal.moola@gmail.com> In-Reply-To: From: Vishal Moola Date: Mon, 2 Oct 2023 08:55:34 -0700 Message-ID: Subject: Re: [RFC PATCH 2/2] mm/khugepaged: Remove compound_pagelist To: Yang Shi Cc: Matthew Wilcox , linux-mm@kvack.org, akpm@linux-foundation.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 281671C0015 X-Stat-Signature: j99ce1dm3pdy3tidu5t1hdo5zwn9fytt X-Rspam-User: X-HE-Tag: 1696262143-881251 X-HE-Meta: U2FsdGVkX1+7EHEVC8y6C8jQcO1HoZTA6sU1TTyRpEmdKhxmYJtKMJft1sHNNMp84dYDg5+7JoygiYyPJgEJos/88hmAuGNEwULrKrwgXcNPt9yFSkl44k9ss+yT5+ajMmjs9UuRvvV3bme5M6xvSrY5zNJ7gB4ZCp60PjP3KDD2IS22XFINsGZWJZQLPHZjk+7Gb1NeimIPjhtIr6H2Nr3IwMO6UUo9/EWLXDFsSv2QB3LEkQupju70gg3+PrjVhQU5kARmakRCrgApoX7wd5+KuXHScmIR4/TiCWu0Aji49/EgInLulq7k2mIy44dhKnJzqJ3Vqz/P0pxYmCYT+7l7KRw4qwhCJ05vJlUbR4uKbPFIVQn/Xh0AJPKN1dK7NyeP88cBdBgNd0LvjA1e1A1vH8SFZPGYkqAZKe422z7nDM3KGT+4Fy9p/+ESrqTg3GbYsN95RUiNHrVoe+f2uRjddYZxKGwRFGBneqNWNO13OHH3ICjoMRZrjrm8J8I2Q34JxaMWabjD7F6yV5o/l1l0tSk1XgFWjexSE1MKzhQZOPF3CFXxL0nDYJ/qW8zM++WsIez83pypvokxx8H9N3zsEcpPh3VjDH53D6xfHFbLN8nxOC0rSZ8QUJmmS2WpQLxjle6KubjykHV/zg+bnWzb/RtHjBOniEONIUIpeY1iEjs7Cij4rgAJRK67YMp3s/GkiTypGwOY6guDzSruvgMRSN9wxgNgw5sop2sqbuf+QXHkBvGi79FIOCxQf7xL3yZEJOEb/8IJ/2lYn0rkteYtODwVD89i7rcASgk8fsHjMBkLZ5QE9t9eETi2PMbsgYI90hdgzG2ULpWTRaElNaixk21AHLkh8pXS7FDmHr4lmokNDQ8EqnaCgIHIeCRU0Gfu/1K0haCo8g2FSisxnJfbAtTEsd4KjXyQm7vdi7AegDrMrfHIf08m4ErpX1QerF27ps5sGxxXvbUFJF9 wyxhGW5U IRCh966AZ4xJuL0YjjI54W3DIwngt3e/dmrk5Ekj23z9DDIPGbDWkQjIEF7FB4TFU73xzi1hH0POoJsS80/3zxGhF8bZYd6B1p3y9UVprx7ytBhidFopKjo4fc0shGbLnyQ4inRp4MF4Yys3nAnayDM03fPb7VXkF51MpHy7mhYNB1HKoK5FZtx1DglsVzo9Zr2g/GbPHxX6fCPn6KAo/rJJwyUcM8YQzsbhie94E4NyssIwSGqYbwqRdWmYl/0Ma0WszfwQd0cw8kzCzp64sQo3gZL5D+N7kOWZa9In4eVL+3F0meCtCBq5S9NPQETv7lgSZl3wMbV6wOTMLnKneQ485Ap3t9R7Gh2WbK1dXWgni9NATH1ts0hdoWG4EIXBmM2czhzC2vzNlgkSM5B3HM2Mr+hvvRXJGGOD8xgrOU43WKQlsaNWU4rMF+M73OX+saqRMlhoTRfsuPdT6c/Rzn+ncWymcuFQS6kT+2iIQLLgXq1CXNA0sazYnSzB/fAVpSZce X-Bogosity: Ham, tests=bogofilter, spamicity=0.000054, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Sep 28, 2023 at 12:33=E2=80=AFPM Yang Shi wro= te: > > On Thu, Sep 28, 2023 at 2:05=E2=80=AFAM Matthew Wilcox wrote: > > > > On Tue, Sep 26, 2023 at 03:07:18PM -0700, Yang Shi wrote: > > > On Fri, Sep 22, 2023 at 9:33=E2=80=AFPM Vishal Moola (Oracle) > > > wrote: > > > > > > > > Currently, khugepaged builds a compound_pagelist while scanning, wh= ich > > > > is used to properly account for compound pages. We can now account > > > > for a compound page as a singular folio instead, so remove this lis= t. > > > > > > > > Large folios are guaranteed to have consecutive ptes and addresses,= so > > > > once the first pte of a large folio is found skip over the rest. > > > > > > The address space may just map a partial folio, for example, in the > > > extreme case the HUGE_PMD size range may have HUGE_PMD_NR folios with > > > mapping one subpage from each folio per PTE. So assuming the PTE > > > mapped folio is mapped consecutively may be wrong. > > > > How? You can do that with two VMAs, but this is limited to scanning > > within a single VMA. If we've COWed a large folio, we currently do > > so as a single page folio, and I'm not seeing any demand to change that= . > > If we did COW as a large folio, we'd COW every page in that folio. > > How do we interleave two large folios in the same VMA? > > It is not about COW. The magic from mremap() may cause some corner > cases. For example, > > We have a 2M VMA, every 4K of the VMA may be mapped to a subpage from > different folios. Like: > > 0: #0 subpage of folio #0 > 1: #1 subpage of folio #1 > 2: #2 subpage of folio #2 > .... > 511: #511 subpage of folio #511 > > When khugepaged is scanning the VMA, it may just isolate and lock the > folio #0, but skip all other folios since it assumes the VMA is just > mapped by folio #0. > > This may trigger kernel bug when unlocking other folios which are > actually not locked and maybe data corruption since the other folios > may go away under us (unisolated, unlocked and unpinned). Thanks for the review. I did not know this could happen; I'll drop this patch for now until I can think of a better way to iterate through ptes for large fol= ios.