From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D357EC48BC4 for ; Tue, 20 Feb 2024 05:10:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 622886B0078; Tue, 20 Feb 2024 00:10:19 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5AB596B007B; Tue, 20 Feb 2024 00:10:19 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 44C966B007E; Tue, 20 Feb 2024 00:10:19 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 2F57B6B0078 for ; Tue, 20 Feb 2024 00:10:19 -0500 (EST) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id ECB1EA02C0 for ; Tue, 20 Feb 2024 05:10:18 +0000 (UTC) X-FDA: 81811006116.13.705E98D Received: from mail-yw1-f178.google.com (mail-yw1-f178.google.com [209.85.128.178]) by imf07.hostedemail.com (Postfix) with ESMTP id 32DCA4001D for ; Tue, 20 Feb 2024 05:10:17 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=wfCN0vGL; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf07.hostedemail.com: domain of hughd@google.com designates 209.85.128.178 as permitted sender) smtp.mailfrom=hughd@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1708405817; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=IgyfdjCU+hZCcbLLD6KYpoV/bfbjieZdWhMNZ1uBaPU=; b=az9EQm1rsIDwsnM63yDU5FvFBdv5LoyOXdUz9A7vVXr4zzdZuz1GWBmdzaISrgjkRpJSd5 TQYDIxAmBioDcjsWKdq4kFN9MU/uhzoseCIr8cZa9WJeWMWkKvlQrKI152hA434QoNY6Xk EjCP86uAco/PYF++V392IBiicr3OphA= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=wfCN0vGL; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf07.hostedemail.com: domain of hughd@google.com designates 209.85.128.178 as permitted sender) smtp.mailfrom=hughd@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1708405817; a=rsa-sha256; cv=none; b=ue5wjNdsLub6DfAH2vh8BzyRGEBXiWdZmNNBjJ/wNtGPH81ZNPFkCNwaWCqlvIF0t7fnmp jq+7wb+04k/yB9U5S9rlCtmIk84xV8rFhLWgPAz3zVMVUhj6SOnuoAjfn4QGIC0KiRhHZ+ 7YVvrmRaSSnh2+igaJ/RjPXpjrCOI+M= Received: by mail-yw1-f178.google.com with SMTP id 00721157ae682-60837b7a8ddso17858657b3.3 for ; Mon, 19 Feb 2024 21:10:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1708405816; x=1709010616; darn=kvack.org; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=IgyfdjCU+hZCcbLLD6KYpoV/bfbjieZdWhMNZ1uBaPU=; b=wfCN0vGLFSf5SPZNW3pw0mNJhcq/IiazzA2h5SCV1f1bfQm7xUS+LPcz2vSencvASd QLPJN6JpwKGbOrib00et+MzBsOfJKCkkVvxAPT2VHhoeLKDSgC/gWriMhatAQTnyc254 We7/NN5XJ2HOVH0FchmBsG2SYMNBab5nN+MaZPFnZ5NqXE7Kh00A2HNnY+aPg9jdGScx mMwc+gj4L4Mal+17LnnFVpW9i0vTSQLVXyUxhB+M55TcmmVLMn7p2gm/C1R3nbKD5cKD 91Z6ZDTkbB9cnY3BFwk5CBue2sNJsAFb5jr0tz8TM3up97VpR2+SfvHyrWADuhYIWIqt 2d2A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1708405816; x=1709010616; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=IgyfdjCU+hZCcbLLD6KYpoV/bfbjieZdWhMNZ1uBaPU=; b=vNGgjRw50CHiWdJcqh667gc66I+L+24Y8soUCX8BOS0+MDDrLkVZtlI8zuDzCpWANf JbwcSHWnoNpDhk5Yt9QhkR+uMtZouOPuJF1jp07Qijtaqu4IdJfNHT64/bIxl00k9nTw lmHbmyRUx172kxCWU/XPMEaTBCN6eZXdxsrpNrDoYOzNubpYUXkx297Z/oATACMalUop 9gRNZ7+iMuAp4bxXA1+tzIuvvwKd0wb95n3zuVUkRciBVDfCgyiiV7v/Mgz64lo9nGjT qB05DPHaI6EFnUaCC1mpnEn30LnKmZ1l6QFsU0nHcBsV6qvar39JLRElUZn8m/MUUOP3 8D9A== X-Forwarded-Encrypted: i=1; AJvYcCXcfdjhg+Qhue+77FA88y6YGgObPH+5PC8DhQDdPwGApamjhCOHuwbyKcQMoANBhwwfJvr6E3vg0b7MogdkfGYd+L0= X-Gm-Message-State: AOJu0YxsimRg+K6UAbrOA3RvXC7X9IZfDx2U6Yy2zW7a6GX7k3mciJzs vxFguBBMLMvHwakJCsVs829w/id+Sh6xQx4AEMpvddG+uydLizaVPRldUG5eBA== X-Google-Smtp-Source: AGHT+IG7BwM7vxRSUewnnRf0BMipko/Du7E1sEN5r/nlk7GYRj97y1scNbrhkgcW7UvpbKb7YyKbBw== X-Received: by 2002:a81:6946:0:b0:607:805a:6da4 with SMTP id e67-20020a816946000000b00607805a6da4mr15686343ywc.2.1708405816108; Mon, 19 Feb 2024 21:10:16 -0800 (PST) Received: from darker.attlocal.net (172-10-233-147.lightspeed.sntcca.sbcglobal.net. [172.10.233.147]) by smtp.gmail.com with ESMTPSA id d85-20020a814f58000000b0060851394832sm445379ywb.90.2024.02.19.21.10.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 19 Feb 2024 21:10:15 -0800 (PST) Date: Mon, 19 Feb 2024 21:10:13 -0800 (PST) From: Hugh Dickins To: Charan Teja Kalla cc: Hugh Dickins , Andrew Morton , willy@infradead.org, rientjes@google.com, surenb@google.com, fvdl@google.com, quic_pkondeti@quicinc.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Minchan Kim Subject: Re: [PATCH V7 2/2] mm: shmem: implement POSIX_FADV_[WILL|DONT]NEED for shmem In-Reply-To: <64ed46f4-459c-63b0-a69e-81353e9fcbc9@quicinc.com> Message-ID: References: <631e42b6dffdcc4b4b24f5be715c37f78bf903db.1676378702.git.quic_charante@quicinc.com> <2d56e1dd-68b5-c99e-522f-f8dadf6ad69e@google.com> <64ed46f4-459c-63b0-a69e-81353e9fcbc9@quicinc.com> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="-1463762175-1004148714-1708405815=:18844" X-Rspamd-Queue-Id: 32DCA4001D X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: zu6c4g83unm74tccten7o4n69cejkceg X-HE-Tag: 1708405817-953124 X-HE-Meta: U2FsdGVkX1+tP+uvixxyfxqF34ZvO8BtS42Jve0EjVT84+KlWyQOzlD9YgsBxy0qgHjoSy4SgwdqFpkE/zvnRkAQsFP8EpIJB8EJkQaHCdvDM4/qMvRIqJGb3pLcauTC1QrL9lln71JYR/5P9vGvLp2sWqKP6ZKvZ7Lu/TpQdP24+Rm0o0ehV8apB73chPUiPqX4BTZVIXkJyjilt5eFPrdnhsAAT2R1KSszNxa6M8xks5JcVldlQD+WmPGdpZ2HSHYrFoywVFfIjEmOGo3EmD222VODM9BTSJhDzIPXuBpVBqE09bg9PNmvFgAlCx/ICW8mFybn9iyody/6UzNP7CMPa41MCD1B0I3xpy0ZWGSwPCz5euAbAtr+84Hqv4MoOV4oqdoaO27vYgRxEw9qndeC6wmyNwuYS9HQFmcusGJRqWDYbPR9s4bG7qILAp9d+upZigA0mHJkG6zPlXxTMIVa3ThMIfhJwBG42yOyJlejGPnQgl/v5pVp8N+jFqGaZiwMUjiPo9sHm37BglJJd3rzpDIPtNzH0kLQZv1VV/BfsTcc9Lc5PeWM5O900yF23uJABaKbcC776v6uxlNlG4oUTLcHty31FTAPRa/3Q8yAHzttyYp0H5fTIk9mWyl7Oogtn+dRWusuRyi1wBhFshOXq76M6SoB7cUtE7RQnjwrHfzjy1ddPVrDXQWn6Foz2pBvhT1O0+0vOltiRihMYohRPQ303UAw00f6fzA7i8g7X20SW17hBnAXTVttVM/ofVcXX2ao7duEcjxqo1cMov5jLMkWXq+zPT/Nk03mZhcvMoKmU5x4xL6rBzgafNSA/BLvRD5dUAP8d9FKB1zr81JF0KenzSHfq91AS7nR6MEZxZj+hd6DqYh+cvtt2UnrA9Wbc4R/AnzcEzeAeaNibWLajriPJA8Q8xyqs1WByybBt5XXORWNi+j3KcBcE262Z+hPo96Pl/CnyUkRqyz /1ZEfjX4 jOw/nEgHxXay6YWqh7zMejLZ66j+xfDLpDbVJbeyYCl61xpNF/ApTDCms5SsWWxd6+E9IV0FAdc8GqlqZWDd+GxUV3PEKd9NVT4iAfFeOJyt+3HYVUDCmMPI8UvgcoFBC0/LjOlm3FM+Sl4536Qa+NmJs9NjEQ8A2jgQhuOmdTxCne074ukbL9eb0CynG2KytH6NF X-Bogosity: Ham, tests=bogofilter, spamicity=0.440067, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. ---1463762175-1004148714-1708405815=:18844 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: QUOTED-PRINTABLE On Wed, 14 Feb 2024, Charan Teja Kalla wrote: > Hello Hugh, >=20 > Based on offline discussion with some folks in the list, it seems that > this syscall can be helpful. This patch might have forgotten and I hope > this ping helps in resurrecting this thread. Charan, it's not forgotten, but it was relayed to you through another channel a month ago, that I did not expect to have time to think about this for 3 months. Countdown says 2 months to go now. I realize that it's frustrating for you; it's unpleasant for me too. >=20 > On 5/18/2023 6:16 PM, Charan Teja Kalla wrote: > > On 5/17/2023 5:02 PM, Hugh Dickins wrote: > >>> Sure, will include those range calculations for shmem pages too. > >> Oh, I forgot this issue, you would have liked me to look at V8 by now, > >> to see whether I agree with your resolution there. Sorry, no, I've > >> not been able to divert my concentration to it yet. > >> > >> And it's quite likely that I shall disagree, because I've a history of > >> disagreeing even with myself on such range widening/narrowing issues - > >> reconciling conflicting precedents is difficult =F0=9F=99=81 > >> > > If you can at least help by commenting which part of the patch you > > disagree with, I can try hard to convince you there:) . > >=20 > >>> Please let me know if I'm missing something where I should be countin= g > >>> these as NR_ISOLATED. > >> Please grep for NR_ISOLATED, to see where and how they get manipulated > >> already, and follow the existing examples. The case that sticks in my > >> mind is in mm/mempolicy.c, where the migrate_pages() syscall can build > >> up a gigantic quantity of transiently isolated pages: your syscall can > >> do the same, so should account for itself in the same way. >=20 > Based on the grep, it seems almost all the call stacks that isolates the > folios is for migrating the pages where after migration the NR_ISOLATED > is decremented (in migrate_folio_done()). The call paths are(compaction, > memory hotplug, mempolicy). >=20 > The another call path is reclaim where we isolate 'nr' pages belongs to > a pgdat, account/unaccount them in NR_ISOLATED across the reclaim. >=20 > I think it is easy to account for the above call paths as we know "which > folio corresponds to which pgdat". >=20 > Where as in this patch, we are isolating a set of folios(can corresponds > to different nodes) and relying on the reclaim_pages() to do the swap > out. It is straightforward to account NR_ISOLATED while isolating, but > it requires unaccounting changes in the shrink_folio_list() where folio > is being freed after swap out. Doing so requires changes in all the > code places(eg: shrink_inactive_list()), where it now requires to > account NR_ISOLATED while isolating and the shrink_folio_list() > unaccounts it. >=20 > So, accounting NR_ISOLATED requires changes in other code places where > this patch has not touched. That surprises me, though I do recall that there's an irritating asymmetry in where NR_ISOLATED is accounted and unaccounted. I have not checked what you say there, may do so in 2 months. >=20 > If isolating a large amount of pages and not being recorded in > NR_ISOLATED is really a problem, then may I please know your opinion on > isolating(with out accounting) and reclaiming in small batches? The > batch size can be considered as SWAP_CLUSTER_MAX of pages. In most circumstances, omitting to account NR_ISOLATED wouldn't show up as a problem; in low memory it would. Splitting into small batches without accounting might be an easier and better way; but whatever I say in a hurried unthoughtful reply is likely to be wrong. I am not convinced that isolating is even appropriate: I think I hinted before that I would want to compare what you do here with what shmem_swapin_range() does in mm/madvise.c, and the shmem_collapse_swapin() I'll be proposing to avoid swapin while building up THP in collapse_file(). But it may well be that you've found the switching of LRUs essential: I'm not prejudging, just saying I cannot rush to judgment. And this is also a new UAPI for tmpfs, so should not be rushed then regretted. But if you can find another champion to force this into mm/shmem.c for you faster than I can manage, well, I don't own any Linux source. It's not unusual for me to limp along later and rearrange things to suit my preference. Hugh >=20 > > I had a V8 posted without this into accounting. Let me make the changes > > to account for the NR_ISOLATED too. >=20 > Thanks, > Charan ---1463762175-1004148714-1708405815=:18844--