From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A3302E77188 for ; Mon, 30 Dec 2024 05:36:47 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 26FAC6B0085; Mon, 30 Dec 2024 00:36:47 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 220786B0088; Mon, 30 Dec 2024 00:36:47 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0E7A06B0089; Mon, 30 Dec 2024 00:36:47 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id E0F7E6B0085 for ; Mon, 30 Dec 2024 00:36:46 -0500 (EST) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 596931A03CB for ; Mon, 30 Dec 2024 05:36:46 +0000 (UTC) X-FDA: 82950514584.10.528784E Received: from mail-pl1-f172.google.com (mail-pl1-f172.google.com [209.85.214.172]) by imf08.hostedemail.com (Postfix) with ESMTP id 94F4D16000B for ; Mon, 30 Dec 2024 05:36:13 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=c6xbx+6d; spf=pass (imf08.hostedemail.com: domain of rientjes@google.com designates 209.85.214.172 as permitted sender) smtp.mailfrom=rientjes@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1735536956; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=+HRQXgX0daiqxpzvuSzDUgKQWhiGla6Li64n1DBY4nA=; b=KDalXI7boFz13eKHYywVMFr472kbdSV6exJrn34hyJNWG89qqd5f7k1pii9h5TFGjEx8xP Cy5JVWAfgRnIl5ttp5oXZv08ADV/vDhDcb7xdZO4AHMGgDzwLlxwtWKWPqaqrWGpTNq6kQ Ph5JfpkT6PQqxHK0oUCwfGqvmdE5r9o= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=c6xbx+6d; spf=pass (imf08.hostedemail.com: domain of rientjes@google.com designates 209.85.214.172 as permitted sender) smtp.mailfrom=rientjes@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1735536956; a=rsa-sha256; cv=none; b=upTF23akkZRXaGyzx1GjdM/pPhq8buFJIV3gc5m4BZvKA+LTnTT0Du2iQ2ar3+orvU7cZi 0yENqR18Yp64c3lArzbjZlMzwuRAflFsc7NGg0SjGtGsgS14C6dt62oJTPJgj0HZfcosWZ 24khuWonLWe2KX5EsKL0R5Kj/ig+uvs= Received: by mail-pl1-f172.google.com with SMTP id d9443c01a7336-2163affd184so1107835ad.1 for ; Sun, 29 Dec 2024 21:36:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1735537003; x=1736141803; darn=kvack.org; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=+HRQXgX0daiqxpzvuSzDUgKQWhiGla6Li64n1DBY4nA=; b=c6xbx+6d/AKidxfqFej/3y439sSRuKKlQ/bLTS/vRKg2X/klUNhnyo4j/ZXuDQHrn7 lDDvDuqjXMm1aFpUHVEKq9G4aedx2dFs15ZJIkS24VCXDLhXZjR2TXmK9iXeh3k7IqH+ LT8n+1CiJiedpGzcS3+82pakmy0MUDuUt0F28xTp5soOkbc8qTpUEXfkD3CkJTGITSHv CljBcaWN2TfR1k/t5gbbKq/wbrqqN4+heflFwShPGIDpFfUfFW0K7Zcjk97jSAFcQpuf d79QZZc1wj91QY9hEy0a1aYEwHZ8J+AUoNYCW/GscFa0Gr6qkEzzLm9qQG63/K3lcxC4 EH9g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1735537003; x=1736141803; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=+HRQXgX0daiqxpzvuSzDUgKQWhiGla6Li64n1DBY4nA=; b=g+kO3jVHWa6GYSC+J2qovUVMrH2/Bg0qSuP5BWhgSagXN5nX5VJeHWFegAk3K2Nrae E32oNsqWFdnZ+nq1Usd4f7uI6kqIdITpNGT9SH/RMdui2hZ57pNVa7a8f24Jtg3uhc4y s6KSDqaPcKJoYka9+eLtL92D2J8zY1S7AR2DAqjJy8gsT782Hk5+Qbn1ifcTfx0B3fvM p2W95HFU+CMxfta+Ces3aBa/U1mKZ1oF7IpT9Jy2F/TzAmbdMLp7nAD7dcvP/FY2d7zs k88igYjPMTFu5EUZhrxNkEWS/6la2TwEtnHF/MfzE0wPg0O3o2PcuP37DwOEzmDQJ75t ZRTg== X-Forwarded-Encrypted: i=1; AJvYcCU7FZmR6TdfZGpoO5qYw/R6duFBcVmOA11tsrqlgbGB0es7WJQMOayHyJP4XXq8/6aTNCh1oqLjWA==@kvack.org X-Gm-Message-State: AOJu0YyHf9bnfvjDmxJJnZR8T8rMAVFK9bPtcni6j241gwsaH5ckimYh Y5ZN7rs1mwErILq9ixAtpnFtuaYPh4TttrP6DTu1Cq69HISAb8fMjgyBAfTjKQ== X-Gm-Gg: ASbGnctc28hnO9epw2Wn1dgNlPHVwFhkNejLUtBJgAYW9UvwG+t93GBdqPtcOVH9N/T XL0/adUioi18hvFrr3hKF8LOXh5xztFSOV1Gkqo+O/eCJh71XYmMe96ezL+6gWlLa7r+zxpMGzP B0q+tzXjb31KBHBlfAwT7ioNiUVC8jFOkmFpNpud4oiwJmssXfsED9M2bm+V5AeK3YYrxB3oyOK chrEgBl0081yhz0L2F2CHiAv3KjVyQuovDalpUkQHIH7sXLV09gK/sjDRWNkJ+Je5vN59UcSkyI nq5VSVFMDZE= X-Google-Smtp-Source: AGHT+IF2YWL7II9JFpBtxkUdeump+3nZ4StwfH7cGBnG900o4JCBnrcEBSvIvnJ1h1Og4Gn6sLzzUg== X-Received: by 2002:a17:903:244f:b0:216:607d:c867 with SMTP id d9443c01a7336-219e777a523mr14085305ad.29.1735537002993; Sun, 29 Dec 2024 21:36:42 -0800 (PST) Received: from [2620:0:1008:15:cab6:d74a:8241:f058] ([2620:0:1008:15:cab6:d74a:8241:f058]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-219dc9d448asm171598205ad.146.2024.12.29.21.36.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 29 Dec 2024 21:36:42 -0800 (PST) Date: Sun, 29 Dec 2024 21:36:41 -0800 (PST) From: David Rientjes To: Karim Manaouil cc: Gregory Price , Aneesh Kumar , David Hildenbrand , John Hubbard , Kirill Shutemov , Matthew Wilcox , Mel Gorman , "Rao, Bharata Bhasker" , Rik van Riel , RaghavendraKT , Wei Xu , Suyeon Lee , Lei Chen , "Shukla, Santosh" , "Grimm, Jon" , sj@kernel.org, shy828301@gmail.com, Zi Yan , Liam Howlett , Gregory Price , linux-mm@kvack.org Subject: Re: Slow-tier Page Promotion discussion recap and open questions In-Reply-To: <20241226012833.rmmbkws4wdhzdht6@ed.ac.uk> Message-ID: References: <6d582bb6-3ba5-1768-92f2-6025340a3cd4@google.com> <20241226012833.rmmbkws4wdhzdht6@ed.ac.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 94F4D16000B X-Rspam-User: X-Stat-Signature: 3wkug7t1n8ehjxscqjnf71mmiuc1mn3b X-HE-Tag: 1735536973-523318 X-HE-Meta: U2FsdGVkX188yVMqn14OGf7XXqIhutwxdafjvBSOC6JOhKEGaheYMb8xTqWfJXL3EaOvIABENC4O994V7QY2Ebf1vYcvKSUOkGPDLLrh/0vkJmXmCblNlJgIfsGt2clWS5YHpcI1dXM0koImbTWXCIUrKPWfXRkD/vTDlzHDm745Gza1ViUOAXGJrenSddQF/4BHurd1ihO/GGfCN31m+p1hVbbxWnfMNT3FCUrnIihKcSEmmcy8onOEzBwYDPN6vhw4N1zPb00vaWrZq/K14f963EnG1Z/pLgjw3DCEGeACMya1yRuOrFNPGBOfteli97eoARYllMDoNEEv2UfFWVtrniKc/Kscx52FJpCsAR22bFE1ynW2wyWU8gFUq3FPXAf2lBjL0NNHQBVwjxjerHRrF/wjifv6TjO0fTUH85vPBkswnL8OD5H3x3M3c8zQ/VzHSKAAyvpWZiFGP3sqHjGRV795hwRv/psVfuOWM3OZtkMI/I8s5xtaz/LhWSXIn2138CjPrefnjQ1vpgZKsiSPtqtfvbLFgjo6OFlUaW6agzL1pzuKeg9Y2rGVEnVPnzF3I9ao9joi5HzINaLNAKWUT53gmr9AOvDP1KfIJ0GRm2Tl+VdHNUPuwv95fvCPEmBm1ecYE1lQKbIQbbr6Aj5Wtod5u53U36aqFyHWc00vp8ktklMBKCxQdaL9bJVmIvHyMKhKgJM0wx+a2VVUaxnpTiO/wWKpmV+KxcoUw8jXstSTwH+QrZoxMNdJjxPlqj1hSc1N2yrXpQMrAVQNXnitVFNRnUGw3BVaR76fPcNQwXyWDBYlGmPX3f78YdOTlWNg0YRjarxWwDyLBeNrdEO4CV09DtaAnRkSqLvby/cmElqf41nJ5TQIstRVlcEx19QgTeaDTkyuEUY6Spn5Qk7HjGC1zBXn1NxXuJhVFdlhTuxJeXURihfZFuC5p8XDLFqXbgHod9eWKPPHYwP VXjIwJup geGfogIYPQsDGjRVm+MPp9prMfCZNP6DD2fOLPj82SAIHFCZe9RSA1J69/zNJP3Dok4W2CyNE9lVcdtoRvBGxmAljelyOLg7PfO2s1AXxO44XZB2lXmdTO5n2/1x75KOLmmCuDDOmpdhdmFq565IHQ7lf1poyVvUOxsnkFPH7djyZ0lPlfhYB1rpvRVLGGsgL4UuTX2pFylH1KUc/0u042zD6a61svFZFQoowPWUwobnfoRhZQpZuF/V3R60lbCRgCrmwre8WdPXlZHjY6xIMJWIXz5voKhk1o92dG8cyAJxLXOUfXVnht+Z3XbLIoLgqYlv9YOTwQkaTlBKZMVCaP6OQK3Jhku4kO2nqL9Nqx9TIitjQ2Yu48gCJ0JJ//AeUTTUqws8N06FXA92ED8RQei4KDCH74TNd4bG0hMlQgXm8EFYZ/HCMa1+YLjDl00xE8wJB X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, 26 Dec 2024, Karim Manaouil wrote: > On Wed, Dec 18, 2024 at 07:56:19PM -0500, Gregory Price wrote: > > On Tue, Dec 17, 2024 at 08:19:56PM -0800, David Rientjes wrote: > > > ----->o----- > > > Raghu noted the current promotion destination is node 0 by default. Wei > > > noted we could get some page owner information to determine things like > > > mempolicies or compute the distance between nodes and, if multiple nodes > > > have the same distance, choose one of them just as we do for demotions. > > > > > > Gregory Price noted some downsides to using mempolicies for this based on > > > per-task, per-vma, and cross socket policies, so using the kernel's > > > memory tiering policies is probably the best way to go about it. > > > > > > > Slightly elaborating here: > > - In an async context, associating a page with a specific task is not > > presently possible (that I know of). The most we know is the last > > accessing CPU - maybe - in the page/folio struct. Right now this > > is disabled in favor of a timestamp when tiering is enabled. > > > > a process with 2 tasks which have access to the page may not run > > on the same socket, so we run the risk of migrating to a bad target. > > Best effort here would suggest either socket is fine - since they're > > both "fast nodes" - but this requires that we record the last > > accessing CPU for a page at identification time. > > > > This can be sovled with a two steps migration: first, you promote the > page from CXL to a NUMA node, then you rely on NUMA balancing to > further place the page into the right NUMA node. NUMA hint faults can > still be enabled for pages allocated from NUMA nodes, but not for CXL. > I think it would be a shame to promote to the wrong top-tier NUMA node and rely on NUMA Balancing to fix it up with yet another migration :/ Since these cpuless memory nodes should have a promotion node associated with them, which defaults to the latency given to us by the HMAT, can we make that the default promotion target when memory is accessed? The "normal mode" for NUMA Balancing could fix this up subsequent to the promotion, but only if enabled. Raghu noted in the session that the current patch series only promotes to node 0 but that choice is only for the RFC. I *assume* that every CXL memory node will have a standard top-tier node to promote to *or* that we stash that promotion node information at the time of demotion so memory comes back to the same node it was demoted from. Either way, this feels like a solvable problem?