From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6A6E8C00140 for ; Wed, 10 Aug 2022 13:17:09 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 93F9A8E0002; Wed, 10 Aug 2022 09:17:08 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8EE738E0001; Wed, 10 Aug 2022 09:17:08 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 767C28E0002; Wed, 10 Aug 2022 09:17:08 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 66EBD8E0001 for ; Wed, 10 Aug 2022 09:17:08 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 24016812A0 for ; Wed, 10 Aug 2022 13:17:08 +0000 (UTC) X-FDA: 79783733736.27.2EDFEAE Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf17.hostedemail.com (Postfix) with ESMTP id 58BF94017B for ; Wed, 10 Aug 2022 13:17:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1660137426; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=2VZ6L8jjXf4+HOhHElVvMwLB5fdp2KKJb8m9wQU3YtY=; b=bDXK8Nw6e7YkaRRJIKAYrEZEktwqU2rvRMMN5+73JR2jDhJZB4V8Tm+Nrwt5wFqQUx8zWM 1xCBD6dXMozkfyIi6qK4WwsmT5DKjWA5FACRVTVtfr04f1/xzVKKw4xJE01O8BssrA4X8q KP+OHg/NoAT7EiUARa+KhZr492Ip/OY= Received: from mail-io1-f69.google.com (mail-io1-f69.google.com [209.85.166.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-499-jwfGdJNoNdyoIU5xmmKiEQ-1; Wed, 10 Aug 2022 09:17:05 -0400 X-MC-Unique: jwfGdJNoNdyoIU5xmmKiEQ-1 Received: by mail-io1-f69.google.com with SMTP id c5-20020a5ea805000000b00684468c5005so7193174ioa.15 for ; Wed, 10 Aug 2022 06:17:05 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc; bh=2VZ6L8jjXf4+HOhHElVvMwLB5fdp2KKJb8m9wQU3YtY=; b=KSiHQW3miBehiY2ZHeYfiLjxGyliwVhmnB75NzyVt+DdeJ0bXkXZrxJcdqOp+fVBFb YQyMxAnFApk/VsRQVrsfY5bXqeuvx2fnbK5A66FVosI60N7y0wBg5AmDQmpNf71QLQsQ dBRPNr2ROgAKuR6LCkhJ9bCzfHFgxIbt3kqZtnDfy+KmShaw8dZ3vA1+bBJobNrVOzFP DZuIXe/69fdR+J89TnOuPqE3KjaV5tBiXnIfFNtxTyabovpHCBZss8RZgZDjFvqzFubS cApZtgEYKz9qUNyPihsBP+DBTlLisFPCTpV5O14dilOLADKrGCcd9V6nQGVNTmg8npSa vZwA== X-Gm-Message-State: ACgBeo07usr547EAdoJGZUoFmfK4SeLeUZbr8zAKONzHT+OYXRg7OurB Pmr7Q0Hl4KuI7lIFLzZqVfohG03Q3qZ7sMycvcL3BsJdbYqGeugbDNnr87vrKU58S1N6aSTZuwm YHdTkj2wVry8= X-Received: by 2002:a05:6e02:194d:b0:2df:8893:90bb with SMTP id x13-20020a056e02194d00b002df889390bbmr11119673ilu.242.1660137424949; Wed, 10 Aug 2022 06:17:04 -0700 (PDT) X-Google-Smtp-Source: AA6agR5yHoOuiq0aiGRx8np1v5C3vWpminLzjsgvwmOQ0Z4YZquxeRyC22w1yGAli2b2E00QDBrpkA== X-Received: by 2002:a05:6e02:194d:b0:2df:8893:90bb with SMTP id x13-20020a056e02194d00b002df889390bbmr11119657ilu.242.1660137424674; Wed, 10 Aug 2022 06:17:04 -0700 (PDT) Received: from xz-m1.local (bras-base-aurron9127w-grc-35-70-27-3-10.dsl.bell.ca. [70.27.3.10]) by smtp.gmail.com with ESMTPSA id q28-20020a02a99c000000b003431865d3c6sm2865412jam.7.2022.08.10.06.17.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Aug 2022 06:17:04 -0700 (PDT) Date: Wed, 10 Aug 2022 09:17:02 -0400 From: Peter Xu To: "Huang, Ying" Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Minchan Kim , David Hildenbrand , Nadav Amit , Andrew Morton , Hugh Dickins , Vlastimil Babka , Andrea Arcangeli , Andi Kleen , "Kirill A . Shutemov" Subject: Re: [PATCH v3 3/7] mm/swap: Add swp_offset_pfn() to fetch PFN from swap entry Message-ID: References: <20220809220100.20033-1-peterx@redhat.com> <20220809220100.20033-4-peterx@redhat.com> <87bkssfxcf.fsf@yhuang6-desk2.ccr.corp.intel.com> MIME-Version: 1.0 In-Reply-To: <87bkssfxcf.fsf@yhuang6-desk2.ccr.corp.intel.com> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1660137427; a=rsa-sha256; cv=none; b=0g39xPS06CMDTyLokIMNVK2f3I0iiOH8KieLc9GeGPCTUYBpb3iawJh0NOT+JPBU8XKden zKshY5RCDe41lQn1PRA6QCrU1YFTdUN5VqiPZSPxPA9wqF2Siw+mh6hco0E6WoRTvPGxuL Lzr09hRR3gP1Yc2awkzxj6uUQJIW14k= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=bDXK8Nw6; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf17.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1660137427; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=2VZ6L8jjXf4+HOhHElVvMwLB5fdp2KKJb8m9wQU3YtY=; b=TBPThJzTbq4A3sFGLgE87pY8wuK1xJSjQsy24ObY+PWceVXFPZMQUKOnssbYDoQ8ibhjMD AtxeiPJ7qvrXief7eEPwZ7wY2nuG6EGlHikparMa3MTp0acLJkCkgFz9Y0LKequkXCYg9+ +FjTmAMZ4U7Q3EK7sL/WketX/75BW1w= X-Rspamd-Server: rspam10 X-Stat-Signature: k1gi4goc4jcttennfijkuyunc3e85edg Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=bDXK8Nw6; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf17.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com X-Rspam-User: X-Rspamd-Queue-Id: 58BF94017B X-HE-Tag: 1660137427-218220 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Aug 10, 2022 at 02:04:32PM +0800, Huang, Ying wrote: > Peter Xu writes: > > > We've got a bunch of special swap entries that stores PFN inside the swap > > offset fields. To fetch the PFN, normally the user just calls swp_offset() > > assuming that'll be the PFN. > > > > Add a helper swp_offset_pfn() to fetch the PFN instead, fetching only the > > max possible length of a PFN on the host, meanwhile doing proper check with > > MAX_PHYSMEM_BITS to make sure the swap offsets can actually store the PFNs > > properly always using the BUILD_BUG_ON() in is_pfn_swap_entry(). > > > > One reason to do so is we never tried to sanitize whether swap offset can > > really fit for storing PFN. At the meantime, this patch also prepares us > > with the future possibility to store more information inside the swp offset > > field, so assuming "swp_offset(entry)" to be the PFN will not stand any > > more very soon. > > > > Replace many of the swp_offset() callers to use swp_offset_pfn() where > > proper. Note that many of the existing users are not candidates for the > > replacement, e.g.: > > > > (1) When the swap entry is not a pfn swap entry at all, or, > > (2) when we wanna keep the whole swp_offset but only change the swp type. > > > > For the latter, it can happen when fork() triggered on a write-migration > > swap entry pte, we may want to only change the migration type from > > write->read but keep the rest, so it's not "fetching PFN" but "changing > > swap type only". They're left aside so that when there're more information > > within the swp offset they'll be carried over naturally in those cases. > > > > Since at it, dropping hwpoison_entry_to_pfn() because that's exactly what > > the new swp_offset_pfn() is about. > > > > Signed-off-by: Peter Xu > > The patch itself looks good. But I searched swp_entry() in kernel > source code, and found that we need to do more. > > For example, in pte_to_pagemap_entry() > > frame = swp_type(entry) | > (swp_offset(entry) << MAX_SWAPFILES_SHIFT); > > If it's a migration entry, we need > > frame = swp_type(entry) | > (swp_offset_pfn(entry) << MAX_SWAPFILES_SHIFT); > > So I think you need to search all swp_offset() calling in the kernel > source and check whether they need to be changed. Yeah I actually looked at all of them and explicitly left this one since I wanted to dump the whole swp entry - even if it's called "show_pfn" it was actually dumping the whole entries always, e.g., for genuine swap entries I don't think it's PFN stored in swp offset, so it's nothing about PFN but swp offset itself, IMHO. But after a second thought I agree it should be specially handled here, because the user app could be relying offset to be pfn for migration entries. The other thing is I'm not sure whether the encoding of pagemap entries can always fit for both pfn and A/D bits (majorly, PM_PFRAME_MASK) even if the arch swap pte fits; it needs more math. So unless necessary, it'll be good to still make the A/D bits internal to kernel too. Thanks for the careful review, I'll fix that. -- Peter Xu