From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.7 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DB7BFC43141 for ; Wed, 13 Nov 2019 19:06:48 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 0CFF7206E1 for ; Wed, 13 Nov 2019 19:06:49 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=intel-com.20150623.gappssmtp.com header.i=@intel-com.20150623.gappssmtp.com header.b="A7yzewJb" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0CFF7206E1 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 3A9216B0008; Wed, 13 Nov 2019 14:06:48 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 380A66B000A; Wed, 13 Nov 2019 14:06:48 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2BD186B000C; Wed, 13 Nov 2019 14:06:48 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0038.hostedemail.com [216.40.44.38]) by kanga.kvack.org (Postfix) with ESMTP id 17D506B0008 for ; Wed, 13 Nov 2019 14:06:48 -0500 (EST) Received: from smtpin12.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with SMTP id B89A58249980 for ; Wed, 13 Nov 2019 19:06:47 +0000 (UTC) X-FDA: 76152186054.12.drop87_10937bf0b85c X-HE-Tag: drop87_10937bf0b85c X-Filterd-Recvd-Size: 6745 Received: from mail-oi1-f196.google.com (mail-oi1-f196.google.com [209.85.167.196]) by imf23.hostedemail.com (Postfix) with ESMTP for ; Wed, 13 Nov 2019 19:06:46 +0000 (UTC) Received: by mail-oi1-f196.google.com with SMTP id n14so2775420oie.13 for ; Wed, 13 Nov 2019 11:06:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=evIuNrrlG6EZmvamCM+sMtWrypDnzooLTyDJ1S1Tl1k=; b=A7yzewJb20+HepONFpbwXw3tYORpvEyr4MDoe2TBcw3ff8pgibms8R+9F7w6h19QZK eKZ7S7noqVDg8eD4984+lLkv4gvgtVffwrv5u2KDDpHd0Yv+jkyyVxekrGqCbCfw14kB NFFyuRihnCZ6iyRO72+2RUoK5XzscuNVKKIB//ZTXlG61OMDX++6m4R7zekIU9dgbTQQ XqBIKYl010Jyiew74YNcmwIyqg4TIgjAcQz1cS1MrAaPzKub/ug++ShTixXLaA4cSWMn iewaa0BwrX1HI3uCr+iY+C8eptHxTX9dKdZQZ9sp+KTfYi39+GacuoIrkTNvENfQzi1E g4yA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=evIuNrrlG6EZmvamCM+sMtWrypDnzooLTyDJ1S1Tl1k=; b=ZvHfgw7SUxEZ4/rhU7hdqUNic/frW+86F9EPlekzSStdV95OK7vJlxSPl0a3fp0L1S R09X8m8rd/8K1hcwcKbXOQpRDnxAxRL3sDgTiX7mSBXqqx2eLX92lZ8noaqZr9meYiMJ cpxQKohLB2v1+AE/TCQl0d+4sQQ3QOUpAkxmeIurpUbpXPC8dC5XhNFFOPDYUP3Nd6eL x1fikmd7WnKGK/St8/BOKz9G2wFhLRmdLnK3+yv97nnG1ZLLICrhP5HmkUWSwJ3h4XNW 4ETqJZVLFedAs75HemUDQK8Qfk71KXiUTRkbQnk3dXoYslQOB/XiN0QuOnCStkLbSRgt XD/g== X-Gm-Message-State: APjAAAVVsPj8gVOeqN5tmlJd3RoDhtieSn86NoL3zRO0TvqKnVG6IvoP tuzQWUdLC0ybPIHBG8umx6fs09GAbyj/qzg4V1QiAQ== X-Google-Smtp-Source: APXvYqxlCW2sXfqrOozCNuUeVMdzUz9tPqWKwSathCzp4elJR1uVRsYFgSgb7sV27mHmtvy8uoZBmtl6l8JsUG7E7XI= X-Received: by 2002:aca:ead7:: with SMTP id i206mr193568oih.0.1573672006126; Wed, 13 Nov 2019 11:06:46 -0800 (PST) MIME-Version: 1.0 References: <20191108000855.25209-1-t-fukasawa@vx.jp.nec.com> <20191108000855.25209-3-t-fukasawa@vx.jp.nec.com> <163d1d41-19c1-d8cf-6c1c-eec226c34ac1@redhat.com> In-Reply-To: <163d1d41-19c1-d8cf-6c1c-eec226c34ac1@redhat.com> From: Dan Williams Date: Wed, 13 Nov 2019 11:06:35 -0800 Message-ID: Subject: Re: [PATCH 2/3] mm: Introduce subsection_dev_map To: David Hildenbrand Cc: Toshiki Fukasawa , "linux-mm@kvack.org" , "linux-kernel@vger.kernel.org" , "akpm@linux-foundation.org" , "mhocko@kernel.org" , "adobriyan@gmail.com" , "hch@lst.de" , "longman@redhat.com" , "sfr@canb.auug.org.au" , "mst@redhat.com" , "cai@lca.pw" , Naoya Horiguchi , Junichi Nomura Content-Type: text/plain; charset="UTF-8" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Nov 13, 2019 at 10:51 AM David Hildenbrand wrote: > > On 08.11.19 20:13, Dan Williams wrote: > > On Thu, Nov 7, 2019 at 4:15 PM Toshiki Fukasawa > > wrote: > >> > >> Currently, there is no way to identify pfn on ZONE_DEVICE. > >> Identifying pfn on system memory can be done by using a > >> section-level flag. On the other hand, identifying pfn on > >> ZONE_DEVICE requires a subsection-level flag since ZONE_DEVICE > >> can be created in units of subsections. > >> > >> This patch introduces a new bitmap subsection_dev_map so that > >> we can identify pfn on ZONE_DEVICE. > >> > >> Also, subsection_dev_map is used to prove that struct pages > >> included in the subsection have been initialized since it is > >> set after memmap_init_zone_device(). We can avoid accessing > >> pages currently being initialized by checking subsection_dev_map. > >> > >> Signed-off-by: Toshiki Fukasawa > >> --- > >> include/linux/mmzone.h | 19 +++++++++++++++++++ > >> mm/memremap.c | 2 ++ > >> mm/sparse.c | 32 ++++++++++++++++++++++++++++++++ > >> 3 files changed, 53 insertions(+) > >> > >> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h > >> index bda2028..11376c4 100644 > >> --- a/include/linux/mmzone.h > >> +++ b/include/linux/mmzone.h > >> @@ -1174,11 +1174,17 @@ static inline unsigned long section_nr_to_pfn(unsigned long sec) > >> > >> struct mem_section_usage { > >> DECLARE_BITMAP(subsection_map, SUBSECTIONS_PER_SECTION); > >> +#ifdef CONFIG_ZONE_DEVICE > >> + DECLARE_BITMAP(subsection_dev_map, SUBSECTIONS_PER_SECTION); > >> +#endif > > > > Hi Toshiki, > > > > There is currently an effort to remove the PageReserved() flag as some > > code is using that to detect ZONE_DEVICE. In reviewing those patches > > we realized that what many code paths want is to detect online memory. > > So instead of a subsection_dev_map add a subsection_online_map. That > > way pfn_to_online_page() can reliably avoid ZONE_DEVICE ranges. I > > otherwise question the use case for pfn_walkers to return pages for > > ZONE_DEVICE pages, I think the skip behavior when pfn_to_online_page() > > == false is the right behavior. > > To be more precise, I recommended an subsection_active_map, to indicate > which memmaps were fully initialized and can safely be touched (e.g., to > read the zone/nid). This map would also be set when the devmem memmaps > were initialized (race between adding memory/growing the section and > initializing the memmap). > > See > > https://lkml.org/lkml/2019/10/10/87 > > and > > https://www.spinics.net/lists/linux-driver-devel/msg130012.html I'm still struggling to understand the motivation of distinguishing "active" as something distinct from "online". As long as the "online" granularity is improved from sections down to subsections then most code paths are good to go. The others can use get_devpagemap() to check for ZONE_DEVICE in a race free manner as they currently do. > I dislike a map that is specific to ZONE_DEVICE or (currently) > !ZONE_DEVICE. I rather want an indication "this memmap is safe to > touch". As discussed along the mentioned threads, we can combine this > later with RCU to handle some races that are currently possible. The rcu protection is independent of the pfn_active vs pfn_online distinction afaics.