From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.7 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C4A7AC43141 for ; Wed, 13 Nov 2019 20:10:48 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 2ABFF206EE for ; Wed, 13 Nov 2019 20:10:48 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=intel-com.20150623.gappssmtp.com header.i=@intel-com.20150623.gappssmtp.com header.b="PQ29wPrl" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2ABFF206EE Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id C52D96B0003; Wed, 13 Nov 2019 15:10:47 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C025F6B0005; Wed, 13 Nov 2019 15:10:47 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AF0ED6B0006; Wed, 13 Nov 2019 15:10:47 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0225.hostedemail.com [216.40.44.225]) by kanga.kvack.org (Postfix) with ESMTP id 949C56B0003 for ; Wed, 13 Nov 2019 15:10:47 -0500 (EST) Received: from smtpin27.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with SMTP id 4C7988249980 for ; Wed, 13 Nov 2019 20:10:47 +0000 (UTC) X-FDA: 76152347334.27.ear19_7b1f7e510be50 X-HE-Tag: ear19_7b1f7e510be50 X-Filterd-Recvd-Size: 7230 Received: from mail-oi1-f196.google.com (mail-oi1-f196.google.com [209.85.167.196]) by imf25.hostedemail.com (Postfix) with ESMTP for ; Wed, 13 Nov 2019 20:10:45 +0000 (UTC) Received: by mail-oi1-f196.google.com with SMTP id n14so2968358oie.13 for ; Wed, 13 Nov 2019 12:10:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=YWH9ke4c1/CtD4q6Is1TXT0RQwZYPzq9oYNkDhxs540=; b=PQ29wPrlDDhONQP35XIRij91EVjfglBp0QHwicxi3pzAkWQZe1Zc/BbK7rTlFQ2NMY 9xRlbeMP7o1QDsuQM2yN7pZ7+qb2EJOx9o8wYzNLH+jOGHY32ASTlvoRBYMT7ePAaYaF eVUILQRjxsGwGcld3Ub5c3NM9uggulqZsZlGqaRHXIxR4BO1Gv9RClVXZtcYUD3Csbjr V9KtLIqLDR6exny/Tj5u0OAaKvNI3nJfYIvcb2jeQYRfm9M1pDRV2I+68Tg7UMzcxETi FY6i+lBOQp2+ccPaB8nEXggrQaT8Y9UkeW0u9gDyJLqS3/lgawuvZpvowp+fMy+lqUnm j6Cw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=YWH9ke4c1/CtD4q6Is1TXT0RQwZYPzq9oYNkDhxs540=; b=uK2EHtneC1HwVDcb2jyiOmd16copOfHSHavr0rb6LUzYms/YhW4pHGPM1PtKTlwjA9 7vmFPcthSZpQDqr/8/YTEgzKCSA9fbj+vzUdAhuDAf/FTbvSkL2/QEFwXOP5HlVJI9RW PsleqIxkvaAOKDGgzq3AC9LDA8tLfzvfMzkaYO/iyxtES1TrqnsebYMbZn3iBsLyVjo5 jbp5Y3am2mDcjLbQMoX6So1Y0K4sO3+g4S2oufSPGExT6zP12D1u/MqjPtsYB/xVAsVm 4T85Mh+YYlQ6UsEpwubhNDPjTG01/YGrhI+MPT19phsTPcvwrNAlEKtoXzcgypYs38K9 JAwg== X-Gm-Message-State: APjAAAXQQYfMtDfcMa5YFECqdG1jPVDx3AGmTHH7eteKrLrKFsHy/von tgLbrwr7tSpicEEa4FDMQYerlmyxxWxUxgRYeu0NgQ== X-Google-Smtp-Source: APXvYqxZLEWCKz8wCuh/m700RGuDyJEGtkv8PjsIL0dFtRpTcLmFcrwsWsaRxYDz/RYvDEMt1Z5YnlZfOc4ybJSz5f0= X-Received: by 2002:aca:3d84:: with SMTP id k126mr368717oia.70.1573675845013; Wed, 13 Nov 2019 12:10:45 -0800 (PST) MIME-Version: 1.0 References: <6193C847-F09C-439A-81EE-98A59473D582@redhat.com> In-Reply-To: <6193C847-F09C-439A-81EE-98A59473D582@redhat.com> From: Dan Williams Date: Wed, 13 Nov 2019 12:10:34 -0800 Message-ID: Subject: Re: [PATCH 2/3] mm: Introduce subsection_dev_map To: David Hildenbrand Cc: Toshiki Fukasawa , "linux-mm@kvack.org" , "linux-kernel@vger.kernel.org" , "akpm@linux-foundation.org" , "mhocko@kernel.org" , "adobriyan@gmail.com" , "hch@lst.de" , "longman@redhat.com" , "sfr@canb.auug.org.au" , "mst@redhat.com" , "cai@lca.pw" , Naoya Horiguchi , Junichi Nomura Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Nov 13, 2019 at 11:53 AM David Hildenbrand wrote= : > > > > > Am 13.11.2019 um 20:06 schrieb Dan Williams : > > > > =EF=BB=BFOn Wed, Nov 13, 2019 at 10:51 AM David Hildenbrand wrote: > >> > >>> On 08.11.19 20:13, Dan Williams wrote: > >>> On Thu, Nov 7, 2019 at 4:15 PM Toshiki Fukasawa > >>> wrote: > >>>> > >>>> Currently, there is no way to identify pfn on ZONE_DEVICE. > >>>> Identifying pfn on system memory can be done by using a > >>>> section-level flag. On the other hand, identifying pfn on > >>>> ZONE_DEVICE requires a subsection-level flag since ZONE_DEVICE > >>>> can be created in units of subsections. > >>>> > >>>> This patch introduces a new bitmap subsection_dev_map so that > >>>> we can identify pfn on ZONE_DEVICE. > >>>> > >>>> Also, subsection_dev_map is used to prove that struct pages > >>>> included in the subsection have been initialized since it is > >>>> set after memmap_init_zone_device(). We can avoid accessing > >>>> pages currently being initialized by checking subsection_dev_map. > >>>> > >>>> Signed-off-by: Toshiki Fukasawa > >>>> --- > >>>> include/linux/mmzone.h | 19 +++++++++++++++++++ > >>>> mm/memremap.c | 2 ++ > >>>> mm/sparse.c | 32 ++++++++++++++++++++++++++++++++ > >>>> 3 files changed, 53 insertions(+) > >>>> > >>>> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h > >>>> index bda2028..11376c4 100644 > >>>> --- a/include/linux/mmzone.h > >>>> +++ b/include/linux/mmzone.h > >>>> @@ -1174,11 +1174,17 @@ static inline unsigned long section_nr_to_pf= n(unsigned long sec) > >>>> > >>>> struct mem_section_usage { > >>>> DECLARE_BITMAP(subsection_map, SUBSECTIONS_PER_SECTION); > >>>> +#ifdef CONFIG_ZONE_DEVICE > >>>> + DECLARE_BITMAP(subsection_dev_map, SUBSECTIONS_PER_SECTION); > >>>> +#endif > >>> > >>> Hi Toshiki, > >>> > >>> There is currently an effort to remove the PageReserved() flag as som= e > >>> code is using that to detect ZONE_DEVICE. In reviewing those patches > >>> we realized that what many code paths want is to detect online memory= . > >>> So instead of a subsection_dev_map add a subsection_online_map. That > >>> way pfn_to_online_page() can reliably avoid ZONE_DEVICE ranges. I > >>> otherwise question the use case for pfn_walkers to return pages for > >>> ZONE_DEVICE pages, I think the skip behavior when pfn_to_online_page(= ) > >>> =3D=3D false is the right behavior. > >> > >> To be more precise, I recommended an subsection_active_map, to indicat= e > >> which memmaps were fully initialized and can safely be touched (e.g., = to > >> read the zone/nid). This map would also be set when the devmem memmaps > >> were initialized (race between adding memory/growing the section and > >> initializing the memmap). > >> > >> See > >> > >> https://lkml.org/lkml/2019/10/10/87 > >> > >> and > >> > >> https://www.spinics.net/lists/linux-driver-devel/msg130012.html > > > > I'm still struggling to understand the motivation of distinguishing > > "active" as something distinct from "online". As long as the "online" > > granularity is improved from sections down to subsections then most > > code paths are good to go. The others can use get_devpagemap() to > > check for ZONE_DEVICE in a race free manner as they currently do. > > I thought we wanted to unify access if we don=E2=80=99t really care about= the zone as in most pfn walkers - E.g., for zone shrinking. Agree, when the zone does not matter, which is most cases, then pfn_online() and pfn_valid() are sufficient. > Anyhow, a subsection online map would be a good start, we can reuse that = later for ZONE_DEVICE as well. Cool, good to go with me sending a patch to introduce pfn_online() and a corresponding subsection_map for the same?