From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8435DC433DB for ; Thu, 14 Jan 2021 06:18:22 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id EBCD3238EC for ; Thu, 14 Jan 2021 06:18:21 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org EBCD3238EC Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id DD94C8D00BE; Thu, 14 Jan 2021 01:18:20 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D8A668D008E; Thu, 14 Jan 2021 01:18:20 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C9FDE8D00BE; Thu, 14 Jan 2021 01:18:20 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0223.hostedemail.com [216.40.44.223]) by kanga.kvack.org (Postfix) with ESMTP id B54928D008E for ; Thu, 14 Jan 2021 01:18:20 -0500 (EST) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 6D5591EF3 for ; Thu, 14 Jan 2021 06:18:20 +0000 (UTC) X-FDA: 77703375960.24.quiet19_0217e3f27524 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin24.hostedemail.com (Postfix) with ESMTP id 5452A1A4A7 for ; Thu, 14 Jan 2021 06:18:20 +0000 (UTC) X-HE-Tag: quiet19_0217e3f27524 X-Filterd-Recvd-Size: 5457 Received: from mail-ej1-f53.google.com (mail-ej1-f53.google.com [209.85.218.53]) by imf45.hostedemail.com (Postfix) with ESMTP for ; Thu, 14 Jan 2021 06:18:19 +0000 (UTC) Received: by mail-ej1-f53.google.com with SMTP id ga15so6562778ejb.4 for ; Wed, 13 Jan 2021 22:18:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=V6f56GHlvX+30vh+DfZsExZ/lVAivzj42R6p1vKO9YA=; b=1VxNfJeYo2SCW48DoIhJXUv4ZH/NwkAwPSchXiBWcyb0pHcnoyYSzydSAPwShL9/w+ LsvaAe7mF6uXUVEFgv5fSiXdCtUeZ3JEhabsZ1Ccl7uxjM2smFWVoo+HEHwyfNwaow7L 8S9JJletV2OtIkQzBQrT4j7RXy0gfBR2aSiYdcTGriyLkQBsMghrRWwKrTSSGixJta+E JTAaiD6v8ePqVmOmR6wmW6fYZrqwpxpMlS8fPgie+k9CvynrC5o/6nbXAQHkMjMi+pv1 tC0vWVPGaRdnesbjLjoyy6C8l2CbwOBjauh1Df9fiTLBZSzOFVKd8W/XZmcJvLwD0bkw qoOg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=V6f56GHlvX+30vh+DfZsExZ/lVAivzj42R6p1vKO9YA=; b=PsGIqJR/2gqtxgE5nnLnnKivTpQAeL7wXxryjfmea4lqIouasbHaB2SZrMz7H44NPo It/fTYG3k5xMmuoKzxrv0t2JeGEaJ9BxJaaASmfiiRHbDTEJ1zKy+fewOo015EHPMrmc pry4wDLNh9ir2lhewxj5vj+k/LU/bmWVogXPJwUdjTrSttAjc//Ljpb6O4WiqoazzKRs 9CBSXz3u7zpTG/WLHZWcWCoStIBwhQss8CNbyxgYzGR2afuUVtnXdPvrJMxJoq1pGFj/ BdSNPmh1c4Ud6GtKUA6/NXhOxCYcfOgh8biwte2S9aFh20fNbqP9/GHCqDYnfyKy9opw tVdw== X-Gm-Message-State: AOAM53025ptzOO7PFiRRHzT5UVYzguoiOEPsz/mpfPNFb5tJlEWn3Xbe j8bvTo54OntUFJBl1Uy271C+p9eZyfMPdk8fkQ7EsA== X-Google-Smtp-Source: ABdhPJxrgOBl9xEDA1Gqh4koRRCp2CcKtKLfmgUGiGQTbaKZpts+wK9W7/mgOVdXnyraOhZw/5TvuyhbRoA/6YubSNk= X-Received: by 2002:a17:906:a29a:: with SMTP id i26mr4116106ejz.45.1610605097907; Wed, 13 Jan 2021 22:18:17 -0800 (PST) MIME-Version: 1.0 References: <161058499000.1840162.702316708443239771.stgit@dwillia2-desk3.amr.corp.intel.com> <161058501210.1840162.8108917599181157327.stgit@dwillia2-desk3.amr.corp.intel.com> <20210114014944.GA16873@hori.linux.bs1.fc.nec.co.jp> In-Reply-To: <20210114014944.GA16873@hori.linux.bs1.fc.nec.co.jp> From: Dan Williams Date: Wed, 13 Jan 2021 22:18:09 -0800 Message-ID: Subject: Re: [PATCH v4 4/5] mm: Fix page reference leak in soft_offline_page() To: =?UTF-8?B?SE9SSUdVQ0hJIE5BT1lBKOWggOWPoyDnm7TkuZ8p?= Cc: "akpm@linux-foundation.org" , Naoya Horiguchi , Michal Hocko , David Hildenbrand , Oscar Salvador , "stable@vger.kernel.org" , "linux-mm@kvack.org" , "linux-nvdimm@lists.01.org" , "linux-kernel@vger.kernel.org" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Jan 13, 2021 at 5:50 PM HORIGUCHI NAOYA(=E5=A0=80=E5=8F=A3=E3=80=80= =E7=9B=B4=E4=B9=9F) wrote: > > On Wed, Jan 13, 2021 at 04:43:32PM -0800, Dan Williams wrote: > > The conversion to move pfn_to_online_page() internal to > > soft_offline_page() missed that the get_user_pages() reference taken by > > the madvise() path needs to be dropped when pfn_to_online_page() fails. > > Note the direct sysfs-path to soft_offline_page() does not perform a > > get_user_pages() lookup. > > > > When soft_offline_page() is handed a pfn_valid() && > > !pfn_to_online_page() pfn the kernel hangs at dax-device shutdown due t= o > > a leaked reference. > > > > Fixes: feec24a6139d ("mm, soft-offline: convert parameter to pfn") > > Cc: Andrew Morton > > Cc: Naoya Horiguchi > > Cc: Michal Hocko > > Reviewed-by: David Hildenbrand > > Reviewed-by: Oscar Salvador > > Cc: > > Signed-off-by: Dan Williams > > I'm OK if we don't have any other better approach, but the proposed chang= es > make code a little messy, and I feel that get_user_pages() might be the > right place to fix. Is get_user_pages() expected to return struct page wi= th > holding refcount for offline valid pages? I thought that such pages are > only used by drivers for dax-devices, but that might be wrong. Can I ask = for > a little more explanation from this perspective? The motivation for ZONE_DEVICE is to allow get_user_pages() for "offline" p= fns. soft_offline_page() wants to only operate on "online" pfns. get_user_pages() on a dax-mapping returns an "offline" ZONE_DEVICE page. When pfn_to_online_page() fails the get_user_pages() reference needs to be dropped. To be honest I dislike the entire flags based scheme for communicating the fact that page reference obtained by madvise needs to be dropped later. I'd rather pass a non-NULL 'struct page *' than redo pfn_to_page() conversions in the leaf functions, but that's a much larger change.