From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5F5AAC433EF for ; Fri, 28 Jan 2022 16:53:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BFDB06B00C2; Fri, 28 Jan 2022 11:53:19 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id BAD896B00C5; Fri, 28 Jan 2022 11:53:19 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A75306B00C6; Fri, 28 Jan 2022 11:53:19 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0203.hostedemail.com [216.40.44.203]) by kanga.kvack.org (Postfix) with ESMTP id 981366B00C2 for ; Fri, 28 Jan 2022 11:53:19 -0500 (EST) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 4DCE7181E7268 for ; Fri, 28 Jan 2022 16:53:19 +0000 (UTC) X-FDA: 79080291318.07.03E310F Received: from mail-ej1-f53.google.com (mail-ej1-f53.google.com [209.85.218.53]) by imf18.hostedemail.com (Postfix) with ESMTP id D73461C0003 for ; Fri, 28 Jan 2022 16:53:18 +0000 (UTC) Received: by mail-ej1-f53.google.com with SMTP id s13so18023098ejy.3 for ; Fri, 28 Jan 2022 08:53:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=BLJFT4AhBocD6kHHT+suS0eJkn3pPtgeNyCXLqLIpak=; b=eXWnQezlzjgM85qsZszg2oBXIDeFvSRt24xoZNoEeaT02DL98kW6vYQmfrS77uf+ck 73j/FMVrXvBV0t9/NOXyQild2tI4WGX8LDLVemr0lyO49SRXaerhke2s3D69BJNjvCE6 EkeqgY43aO97+R6xHwHtbrc9e11d3/+UMkt6snjEgmt0taXrCSuA2GuPd4pHy1hNB7I+ JTGxPnrpmZS0GWmVAx+zeXJM22AXimC3Ot09+82+1KRP1Boz6rvwfZKcSeUJ3JefUBHG ODRGdKMPEWfT57hTNdkNbbhNDpIZ4q6j3j//lxvpy07jrisIO7uQj0TNdx6y12/EJM16 S38A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=BLJFT4AhBocD6kHHT+suS0eJkn3pPtgeNyCXLqLIpak=; b=zY4vIOaSORd10NMnl7l+ovF6GO7KV2PnviWGmrAWEvN4dJXFbH2k+mwJwXqW6foqgs jR9t7yQBlztF0IKISJgORMLvVwmbH3J62eykQgITyhoG1bn2SboIR5xZ5AMvFV7ozH7a oc3HEPJ/FDB9SuY6F8PZc3VodBXs6mxSoepfP0+fR5NDAYL4walAOqJafUzwMYseoqk+ R+/PiqIi/3YaAh1YgA6o36P5etpmG8e+PME858Kq7SUZGT8pV4aKqRpmQQi07xhXI1Ul kMBfYsng+BNlZ78Nv3jL4Bewc2yd5OxGhcsBO2HAOhOQijA4e6LYoDvEoEt07rfRzXMX +H6w== X-Gm-Message-State: AOAM531LydixqgsWxT78t7xuRc3TidwOJhh1KNHtbTY6Ja1o5Ri7w2Pu 92kXBZr2wCz2wgqvaxUivCYcH+7dogvr4U9MzX8= X-Google-Smtp-Source: ABdhPJyQPlCRIBkgo1TAzlQctDBPEi8rIwNs5vVeYsSwaXGWzQ4dqp3N/HFI29N2DS1PrkdvwWQNnykEMmK1wRcsKK4= X-Received: by 2002:a17:907:3d94:: with SMTP id he20mr7753163ejc.637.1643388797473; Fri, 28 Jan 2022 08:53:17 -0800 (PST) MIME-Version: 1.0 References: <20220120202805.3369-1-shy828301@gmail.com> <5b4e2c29-8f1a-5a68-d243-a30467cc02d4@redhat.com> <5a565d5a-0540-4041-ce63-a8fd5d1bb340@redhat.com> <2a1c5bd2-cb8c-b93b-68af-de620438d19a@redhat.com> <8c0430e5-fecb-3eda-3d40-e94caa8cbd78@redhat.com> In-Reply-To: <8c0430e5-fecb-3eda-3d40-e94caa8cbd78@redhat.com> From: Yang Shi Date: Fri, 28 Jan 2022 08:53:05 -0800 Message-ID: Subject: Re: [v2 PATCH] fs/proc: task_mmu.c: don't read mapcount for migration entry To: David Hildenbrand Cc: Jann Horn , "Kirill A. Shutemov" , Matthew Wilcox , Andrew Morton , Linux MM , Linux Kernel Mailing List , stable Content-Type: text/plain; charset="UTF-8" X-Rspam-User: nil X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: D73461C0003 X-Stat-Signature: og1xwnzpb69h497jwytrhzigysr47wdf Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=eXWnQezl; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf18.hostedemail.com: domain of shy828301@gmail.com designates 209.85.218.53 as permitted sender) smtp.mailfrom=shy828301@gmail.com X-HE-Tag: 1643388798-403793 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Jan 28, 2022 at 12:02 AM David Hildenbrand wrote: > > On 27.01.22 22:16, Yang Shi wrote: > > On Wed, Jan 26, 2022 at 10:54 AM David Hildenbrand wrote: > >> > >>>>> Just page lock or elevated page refcount could serialize against THP > >>>>> split AFAIK. > >>>>> > >>>>>> > >>>>>> But yeah, using the mapcount of a page that is not even mapped > >>>>>> (migration entry) is clearly wrong. > >>>>>> > >>>>>> To summarize: reading the mapcount on an unlocked page will easily > >>>>>> return a wrong result and the result should not be relied upon. reading > >>>>>> the mapcount of a migration entry is dangerous and certainly wrong. > >>>>> > >>>>> Depends on your usecase. Some just want to get a snapshot, just like > >>>>> smaps, they don't care. > >>>> > >>>> Right, but as discussed, even the snapshot might be slightly wrong. That > >>>> might be just fine for smaps (and I would have enjoyed a comment in the > >>>> code stating that :) ). > >>> > >>> I think that is documented already, see Documentation/filesystems/proc.rst: > >>> > >>> Note: reading /proc/PID/maps or /proc/PID/smaps is inherently racy (consistent > >>> output can be achieved only in the single read call). > >> > >> Right, but I think there is a difference between > >> > >> * Atomic values that change immediately afterwards ("this value used to > >> be true at one point in time") > >> * Values that are unstable because we cannot read them atomically ("this > >> value never used to be true") > >> > >> I'd assume with the documented race we actually talk about the first > >> point, but I might be just wrong. > >> > >>> > >>> Of course, if the extra note is preferred in the code, I could try to > >>> add some in a separate patch. > >> > >> When staring at the (original) code I would have hoped to find something > >> like: > >> > >> /* > >> * We use page_mapcount() to get a snapshot of the mapcount. Without > >> * holding the page lock this snapshot can be slightly wrong as we > >> * cannot always read the mapcount atomically. As long we hold the PT > >> * lock, the page cannot get unmapped and it's at safe to call > >> * page_mapcount(). > >> */ > >> > >> With the addition of > >> > >> "... For unmapped pages (e.g., migration entries) we cannot guarantee > >> that, so treat the mapcount as being 1." > > > > It seems a little bit confusing to me, it is not safe to call with PTL > > held either, right? I'd like to rephrase the note to: > > The implication that could have been spelled out is that only a mapped > page can get unmapped. (I know, there are some weird migration entries > nowadays ...) Yes, I see your point. Just felt "only a mapped page can get unmapped" sounds not that straightforward (just my personal feeling). How's about "It is not safe to call page_mapcount() even with PTL held if the page is not mapped, especially for migration entries". > > /* > * We use page_mapcount() to get a snapshot of the mapcount. Without > * holding the page lock this snapshot can be slightly wrong as we > * cannot always read the mapcount atomically. As long we hold the PT > * lock, a mapped page cannot get unmapped and it's at safe to call > * page_mapcount(). Especially for migration entries, it's not safe to > * call page_mapcount(), so we treat the mapcount as being 1. > */ > > > > -- > Thanks, > > David / dhildenb >