From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 54325C28CF5 for ; Wed, 26 Jan 2022 11:49:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CB1486B0072; Wed, 26 Jan 2022 06:48:59 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C61F46B0074; Wed, 26 Jan 2022 06:48:59 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B50026B0075; Wed, 26 Jan 2022 06:48:59 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0108.hostedemail.com [216.40.44.108]) by kanga.kvack.org (Postfix) with ESMTP id A77CE6B0072 for ; Wed, 26 Jan 2022 06:48:59 -0500 (EST) Received: from smtpin13.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 5A153951AF for ; Wed, 26 Jan 2022 11:48:59 +0000 (UTC) X-FDA: 79072266798.13.EE1F41A Received: from mail-lf1-f42.google.com (mail-lf1-f42.google.com [209.85.167.42]) by imf22.hostedemail.com (Postfix) with ESMTP id E3536C0005 for ; Wed, 26 Jan 2022 11:48:58 +0000 (UTC) Received: by mail-lf1-f42.google.com with SMTP id y15so54245708lfa.9 for ; Wed, 26 Jan 2022 03:48:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=8ZZvVwoKjlAsaqSxtK4fQe4yK7eYFTBEgmc8sINCg78=; b=X6aKr0fMqQPTsLVZwB/htoJ9UO/IbyIRUdV/qXdYpLy0MU2rm5LgmNuRIXGDnLc284 N2KNzz9uaXR446/RX3vbnM2v8mAbnKG17fPesrLsSJIOYY7Sm7JljX7Mly7kI7fbQ1nU w8Y3F5keHKCzciOhwm9YabBt0S06kRim5XzikIaZ+rPPFt7YOQLzuhXecmdZXYS2tUUM ZFMNtRDuaAdtKiLI+UXAP/FTYsSLci4egde7rRUle8+xxSB7uoJ3xIYsXZb+gv7hEOmE mbPTBmk6lQhoGYP+W/9G5i73WLJy8+K6Qy601L0MJJNsS2qnooiaWyGv9UaQbe6ol8D0 lfWw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=8ZZvVwoKjlAsaqSxtK4fQe4yK7eYFTBEgmc8sINCg78=; b=1SEDgfCfIU4hVBedHnroXQssI7KPENvP4znSVggBtqpI20woMAmwkE7qJORVzWL3/k nB7j3RPr9cGC/dF51HPhATqnLKlnQJOH4MMZzSM78rfxCvPQix2zoTJTg0aFH9uzO3Zr q60PyE1trSqqxzHt/9qrLk436Ed6Rikt0tqn2M0rqNBfQA+ShlX4yfDk1Kqyno+2bLjl dyyy3Fmc/zp6g/i9+/A3ujT9O8tH84VMm9EG3yPbYocBfnPIofywdGzGi+t7KkyhaCda qDUACxSkPcWRVgNcxr4t3W/YpVeWy9opktt7WzFDbQqmbIxI6kwQB028jo6TSrQoC93a IJFg== X-Gm-Message-State: AOAM5304WpPVXp2/pCuJYV90UT0wQHSYA4y2rbmXkWKbPVX0CBhq5UQL h65gdmsl/mTe0PxbSC43D3l7mgOaUQ1TboaTE8wIKA== X-Google-Smtp-Source: ABdhPJziD07+eMnW4rLwOuYDFVWZHuCUdhswCuRCWSHJWbdITLdwopbQOQ/llqE1eiGY374+vaHv5tuFoZPkxX6LTJA= X-Received: by 2002:a05:6512:3d12:: with SMTP id d18mr20141777lfv.213.1643197736980; Wed, 26 Jan 2022 03:48:56 -0800 (PST) MIME-Version: 1.0 References: <20220120202805.3369-1-shy828301@gmail.com> In-Reply-To: From: Jann Horn Date: Wed, 26 Jan 2022 12:48:29 +0100 Message-ID: Subject: Re: [v2 PATCH] fs/proc: task_mmu.c: don't read mapcount for migration entry To: David Hildenbrand Cc: Yang Shi , kirill.shutemov@linux.intel.com, willy@infradead.org, akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, stable@vger.kernel.org Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: E3536C0005 X-Rspam-User: nil Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=X6aKr0fM; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf22.hostedemail.com: domain of jannh@google.com designates 209.85.167.42 as permitted sender) smtp.mailfrom=jannh@google.com X-Stat-Signature: 7exsf5kip788kwg5cggzdrgqnyjoi9nf X-Rspamd-Server: rspam08 X-HE-Tag: 1643197738-785464 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Jan 26, 2022 at 12:38 PM David Hildenbrand wrote: > On 26.01.22 12:29, Jann Horn wrote: > > On Wed, Jan 26, 2022 at 11:51 AM David Hildenbrand wrote: > >> On 20.01.22 21:28, Yang Shi wrote: > >>> The syzbot reported the below BUG: > >>> > >>> kernel BUG at include/linux/page-flags.h:785! [...] > >>> RIP: 0010:PageDoubleMap include/linux/page-flags.h:785 [inline] > >>> RIP: 0010:__page_mapcount+0x2d2/0x350 mm/util.c:744 [...] > >> Does this point at the bigger issue that reading the mapcount without > >> having the page locked is completely unstable? > > > > (See also https://lore.kernel.org/all/CAG48ez0M=iwJu=Q8yUQHD-+eZDg6ZF8QCF86Sb=CN1petP=Y0Q@mail.gmail.com/ > > for context.) > > Thanks for the pointer. > > > > > I'm not sure what you mean by "unstable". Do you mean "the result is > > not guaranteed to still be valid when the call returns", "the result > > might not have ever been valid", or "the call might crash because the > > page's state as a compound page is unstable"? > > A little bit of everything :) [...] > > In case you mean "the result might not have ever been valid": > > Yes, even with this patch applied, in theory concurrent THP splits > > could cause us to count some page mappings twice. Arguably that's not > > entirely correct. > > Yes, the snapshot is not atomic and, thereby, unreliable. That what I > mostly meant as "unstable". > > > > > In case you mean "the call might crash because the page's state as a > > compound page could concurrently change": > > I think that's just a side-product of the snapshot not being "correct", > right? I guess you could see it that way? The way I look at it is that page_mapcount() is designed to return a number that's at least as high as the number of mappings (rarely higher due to races), and using page_mapcount() on an unlocked page is legitimate if you're fine with the rare double-counting of references. In my view, the problem here is: There are different types of references to "struct page" - some of them allow you to call page_mapcount(), some don't. And in particular, get_page() doesn't give you a reference that can be used with page_mapcount(), but locking a (real, non-migration) PTE pointing to the page does give you such a reference. This concept of "different types of references" is the same as you e.g. get with mmgrab() vs mmget() - they both give references to the same object, but those references have different usage restrictions.