From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 262FDC433EF for ; Tue, 15 Mar 2022 20:34:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 795148D0002; Tue, 15 Mar 2022 16:34:32 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7446A8D0001; Tue, 15 Mar 2022 16:34:32 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5E4378D0002; Tue, 15 Mar 2022 16:34:32 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0089.hostedemail.com [216.40.44.89]) by kanga.kvack.org (Postfix) with ESMTP id 4E53B8D0001 for ; Tue, 15 Mar 2022 16:34:32 -0400 (EDT) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 06BAEA0A28 for ; Tue, 15 Mar 2022 20:34:32 +0000 (UTC) X-FDA: 79247773584.26.FEF08E1 Received: from mail-oi1-f180.google.com (mail-oi1-f180.google.com [209.85.167.180]) by imf01.hostedemail.com (Postfix) with ESMTP id 8AC6E4000A for ; Tue, 15 Mar 2022 20:34:31 +0000 (UTC) Received: by mail-oi1-f180.google.com with SMTP id z8so557293oix.3 for ; Tue, 15 Mar 2022 13:34:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:from:to:cc:subject:in-reply-to:message-id:references :mime-version; bh=HuxzPBFcNGizUpUFeFsNjMUynwK/YImVehMj2Mb+h5A=; b=K3HGrbkU3n9yvQYCY/kzElGn6znziTraIc5bj7MoyQgemj6Jy89aSAsaWRBNeSQirf 5XAGCDiWwqOiXzbReSY3yKhwLgKrG7zIljAb0MxtNcmhyYqdc+BbqjG11NdnvjIGO7pF WhYkc0Pj2Lpv6CKFjhncgf5mGTCONSuF6qVBOAl3I/yiUt3sNVL9My/LEgJzuZ9PNz1+ KNp706Dq/JKj2fKDY5vdz4tqkqUxLy0tS2/hFySkXnSULhgAZIqeLKYu1+UvCMNJsCH5 IhISLb9TUOzNwHkwJbSji4zTaCSKtDvaR3Ge1ojWe2ATreVqSqdLJBV+Jk1aDaeLLEjs owUA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:in-reply-to:message-id :references:mime-version; bh=HuxzPBFcNGizUpUFeFsNjMUynwK/YImVehMj2Mb+h5A=; b=fr2kbrHo3JGpO6cHPJeg5Tchyw7Gi7UNgozQdUMDckpwLZ6NBMeUk+k6h3P3SGXelC goU3n9V4nDlTAHDrRlNprhqMDxcFeV5ja3Ks6QNfLKG8JzJ4Y866BzQ9HW9IcsGkOjqr Khopmr4TQ5V3KeHXNAGtIQTUg5mRyo8NHE9VLmjYI9ZlLgTY0N6sPEcDImuwJamPeLCt aaRZKrT25qfyEzm4k4M+t5b0V3AQzb4R+WM2yI0ZTgiVi7tewK+M0mQqnFTBwfyPd+Di +WkBfKKGnSJVt6pySNIG8fY1Q250m0lHohc4APdGbVgq4bZfjf6YbEiOxPi/jXeSSbu0 HpoQ== X-Gm-Message-State: AOAM5325+gbdk+EQb906jSDy5ug6ophYIA10rTdERvoz6t2l3B08CVDs gRcHjM/l2ZnmMYN934FFRBXCVA== X-Google-Smtp-Source: ABdhPJyIXsrs++KyhkbY8OaCTfEwShUw+XWKA+egNWequzBiJdYyCj3Gy31D2jBYT9eCoIr7cy9G0w== X-Received: by 2002:a54:4e11:0:b0:2ec:e0ee:ac29 with SMTP id a17-20020a544e11000000b002ece0eeac29mr2429740oiy.257.1647376470526; Tue, 15 Mar 2022 13:34:30 -0700 (PDT) Received: from ripple.attlocal.net (172-10-233-147.lightspeed.sntcca.sbcglobal.net. [172.10.233.147]) by smtp.gmail.com with ESMTPSA id a10-20020a056808120a00b002d404a71444sm95511oil.35.2022.03.15.13.34.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 15 Mar 2022 13:34:29 -0700 (PDT) Date: Tue, 15 Mar 2022 13:34:02 -0700 (PDT) From: Hugh Dickins X-X-Sender: hugh@ripple.anvils To: David Hildenbrand cc: Andrew Morton , Andrew Yang , Matthias Brugger , Matthew Wilcox , Vlastimil Babka , David Howells , William Kucharski , Yang Shi , Marc Zyngier , linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linux-mediatek@lists.infradead.org, wsd_upstream@mediatek.com, Nicholas Tang , Kuan-Ying Lee Subject: Re: [PATCH] mm/migrate: fix race between lock page and clear PG_Isolated In-Reply-To: <4cb789a5-c49c-f095-1f7e-67be65ba508a@redhat.com> Message-ID: <883877a-30b0-96e0-48a6-7cfc3c59de93@google.com> References: <20220315030515.20263-1-andrew.yang@mediatek.com> <20220314212127.a2797926ee0ef8a7ad05dcaa@linux-foundation.org> <4cb789a5-c49c-f095-1f7e-67be65ba508a@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 8AC6E4000A X-Stat-Signature: nsnkay39qadhybmeie6ozwrf4gmmbaru Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=K3HGrbkU; spf=pass (imf01.hostedemail.com: domain of hughd@google.com designates 209.85.167.180 as permitted sender) smtp.mailfrom=hughd@google.com; dmarc=pass (policy=reject) header.from=google.com X-HE-Tag: 1647376471-380632 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, 15 Mar 2022, David Hildenbrand wrote: > On 15.03.22 05:21, Andrew Morton wrote: > > On Tue, 15 Mar 2022 11:05:15 +0800 Andrew Yang wrote: > > > >> When memory is tight, system may start to compact memory for large > >> continuous memory demands. If one process tries to lock a memory page > >> that is being locked and isolated for compaction, it may wait a long time > >> or even forever. This is because compaction will perform non-atomic > >> PG_Isolated clear while holding page lock, this may overwrite PG_waiters > >> set by the process that can't obtain the page lock and add itself to the > >> waiting queue to wait for the lock to be unlocked. > >> > >> CPU1 CPU2 > >> lock_page(page); (successful) > >> lock_page(); (failed) > >> __ClearPageIsolated(page); SetPageWaiters(page) (may be overwritten) > >> unlock_page(page); > >> > >> The solution is to not perform non-atomic operation on page flags while > >> holding page lock. > > > > Sure, the non-atomic bitop optimization is really risky and I suspect > > we reach for it too often. Or at least without really clearly > > demonstrating that it is safe, and documenting our assumptions. > > I agree. IIRC, non-atomic variants are mostly only safe while the > refcount is 0. Everything else is just absolutely fragile. It is normal and correct to use __SetPageFlag(page) on a page just allocated from the buddy, and not yet logically visible to others: that has refcount 1. Of course, it might have refcount 2 or more, through being speculatively visible to get_page_unless_zero() users: perhaps through earlier usage of the same struct page, or by physical scan of memmap. Those few such others - compaction's isolate_migratepages_block() is the one I know best - must be very careful in their sequence of operations. Preliminary read-only checks are usually okay (but some VM_BUG_ON_PGFLAGS are increasingly problematic: I've had to turn off that CONFIG), then get_page_unless_zero(), then read-only check that the page is of the manageable kind (PageLRU in my world), and only then can it be safe to lock the page - which of course touches page flags, and so would be problematic for a racing user's __SetPageFlag(page). But PageMovable and PageIsolated are beyond my ken: I can't say there. Hugh