From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9D4C5C4332F for ; Wed, 13 Dec 2023 02:22:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 110A86B043B; Tue, 12 Dec 2023 21:22:30 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 09A7D6B043C; Tue, 12 Dec 2023 21:22:30 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E53EA6B043D; Tue, 12 Dec 2023 21:22:29 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id D37846B043B for ; Tue, 12 Dec 2023 21:22:29 -0500 (EST) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 9F611160A1A for ; Wed, 13 Dec 2023 02:22:29 +0000 (UTC) X-FDA: 81560196018.28.C907201 Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by imf23.hostedemail.com (Postfix) with ESMTP id 387F4140004 for ; Wed, 13 Dec 2023 02:22:25 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=MRtu+hmU; spf=pass (imf23.hostedemail.com: domain of chrisl@kernel.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=chrisl@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1702434146; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=AMaOuwX05cXWrFNobmMzbdjje1SpPIWT5dwm/Wgp1Jo=; b=UgsNhSxF5VlSkHhS3P8woXqHigA4c8imaPY8KmBYJMROYmRTuSF/MJxgq/g2DuycJsHSPe UwKNXHVgqGC9wEJihWvm4lngEYIPOzfQLupd30B8/gFNMTNS+bkxU9VDPVfI44LM5jdOGy hXiBTEVNC/yCJKVnPaOrghzFHOpkaqI= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=MRtu+hmU; spf=pass (imf23.hostedemail.com: domain of chrisl@kernel.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=chrisl@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1702434146; a=rsa-sha256; cv=none; b=OcBFp0T8mbrbJgfUl/93f6S3VOkEHZGiB5oA/2hHOytbvarUjqUb50Av3F2nJLRqRVQnlD oxCBnBgVvVK1KoCYeydBzjRds8bzT9OjO709OkloJESr8plBh5zapTPl1SZroRlY+Xgau1 KmpuTUC0RAZO/GOwSverlcmM9M5iSvA= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by sin.source.kernel.org (Postfix) with ESMTP id 58A82CE190A for ; Wed, 13 Dec 2023 02:22:22 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 41015C433D9 for ; Wed, 13 Dec 2023 02:22:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1702434141; bh=wwCgm+IFjqFyxUBy11fFZfn6oiOr0/BXrjgrdZ1mWyg=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=MRtu+hmU4CkJurSA+0yTo4zFOwkQjx6jo6GwjN2I4q4AIdx+cHWzDP+9TtJ3+WFoM aQ40bxwPHQ0BRMj4QezD4VfPgxfi+8H0CeKYrifSS2lVBsZ/0XWF7YRN3T8zWfgFPA uumVIUe7gCjYhfd2aL6IXbUEGs+5rVVnA/kXXYST58i3KT4xkamiN54BLkxfKDP5po lElzd9bUkVPeNhe7kM1bC5lLzBCHjEpHftGfMsHPFHzLHVaSdjrbhNGk9QNXfAD1YE HLvMe/vrEwaVj6K1c8m537yho50ezftINb2ief0tlMNuxFmnIItHdplkk0OzWI/kMF AVcgwvc761ULA== Received: by mail-oa1-f45.google.com with SMTP id 586e51a60fabf-1f060e059a3so4759366fac.1 for ; Tue, 12 Dec 2023 18:22:21 -0800 (PST) X-Gm-Message-State: AOJu0YxehZPBnhmbORnEOgOhA5qDPjwRl/ocilbM8eMtgmrq/btTdWP+ WPzfmzll9hCBUPBaqy/zGw7GMvOwXE38lp8TGZHn9A== X-Google-Smtp-Source: AGHT+IHdIzK2hZu/4raabTovfoFExqE0zHuUM405JETYLRfcMLTywTWtMmLl1Nu1fFOkxwfh2F92Yinr1bo9lZPjNMk= X-Received: by 2002:a05:6358:9889:b0:16d:e1d8:22c7 with SMTP id q9-20020a056358988900b0016de1d822c7mr7530145rwa.29.1702434140415; Tue, 12 Dec 2023 18:22:20 -0800 (PST) MIME-Version: 1.0 References: <20231119194740.94101-1-ryncsn@gmail.com> <20231119194740.94101-19-ryncsn@gmail.com> In-Reply-To: From: Chris Li Date: Tue, 12 Dec 2023 18:22:07 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH 18/24] mm/swap: introduce a helper non fault swapin To: Kairui Song Cc: linux-mm@kvack.org, Andrew Morton , "Huang, Ying" , David Hildenbrand , Hugh Dickins , Johannes Weiner , Matthew Wilcox , Michal Hocko , linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 387F4140004 X-Rspam-User: X-Stat-Signature: ehwyfedrtbjhurdtp8hp41x7wtjr6zjc X-Rspamd-Server: rspam01 X-HE-Tag: 1702434145-771136 X-HE-Meta: U2FsdGVkX19p8LEVC8aRYPpLlP+ELTyDUgzV0FDh0BXcPVYlNCwkDFwYEpe1eyT0gZu8ZME470mmnXZfHMyxCgneX17cWKMEibXDeAXPQwIO8L8kcGFdrFCTmRge6SuYHK53CCw2mbWflWNvRkr3XtsV4rKcAAKNa5GgD8S+JLeLktddhkHIch5pYbcPs6jMuIsuIWLkUd4/ljkTvq7GqPuemA1Jbwy11EfjLpW8lIWQtxvVhQQPqIBGXxd6QGuEdFD0Hw5770ZMZsJoBk89j/OboJCtPA2y7qw75mRKT1N/GnHLg+WdQebedGJKk99Owb/1wj6wPkugAx0c5H8sSiK16dS8+7G+N3IghUMCPnzatAfMk6O86ayv5k2rKRCjMxHhG6yq/Y/62t2/4U+tA+wc5zpmXTpEoYpFKa3eDiGnP9tyMAtAR2JtBhE9PI8NulU+ff+7tof8y1fsCeb/EnlvN6R8oJvs5cCCVIvWoqBqgqoTu4DJf+Buh5q1fQWLvhkMnisyW8apGU7DXcfDTWs10s+Db6DDiJLlxLuIVALEdwNoECDjA9LBQlZ28aAl1T7yF374AXVA3Cf2wPX8MaFrLVN8qdWOhfDgTiMj3QPz4hxsTuhz2uBxsG8nU9ptOue2S52MHsfu6cdcKDrEnnho92MV1e2DUOQ0HboBzVLw0VMlajoPt2E8N7HpL2lrcbDhxSnQRVam9ynKxGaXMhZgPRU267K5gTXgGSTzbzSlB1PZIbDmov+CiwnSqW2u9xwzfmjR5Sk5QsBh3H3MEkCY0QAlkssTOT+sG6zPIjoXQopVt76UoytMYlKlE9TSsOYBaVgkyL1qaa1UOYQTu8uyZfLnZCMN+NilC4JgOPXVk6GF2KcMW3/wtvfdXkw++zdIRpUF8yHT+vvyPiTtMKS1kHYNVWOd6fbnyOazDEFy03bfjRtPc0+PdQAmBi9oaXlV2QhJTmeNcR8x1HL GTJiV5vd likl/Sed1mmtroapqcc8Gq7ehKa9+ypOAd2SoKBeFwqixjygaAvXOspHrBJA4/+Ny5nNbrQq/mglhdhzQF5BATqv8hhu5rx8TWTbFz+whx1FKJvuR1RTDhhRmvZLnnPnfnFMnY/1Ra0k2JFudfQ7dfIHIRM6pa00ygmO+VxI5+3CaQb0UUQBwmgwbh9x00Ql+YRYkXHdpDek8CUTPXY6GWRwpaorRWSoaoIZNNeEiijYJ/iw= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Nov 28, 2023 at 3:22=E2=80=AFAM Kairui Song wrot= e: > > > > /* > > > * Make sure huge_gfp is always more limited than limit_gfp. > > > * Some of the flags set permissions, while others set limitations. > > > @@ -1854,9 +1838,12 @@ static int shmem_swapin_folio(struct inode *in= ode, pgoff_t index, > > > { > > > struct address_space *mapping =3D inode->i_mapping; > > > struct shmem_inode_info *info =3D SHMEM_I(inode); > > > - struct swap_info_struct *si; > > > + enum swap_cache_result result; > > > struct folio *folio =3D NULL; > > > + struct mempolicy *mpol; > > > + struct page *page; > > > swp_entry_t swap; > > > + pgoff_t ilx; > > > int error; > > > > > > VM_BUG_ON(!*foliop || !xa_is_value(*foliop)); > > > @@ -1866,34 +1853,30 @@ static int shmem_swapin_folio(struct inode *i= node, pgoff_t index, > > > if (is_poisoned_swp_entry(swap)) > > > return -EIO; > > > > > > - si =3D get_swap_device(swap); > > > - if (!si) { > > > + mpol =3D shmem_get_pgoff_policy(info, index, 0, &ilx); > > > + page =3D swapin_page_non_fault(swap, gfp, mpol, ilx, fault_mm= , &result); > > Hi Chris, > > I've been trying to address these issues in V2, most issue in other > patches have a straight solution, some could be discuss in seperate > series, but I come up with some thoughts here: > > > > > Notice this "result" CAN be outdated. e.g. after this call, the swap > > cache can be changed by another thread generating the swap page fault > > and installing the folio into the swap cache or removing it. > > This is true, and it seems a potential race also exist before this > series for direct (no swapcache) swapin path (do_swap_page) if I > understand it correctly: I just noticed I missed this email while I was cleaning up my email archive. Sorry for the late reply. Traveling does not help either. I am not aware of swap in racing bugs in the existing code. Racing, yes. If you discover a code path for racing causing bug, please report it. > > In do_swap_page path, multiple process could swapin the page at the > same time (a mapped once page can still be shared by sub threads), > they could get different folios. The later pte lock and pte_same check > is not enough, because while one process is not holding the pte lock, > another process could read-in, swap_free the entry, then swap-out the > page again, using same entry, an ABA problem. The race is not likely > to happen in reality but in theory possible. Have you taken into account that if the page was locked, then it wasn't able to change from the swapcache? I think the swap cache find and get function will return the page locked. Then swapcache will not be able to change the mapping as long as the page is still locked. > > Same issue for shmem here, there are > shmem_confirm_swap/shmem_add_to_page_cache check later to prevent > re-installing into shmem mapping for direct swap in, but also not > enough. Other process could read-in and re-swapout using same entry so > the mapping entry seems unchanged during the time window. Still very > unlikely to happen in reality, but not impossible. Please take a look again with the page lock information. Report back if you still think there is a racing bug in the existing code. We can take a closer look at the concurrent call stack to trigger the bug. Chris > > When swapcache is used there is no such issue, since swap lock and > swap_map are used to sync all readers, and while one reader is still > holding the folio, the entry is locked through swapcache, or if a > folio is removed from swapcache, folio_test_swapcache will fail, and > the reader could retry. > > I'm trying to come up with a better locking for direct swap in, am I > missing anything here? Correct me if I get it wrong... >