From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 984ABC433FE for ; Tue, 11 Jan 2022 21:59:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 107136B00D8; Tue, 11 Jan 2022 16:59:51 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0903D6B00D9; Tue, 11 Jan 2022 16:59:51 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E73AA6B00DA; Tue, 11 Jan 2022 16:59:50 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0254.hostedemail.com [216.40.44.254]) by kanga.kvack.org (Postfix) with ESMTP id D17866B00D8 for ; Tue, 11 Jan 2022 16:59:50 -0500 (EST) Received: from smtpin27.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 9E3B38A3E8 for ; Tue, 11 Jan 2022 21:59:50 +0000 (UTC) X-FDA: 79019374140.27.82C0BF9 Received: from mail-pl1-f181.google.com (mail-pl1-f181.google.com [209.85.214.181]) by imf17.hostedemail.com (Postfix) with ESMTP id 3D8CA40006 for ; Tue, 11 Jan 2022 21:59:50 +0000 (UTC) Received: by mail-pl1-f181.google.com with SMTP id i7so75222plr.4 for ; Tue, 11 Jan 2022 13:59:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to; bh=4d36RkzytXf0AgHZAiUJdXMpTmlRY0RoMET8pPsPvK0=; b=LGd1jH+zfdPsEcPidDrJ0DVRR4MjhIWo1GWAUUtBAd8WFzU0bN8xR3cY8Q1y5rCn0G Tel+J2RbldSK6/GMpRPS9ML+zYKLZRPTzH8mpKFRyIGiFqNeFVRTtTcar5GEpLsTja+/ 3plP//t/Zlpq3FpZcan8Vq6n5gUvpDA9QXrFHGu2cLUWIwfVtT5pJP+hlOXwmyVxSDfS LXkYUTm20TeDHTd83XcU1urJ5U3WnMJmXMlC5B3Hma+aJicIBsqtz+NSsz9OwKgNgNJf 8/UbFu4of/WHopTxp+n6wH1uOV5oAwdKrSiF5agsVRZ55cRjmfYBFeQBZIdD32P8Przw 8qAA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition :content-transfer-encoding:in-reply-to; bh=4d36RkzytXf0AgHZAiUJdXMpTmlRY0RoMET8pPsPvK0=; b=KIX0kHVnYIpuT5BsXLgoUyJQevDY8g0acdoaM31YJIFEwKeq0BnLr8N9R42LDVMh/b ra2KnsY9MaGOeMknspQS8mqZYDCqNNWXot+vm5w1ecaS/Sv30Ai6csGwIxFuYb5TT03E Fzgm1ePry0Voc8HLhvfhJ5PemTqL7G4Agb78PI7uktM9bJ4W63SEdQkepmPb4q+YuzeG wPXa6ixo55tYTJnUTZxRqmgqQvsdCrEtI7LVgYMT1rpRDwxU+Ro2sdjRj3FgP84b2fSI Pj6m0YRIK0GcHv73sbbsX0lQ5f4A+Nx85bUy6y4nlKrB0kjQiEaSjj91cwLnxfMd76qc DYPw== X-Gm-Message-State: AOAM533eIx/clnic7p0VHeEsZl5e9/7+Ba4i3BHxBq2Z3GxnOKEWvfrc qwTSS8hl3T1K+KAbZyIO8MQ= X-Google-Smtp-Source: ABdhPJyeC53hc0Mjhmw1IVlh6Gi1bTzao6f/P9GnEVnDagcJ383FsXEMOfHhKiCIPRTYPatWVsXgHg== X-Received: by 2002:a17:902:d485:b0:14a:4ba5:6e64 with SMTP id c5-20020a170902d48500b0014a4ba56e64mr6535979plg.152.1641938389214; Tue, 11 Jan 2022 13:59:49 -0800 (PST) Received: from google.com ([2620:15c:211:201:4f0e:ffc8:3f7b:ac89]) by smtp.gmail.com with ESMTPSA id l12sm11764682pfc.181.2022.01.11.13.59.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 11 Jan 2022 13:59:48 -0800 (PST) Date: Tue, 11 Jan 2022 13:59:47 -0800 From: Minchan Kim To: John Hubbard Cc: Yu Zhao , Mauricio Faria de Oliveira , Andrew Morton , linux-mm@kvack.org, linux-block@vger.kernel.org, Huang Ying , Miaohe Lin , Yang Shi Subject: Re: [PATCH v2] mm: fix race between MADV_FREE reclaim and blkdev direct IO read Message-ID: References: <20220105233440.63361-1-mfo@canonical.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: X-Stat-Signature: 5hbqek91e7d3t33f3skgf4kobiaa94bu Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=LGd1jH+z; dmarc=fail reason="SPF not aligned (relaxed), DKIM not aligned (relaxed)" header.from=kernel.org (policy=none); spf=pass (imf17.hostedemail.com: domain of minchan.kim@gmail.com designates 209.85.214.181 as permitted sender) smtp.mailfrom=minchan.kim@gmail.com X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 3D8CA40006 X-HE-Tag: 1641938390-482812 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Jan 11, 2022 at 12:21:59PM -0800, Minchan Kim wrote: > On Tue, Jan 11, 2022 at 12:20:13PM -0800, Minchan Kim wrote: > < snip > > > > > slow path with __gup_longterm_unlocked and set_dirty_pages > > > > for them). > > > >=20 > > > > This approach would solve other cases where map userspace > > > > pages into kernel space and then write. Since the write > > > > didn't go through with the process's page table, we will > > > > lose the dirty bit in the page table of the process and > > > > it turns out same problem. That's why I'd like to approach > > > > this. > > > >=20 > > > > If it doesn't work, the other option to fix this specific > > > > case is can't we make pages dirty in advance in DIO read-case? > > > >=20 > > > > When I look at DIO code, it's already doing in async case. > > > > Could't we do the same thing for the other cases? > > > > I guess the worst case we will see would be more page > > > > writeback since the page becomes dirty unnecessary. > > >=20 > > > Marking pages dirty after pinning them is a pre-existing area of > > > problems. See the long-running LWN articles about get_user_pages() = [1]. > >=20 > > Oh, Do you mean marking page dirty in DIO path is already problems? >=20 > ^ marking page dirty too late in DIO path >=20 > Typo fix. I looked though the articles but couldn't find dots to connetct issues with this MADV_FREE issue. However, man page shows a clue why it's fine. ``` O_DIRECT I/Os should never be run concurrently with the fork(2) s= ystem call, if the memory buffer is a private map=E2=80=90 ping (i.e., any mapping created with the mmap(2) MAP_PRIVATE flag;= this includes memory allocated on the heap and statically allocated buffers). Any such I/Os, whether submitted= via an asynchronous I/O interface or from another thread in the process, should be completed before fork(2) is calle= d. Failure to do so can result in data corruption and undefined behavior in parent and child processes. ``` I think it would make the copy_present_pte's page_dup_rmap safe.