From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 91AE4C7EE29 for ; Fri, 19 May 2023 20:54:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D676D900005; Fri, 19 May 2023 16:54:35 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D16BC900003; Fri, 19 May 2023 16:54:35 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C0602900005; Fri, 19 May 2023 16:54:35 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id AD0A0900003 for ; Fri, 19 May 2023 16:54:35 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 76722A09CD for ; Fri, 19 May 2023 20:54:35 +0000 (UTC) X-FDA: 80808208110.23.F9BCF11 Received: from mail-yw1-f175.google.com (mail-yw1-f175.google.com [209.85.128.175]) by imf05.hostedemail.com (Postfix) with ESMTP id B30D010000A for ; Fri, 19 May 2023 20:54:33 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=r4xcUW9P; spf=pass (imf05.hostedemail.com: domain of jiaqiyan@google.com designates 209.85.128.175 as permitted sender) smtp.mailfrom=jiaqiyan@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1684529673; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=MHyyjBXGdkt4cSmFwqyUBwVMuSNH2CUYNfBzVRGjfmY=; b=G+8fYugaRnnXTWCGjC7rkFFKaT9iy8MUf+HXUoCF4yf2vSkTyNM9IdNFDQepQ+06mV9DmC DR0lEDWkb+LRIY/RWQN2+AdDH7d9SN8auum+BYrNEl3/iDEW4hQPjO+WpxjYb+gRxKwl08 3EnJyrIQoyRnQzbTNK1OmfbD937pYhE= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1684529673; a=rsa-sha256; cv=none; b=dSF6XA9Rlxzijb5RziWi8iFOeqfE9BgX0OLrOK2qsVMUZ68vUn+wNNTYEjcVwlWG7oH6hl BQWKt9aCz2HOO8UWyyM582l3We8iLHtborqX/Y8ZigzSDAzk8AHZMlInBghVZe68AQ211T K0Ta2D39IPEbiKLhBCOIs/I3tAKtJLE= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=r4xcUW9P; spf=pass (imf05.hostedemail.com: domain of jiaqiyan@google.com designates 209.85.128.175 as permitted sender) smtp.mailfrom=jiaqiyan@google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-yw1-f175.google.com with SMTP id 00721157ae682-561f23dc55aso34667707b3.3 for ; Fri, 19 May 2023 13:54:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1684529673; x=1687121673; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=MHyyjBXGdkt4cSmFwqyUBwVMuSNH2CUYNfBzVRGjfmY=; b=r4xcUW9P7mdoHn/A+CRJgXb7Dqt1J5nqN2qQstM8zJ4vzpyBMLRQdqE6O+2Yt6wZ6J XLqtV3cGRqQGUqAA9uK56ZkgjSbV0DPsadwZmjjpqLmDlgNahNwvWzbH4rcJNkYyHC9W o8ziEEEcSETn5YCI2g6cJ9lMg1lSJV8QIQFxS785UqdAwFS7ZIpIWhsjFz+Hii7g7Ixo /sp1fZvkYpFdnucZ7Rn/CpYqyFIIsK2y1a1IQIsEx+/attmCuI/8AqP29nXX5PeL+FtD qv8JalAbNLD7Thc4JRda4MAJSCga202V19OZZGmXjWKaVrvf/iGocJ8Sna1zG/TmP/2P Nupw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684529673; x=1687121673; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=MHyyjBXGdkt4cSmFwqyUBwVMuSNH2CUYNfBzVRGjfmY=; b=YFm60lVwjF2EKuARmtUXDIXHudasqI+gSZL7Kg3WeMV2kXxKTTUKWitADjwvClBdGf ebDouo3iVpOswJapmmb9bHDkRxzjUedob3Do0Reco5xcg6jusfPNJN90xdy/X/OuoZ8y xJns3ZWg/CdDxPleLfa6GeKKWNppoyxvrIOi0MSdYz99V4a3HNbNRL8d8rShMj4mJwrr xPPyUQRpccEnN5uun+sYctVlM+37OUdKy8KFZTGHBHOfxgSpU2yPE/qa9LJT1+VgNP4D npM+zwt2ChXBV+xBzkr9vITO+A9hevJwivnxDazZ201o3hluhUdnmZLRZBv/wic/JwPZ siNA== X-Gm-Message-State: AC+VfDxS+V5uA7OfKaBiKpHVrTF1oytz6eCKkKXHxMRUbHxS478RSyQE qO3fxjfMlNLNucFBTAMkFKJHyBWxCDrTh3CWM36G2gwvaQoLx0BnrR02UC5N X-Google-Smtp-Source: ACHHUZ7Uhhh6XgxXqlIUX4pwzEPbZOoNALxhDzGkdqBmt0G1CfN66PxvYVOA6Lxetgil3GBAaSSTlcRNPu7moiIMNx8= X-Received: by 2002:a0d:ca11:0:b0:561:4bc2:1587 with SMTP id m17-20020a0dca11000000b005614bc21587mr3564951ywd.39.1684529672768; Fri, 19 May 2023 13:54:32 -0700 (PDT) MIME-Version: 1.0 References: <20230517160948.811355-1-jiaqiyan@google.com> <20230517160948.811355-3-jiaqiyan@google.com> <20230518221808.GC4029@monkey> In-Reply-To: <20230518221808.GC4029@monkey> From: Jiaqi Yan Date: Fri, 19 May 2023 13:54:21 -0700 Message-ID: Subject: Re: [PATCH v1 2/3] hugetlbfs: improve read HWPOISON hugepage To: Mike Kravetz Cc: songmuchun@bytedance.com, naoya.horiguchi@nec.com, shy828301@gmail.com, linmiaohe@huawei.com, akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, duenwen@google.com, axelrasmussen@google.com, jthoughton@google.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: gw9hpap7kesp8tiuofqt93h66fqhj7b7 X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: B30D010000A X-Rspam-User: X-HE-Tag: 1684529673-260631 X-HE-Meta: U2FsdGVkX19/aH56HtQA+7v0zfTUkd0wn83uYmW1TRNYC0b9SC6t3g6vcpQiYsvAKobe3cOAcB9wfJ4xHs3jWm104T2myyCsBXRf3vanbivvuVwie48AfjvVPwYhLYMVEJDpn6bXuPqBQrKoovWVC1kb6Abf3kOqAh57INQfTm3bUSlLanQuDz/zETb03j36k0B7lcCgyJv4NhQr2RXv0SSfqz46GaCfZE1gJpuc2UZFr25KgMLjWLeoWPedBxDRuCIaFnMiX/+25LWn/JYpeZYL9D2hNCwIgPpvNyJtoRE98HCQnKTvpPj5VJ+me2mWcbX83LZ/5ZLOClPjlk5sJjg2uFOfYaEri/pbPoW2jEFJ+zkZDA7AOdg6QR2p3TLpvsjkKRE0Wj43grHxfmmz3+/8Ma6/S2ibo/vQ3Nu7zngIm6ZaJFzvEP5cM1dswtYgAJn0oDVMcilC78YQnshMu4MF8+mubn6SSdl0yBAZZspwXTQxpQ1+GGXtd3LI0zxtfolfiJOW6kHOQY3V2e1vzBUA3r1CCxcJ2grEmr2JlkdxD2zI+exxcl018INYhXTe9wEFH9a+K6rw5UKM/yWgtFVFMOzpJbz1h6U2ve7+38IwWf/qerKyTmYCuVlwXGT9fDWF/kHaBNqJL0PDoUy5W82palFReFTdkUtAwKC0o8EtBd42ukgcnGenuodq5BKb8NUkwy//PgLeaM9gSBegefovghg2GypqTztCQuzUrTcRGzScWdeWzXPeaKNWGq+Hhp7s3bRfBQVaWr0sjYpjUPXiuMwkzybmTKavhXjfHVZxyR9cuSlfbmEnwEpam067q53Y4YWymEbfv7HyfgAgyJYABYn705Ce5vG1alhRUp+hM+D5Fb36lJFmmXA7xe8+ieA2ywx0YIXGFUWTVv7h5OLywcnChFBLqhETCQ81KDCjVYn8rv6cwEjHJ0wjl3wpIv2j2NhN1O30HxhV6Is QBq35ewX R1OcIP01tubvmGe3tYRI3hMjbGIeK85Y/K0YZd+K22AeTdjnFkMRUIqW77EfeA3QQQQZcbbLIXDu1RAbE9e7gDtBd66TdAWrAc/FL6azFsTf2wDNEeWjp4BwoetOsfXWwOH+uDR/fUt1t/XdFYFnRpmUhXyQruFBUXDLcOIXsGhsnhwlkjgsKtQ/tsoQWq5F8m0YStZ0FcUZoc07oCcW5vMteN10xniQlVUA2+qLSwcc/5jG8w7gajaAd63BKMwLhWcXyoBH5NI2hyXm6pWmrpWdTrA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, May 18, 2023 at 3:18=E2=80=AFPM Mike Kravetz wrote: > > On 05/17/23 16:09, Jiaqi Yan wrote: > > When a hugepage contains HWPOISON pages, read() fails to read any byte > > of the hugepage and returns -EIO, although many bytes in the HWPOISON > > hugepage are readable. > > > > Improve this by allowing hugetlbfs_read_iter returns as many bytes as > > possible. For a requested range [offset, offset + len) that contains > > HWPOISON page, return [offset, first HWPOISON page addr); the next read > > attempt will fail and return -EIO. > > > > Signed-off-by: Jiaqi Yan > > --- > > fs/hugetlbfs/inode.c | 62 +++++++++++++++++++++++++++++++++++++++----- > > 1 file changed, 56 insertions(+), 6 deletions(-) > > > > diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c > > index ecfdfb2529a3..1baa08ec679f 100644 > > --- a/fs/hugetlbfs/inode.c > > +++ b/fs/hugetlbfs/inode.c > > @@ -282,6 +282,46 @@ hugetlb_get_unmapped_area(struct file *file, unsig= ned long addr, > > } > > #endif > > > > +/* > > + * Someone wants to read @bytes from a HWPOISON hugetlb @page from @of= fset. > > + * Returns the maximum number of bytes one can read without touching t= he 1st raw > > + * HWPOISON subpage. > > + * > > + * The implementation borrows the iteration logic from copy_page_to_it= er*. > > + */ > > +static size_t adjust_range_hwpoison(struct page *page, size_t offset, = size_t bytes) > > +{ > > + size_t n =3D 0; > > + size_t res =3D 0; > > + struct folio *folio =3D page_folio(page); > > + > > + folio_lock(folio); > > What is the reason for taking folio_lock? I intended to make this routine (mostly find_raw_hwp_page) to be serialized with folio_clear_hugetlb_hwpoison() and hwpoison_user_mappings() in try_memory_failure_hugetlb(). They don't directly affect the raw_hwp_list. I can remove the lock in v2. > > > + > > + /* First subpage to start the loop. */ > > + page +=3D offset / PAGE_SIZE; > > + offset %=3D PAGE_SIZE; > > + while (1) { > > + if (find_raw_hwp_page(folio, page) !=3D NULL) > > + break; > > + > > + /* Safe to read n bytes without touching HWPOISON subpage= . */ > > + n =3D min(bytes, (size_t)PAGE_SIZE - offset); > > + res +=3D n; > > + bytes -=3D n; > > + if (!bytes || !n) > > + break; > > + offset +=3D n; > > + if (offset =3D=3D PAGE_SIZE) { > > + page++; > > + offset =3D 0; > > + } > > + } > > + > > + folio_unlock(folio); > > + > > + return res; > > +} > > + > > /* > > * Support for read() - Find the page attached to f_mapping and copy o= ut the > > * data. This provides functionality similar to filemap_read(). > > @@ -300,7 +340,7 @@ static ssize_t hugetlbfs_read_iter(struct kiocb *io= cb, struct iov_iter *to) > > > > while (iov_iter_count(to)) { > > struct page *page; > > - size_t nr, copied; > > + size_t nr, copied, want; > > > > /* nr is the maximum number of bytes to copy from this pa= ge */ > > nr =3D huge_page_size(h); > > @@ -328,16 +368,26 @@ static ssize_t hugetlbfs_read_iter(struct kiocb *= iocb, struct iov_iter *to) > > } else { > > unlock_page(page); > > > > - if (PageHWPoison(page)) { > > - put_page(page); > > - retval =3D -EIO; > > - break; > > + if (!PageHWPoison(page)) > > + want =3D nr; > > + else { > > + /* > > + * Adjust how many bytes safe to read wit= hout > > + * touching the 1st raw HWPOISON subpage = after > > + * offset. > > + */ > > + want =3D adjust_range_hwpoison(page, offs= et, nr); > > + if (want =3D=3D 0) { > > + put_page(page); > > + retval =3D -EIO; > > + break; > > + } > > } > > > > /* > > * We have the page, copy it to user space buffer= . > > */ > > - copied =3D copy_page_to_iter(page, offset, nr, to= ); > > + copied =3D copy_page_to_iter(page, offset, want, = to); > > put_page(page); > > } > > offset +=3D copied; > > -- > > 2.40.1.606.ga4b1b128d6-goog > > > > Code looks fine, just wondering about that folio_lock. > -- > Mike Kravetz