From: Mike Kravetz <mike.kravetz@oracle.com>
To: Jiaqi Yan <jiaqiyan@google.com>
Cc: "Naoya Horiguchi" <naoya.horiguchi@linux.dev>,
"HORIGUCHI NAOYA(堀口 直也)" <naoya.horiguchi@nec.com>,
"songmuchun@bytedance.com" <songmuchun@bytedance.com>,
"shy828301@gmail.com" <shy828301@gmail.com>,
"linmiaohe@huawei.com" <linmiaohe@huawei.com>,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"duenwen@google.com" <duenwen@google.com>,
"axelrasmussen@google.com" <axelrasmussen@google.com>,
"jthoughton@google.com" <jthoughton@google.com>
Subject: Re: [PATCH v1 1/3] mm/hwpoison: find subpage in hugetlb HWPOISON list
Date: Sat, 17 Jun 2023 15:59:27 -0700 [thread overview]
Message-ID: <20230617225927.GA3540@monkey> (raw)
In-Reply-To: <CACw3F52iG5bqQbvZ9QkkRkVfy+NbSOu9hnkVOt5khukNNG73OQ@mail.gmail.com>
On 06/16/23 19:18, Jiaqi Yan wrote:
> On Fri, Jun 16, 2023 at 4:35 PM Mike Kravetz <mike.kravetz@oracle.com> wrote:
> > On 06/16/23 14:19, Jiaqi Yan wrote:
> > >
> > > Now looking again this, I think concurrent adding and deleting are
> > > fine with each other and with themselves, because raw_hwp_list is
> > > lock-less llist.
> >
> > Correct.
> >
> > > As for synchronizing traversal with adding and deleting, I wonder is
> > > it a good idea to make __update_and_free_hugetlb_folio hold
> > > hugetlb_lock before it folio_clear_hugetlb_hwpoison(which traverse +
> > > delete raw_hwp_list)? In hugetlb, get_huge_page_for_hwpoison already
> > > takes hugetlb_lock; it seems to me __update_and_free_hugetlb_folio is
> > > missing the lock.
> >
> > I do not think the lock is needed. However, while looking more closely
> > at this I think I discovered another issue.
> > This is VERY subtle.
> > Perhaps Naoya can help verify if my reasoning below is correct.
> >
> > In __update_and_free_hugetlb_folio we are not operating on a hugetlb page.
> > Why is this?
> > Before calling update_and_free_hugetlb_folio we call remove_hugetlb_folio.
> > The purpose of remove_hugetlb_folio is to remove the huge page from the
> > list AND compound page destructor indicating this is a hugetlb page is changed.
> > This is all done while holding the hugetlb lock. So, the test for
> > folio_test_hugetlb(folio) is false.
> >
> > We have technically a compound non-hugetlb page with a non-null raw_hwp_list.
> >
> > Important note: at this time we have not reallocated vmemmap pages if
> > hugetlb page was vmemmap optimized. That is done later in
> > __update_and_free_hugetlb_folio.
>
>
> >
> > The 'good news' is that after this point get_huge_page_for_hwpoison will
> > not recognize this as a hugetlb page, so nothing will be added to the
> > list. There is no need to worry about entries being added to the list
> > during traversal.
> >
> > The 'bad news' is that if we get a memory error at this time we will
> > treat it as a memory error on a regular compound page. So,
> > TestSetPageHWPoison(p) in memory_failure() may try to write a read only
> > struct page. :(
>
> At least I think this is an issue.
>
> Would it help if dissolve_free_huge_page doesn't unlock hugetlb_lock
> until update_and_free_hugetlb_folio is done, or basically until
> dissolve_free_huge_page is done?
Unfortunately, update_and_free_hugetlb_folio is designed to be called
without locks held. This is because we can not hold any locks while
allocating vmemmap pages.
I'll try to think of some way to restructure the code. IIUC, this is a
potential general issue, not just isolated to memory error handling.
--
Mike Kravetz
>
> TestSetPageHWPoison in memory_failure is called after
> try_memory_failure_hugetlb, and folio_test_hugetlb is tested within
> __get_huge_page_for_hwpoison, which is wrapped by the hugetlb_lock. So
> by the time dissolve_free_huge_page returns, subpages already go
> through hugetlb_vmemmap_restore and __destroy_compound_gigantic_folio
> and become non-compound raw pages (folios). Now
> folio_test_hugetlb(p)=false will be correct for memory_failure, and it
> can recover p as a dissolved non-hugetlb page.
next prev parent reply other threads:[~2023-06-17 23:00 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-17 16:09 [PATCH v1 0/3] Improve hugetlbfs read on HWPOISON hugepages Jiaqi Yan
2023-05-17 16:09 ` [PATCH v1 1/3] mm/hwpoison: find subpage in hugetlb HWPOISON list Jiaqi Yan
2023-05-17 23:53 ` Mike Kravetz
2023-05-19 20:54 ` Jiaqi Yan
2023-05-19 22:42 ` Mike Kravetz
2023-05-22 4:50 ` HORIGUCHI NAOYA(堀口 直也)
2023-05-22 18:22 ` Jiaqi Yan
2023-05-23 2:43 ` HORIGUCHI NAOYA(堀口 直也)
2023-05-26 0:28 ` Jiaqi Yan
2023-06-10 5:48 ` Jiaqi Yan
2023-06-12 4:19 ` Naoya Horiguchi
2023-06-16 21:19 ` Jiaqi Yan
2023-06-16 23:34 ` Mike Kravetz
2023-06-17 2:18 ` Jiaqi Yan
2023-06-17 22:59 ` Mike Kravetz [this message]
2023-06-19 8:23 ` Naoya Horiguchi
2023-06-20 18:05 ` Mike Kravetz
2023-06-20 22:39 ` Mike Kravetz
2023-06-23 0:45 ` Jiaqi Yan
2023-06-23 4:19 ` Mike Kravetz
2023-06-23 16:40 ` Jiaqi Yan
2023-05-17 16:09 ` [PATCH v1 2/3] hugetlbfs: improve read HWPOISON hugepage Jiaqi Yan
2023-05-18 22:18 ` Mike Kravetz
2023-05-19 20:54 ` Jiaqi Yan
2023-05-17 16:09 ` [PATCH v1 3/3] selftests/mm: add tests for HWPOISON hugetlbfs read Jiaqi Yan
2023-05-23 7:35 ` kernel test robot
2023-05-17 23:30 ` [PATCH v1 0/3] Improve hugetlbfs read on HWPOISON hugepages Mike Kravetz
2023-05-18 16:02 ` Jiaqi Yan
2023-05-18 16:10 ` Jiaqi Yan
2023-05-18 22:24 ` Mike Kravetz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230617225927.GA3540@monkey \
--to=mike.kravetz@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=axelrasmussen@google.com \
--cc=duenwen@google.com \
--cc=jiaqiyan@google.com \
--cc=jthoughton@google.com \
--cc=linmiaohe@huawei.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=naoya.horiguchi@linux.dev \
--cc=naoya.horiguchi@nec.com \
--cc=shy828301@gmail.com \
--cc=songmuchun@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox