From: "HORIGUCHI NAOYA(堀口 直也)" <naoya.horiguchi@nec.com>
To: Oscar Salvador <osalvador@suse.de>
Cc: "akpm@linux-foundation.org" <akpm@linux-foundation.org>,
"aris@ruivo.org" <aris@ruivo.org>,
"mhocko@kernel.org" <mhocko@kernel.org>,
"tony.luck@intel.com" <tony.luck@intel.com>,
"cai@lca.pw" <cai@lca.pw>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>
Subject: Re: [PATCH v4 0/7] HWpoison: further fixes and cleanups
Date: Thu, 17 Sep 2020 11:39:21 +0000 [thread overview]
Message-ID: <20200917113920.GA19898@hori.linux.bs1.fc.nec.co.jp> (raw)
In-Reply-To: <20200917081049.27428-1-osalvador@suse.de>
On Thu, Sep 17, 2020 at 10:10:42AM +0200, Oscar Salvador wrote:
> This patchset includes some fixups (patch#1,patch#2 and patch#3)
> and some cleanups (patch#4-7).
>
> Patch#1 is a fix to take off HWPoison pages off a buddy freelist since
> it can lead us to having HWPoison pages back in the game without no one
> noticing it.
> So fix it (we did that already for soft_offline_page [1]).
>
> Patch#2 is fixing a rebasing problem that made the call
> to page_handle_poison from _soft_offline_page set the
> wrong value for hugepage_or_freepage. [2]
>
> Patch#3 is not really a fixup, but tries to re-handle a page
> in case it was allocated under us.
Thanks for the update.
This patchset triggers the following BUG_ON() with Aristeu's workload:
[ 1010.400900] Soft offlining pfn 0xbff8c at process virtual address 0x7fe6c99c8000
[ 1010.402931] page:00000000f5670686 refcount:1 mapcount:-128 mapping:0000000000000000 index:0x1 pfn:0xbff89
[ 1010.405604] flags: 0xfffe000800000(hwpoison)
[ 1010.406755] raw: 000fffe000800000 ffffcddf029ab848 ffffcddf02ff9448 0000000000000000
[ 1010.408824] raw: 0000000000000001 0000000000000000 00000001ffffff7f 0000000000000000
[ 1010.410877] page dumped because: VM_BUG_ON_PAGE(page_count(buddy) != 0)
[ 1010.412673] ------------[ cut here ]------------
[ 1010.413930] kernel BUG at mm/page_alloc.c:800!
[ 1010.415143] invalid opcode: 0000 [#1] SMP PTI
[ 1010.416320] CPU: 3 PID: 1340 Comm: kworker/3:0 Not tainted 5.9.0-rc2-mm1-v5.9-rc2-200917-1952-00212-gf1a0765b04cb+ #33
[ 1010.419101] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014
[ 1010.422645] Workqueue: mm_percpu_wq drain_local_pages_wq
[ 1010.424075] RIP: 0010:__free_one_page+0x552/0x580
[ 1010.425344] Code: 48 c7 c6 90 6c 0f 84 4c 89 e7 e8 69 7e fd ff 0f 0b 0f 1f 44 00 00 e9 e5 fc ff ff 48 c7 c6 c8 f3 11 84 4c 89 f7 e8 4e 7e fd ff <0f> 0b 83 fb 08 0f 86 cb fc ff ff 48 83 c4 20 5b 5d 41 5c 41 5d 41
[ 1010.430231] RSP: 0018:ffffaa96c171fda0 EFLAGS: 00010082
[ 1010.431651] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000027
[ 1010.433598] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8dc8bbd18d08
[ 1010.435627] RBP: 00000000000bff88 R08: ffff8dc8bbd18d00 R09: 6573756163656220
[ 1010.437544] R10: 6163656220646570 R11: 6d75642065676170 R12: ffffcddf02ffe200
[ 1010.439376] R13: 00000000000bff89 R14: ffffcddf02ffe240 R15: ffff8dc7bffd5680
[ 1010.441271] FS: 0000000000000000(0000) GS:ffff8dc8bbd00000(0000) knlGS:0000000000000000
[ 1010.443349] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1010.444892] CR2: 00007f6b69f92000 CR3: 0000000139c4a000 CR4: 00000000001506e0
[ 1010.446746] Call Trace:
[ 1010.447424] free_pcppages_bulk+0x1d4/0x2c0
[ 1010.448553] drain_pages_zone+0x42/0x50
[ 1010.449585] drain_local_pages_wq+0xe/0x10
[ 1010.450702] process_one_work+0x1b0/0x360
[ 1010.451769] worker_thread+0x50/0x3a0
[ 1010.452940] ? process_one_work+0x360/0x360
[ 1010.454072] kthread+0xfe/0x140
[ 1010.454989] ? kthread_park+0x90/0x90
[ 1010.455970] ret_from_fork+0x22/0x30
This message seems to show that the pages to be moved to buddy have refcount.
Could you review how changes in v3 -> v4 make it?
Here's my reproducer.
[build1:~]$ cat test_ksm_madv_soft.c
#include <stdio.h>
#include <string.h>
#include <sys/mman.h>
#include <unistd.h>
#include <sys/types.h>
#include <errno.h>
#include <stdlib.h>
#define MADV_SOFT_OFFLINE 101
#define err(x) perror(x),exit(EXIT_FAILURE)
int main() {
int ret;
int size = 100000*0x1000;
char *p1 = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
printf("p1 %p\n", p1);
char *p2 = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
printf("p2 %p\n", p2);
ret = madvise(p1, size, MADV_MERGEABLE);
printf("madvise(p1) %d\n", ret);
ret = madvise(p2, size, MADV_MERGEABLE);
printf("madvise(p2) %d\n", ret);
printf("writing p1 ... ");
memset(p1, 'a', size);
printf("done\n");
printf("writing p2 ... ");
memset(p2, 'a', size);
printf("done\n");
usleep(10000000);
printf("soft offline\n");
ret = madvise(p1, size, MADV_SOFT_OFFLINE);
printf("soft offline returns %d\n", ret);
if (ret)
err("madvise");
madvise(p1, size, MADV_UNMERGEABLE);
madvise(p2, size, MADV_UNMERGEABLE);
printf("OK\n");
}
[build1:~/upstream/mm_regression/lib]$ cat tmp_run_ksm_madv.sh
rm test_ksm_madv_soft 2> /dev/null
gcc -o test_ksm_madv_soft test_ksm_madv_soft.c || exit 1
echo 0 > /sys/kernel/mm/ksm/sleep_millisecs
echo 100000 > /sys/kernel/mm/ksm/pages_to_scan
echo 100000 > /sys/kernel/mm/ksm/max_page_sharing
echo 2 > /sys/kernel/mm/ksm/run
echo 1 > /sys/kernel/mm/ksm/run
./test_ksm_madv_soft
Thanks,
Naoya Horiguchi
next prev parent reply other threads:[~2020-09-17 11:39 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-09-17 8:10 Oscar Salvador
2020-09-17 8:10 ` [PATCH v4 1/7] mm,hwpoison: take free pages off the buddy freelists Oscar Salvador
2020-09-25 2:22 ` HORIGUCHI NAOYA(堀口 直也)
2020-09-17 8:10 ` [PATCH v4 2/7] mm,hwpoison: Do not set hugepage_or_freepage unconditionally Oscar Salvador
2020-09-18 19:26 ` Aristeu Rozanski
2020-09-17 8:10 ` [PATCH v4 3/7] mm,hwpoison: Try to narrow window race for free pages Oscar Salvador
2020-09-18 19:27 ` Aristeu Rozanski
2020-09-17 8:10 ` [PATCH v4 4/7] mm,hwpoison: refactor madvise_inject_error Oscar Salvador
2020-09-17 8:10 ` [PATCH v4 5/7] mm,hwpoison: drain pcplists before bailing out for non-buddy zero-refcount page Oscar Salvador
2020-09-25 2:22 ` HORIGUCHI NAOYA(堀口 直也)
2020-09-17 8:10 ` [PATCH v4 6/7] mm,hwpoison: drop unneeded pcplist draining Oscar Salvador
2020-09-17 8:10 ` [PATCH v4 7/7] mm,hwpoison: remove stale code Oscar Salvador
2020-09-17 11:39 ` HORIGUCHI NAOYA(堀口 直也) [this message]
2020-09-17 13:09 ` [PATCH v4 0/7] HWpoison: further fixes and cleanups Oscar Salvador
2020-09-17 13:40 ` Oscar Salvador
2020-09-17 15:27 ` HORIGUCHI NAOYA(堀口 直也)
2020-09-18 5:49 ` osalvador
2020-09-18 19:25 ` aris
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200917113920.GA19898@hori.linux.bs1.fc.nec.co.jp \
--to=naoya.horiguchi@nec.com \
--cc=akpm@linux-foundation.org \
--cc=aris@ruivo.org \
--cc=cai@lca.pw \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=osalvador@suse.de \
--cc=tony.luck@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox