From: "Mika Penttilä" <mpenttil@redhat.com>
To: Alistair Popple <apopple@nvidia.com>
Cc: linux-mm@kvack.org, akpm@linux-foundation.org,
linux-kernel@vger.kernel.org, "Sierra Guiza,
Alejandro (Alex)" <alex.sierra@amd.com>,
Felix Kuehling <Felix.Kuehling@amd.com>,
Jason Gunthorpe <jgg@nvidia.com>,
John Hubbard <jhubbard@nvidia.com>,
David Hildenbrand <david@redhat.com>,
Ralph Campbell <rcampbell@nvidia.com>,
Matthew Wilcox <willy@infradead.org>,
Karol Herbst <kherbst@redhat.com>, Lyude Paul <lyude@redhat.com>,
Ben Skeggs <bskeggs@redhat.com>,
Logan Gunthorpe <logang@deltatee.com>,
linuxram@us.ibm.com, paulus@ozlabs.org
Subject: Re: [PATCH 2/2] selftests/hmm-tests: Add test for dirty bits
Date: Mon, 15 Aug 2022 07:05:01 +0300 [thread overview]
Message-ID: <2aa2013a-735d-a96a-2f35-0a44a06d85f0@redhat.com> (raw)
In-Reply-To: <87h72ew4p6.fsf@nvdebian.thelocal>
On 15.8.2022 6.21, Alistair Popple wrote:
>
> Mika Penttilä <mpenttil@redhat.com> writes:
>
>> On 15.8.2022 5.35, Alistair Popple wrote:
>>> Mika Penttilä <mpenttil@redhat.com> writes:
>>>
>>>> Hi Alistair!
>>>>
>>>> On 12.8.2022 8.22, Alistair Popple wrote:
>>> [...]
>>>
>>>>> + buffer->ptr = mmap(NULL, size,
>>>>> + PROT_READ | PROT_WRITE,
>>>>> + MAP_PRIVATE | MAP_ANONYMOUS,
>>>>> + buffer->fd, 0);
>>>>> + ASSERT_NE(buffer->ptr, MAP_FAILED);
>>>>> +
>>>>> + /* Initialize buffer in system memory. */
>>>>> + for (i = 0, ptr = buffer->ptr; i < size / sizeof(*ptr); ++i)
>>>>> + ptr[i] = 0;
>>>>> +
>>>>> + ASSERT_FALSE(write_cgroup_param(cgroup, "memory.reclaim", 1UL<<30));
>>>>> +
>>>>> + /* Fault pages back in from swap as clean pages */
>>>>> + for (i = 0, ptr = buffer->ptr; i < size / sizeof(*ptr); ++i)
>>>>> + tmp += ptr[i];
>>>>> +
>>>>> + /* Dirty the pte */
>>>>> + for (i = 0, ptr = buffer->ptr; i < size / sizeof(*ptr); ++i)
>>>>> + ptr[i] = i;
>>>>> +
>>>>
>>>> The anon pages are quite likely in memory at this point, and dirty in pte.
>>> Why would the pte be dirty? I just confirmed using some modified pagemap
>>> code that on my system at least this isn't the case.
>>>
>>>>> + /*
>>>>> + * Attempt to migrate memory to device, which should fail because
>>>>> + * hopefully some pages are backed by swap storage.
>>>>> + */
>>>>> + ASSERT_TRUE(hmm_migrate_sys_to_dev(self->fd, buffer, npages));
>>>>
>>>> And pages marked dirty also now. But could you elaborate how and where the above
>>>> fails in more detail, couldn't immediately see it...
>>> Not if you don't have patch 1 of this series applied. If the
>>> trylock_page() in migrate_vma_collect_pmd() succeeds (which it almost
>>> always does) it will have cleared the pte without setting PageDirty.
>>>
>>
>> Ah yes but I meant with the patch 1 applied, the comment "Attempt to migrate
>> memory to device, which should fail because hopefully some pages are backed by
>> swap storage" indicates that hmm_migrate_sys_to_dev() would fail..and there's
>> that ASSERT_TRUE which means fail here.
>>
>> So I understand the data loss but where is the hmm_migrate_sys_to_dev() failing,
>> with or wihtout patch 1 applied?
>
> Oh right. hmm_migrate_sys_to_dev() will fail because the page is in the
> swap cache, and migrate_vma_*() doesn't currently support migrating
> pages with a mapping.
>
Ok I forgot we skip also page cache pages, not just file pages...
>>> So now we have a dirty page without PageDirty set and without a dirty
>>> pte. If this page gets swapped back to disk and is still in the swap
>>> cache data will be lost because reclaim will see a clean page and won't
>>> write it out again.
>>> At least that's my understanding - please let me know if you see
>>> something that doesn't make sense.
>>>
>>>>> +
>>>>> + ASSERT_FALSE(write_cgroup_param(cgroup, "memory.reclaim", 1UL<<30));
>>>>> +
>>>>> + /* Check we still see the updated data after restoring from swap. */
>>>>> + for (i = 0, ptr = buffer->ptr; i < size / sizeof(*ptr); ++i)
>>>>> + ASSERT_EQ(ptr[i], i);
>>>>> +
>>>>> + hmm_buffer_free(buffer);
>>>>> + destroy_cgroup();
>>>>> +}
>>>>> +
>>>>> /*
>>>>> * Read anonymous memory multiple times.
>>>>> */
>>>>
>>>>
>>>> --Mika
>>>
>
next prev parent reply other threads:[~2022-08-15 4:05 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-12 5:22 [PATCH 1/2] mm/migrate_device.c: Copy pte dirty bit to page Alistair Popple
2022-08-12 5:22 ` [PATCH 2/2] selftests/hmm-tests: Add test for dirty bits Alistair Popple
2022-08-12 7:58 ` Mika Penttilä
2022-08-15 2:35 ` Alistair Popple
2022-08-15 3:11 ` Mika Penttilä
2022-08-15 3:21 ` Alistair Popple
2022-08-15 4:05 ` Mika Penttilä [this message]
2022-08-15 4:06 ` Mika Penttilä
2022-08-15 20:29 ` [PATCH 1/2] mm/migrate_device.c: Copy pte dirty bit to page Peter Xu
2022-08-16 0:51 ` Alistair Popple
2022-08-16 1:39 ` huang ying
2022-08-16 2:28 ` Alistair Popple
2022-08-16 6:37 ` huang ying
2022-08-17 1:27 ` Alistair Popple
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2aa2013a-735d-a96a-2f35-0a44a06d85f0@redhat.com \
--to=mpenttil@redhat.com \
--cc=Felix.Kuehling@amd.com \
--cc=akpm@linux-foundation.org \
--cc=alex.sierra@amd.com \
--cc=apopple@nvidia.com \
--cc=bskeggs@redhat.com \
--cc=david@redhat.com \
--cc=jgg@nvidia.com \
--cc=jhubbard@nvidia.com \
--cc=kherbst@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linuxram@us.ibm.com \
--cc=logang@deltatee.com \
--cc=lyude@redhat.com \
--cc=paulus@ozlabs.org \
--cc=rcampbell@nvidia.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox