linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Oleg Nesterov <oleg@redhat.com>, Ingo Molnar <mingo@redhat.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Namhyung Kim <namhyung@kernel.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Jiri Olsa <jolsa@kernel.org>, Ian Rogers <irogers@google.com>,
	Adrian Hunter <adrian.hunter@intel.com>,
	Kan Liang <kan.liang@linux.intel.com>,
	Masami Hiramatsu <mhiramat@kernel.org>,
	linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-trace-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	"Liam R . Howlett" <Liam.Howlett@oracle.com>,
	Vlastimil Babka <vbabka@suse.cz>, Jann Horn <jannh@google.com>,
	linux-mm@kvack.org, Peng Zhang <zhangpeng.00@bytedance.com>,
	syzbot+2d788f4f7cb660dac4b7@syzkaller.appspotmail.com
Subject: Re: [PATCH v2] fork: avoid inappropriate uprobe access to invalid mm
Date: Tue, 10 Dec 2024 22:43:55 +0100	[thread overview]
Message-ID: <63ae2ae4-b023-4802-9b34-a2c0d272f6d7@redhat.com> (raw)
In-Reply-To: <ec718cd8-afcf-4002-95c5-4cc610a44107@lucifer.local>

On 10.12.24 21:59, Lorenzo Stoakes wrote:
> On Tue, Dec 10, 2024 at 08:35:30PM +0100, David Hildenbrand wrote:
>> On 10.12.24 18:24, Lorenzo Stoakes wrote:
>>> If dup_mmap() encounters an issue, currently uprobe is able to access the
>>> relevant mm via the reverse mapping (in build_map_info()), and if we are
>>> very unlucky with a race window, observe invalid XA_ZERO_ENTRY state which
>>> we establish as part of the fork error path.
>>>
>>> This occurs because uprobe_write_opcode() invokes anon_vma_prepare() which
>>> in turn invokes find_mergeable_anon_vma() that uses a VMA iterator,
>>> invoking vma_iter_load() which uses the advanced maple tree API and thus is
>>> able to observe XA_ZERO_ENTRY entries added to dup_mmap() in commit
>>> d24062914837 ("fork: use __mt_dup() to duplicate maple tree in
>>> dup_mmap()").
>>>
>>> This change was made on the assumption that only process tear-down code
>>> would actually observe (and make use of) these values. However this very
>>> unlikely but still possible edge case with uprobes exists and unfortunately
>>> does make these observable.
>>>
>>> The uprobe operation prevents races against the dup_mmap() operation via
>>> the dup_mmap_sem semaphore, which is acquired via uprobe_start_dup_mmap()
>>> and dropped via uprobe_end_dup_mmap(), and held across
>>> register_for_each_vma() prior to invoking build_map_info() which does the
>>> reverse mapping lookup.
>>>
>>> Currently these are acquired and dropped within dup_mmap(), which exposes
>>> the race window prior to error handling in the invoking dup_mm() which
>>> tears down the mm.
>>>
>>> We can avoid all this by just moving the invocation of
>>> uprobe_start_dup_mmap() and uprobe_end_dup_mmap() up a level to dup_mm()
>>> and only release this lock once the dup_mmap() operation succeeds or clean
>>> up is done.
>>
>> What I understand is: we need to perform the uprobe_end_dup_mmap() after the
>> mmput().
> 
> Ack yes.
> 
>>
>> I assume/hope that we cannot see another mmget() before we return here. In
>> that case, this LGTM.
> 
> We are dealing with a tiny time window and brief rmap availability, so it's hard
> to say that's impossible. You also have to have failed to allocate really very
> small amounts of memory, so we are talking lottery odds for this to even happen
> in the first instance :)

Yes, likely the error injection framework is one of the only reliable 
ways to trigger that :)

> 
> I mean the syzkaller report took a year or so to hit it, and had to do
> fault injection to do so.

Ah, there it is: "fault injection" :D

> 
> Also it's not impossible that there are other means of accessing the mm
> contianing XA_ZERO_ENTRY items through other means (I believe Liam was looking
> into this).
> 
> However this patch is intended to at least eliminate the most proximate obvious
> case with as simple a code change as possible.
> 
> Ideally we'd somehow mark the mm as being inaccessible somehow, but MMF_ flags
> are out, and the obvious one to extend to mean this here, MMF_UNSTABLE, may
> interact with oomk logic in some horrid way.
> 
>>
>> --
>> Cheers,
>>
>> David / dhildenb
>>
> 
> So overall this patch is a relatively benign attempt to deal with the most
> obvious issue with no apparent cost, but doesn't really rule out the need
> to do more going forward...

Maybe add a bit of that to the patch description. Like "Fixes the 
reproducer, but likely there is more we'll tackle separately", your call.

Thanks for the details!

-- 
Cheers,

David / dhildenb



      reply	other threads:[~2024-12-10 21:44 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-10 17:24 Lorenzo Stoakes
2024-12-10 19:35 ` David Hildenbrand
2024-12-10 20:59   ` Lorenzo Stoakes
2024-12-10 21:43     ` David Hildenbrand [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=63ae2ae4-b023-4802-9b34-a2c0d272f6d7@redhat.com \
    --to=david@redhat.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=irogers@google.com \
    --cc=jannh@google.com \
    --cc=jolsa@kernel.org \
    --cc=kan.liang@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=mark.rutland@arm.com \
    --cc=mhiramat@kernel.org \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=syzbot+2d788f4f7cb660dac4b7@syzkaller.appspotmail.com \
    --cc=vbabka@suse.cz \
    --cc=zhangpeng.00@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox