linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Andrey Ryabinin <aryabinin@virtuozzo.com>
To: Andrea Arcangeli <aarcange@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org
Cc: Evgheni Dereveanchin <ederevea@redhat.com>,
	Petr Holasek <pholasek@redhat.com>,
	Hugh Dickins <hughd@google.com>,
	Arjan van de Ven <arjan@linux.intel.com>,
	Davidlohr Bueso <dave@stgolabs.net>,
	Gavin Guo <gavin.guo@canonical.com>,
	Jay Vosburgh <jay.vosburgh@canonical.com>,
	Mel Gorman <mgorman@techsingularity.net>
Subject: Re: [PATCH 1/1] ksm: fix use after free with merge_across_nodes = 0
Date: Mon, 15 May 2017 19:14:36 +0300	[thread overview]
Message-ID: <359799de-6afa-8385-1573-4ea45625eac8@virtuozzo.com> (raw)
In-Reply-To: <20170512193805.8807-2-aarcange@redhat.com>

On 05/12/2017 10:38 PM, Andrea Arcangeli wrote:
> If merge_across_nodes was manually set to 0 (not the default value) by
> the admin or a tuned profile on NUMA systems triggering cross-NODE
> page migrations, a stable_node use after free could materialize.
> 
> If the chain is collapsed stable_node would point to the old chain
> that was already freed. stable_node_dup would be the stable_node dup
> now converted to a regular stable_node and indexed in the rbtree in
> replacement of the freed stable_node chain (not anymore a dup).
> 
> This special case where the chain is collapsed in the NUMA replacement
> path, is now detected by setting stable_node to NULL by the
> chain_prune callee if it decides to collapse the chain. This tells the
> NUMA replacement code that even if stable_node and stable_node_dup are
> different, this is not a chain if stable_node is NULL, as the
> stable_node_dup was converted to a regular stable_node and the chain
> was collapsed.
> 
> It is generally safer for the callee to force the caller stable_node
> to NULL the moment it become stale so any other mistake like this
> would result in an instant Oops easier to debug than an use after free.
> 
> Otherwise the replace logic would act like if stable_node was a valid
> chain, when in fact it was freed. Notably
> stable_node_chain_add_dup(page_node, stable_node) would run on a
> stable stable_node.
> 
> Andrey Ryabinin found the source of the use after free in
> chain_prune().
> 
> Reported-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
> Reported-by: Evgheni Dereveanchin <ederevea@redhat.com>
> Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
> ---


Works for me,

Tested-by: Andrey Ryabinin <aryabinin@virtuozzo.com>


Bellow is reproducer which causes crash in ksm in several minutes without this fix.


$ cat  ksm_test.c

#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <sys/mman.h>
#include <stdlib.h>
#include <unistd.h>
#include <stdio.h>
#include <numaif.h>
#include <sys/types.h>
#include <sys/wait.h>

#define NR_NODES 4
#define MAP_SIZE 4096

#define NR_THREADS 1024

pid_t pids[NR_THREADS];

int merge_and_migrate(void)
{
        void *p;
        unsigned long rnd;
        unsigned long old_node, new_node;
        pid_t p_pid, pid;
        int j;

        p = mmap(NULL, MAP_SIZE, PROT_READ|PROT_WRITE,
                MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
        if (p == MAP_FAILED)
                perror("mmap"), exit(1);

        memset(p, 0xff, MAP_SIZE);
        if (madvise(p, MAP_SIZE, MADV_MERGEABLE))
                perror("madvise"), exit(1);


        while (1) {
                sleep(0);
                rnd = rand() % 2;
                switch (rnd) {
                case 0: {
                        rnd = rand() % 128;
                        memset(p, rnd, MAP_SIZE);
                        break;
                }
                case 1: {
                        j = rand()%NR_NODES;
                        old_node = 1 << j;
                        new_node = 1<<((j+1)%NR_NODES);

                        migrate_pages(0, NR_NODES, &old_node, &new_node);
                        break;
                }
                }
        }
        return 0;
}

int main(void)
{
        int i,ret,j;
        pid_t pid;
        int wstatus;
        unsigned long old_node, new_node;

        for (i = 0; i < NR_THREADS; i++) {
                pid = fork();
                if (pid < 0) {
                        perror("fork");
                        return 1;
                }
                if (pid) {
                        pids[i] = pid;
                        continue;
                } else
                        merge_and_migrate();
        }

        while (1) {
                pid = waitpid(-1, &wstatus, WNOHANG);
                if (pid < 0) {
                        perror("waitpid failed");
                        return 1;
                }
                if (pid) {
                        for (i = 0; i< NR_THREADS; i++) {
                                if (pids[i] == pid) {
                                        pid = fork();
                                        if (pid < 0) {
                                                perror("fork in while");
                                                return 1;
                                        }
                                        if (pid) {
                                                pids[i] = pid;
                                                break;
                                        } else
                                                merge_and_migrate();
                                }
                        }
                        continue; /*while(1)*/
                }
                i = rand()%NR_THREADS;
                kill(pids[i], SIGKILL);
        }
        return 0;
}

$ cat run_ksm.sh
#!/bin/bash

gcc -lnuma -O2 ksm_test.c -o ksm_test
echo 1 > /sys/kernel/mm/ksm/run
echo 0 > /sys/kernel/mm/ksm/merge_across_nodes
echo 2 > /sys/kernel/mm/ksm/max_page_sharing
echo 0 > /sys/kernel/mm/ksm/stable_node_chains_prune_millisecs
./ksm_test


$ ./run_ksm.sh
[  203.251200] ==================================================================
[  203.251679] BUG: KASAN: use-after-free in stable_tree_search+0x1450/0x16f0
[  203.252229] Read of size 4 at addr ffff880037e9d938 by task ksmd/170
[  203.252800] 
[  203.252957] CPU: 2 PID: 170 Comm: ksmd Not tainted 4.12.0-rc1-next-20170515+ #639
[  203.253627] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
[  203.254670] Call Trace:
[  203.254907]  dump_stack+0x67/0x98
[  203.255222]  print_address_description+0x7c/0x290
[  203.255652]  ? stable_tree_search+0x1450/0x16f0
[  203.256073]  kasan_report+0x26e/0x350
[  203.256418]  __asan_report_load4_noabort+0x19/0x20
[  203.256852]  stable_tree_search+0x1450/0x16f0
[  203.257262]  ? __stable_node_chain+0x8a0/0x8a0
[  203.257668]  ? follow_page_mask+0x5f9/0xd80
[  203.258060]  ksm_scan_thread+0xb47/0x2790
[  203.258438]  ? stable_tree_search+0x16f0/0x16f0
[  203.258858]  ? __schedule+0x904/0x1ad0
[  203.259214]  ? clkdev_alloc+0xd0/0xd0
[  203.259553]  ? wake_atomic_t_function+0x2a0/0x2a0
[  203.259985]  ? trace_hardirqs_on+0xd/0x10
[  203.260361]  kthread+0x2d6/0x3d0
[  203.260658]  ? stable_tree_search+0x16f0/0x16f0
[  203.261073]  ? kthread_create_on_node+0xb0/0xb0
[  203.261485]  ret_from_fork+0x2e/0x40
[  203.261819] 
[  203.261936] Allocated by task 170:
[  203.262251]  save_stack_trace+0x1b/0x20
[  203.262601]  kasan_kmalloc+0xee/0x180
[  203.262938]  kasan_slab_alloc+0x12/0x20
[  203.263290]  kmem_cache_alloc+0x129/0x2d0
[  203.263654]  alloc_stable_node_chain+0x29/0x310
[  203.264072]  ksm_scan_thread+0x2048/0x2790
[  203.264444]  kthread+0x2d6/0x3d0
[  203.264744]  ret_from_fork+0x2e/0x40
[  203.265075] 
[  203.265220] Freed by task 170:
[  203.265503]  save_stack_trace+0x1b/0x20
[  203.265852]  kasan_slab_free+0xad/0x180
[  203.266208]  kmem_cache_free+0xc7/0x300
[  203.266558]  __stable_node_chain+0x68a/0x8a0
[  203.266948]  stable_tree_search+0x18e/0x16f0
[  203.267339]  ksm_scan_thread+0xb47/0x2790
[  203.267655]  kthread+0x2d6/0x3d0
[  203.267910]  ret_from_fork+0x2e/0x40





--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-05-15 16:12 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-12 19:38 [RFC] [PATCH 0/1] " Andrea Arcangeli
2017-05-12 19:38 ` [PATCH 1/1] " Andrea Arcangeli
2017-05-15 16:14   ` Andrey Ryabinin [this message]
2017-05-12 20:37 ` [RFC] [PATCH 0/1] " Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=359799de-6afa-8385-1573-4ea45625eac8@virtuozzo.com \
    --to=aryabinin@virtuozzo.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=arjan@linux.intel.com \
    --cc=dave@stgolabs.net \
    --cc=ederevea@redhat.com \
    --cc=gavin.guo@canonical.com \
    --cc=hughd@google.com \
    --cc=jay.vosburgh@canonical.com \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=pholasek@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox