On Thu, Jun 16, 2016 at 01:23:43PM +0900, Sergey Senozhatsky wrote:
On (06/16/16 11:58), Minchan Kim wrote: [..]
RAX: 2065676162726166 so rax is totally garbage, I think. It means obj_to_head returns garbage because get_first_obj_offset is utter crab because (page_idx / class->pages_per_zspage) was totally wrong.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6408: f0 0f ba 28 00 lock btsl $0x0,(%rax)
<snip>
Could you test with [zsmalloc: keep first object offset in struct page] in mmotm?
sure, I can. will it help, tho? we have a race condition here I think.
I guess root cause is caused by get_first_obj_offset.
sounds reasonable.
Please test with it.
this is what I'm getting with the [zsmalloc: keep first object offset in struct page] applied: "count:0 mapcount:-127". which may be not related to zsmalloc at this point.
kernel: BUG: Bad page state in process khugepaged pfn:101db8 kernel: page:ffffea0004076e00 count:0 mapcount:-127 mapping: (null) index:0x1
Hm, it seems double free.
It doen't happen if you disable zram? IOW, it seems to be related zsmalloc migration?
How easy can you reprodcue it? Could you bisect it?
kernel: flags: 0x8000000000000000() kernel: page dumped because: nonzero mapcount kernel: Modules linked in: lzo zram zsmalloc mousedev coretemp hwmon crc32c_intel snd_hda_codec_realtek i2c_i801 snd_hda_codec_generic r8169 mii snd_hda_intel snd_hda_codec snd_hda_core acpi_cpufreq snd_pcm snd_timer snd soundcore lpc_ich processor mfd_core sch_fq_codel sd_mod hid_generic usb kernel: CPU: 3 PID: 38 Comm: khugepaged Not tainted 4.7.0-rc3-next-20160615-dbg-00005-gfd11984-dirty #491 kernel: 0000000000000000 ffff8801124c73f8 ffffffff814d69b0 ffffea0004076e00 kernel: ffffffff81e658a0 ffff8801124c7420 ffffffff811e9b63 0000000000000000 kernel: ffffea0004076e00 ffffffff81e658a0 ffff8801124c7440 ffffffff811e9ca9 kernel: Call Trace: kernel: [<ffffffff814d69b0>] dump_stack+0x68/0x92 kernel: [<ffffffff811e9b63>] bad_page+0x158/0x1a2 kernel: [<ffffffff811e9ca9>] free_pages_check_bad+0xfc/0x101 kernel: [<ffffffff811ee516>] free_hot_cold_page+0x135/0x5de kernel: [<ffffffff811eea26>] __free_pages+0x67/0x72 kernel: [<ffffffff81227c63>] release_freepages+0x13a/0x191 kernel: [<ffffffff8122b3c2>] compact_zone+0x845/0x1155 kernel: [<ffffffff8122ab7d>] ? compaction_suitable+0x76/0x76 kernel: [<ffffffff8122bdb2>] compact_zone_order+0xe0/0x167 kernel: [<ffffffff8122bcd2>] ? compact_zone+0x1155/0x1155 kernel: [<ffffffff8122ce88>] try_to_compact_pages+0x2f1/0x648 kernel: [<ffffffff8122ce88>] ? try_to_compact_pages+0x2f1/0x648 kernel: [<ffffffff8122cb97>] ? compaction_zonelist_suitable+0x3a6/0x3a6 kernel: [<ffffffff811ef1ea>] ? get_page_from_freelist+0x2c0/0x133c kernel: [<ffffffff811f0350>] __alloc_pages_direct_compact+0xea/0x30d kernel: [<ffffffff811f0266>] ? get_page_from_freelist+0x133c/0x133c kernel: [<ffffffff811ee3b2>] ? drain_all_pages+0x1d6/0x205 kernel: [<ffffffff811f21a8>] __alloc_pages_nodemask+0x143d/0x16b6 kernel: [<ffffffff8111f405>] ? debug_show_all_locks+0x226/0x226 kernel: [<ffffffff811f0d6b>] ? warn_alloc_failed+0x24c/0x24c kernel: [<ffffffff81110ffc>] ? finish_wait+0x1a4/0x1b0 kernel: [<ffffffff81122faf>] ? lock_acquire+0xec/0x147 kernel: [<ffffffff81d32ed0>] ? _raw_spin_unlock_irqrestore+0x3b/0x5c kernel: [<ffffffff81d32edc>] ? _raw_spin_unlock_irqrestore+0x47/0x5c kernel: [<ffffffff81110ffc>] ? finish_wait+0x1a4/0x1b0 kernel: [<ffffffff8128f73a>] khugepaged+0x1d4/0x484f kernel: [<ffffffff8128f566>] ? hugepage_vma_revalidate+0xef/0xef kernel: [<ffffffff810d5bcc>] ? finish_task_switch+0x3de/0x484 kernel: [<ffffffff81d32f18>] ? _raw_spin_unlock_irq+0x27/0x45 kernel: [<ffffffff8111d13f>] ? trace_hardirqs_on_caller+0x3d2/0x492 kernel: [<ffffffff81111487>] ? prepare_to_wait_event+0x3f7/0x3f7 kernel: [<ffffffff81d28bf5>] ? __schedule+0xa4d/0xd16 kernel: [<ffffffff810cd0de>] kthread+0x252/0x261 kernel: [<ffffffff8128f566>] ? hugepage_vma_revalidate+0xef/0xef kernel: [<ffffffff810cce8c>] ? kthread_create_on_node+0x377/0x377 kernel: [<ffffffff81d3387f>] ret_from_fork+0x1f/0x40 kernel: [<ffffffff810cce8c>] ? kthread_create_on_node+0x377/0x377 -- Reboot --
-ss
dri-devel@lists.freedesktop.org