On 2011.12.02 at 21:06 +0100, Markus Trippelsdorf wrote:
On 2011.12.02 at 14:43 -0500, Jerome Glisse wrote:
On Thu, Dec 01, 2011 at 09:44:37AM +0100, Markus Trippelsdorf wrote:
On 2011.11.24 at 09:50 +0100, Markus Trippelsdorf wrote:
On 2011.11.23 at 10:06 -0600, Christoph Lameter wrote:
On Wed, 23 Nov 2011, Markus Trippelsdorf wrote:
> FIX idr_layer_cache: Marking all objects used
Yesterday I couldn't reproduce the issue at all. But today I've hit exactly the same spot again. (CCing the drm list)
Well this is looks like write after free.
============================================================================= BUG idr_layer_cache: Poison overwritten
Object ffff8802156487c0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk Object ffff8802156487d0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk Object ffff8802156487e0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk Object ffff8802156487f0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk Object ffff880215648800: 00 00 00 00 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b ....kkkkkkkkkkkk Object ffff880215648810: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
And its an integer sized write of 0. If you look at the struct definition and lookup the offset you should be able to locate the field that was modified.
It also happens with CONFIG_SLAB. (If someone wants to reproduce the issue, just run a kexec boot loop and the bug will occur after a few (~10) iterations.)
Can you provide the kexec command line you are using and full kernel log (mostly interested in kernel option).
/usr/sbin/kexec -l "/usr/src/linux/arch/x86/boot/bzImage" --append="root=PARTUUID=6d6a4009-3a90-40df-806a-e63f48189719 init=/sbin/minit rootflags=logbsize=262144 fbcon=rotate:3 drm_kms_helper.poll=0 quiet" /usr/sbin/kexec -e
(The loop happens after autologin in .zprofile: sleep 4 && sudo /etc/minit/ctrlaltdel/run (the last script kills, unmounts and then runs the two kexec commands above))
BTW I always see (mostly only on screen, sometimes in the logs):
[Firmware Bug]: cpu 2, try to use APIC500 (LVT offset 0) for vector 0x10400, but the register is already in use for vector 0xf9 on another cpu [Firmware Bug]: cpu 2, IBS interrupt offset 0 not available (MSRC001103A=0x0000000000000100) [Firmware Bug]: using offset 1 for IBS interrupts [Firmware Bug]: workaround enabled for IBS LVT offset perf: AMD IBS detected (0x0000001f)
But I hope that it is only a harmless warning. (perf Instruction-Based Sampling)
Robert?