-----Original Message----- From: Josh Poimboeuf [mailto:jpoimboe@redhat.com] Sent: Monday, March 07, 2016 6:10 PM To: Deucher, Alexander; Koenig, Christian Cc: dri-devel@lists.freedesktop.org; linux-kernel@vger.kernel.org; kbuild test robot; Ingo Molnar Subject: [PATCH] drm/radeon: refactor CIK tiling table initialization
When compiling the radeon driver on x86_64 with CONFIG_STACK_VALIDATION enabled, objtool gives the following warnings:
drivers/gpu/drm/radeon/cik.o: warning: objtool: cik_tiling_mode_table_init()+0x6ce: call without frame pointer save/setup drivers/gpu/drm/radeon/cik.o: warning: objtool: cik_tiling_mode_table_init()+0x72b: call without frame pointer save/setup drivers/gpu/drm/radeon/cik.o: warning: objtool: cik_tiling_mode_table_init()+0x464: call without frame pointer save/setup ...
These are actually false positive warnings; there are no frame pointer bugs. Instead objtool gets confused by the jump tables created by all the switch statements, combined with some other gcc optimizations. It tries to follows all possible code paths, but it fails to realize that some of the paths aren't possible. For example:
4c97: 31 c0 xor %eax,%eax ... 4ca2: 89 c1 mov %eax,%ecx 4ca4: ff 24 cd 00 00 00 00 jmpq *0x0(,%rcx,8) 4ca7: R_X86_64_32S
.rodata+0x148
First eax is cleared to zero with the "xor %eax,%eax" instruction. Later, it moves the value of eax (zero in this case) to ecx, and uses that value to jump to the first entry in a jump table in .rodata.
Because objtool doesn't have an x86 emulator, it doesn't know that rcx is zero. So instead of following a single code path to the first jump table entry, it follows all possible jump table entry paths in parallel.
Usually such overactive analysis isn't a problem. In every other jump table in the kernel, all the jump targets have the same frame pointer state. But in this exceedingly rare case, different targets have different frame pointer states. Objtool notices that and creates the false positive warnings.
In theory we could use the STACK_FRAME_NON_STANDARD marker to tell objtool to skip analysis of the function. However, that's less than ideal.
Looking at the cik_tiling_mode_table_init() code, it seems overly complex with lots of repetition. So let's simplify it. All the switch statements and conditionals can be replaced with much simpler logic by generalizing the different behaviors and moving the initialization data into data structures.
The change is a win-win: it's easier to parse for both humans and machines. It also reduces the binary size by about 2%:
text data bss dec hex filename
101011 30360 0 131371 2012b cik-before.o 98699 30200 0 128899 1f783 cik-after.o
[ Note: Unfortunately I don't know how to test this code, so it's completely untested. Any help or guidance with ensuring that the correct initialization is still being written would be greatly appreciated! ]
I think it would be clearer to rework it similarly to how it was reworked in amdgpu (see gfx_v8_0.c and gfx_v7_0.c in drm-next). Also ideally you'd update the similar code in si.c as well for consistency.
Alex