The varargs macro trick in _PIPE3/_PHY3/_PORT3 was meant as an optimization to shrink the i915 kernel module by around 1000 bytes. However, the downside is a size regression with CONFIG_KASAN, as I found from stack size warnings with gcc-7.0.1:
before: drivers/gpu/drm/i915/intel_dpll_mgr.c: In function 'bxt_ddi_pll_get_hw_state': drivers/gpu/drm/i915/intel_dpll_mgr.c:1644:1: error: the frame size of 176 bytes is larger than 100 bytes [-Werror=frame-larger-than=] drivers/gpu/drm/i915/intel_dpll_mgr.c: In function 'bxt_ddi_pll_enable': drivers/gpu/drm/i915/intel_dpll_mgr.c:1548:1: error: the frame size of 224 bytes is larger than 100 bytes [-Werror=frame-larger-than=]
after: drivers/gpu/drm/i915/intel_dpll_mgr.c: In function 'bxt_ddi_pll_get_hw_state': drivers/gpu/drm/i915/intel_dpll_mgr.c:1644:1: error: the frame size of 1016 bytes is larger than 1000 bytes [-Werror=frame-larger-than=] drivers/gpu/drm/i915/intel_dpll_mgr.c: In function 'bxt_ddi_pll_enable': drivers/gpu/drm/i915/intel_dpll_mgr.c:1548:1: error: the frame size of 1960 bytes is larger than 1000 bytes [-Werror=frame-larger-than=]
I also checked the module sizes and got with gcc-7.0.1
original: text data bss dec hex filename 2380830 1155436 4448 3540714 3606ea drivers/gpu/drm/i915/i915-kasan.o 1298054 543692 2884 1844630 1c2596 drivers/gpu/drm/i915/i915-nokasan.o
after ce64645d86ac: text data bss dec hex filename 2389515 1154476 4448 3548439 362517 drivers/gpu/drm/i915/i915-kasan.o 1299639 543692 2884 1846215 1c2bc7 drivers/gpu/drm/i915/i915-nokasan.o
with this patch: text data bss dec hex filename 2381275 1163884 4448 3549607 3629a7 drivers/gpu/drm/i915/i915-kasan.o 1296038 543692 2884 1842614 1c1db6 drivers/gpu/drm/i915/i915-nokasan.o
Actually showing a code size growth in .text both with and without kasan, and my version gets most of it back at the expense of larger .data when kasan is enabled.
Fixes: ce64645d86ac ("drm/i915: use variadic macros and arrays to choose port/pipe based registers") Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80114 Cc: Jani Nikula jani.nikula@linux.intel.com Signed-off-by: Arnd Bergmann arnd@arndb.de --- drivers/gpu/drm/i915/i915_reg.h | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 04c8f69fcc62..39b53878a188 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -48,7 +48,7 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg) return !i915_mmio_reg_equal(reg, INVALID_MMIO_REG); }
-#define _PICK(__index, ...) (((const u32 []){ __VA_ARGS__ })[__index]) +#define _PICK(__index, ...) ({static const u32 __arr[] = { __VA_ARGS__ }; __arr[__index];})
#define _PIPE(pipe, a, b) ((a) + (pipe)*((b)-(a))) #define _MMIO_PIPE(pipe, a, b) _MMIO(_PIPE(pipe, a, b)) @@ -2657,10 +2657,10 @@ enum skl_disp_power_wells { /* * Clock control & power management */ -#define _DPLL_A (dev_priv->info.display_mmio_offset + 0x6014) -#define _DPLL_B (dev_priv->info.display_mmio_offset + 0x6018) -#define _CHV_DPLL_C (dev_priv->info.display_mmio_offset + 0x6030) -#define DPLL(pipe) _MMIO_PIPE3((pipe), _DPLL_A, _DPLL_B, _CHV_DPLL_C) +#define _DPLL_A 0x6014 +#define _DPLL_B 0x6018 +#define _CHV_DPLL_C 0x6030 +#define DPLL(pipe) _MMIO(dev_priv->info.display_mmio_offset + _PIPE3((pipe), _DPLL_A, _DPLL_B, _CHV_DPLL_C))
#define VGA0 _MMIO(0x6000) #define VGA1 _MMIO(0x6004) @@ -2756,10 +2756,10 @@ enum skl_disp_power_wells { #define SDVO_MULTIPLIER_SHIFT_HIRES 4 #define SDVO_MULTIPLIER_SHIFT_VGA 0
-#define _DPLL_A_MD (dev_priv->info.display_mmio_offset + 0x601c) -#define _DPLL_B_MD (dev_priv->info.display_mmio_offset + 0x6020) -#define _CHV_DPLL_C_MD (dev_priv->info.display_mmio_offset + 0x603c) -#define DPLL_MD(pipe) _MMIO_PIPE3((pipe), _DPLL_A_MD, _DPLL_B_MD, _CHV_DPLL_C_MD) +#define _DPLL_A_MD 0x601c +#define _DPLL_B_MD 0x6020 +#define _CHV_DPLL_C_MD 0x603c +#define DPLL_MD(pipe) _MMIO(dev_priv->info.display_mmio_offset + _PIPE3((pipe), _DPLL_A_MD, _DPLL_B_MD, _CHV_DPLL_C_MD))
/* * UDI pixel divider, controlling how many pixels are stuffed into a packet.
On Mon, 20 Mar 2017, Arnd Bergmann arnd@arndb.de wrote:
The varargs macro trick in _PIPE3/_PHY3/_PORT3 was meant as an optimization to shrink the i915 kernel module by around 1000 bytes.
Really, I didn't care one bit about the size shrink, I only cared about making it easier and less error prone to increase the number of args in a number of places. Maintainability and correctness were the goals. Just for the record. ;)
Otherwise, this seems like an acceptable approach to me. Would be interesting to see what happens with this, and f4c3a88e5f04 ("drm/i915: Tighten mmio arrays for MIPI_PORT") reverted on top.
BR, Jani.
However, the downside is a size regression with CONFIG_KASAN, as I found from stack size warnings with gcc-7.0.1:
before: drivers/gpu/drm/i915/intel_dpll_mgr.c: In function 'bxt_ddi_pll_get_hw_state': drivers/gpu/drm/i915/intel_dpll_mgr.c:1644:1: error: the frame size of 176 bytes is larger than 100 bytes [-Werror=frame-larger-than=] drivers/gpu/drm/i915/intel_dpll_mgr.c: In function 'bxt_ddi_pll_enable': drivers/gpu/drm/i915/intel_dpll_mgr.c:1548:1: error: the frame size of 224 bytes is larger than 100 bytes [-Werror=frame-larger-than=]
after: drivers/gpu/drm/i915/intel_dpll_mgr.c: In function 'bxt_ddi_pll_get_hw_state': drivers/gpu/drm/i915/intel_dpll_mgr.c:1644:1: error: the frame size of 1016 bytes is larger than 1000 bytes [-Werror=frame-larger-than=] drivers/gpu/drm/i915/intel_dpll_mgr.c: In function 'bxt_ddi_pll_enable': drivers/gpu/drm/i915/intel_dpll_mgr.c:1548:1: error: the frame size of 1960 bytes is larger than 1000 bytes [-Werror=frame-larger-than=]
I also checked the module sizes and got with gcc-7.0.1
original: text data bss dec hex filename 2380830 1155436 4448 3540714 3606ea drivers/gpu/drm/i915/i915-kasan.o 1298054 543692 2884 1844630 1c2596 drivers/gpu/drm/i915/i915-nokasan.o
after ce64645d86ac: text data bss dec hex filename 2389515 1154476 4448 3548439 362517 drivers/gpu/drm/i915/i915-kasan.o 1299639 543692 2884 1846215 1c2bc7 drivers/gpu/drm/i915/i915-nokasan.o
with this patch: text data bss dec hex filename 2381275 1163884 4448 3549607 3629a7 drivers/gpu/drm/i915/i915-kasan.o 1296038 543692 2884 1842614 1c1db6 drivers/gpu/drm/i915/i915-nokasan.o
Actually showing a code size growth in .text both with and without kasan, and my version gets most of it back at the expense of larger .data when kasan is enabled.
Fixes: ce64645d86ac ("drm/i915: use variadic macros and arrays to choose port/pipe based registers") Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80114 Cc: Jani Nikula jani.nikula@linux.intel.com Signed-off-by: Arnd Bergmann arnd@arndb.de
drivers/gpu/drm/i915/i915_reg.h | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 04c8f69fcc62..39b53878a188 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -48,7 +48,7 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg) return !i915_mmio_reg_equal(reg, INVALID_MMIO_REG); }
-#define _PICK(__index, ...) (((const u32 []){ __VA_ARGS__ })[__index]) +#define _PICK(__index, ...) ({static const u32 __arr[] = { __VA_ARGS__ }; __arr[__index];})
#define _PIPE(pipe, a, b) ((a) + (pipe)*((b)-(a))) #define _MMIO_PIPE(pipe, a, b) _MMIO(_PIPE(pipe, a, b)) @@ -2657,10 +2657,10 @@ enum skl_disp_power_wells { /*
- Clock control & power management
*/ -#define _DPLL_A (dev_priv->info.display_mmio_offset + 0x6014) -#define _DPLL_B (dev_priv->info.display_mmio_offset + 0x6018) -#define _CHV_DPLL_C (dev_priv->info.display_mmio_offset + 0x6030) -#define DPLL(pipe) _MMIO_PIPE3((pipe), _DPLL_A, _DPLL_B, _CHV_DPLL_C) +#define _DPLL_A 0x6014 +#define _DPLL_B 0x6018 +#define _CHV_DPLL_C 0x6030 +#define DPLL(pipe) _MMIO(dev_priv->info.display_mmio_offset + _PIPE3((pipe), _DPLL_A, _DPLL_B, _CHV_DPLL_C))
#define VGA0 _MMIO(0x6000) #define VGA1 _MMIO(0x6004) @@ -2756,10 +2756,10 @@ enum skl_disp_power_wells { #define SDVO_MULTIPLIER_SHIFT_HIRES 4 #define SDVO_MULTIPLIER_SHIFT_VGA 0
-#define _DPLL_A_MD (dev_priv->info.display_mmio_offset + 0x601c) -#define _DPLL_B_MD (dev_priv->info.display_mmio_offset + 0x6020) -#define _CHV_DPLL_C_MD (dev_priv->info.display_mmio_offset + 0x603c) -#define DPLL_MD(pipe) _MMIO_PIPE3((pipe), _DPLL_A_MD, _DPLL_B_MD, _CHV_DPLL_C_MD) +#define _DPLL_A_MD 0x601c +#define _DPLL_B_MD 0x6020 +#define _CHV_DPLL_C_MD 0x603c +#define DPLL_MD(pipe) _MMIO(dev_priv->info.display_mmio_offset + _PIPE3((pipe), _DPLL_A_MD, _DPLL_B_MD, _CHV_DPLL_C_MD))
/*
- UDI pixel divider, controlling how many pixels are stuffed into a packet.
On Tue, Mar 21, 2017 at 9:26 AM, Jani Nikula jani.nikula@linux.intel.com wrote:
On Mon, 20 Mar 2017, Arnd Bergmann arnd@arndb.de wrote:
The varargs macro trick in _PIPE3/_PHY3/_PORT3 was meant as an optimization to shrink the i915 kernel module by around 1000 bytes.
Really, I didn't care one bit about the size shrink, I only cared about making it easier and less error prone to increase the number of args in a number of places. Maintainability and correctness were the goals. Just for the record. ;)
Ok. My only interest here is the warning about possible stack overflow, though the fact that KASAN considers the array code to be fragile is an indication that it is perhaps actually dangerous: if we ever run into a bug that causes the array index to overflow, we might in theory have a security bug that lets users access arbitrary kernel pointers.
While the risk for that actually happening is very low, the original code was safer in that regard. My patch on top of yours merely turns a hypothetical arbitrary stack access into an arbitrary .data access, and I don't even know which one would be worse.
Arnd
On Tue, Mar 21, 2017 at 09:44:07AM +0100, Arnd Bergmann wrote:
On Tue, Mar 21, 2017 at 9:26 AM, Jani Nikula jani.nikula@linux.intel.com wrote:
On Mon, 20 Mar 2017, Arnd Bergmann arnd@arndb.de wrote:
The varargs macro trick in _PIPE3/_PHY3/_PORT3 was meant as an optimization to shrink the i915 kernel module by around 1000 bytes.
Really, I didn't care one bit about the size shrink, I only cared about making it easier and less error prone to increase the number of args in a number of places. Maintainability and correctness were the goals. Just for the record. ;)
Ok. My only interest here is the warning about possible stack overflow, though the fact that KASAN considers the array code to be fragile is an indication that it is perhaps actually dangerous: if we ever run into a bug that causes the array index to overflow, we might in theory have a security bug that lets users access arbitrary kernel pointers.
While the risk for that actually happening is very low, the original code was safer in that regard. My patch on top of yours merely turns a hypothetical arbitrary stack access into an arbitrary .data access, and I don't even know which one would be worse.
Even without these arrays, if userspace could control the index we feed into these you get arbitrary mmio access. Or semi-arbitrary at least.
None of these are bugs we should ever let through, and I think with the current code design (where the driver constructs structs that contain the right indizes, and userspace only ever gets to point at these structs using an idr lookup) none of these are likely to happen. -Daniel
On Tue, 21 Mar 2017, Daniel Vetter daniel@ffwll.ch wrote:
On Tue, Mar 21, 2017 at 09:44:07AM +0100, Arnd Bergmann wrote:
On Tue, Mar 21, 2017 at 9:26 AM, Jani Nikula jani.nikula@linux.intel.com wrote:
On Mon, 20 Mar 2017, Arnd Bergmann arnd@arndb.de wrote:
The varargs macro trick in _PIPE3/_PHY3/_PORT3 was meant as an optimization to shrink the i915 kernel module by around 1000 bytes.
Really, I didn't care one bit about the size shrink, I only cared about making it easier and less error prone to increase the number of args in a number of places. Maintainability and correctness were the goals. Just for the record. ;)
Ok. My only interest here is the warning about possible stack overflow, though the fact that KASAN considers the array code to be fragile is an indication that it is perhaps actually dangerous: if we ever run into a bug that causes the array index to overflow, we might in theory have a security bug that lets users access arbitrary kernel pointers.
While the risk for that actually happening is very low, the original code was safer in that regard. My patch on top of yours merely turns a hypothetical arbitrary stack access into an arbitrary .data access, and I don't even know which one would be worse.
Even without these arrays, if userspace could control the index we feed into these you get arbitrary mmio access. Or semi-arbitrary at least.
None of these are bugs we should ever let through, and I think with the current code design (where the driver constructs structs that contain the right indizes, and userspace only ever gets to point at these structs using an idr lookup) none of these are likely to happen.
That's all true, but I'm curious if explicit checks would help kasan. Something like:
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 04c8f69fcc62..0ab32a05b5d8 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -48,7 +48,8 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg) return !i915_mmio_reg_equal(reg, INVALID_MMIO_REG); }
-#define _PICK(__index, ...) (((const u32 []){ __VA_ARGS__ })[__index]) +#define _PICK_NARGS(...) ARRAY_SIZE(((const u32 []){ __VA_ARGS__ })) +#define _PICK(__index, ...) ((__index) >= 0 && (__index) < _PICK_NARGS(__VA_ARGS__) ? ((const u32 []){ __VA_ARGS__ })[__index] : 0)
#define _PIPE(pipe, a, b) ((a) + (pipe)*((b)-(a))) #define _MMIO_PIPE(pipe, a, b) _MMIO(_PIPE(pipe, a, b))
---
Arnd, can you check that with kasan please? (I don't have gcc 7.) For me the size diff against current git is
text data bss dec hex filename -1137236 31211 2948 1171395 11dfc3 drivers/gpu/drm/i915/i915.ko +1139702 31211 2948 1173861 11e965 drivers/gpu/drm/i915/i915.ko
BR, Jani.
On Tue, Mar 21, 2017 at 12:23 PM, Jani Nikula jani.nikula@linux.intel.com wrote:
On Tue, 21 Mar 2017, Daniel Vetter daniel@ffwll.ch wrote:
On Tue, Mar 21, 2017 at 09:44:07AM +0100, Arnd Bergmann wrote:
Arnd, can you check that with kasan please? (I don't have gcc 7.) For me the size diff against current git is
text data bss dec hex filename
-1137236 31211 2948 1171395 11dfc3 drivers/gpu/drm/i915/i915.ko +1139702 31211 2948 1173861 11e965 drivers/gpu/drm/i915/i915.ko
Sorry for the late reply.
I was rather sure that I had done the numbers and replied to you earlier, but I see no evidence of that, so here it comes again, using gcc-7 and kasan:
text data bss dec hex filename 2623339 511153 12064 3146556 30033c obj-x86/drivers/gpu/drm/i915/i915-original.o 2634886 511153 12064 3158103 303057 obj-x86/drivers/gpu/drm/i915/i915-linux-next.o 2617989 520561 12064 3150614 301316 obj-x86/drivers/gpu/drm/i915/i915-arndpatch.o
The first one is linux-next with ce64645d86ac ("drm/i915: use variadic macros and arrays to choose port/pipe based registers") reverted, the second one is the current version, and the third is with my patch applied on top.
Arnd
On Tue, Mar 21, 2017 at 12:23 PM, Jani Nikula jani.nikula@linux.intel.com wrote:
On Tue, 21 Mar 2017, Daniel Vetter daniel@ffwll.ch wrote:
On Tue, Mar 21, 2017 at 09:44:07AM +0100, Arnd Bergmann wrote:
On Tue, Mar 21, 2017 at 9:26 AM, Jani Nikula jani.nikula@linux.intel.com wrote:
On Mon, 20 Mar 2017, Arnd Bergmann arnd@arndb.de wrote:
The varargs macro trick in _PIPE3/_PHY3/_PORT3 was meant as an optimization to shrink the i915 kernel module by around 1000 bytes.
Really, I didn't care one bit about the size shrink, I only cared about making it easier and less error prone to increase the number of args in a number of places. Maintainability and correctness were the goals. Just for the record. ;)
Ok. My only interest here is the warning about possible stack overflow, though the fact that KASAN considers the array code to be fragile is an indication that it is perhaps actually dangerous: if we ever run into a bug that causes the array index to overflow, we might in theory have a security bug that lets users access arbitrary kernel pointers.
While the risk for that actually happening is very low, the original code was safer in that regard. My patch on top of yours merely turns a hypothetical arbitrary stack access into an arbitrary .data access, and I don't even know which one would be worse.
Even without these arrays, if userspace could control the index we feed into these you get arbitrary mmio access. Or semi-arbitrary at least.
None of these are bugs we should ever let through, and I think with the current code design (where the driver constructs structs that contain the right indizes, and userspace only ever gets to point at these structs using an idr lookup) none of these are likely to happen.
That's all true, but I'm curious if explicit checks would help kasan. Something like:
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 04c8f69fcc62..0ab32a05b5d8 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -48,7 +48,8 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg) return !i915_mmio_reg_equal(reg, INVALID_MMIO_REG); }
-#define _PICK(__index, ...) (((const u32 []){ __VA_ARGS__ })[__index]) +#define _PICK_NARGS(...) ARRAY_SIZE(((const u32 []){ __VA_ARGS__ })) +#define _PICK(__index, ...) ((__index) >= 0 && (__index) < _PICK_NARGS(__VA_ARGS__) ? ((const u32 []){ __VA_ARGS__ })[__index] : 0)
#define _PIPE(pipe, a, b) ((a) + (pipe)*((b)-(a))) #define _MMIO_PIPE(pipe, a, b) _MMIO(_PIPE(pipe, a, b))
Arnd, can you check that with kasan please? (I don't have gcc 7.) For me the size diff against current git is
text data bss dec hex filename
-1137236 31211 2948 1171395 11dfc3 drivers/gpu/drm/i915/i915.ko +1139702 31211 2948 1173861 11e965 drivers/gpu/drm/i915/i915.ko
I just revisited my old patch when I ran into the stack size warning once more, and realized I had not really answered your question earlier.
I compared your version to what is in 4.15-rc3 now, and to my version, and confirmed that yours produces the largest code size of the three, and doesn't address the warnings we get, but does cause additional warnings ("comparison of constant '3' with boolean expression is always true"), so that won't get us anywhere. Here are the numbers I get with gcc-8:
text data bss dec hex filename 2500045 486453 6912 2993410 2dad02 i915-kasan-4_14.ko 2488028 497909 6912 2992849 2daad1 i915-kasan-arnd.ko 2508814 486453 6912 3002179 2dcf43 i915-kasan-jani.ko 1639798 63269 4448 1707515 1a0dfb i915-nokasan-4.15.ko 1635284 63269 4448 1703001 19fc59 i915-nokasan-arnd.ko 1648331 63269 4448 1716048 1a2f50 i915-nokasan-jani.ko
I'll resend my old patch with the original description since I can't easily reproduce it now without your original change, and the code has changed again in the meantime, so I had to slightly adapt my patch to still apply.
Arnd
dri-devel@lists.freedesktop.org