Re: [PATCH 2/3] find: micro-optimize for_each_{set,clear}_bit()

19 Jun 2021

On Fri, 18 Jun 2021 20:57:34 +0100,
Yury Norov yury.norov@gmail.com wrote:
...
The macros iterate thru all set/clear bits in a bitmap. They search a
first bit using find_first_bit(), and the rest bits using find_next_bit().
Since find_next_bit() is called shortly after find_first_bit(), we can
save few lines of I-cache by not using find_first_bit().
Really?
...
Signed-off-by: Yury Norov yury.norov@gmail.com
include/linux/find.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/find.h b/include/linux/find.h
index 4500e8ab93e2..ae9ed52b52b8 100644
--- a/include/linux/find.h
+++ b/include/linux/find.h
@@ -280,7 +280,7 @@ unsigned long find_next_bit_le(const void *addr, unsigned
 #endif
#define for_each_set_bit(bit, addr, size) \

for ((bit) = find_first_bit((addr), (size));		\


for ((bit) = find_next_bit((addr), (size), 0);		\

On which architecture do you observe a gain? Only 32bit ARM and m68k
implement their own version of find_first_bit(), and everyone else
uses the canonical implementation:
#ifndef find_first_bit
#define find_first_bit(addr, size) find_next_bit((addr), (size), 0)
#endif
These architectures explicitly have different implementations for
find_first_bit() and find_next_bit() because they can do better
(whether that is true or not is another debate). I don't think you
should remove this optimisation until it has been measured on these
two architectures.
Thanks,
M.
-- 
Without deviation from the norm, progress is not possible.

    

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

Re: [PATCH 2/3] find: micro-optimize for_each_{set,clear}_bit()

Signed-off-by: Yury Norov yury.norov@gmail.com