• Zacryon@feddit.de
      link
      fedilink
      arrow-up
      3
      ·
      edit-2
      1 year ago

      Yes. Also the required clock cycles depends a lot on individual CPU architectures.

      For example division: Some CPUs have hardwired logic to compute the division operation directly on circuit level. Others are basically running a for loop with substraction. The difference in required clock cycles for a division operation can then be huge.

      Another example: is it a scalar or superscalar CPU?

      A rather obvious example: the bit width of the CPU. 32 bit systems compute 64 bit data much more inefficiently than 64 bit systems.

      Then there is other stuff like branch prediction, or system dependencies like memory bus width and clock, cache size and associativity etc. etc…

      Long story short: When evaluating the performance of code, multiple performance metrics have to be considered simultaneously and prioritized according to the development goals.

      Lines of code is usually a veeery bad metric. (I sometimes spend hours just to write a few lines of code. But those are good ones then.) Cycles per code segment is better, but also not good (except you are developing for a very specific target system). Do benchmarking, profiling, run it on different systems and maybe design individual performance metrics based on your expectations.