|
|
|
@ -0,0 +1,22 @@
|
|
|
|
|
|
|
|
\chapter{False Sharing, Race Conditions, and Schedules}
|
|
|
|
|
|
|
|
\section{False Sharing}
|
|
|
|
|
|
|
|
\begin{figure}[h!]
|
|
|
|
|
|
|
|
\centering
|
|
|
|
|
|
|
|
\includegraphics[width=0.8\textwidth]{false_sharing.png}
|
|
|
|
|
|
|
|
\caption{Inkscaped info graphic to false sharing}
|
|
|
|
|
|
|
|
\end{figure}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
\pagebreak
|
|
|
|
|
|
|
|
\section{Computer Performance after Moore's Law}
|
|
|
|
|
|
|
|
The figure \enquote{Performance gains after Moore's law ends} \footnote{From page 1 in \href{https://www.microsoft.com/en-us/research/wp-content/uploads/2020/11/Leiserson-et-al-Theres-plenty-of-room-at-the-top.pdf}{paper \enquote{There’s plenty of room at the Top: What will drive computer performance after Moore’s law?}} by Charles E. Leiserson, Neil C. Thompson*, Joel S. Emer, Bradley C. Kuszmaul, Butler W. Lampson, Daniel Sanchez, Tao B. Schardl} is depicting the computing stack, the layered model of components, from hardware at the bottom to software at the top. According to {\bfseries Moore's Law}, the bottom of the computing stack would continuously improve computing performance by doubling the number of transistors on a chip approximately every two years. However, we are now approaching the physical limits for hardware.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
This means that further improvements in performance cannot stem from the bottom layer anymore but instead from the top layer. The figure here shows three main types of top level computing stack components that will be able to improve computing performance.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
\paragraph{Software}
|
|
|
|
|
|
|
|
The paper as well as the infographic describe several means to improve efficiency with {\bfseries software performance engineering}. One way is by restructuring, or refactoring code by removing {\bfseries software bloat}, inefficient solutions that are fast to implement, and tailoring code to utilise the hardware it's running on (e.g., parallel processors or accelerators for example)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
\paragraph{Algorithms}
|
|
|
|
|
|
|
|
Algorithmic progress is less consistent but still remains important for performance improvement. At this point in time we mostly see improvements for {\bfseries new algorithms} for {\bfseries new problem domains} (e.g., machine learning) and also improved theoretical machine models that not idealise the hardware but instead realistically reflect the differences between memory access time for different areas of the memory hierarchy or multiple or specialised processing units for example.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
\paragraph{Hardware architecture}
|
|
|
|
|
|
|
|
The last component that has opportunities to improve performance is the {\bfseries hardware architecture}. If we can tailor software to utilise hardware we can as well tailor hardware to suit the software it's supposed to run better. This can be accomplished by {\bfseries processor simplification} where we can effectively reallocate physical ressources to prioritise parallelism (e.g., many simpler cores). Another way is {\bfseries hardware tailoring}, creating domain-specific architectures for specific problems (e.g., reducing floating-point precision where it's not needed)
|