[FE training-materials-updates] Boot time slides: add filesystems section
Michael Opdenacker
michael.opdenacker at free-electrons.com
Thu Jan 2 07:59:58 CET 2014
Repository : git://git.free-electrons.com/training-materials.git
On branch : master
Link : http://git.free-electrons.com/training-materials/commit/?id=4eb4147a113fc362347572282b125f503791f753
>---------------------------------------------------------------
commit 4eb4147a113fc362347572282b125f503791f753
Author: Michael Opdenacker <michael.opdenacker at free-electrons.com>
Date: Thu Jan 2 07:57:38 2014 +0100
Boot time slides: add filesystems section
Signed-off-by: Michael Opdenacker <michael.opdenacker at free-electrons.com>
>---------------------------------------------------------------
4eb4147a113fc362347572282b125f503791f753
.../boottime-course-outline.tex | 1 +
.../boottime-filesystems/boottime-filesystems.tex | 277 ++++++--------------
.../sysdev-flash-filesystems.tex | 2 +-
3 files changed, 84 insertions(+), 196 deletions(-)
diff --git a/slides/boottime-course-outline/boottime-course-outline.tex b/slides/boottime-course-outline/boottime-course-outline.tex
index e41b35b..856cfb8 100644
--- a/slides/boottime-course-outline/boottime-course-outline.tex
+++ b/slides/boottime-course-outline/boottime-course-outline.tex
@@ -4,6 +4,7 @@ Generic optimizations
\begin{itemize}
\item Principles
\item Measuring
+\item Filesystems
\item Userland
\item Kernel
\item Bootloader
diff --git a/slides/boottime-filesystems/boottime-filesystems.tex b/slides/boottime-filesystems/boottime-filesystems.tex
index 4c52c92..c133700 100644
--- a/slides/boottime-filesystems/boottime-filesystems.tex
+++ b/slides/boottime-filesystems/boottime-filesystems.tex
@@ -1,234 +1,121 @@
-\section{Kernel optimizations}
-
-\begin{frame}[fragile]
-\frametitle{Measure - Kernel initialization functions}
-To find out which kernel initialization functions are the longest to
-execute, add \code{initcall_debug} to the kernel command line.
-Here's what you get on the kernel log:
-\begin{block}{}
-\tiny
-\begin{verbatim}
-...
-[ 3.750000] calling ov2640_i2c_driver_init+0x0/0x10 @ 1
-[ 3.760000] initcall ov2640_i2c_driver_init+0x0/0x10 returned 0 after 544 usecs
-[ 3.760000] calling at91sam9x5_video_init+0x0/0x14 @ 1
-[ 3.760000] at91sam9x5-video f0030340.lcdheo1: video device registered @ 0xe0d3e340, irq = 24
-[ 3.770000] initcall at91sam9x5_video_init+0x0/0x14 returned 0 after 10388 usecs
-[ 3.770000] calling gspca_init+0x0/0x18 @ 1
-[ 3.770000] gspca_main: v2.14.0 registered
-[ 3.770000] initcall gspca_init+0x0/0x18 returned 0 after 3966 usecs
-...
-\end{verbatim}
-\end{block}
-It is probably a good idea to increase the log buffer size with
-\code{CONFIG_LOG_BUF_SHIFT} in your kernel configuration. You will
-also need \code{CONFIG_PRINTK_TIME} and \code{CONFIG_KALLSYMS}.
-\end{frame}
+\section{Filesystem optimizations}
\begin{frame}
-\frametitle{Kernel boot graph}
-With \code{initcall_debug}, you can generate a boot graph
-making it easy to see which kernel initialization functions
-take most time to execute.
+\frametitle{Filesystem impact on performance}
+Tuning the filesystem is usually one of the first things
+we work on in boot time projects.
\begin{itemize}
-\item Copy and paste the console output or the output of
- the \code{dmesg} command to a file (let's call it \code{boot.log})
-\item On your workstation, run the \code{scripts/bootgraph.pl} script
- in the kernel sources: \\
- \code{perl scripts/bootgraph.pl boot.log > boot.svg}
-\item You can now open the boot graph with a vector graphics
- editor such as \code{inkscape}:
+\item Different filesystems can have different initialization
+ and mount times. In particular, the type of filesystem
+ for the root filesystem directly impacts boot time.
+\item Different filesystems can exhibit different read, write
+ and access time performance, according to the type
+ of filesystem activity and to the type of files in the
+ system.
\end{itemize}
-\begin{center}
- \includegraphics[width=\textwidth]{slides/boottime-kernel/boot.png}
-\end{center}
\end{frame}
\begin{frame}
-\frametitle{Using the kernel boot graph (1)}
-Start working on the functions consuming most time first. For each
-function:
+\frametitle{Different filesystem for different storage types}
\begin{itemize}
-\item Look for its definition in the kernel source code. You can use LXR
- (for example \url{http://lxr.free-electrons.com}).
-\item Remove unnecessary functionality:
+\item Raw flash storage
\begin{itemize}
- \item Look for kernel parameters in C sources and Makefiles, starting
- with \code{_CONFIG}. Some settings for such parameters could help
- to remove code complexity or remove unnecessary features.
- \item Find which module (if any) it belongs to. Loading this module
- could be deferred.
+ \item JFFS2
+ \item YAFFS2
+ \item UBIFS
\end{itemize}
-\end{itemize}
-\end{frame}
-
-\begin{frame}
-\frametitle{Using the kernel boot graph (2)}
-\begin{itemize}
-\item Postpone:
+\item Block storage (including memory cards, eMMC)
\begin{itemize}
- \item Find which module (if any) the function belongs to.
- Load this module later if possible.
- \end{itemize}
-\item Optimize necessary functionality:
- \begin{itemize}
- \item Look for parameters which could be used to reduce probe time,
- looking for the \code{module_param} macro.
- \item Look for delay loops and calls to functions containing
- \code{delay} in their name, which could take more time than
- needed. You could reduce such delays, and see whether the
- code still works or not.
+ \item ext2, ext3, ext4
+ \item xfs, jfs, reiserfs
+ \item btrfs
+ \item f2fs
+ \item SquashFS
\end{itemize}
\end{itemize}
-\end{frame}
-
-\begin{frame}
-\frametitle{Reduce kernel size}
-First, we focus on reducing the size without removing features
-\begin{itemize}
- \item The main mechanism is to use kernel modules
- \item Compile everything that is not needed at boot time as a
- module
- \item Two benefits: the kernel will be smaller and load faster and
- less initialization code will get executed
- \item Remove features that are not used by userland:
- \code{CONFIG_KALLSYMS}, \code{CONFIG_DEBUG_FS},
- \code{CONFIG_BUG}
- \item Use features designed for embedded systems:
- \code{CONFIG_SLOB}, \code{CONFIG_EMBEDDED}
-\end{itemize}
-\end{frame}
-
-\begin{frame}
-\frametitle{Results}
-Before 8.54s
-\begin{center}
- \includegraphics[width=\textwidth]{slides/boottime-init-scripts2/timechart-initramfs.pdf}
-\end{center}
-After:
-\begin{center}
- \includegraphics[width=\textwidth]{slides/boottime-kernel/timechart-modules.pdf}
-\end{center}
-Total: 6.45s.
-\end{frame}
-
-\begin{frame}
-\frametitle{Kernel Compression}
-Depending on the balance between your storage reading speed and your
-CPU power to decompress the kernel, you will need to benchmark
-different compression algorithms.
+See our embedded Linux training materials for full details:
+{\small
+\url{http://free-electrons.com/doc/training/embedded-linux/}
+}
-Before (gzip): 6.45s.
-\begin{center}
- \includegraphics[width=\textwidth]{slides/boottime-kernel/timechart-modules.pdf}
-\end{center}
-After (LZO):
-\begin{center}
- \includegraphics[width=\textwidth]{slides/boottime-kernel/timechart-lzo.pdf}
-\end{center}
-Total: 6.46s.
-Conclusion: don't use LZO for now.
+See also our flash filesystem benchmarks:
+{\small
+\url{http://elinux.org/Flash_Filesystem_Benchmarks}.
+}
\end{frame}
\begin{frame}
-\frametitle{Deferred initcalls}
+\frametitle{JFFS2}
+For raw flash storage
\begin{itemize}
-\item If you can't compile a feature as a module (e.g. networking or block
- subsystem), try \code{deferred_initcalls}.
-\item Your kernel will not shrink but some initializations will be
- postponed. Once your critical application is ready, you can
- execute the remaining initcalls.
-\item See \url{http://elinux.org/Deferred_Initcalls}
+\item Mount time depending on filesystem size: the kernel has to
+ scan the whole filesystem at mount time, to read which block
+ belongs to each file.
+\item Need to use the \code{CONFIG_JFFS2_SUMMARY} kernel option
+ to store such information in flash. This dramatically reduces
+ mount time (from 16 s to 0.8 s for a 128 MB partition).
+\item Rather poor read and write performance (compared to YAFFS2 and
+ UBIFS)
\end{itemize}
\end{frame}
-\begin{frame}[fragile]
-\frametitle{Tuning the command line}
+\begin{frame}
+\frametitle{YAFFS2}
+For raw flash storage
\begin{itemize}
- \item At each boot, the Linux kernel calibrates a delay loop (for
- the udelay function). This measures a number of loops per
- jiffy ({\em lpj}) value. You just need to measure this once! Find
- the \code{lpj} value in the kernel boot messages:
-\begin{block}{}
-\small
-\begin{verbatim}
-Calibrating delay loop... 262.96 BogoMIPS (lpj=1314816)
-\end{verbatim}
-\end{block}
- Now, you can use the \code{lpj=<value>} argument. This saves
- around 180 ms on ARM.
- \item The console output is actually taking a lot of time. You
- probably don't need it in production. It can be disabled by
- passing the \code{quiet} argument on the kernel command line.
- You will still be able to use \code{dmesg} to get the
- messages.
+\item Mount time depending on filesystem size: the kernel has to
+\item Good mount time
+\item Good read and write performance
+\item Drawbacks: no compression, not in the mainline Linux kernel
\end{itemize}
\end{frame}
\begin{frame}
- \frametitle{Multiprocessor support (SMP)}
- \begin{itemize}
- \item SMP is quite slow to initialize
- \item UP systems may be faster to boot
- \item What you can try is to hotplug the other cores after your critical application has started
- \end{itemize}
-\end{frame}
-
-\setuplabframe
-{Reduce kernel boot time}
-{
+\frametitle{UBIFS}
+For raw flash storage
\begin{itemize}
-\item Recompile the kernel, switching to an initramfs
-\item Use \code{initcall_debug} to find the biggest
- time consumers
-\item Reduce the number of modules
-\item Tune kernel command line parameters
+\item Not so good mount time, because of the time needed
+ to initialize UBI (\code{ubi_attach} command in userspace).
+ Filesystem getting slower and slower as it gets older.
+\item Need \code{CONFIG_UBI_FASTMAP} (introduced in Linux 3.7) to do
+ \code{ubi_attach} in constant time, and get a good mount time.
+\item Good read and write performance (similar to YAFFS2)
+\item Other advantages: better for wear leveling (operates on the whole
+ flash storage, not only within a flash partition).
\end{itemize}
-}
-
-\begin{frame}
-\frametitle{Kernel Optimization results}
-Before (gzip): 6.45s.
-\begin{center}
- \includegraphics[width=\textwidth]{slides/boottime-kernel/timechart-modules.pdf}
-\end{center}
-After:
-\begin{center}
- \includegraphics[width=\textwidth]{slides/boottime-kernel/timechart-final.pdf}
-\end{center}
-Total: 5.77s. Without losing any functionality!
\end{frame}
\begin{frame}
-\frametitle{Kernel: last milliseconds (1)}
-To shave off the last milliseconds, you will probably want to remove
-unnecessary features:
+\frametitle{Block filesystems}
+For block storage
\begin{itemize}
- \item \code{CONFIG_PRINTK=n} will have the same effect as the
- \code{quiet} command line argument but you won't have
- any access to kernel messages. You will have a
- significantly smaller kernel though.
- \item Try \code{CONFIG_CC_OPTIMIZE_FOR_SIZE=y}. This will have
- an impact on performance, you will have to benchmark.
- \item Try to initialize less RAM by passing a \code{mem} value
- on the kernel command line. The less RAM you need to
- initialize, the faster you will boot.
+\item ext4: best for rather big partitions, good read and write
+ performance
+\item xfs, jfs, reiserfs: can be good in some read or write scenarii
+ as well.
+\item btrfs, f2fs: can achieve best read and write performance,
+ taking advantage of the characteristics of flash-based block
+ devices.
+\item SquashFS: best mount time and read performance, for read-only
+ partitions. Great for root filesystems which can be read-only.
\end{itemize}
\end{frame}
\begin{frame}
-\frametitle{Kernel last milliseconds (2)}
-More features you could remove:
+\frametitle{Finding the best filesystem}
\begin{itemize}
- \item Module loading/unloading
- \item Block layer
- \item Network stack
- \item USB stack
- \item Power management features
- \item \code{CONFIG_SYSFS_DEPRECATED}
- \item Input: keyboards / mice / touchscreens
- \item \code{CONFIG_LEGACY_PTY_COUNT} or the
- \code{pty.legacy_count} kernel parameter
+\item Raw flash storage: UBIFS with \code{CONFIG_UBI_FASTMAP} is
+ probably the best solution.
+\item Block storage: SquashFS best solution for root filesystems
+ which can be read-only. Btrfs and f2fs probably the best solutions
+ for read/write filesystems.
+\item Fortunately, changing filesystem types is quite cheap,
+ and completely transparent for applications. Just try
+ several filesystem options, as see which one works best
+ for you!
+\item Do not focus only on boot time. \\
+ For systems in which read and write performance matters, we
+ recommend to use separate root filesystem (for quick
+ boot time) and data partitions (for good runtime performance).
\end{itemize}
\end{frame}
-
diff --git a/slides/sysdev-flash-filesystems/sysdev-flash-filesystems.tex b/slides/sysdev-flash-filesystems/sysdev-flash-filesystems.tex
index 9854896..85e25d9 100644
--- a/slides/sysdev-flash-filesystems/sysdev-flash-filesystems.tex
+++ b/slides/sysdev-flash-filesystems/sysdev-flash-filesystems.tex
@@ -196,7 +196,7 @@ Creating 5 MTD partitions on "omap2-nand.0":
belongs to each file.
\item Need to use the \code{CONFIG_JFFS2_SUMMARY} kernel option
to store such information in flash. This dramatically reduces
- mount time (from 16 s to 0.8s for a 128 MB partition).
+ mount time (from 16 s to 0.8 s for a 128 MB partition).
\end{itemize}
\item \url{http://www.linux-mtd.infradead.org/doc/jffs2.html}
\end{itemize}
More information about the training-materials-updates
mailing list