Study Notes of CS:APP
• Textbooks
• Randal E. Bryant and David R. O'Halloron, Computer Systems: A Programmer's Perspective, Third Edition, Pearson, 2016
• Courses
• 15-213 Instances
• 15-213/18-213 Fall 2019 (Randy's latest)
• http://www.cs.cmu.edu/~213 (original)
• Five system courses
• 嵌入式与Linux那些事's cnblogs posts
• 北洛's cnblogs posts
• FannieGirl's cnblogs posts
• 頔潇's CSDN posts
• 从园客博开始's cnblogs post
• gou4shi1's WordPress post
• Sebdog's CSDN post
• Other course variations
• SJTU ICS SE101 2019
• UW CSE351: H/S Interface
• Comparisons of some course variations
Chapter |
CSE 351: H/SI |
15-213: ICS |
SE 101: ICS |
1 |
1.0-1.10 |
1 |
1.1-1.2, 1.4.1, 1.7-1.8, 1.9.2 |
2 |
2.0-2.4.3 |
2.1-2.3 |
2.1-2.3 |
3 |
3.2-3.5.3, 3.6.0-3.6.5, 3.6.7-3.7.5, 3.8-3.10 |
3.1-3.10 |
3.1, 3.4, 3.6.1-3.6.8, 3.7-3.11 |
4 |
4.1-4.5.9 |
||
5 |
5 |
5.1-5.14 |
|
6 |
6.0, 6.2-6.7 |
6.1-6.7 |
6.1-6.6 |
7 |
7.1-7.12 |
||
8 |
8.0-8.4 |
8.1-8.8 |
8.1-8.6 |
9 |
9.0-9.7, 9.9-9.12 |
9.1-9.12 |
9.1-9.11 |
10 |
10 |
10.1-10.11 |
|
11 |
11.1-11.6 |
11.1-11.6 |
|
12 |
12.1-12.8 |
12.1-12.7 |
• Compiler Explorer
• Easy review
• Sharing
• Content suitable for Chinese national conditions and my personal needs
Chapter |
Phase |
|||
1 |
2 |
… |
||
N/R |
New |
Review |
New |
|
1 |
1.0-1.10 |
1.0-1.10 |
||
2 |
2.0-2.3, |
2.0-2.3, 2.5 |
2.4-2.4.3, 2.4.4.1, |
|
3 |
3.0-3.10, |
3.0-3.10, 3.12 |
||
4 |
4.0 |
4.0 |
||
5 |
5.0 |
5.0 |
||
6 |
6.0-6.4, |
|||
7 |
7.0 |
|||
8 |
8.0-8.6, |
|||
9 |
9.0-9.7, |
|||
10 |
10.0-10.12 |
|||
11 |
11.0-11.7 |
|||
12 |
12.0-12.8 |
• Every time I learn something:
• First, read the book on MoonReader with the assistance of eudic and mark the key points.
• Second, watch the corresponding 15-213 video(s) if available and refine the notes.
• Any instance of a bug listed in the book errata will be corrected.
• WARNING: Earlier versions are not readable enough!
• Environment
• Microsoft Word 2019
• Organization
• Lies between that of typical articles and of typical presentations
• Levels
• Almost identical to those of the global version
• What's the biggest different
• + Heading 6
• Formatting
• Use the style scheme as much close as that of the global edition as possible, to the degree in which formatting would not become a burden on me.
• New/noteworthy conventions
• Heading 6: Times New Roman, bold
• Colors: #000000, #00AEEF, #767676
• Date updated: [yy-mm]
• Code: Courier New
• Terminology: bold
• Highlight (primary): italic
• Highlight (secondary): underline (deprecated)
• Screenshots
• 200% taken
• 55% of 200% displayed
• NOTE: Not fully applied yet for the reasons explained below
• On Word
• File – Share – Post to Blog
• Run some macros that add marks for additional styles
• But Normal.dotm permanently deleted by accident :(
• Publish
• On VS Code
• Paste the HTML source from cnblogs
• On a simple Java program
• Written by me
• A little bit unstable but pretty efficient
• Perform some replacements, to
• Recover the additional styles
• Adjust text size and font for Web reading
• Remove useless attributes
• Read more.
• Refine my notes.
• Goal of # words: ~100 words for each noted page
• Properly decide the length of list items
• Remove all derivations
• Rewrite chapter summaries
• …
• Rewrite Word macros
• Computer system consists of hardware and systems software that work together to run application programs.
• A program begins life as a source program (or source file).
• Source program is a sequence of bits, each with a value of 0 or 1
• Organized in bytes (8-bit chunks), each byte represents some text character in the program
• Most computer systems represent text characters using ASCII
• Text files: files that consist exclusively of ASCII characters
• Binary files: all other files
• All information in a system is represented as a bunch of bits.
• The only thing that distinguishes different data objects is the context we view them.
• Compilation system: collectively programs that perform the four phases (preprocessor, compiler, assembler, and linker).
• Preprocessing phase
• Preprocessor (cpp) modifies hello.c according to directives that begin with '#'
• Result: hello.i C program
• Compilation phase
• Compiler (cc1) translates hello.i into hello.s assembly-language program
• Assembly language provides a common output language for different compilers for different high-level languages.
• Assembly phase
• Assembler (as) translates hello.s into machine language instructions, packages them in a relocatable object program (binary)
• Result: hello.o object file
• Linking phase
• printf
• Part of the standard C library provided by every C compiler
• Resides in a separate precompiled object file printf.o
• Linker (ld) merges printf.o with hello.o
• Result: hello executable object file (an executable object file, or simply executable, binary) ready to be loaded into memory and executed by the system
• Some important reasons:
• Optimizing program performance
• Understanding link-time errors
• Avoiding security holes
• Buffer overflow vulnerabilities
• Shell: command-line interpreter that prints a prompt, waits for you to type a command line, and then performs the command.
• If the first word of the command line does not correspond to a built-in shell command, then the shell assumes that it is the name of an executable file.
• Buses: collection of electrical conduits running throughout the system that carry bytes of information back and forth between the components.
• Typically designed to transfer words
• Word: fixed-size chunks of bytes
• Word size: # bytes in a word
• Fundamental system parameter that varies across systems.
• Most machines: either 4 bytes (32 bits) or 8 bytes (64 bits).
• Input/output (I/O) devices: system's connection to the external world
• Example system has 4:
• Keyboard and mouse
• Display
• Disk drive (or simply disk)
• Each is connected to the I/O bus by either a controller or an adapter.
• The distinction: packaging
• Motherboard: system's main printed circuit board
• Controllers: chip sets in the device itself or on the motherboard
• Adapter: card that plugs into a slot on the motherboard
• Purpose of each: to transfer information back and forth between the I/O bus and an I/O device.
• Main memory: temporary storage device that holds both a program and the data it manipulates while the processor is executing the program.
• Physically, consists of a collection of dynamic random-access memory (DRAM) chips.
• Logically, organized as a linear array of bytes, each with its own unique address (array index) starting at 0.
• Central processing unit (CPU), or simply processor: engine that executes (interprets) instructions stored in main memory.
• Program counter (PC): register at its core
• Register: word-size storage device
• At any point in time, points at (contains the address of) some machine-language instruction in main memory.
• Processor repeatedly
• Executes the instruction pointed at by the PC
• Updates the PC to point to the next instruction
• Processor appears to operate according to a very simple instruction execution model, defined by its instruction set architecture.
• Instructions execute in strict sequence, and executing a single instruction involves performing a series of steps.
• Operations revolve around:
• Main memory
• Register file: small storage device that consists of a collection of word-size registers, each with its own unique name
• Arithmetic/logic unit (ALU): Computes new data and address values
• Examples of operations:
• Load: Copy a byte or a word from main memory into a register, overwriting the previous contents of the register.
• Store: Copy a byte or a word from a register to a location in main memory, overwriting the previous contents of that location.
• Operate: Copy the contents of two registers to the ALU, perform an arithmetic operation on the two words, and store the result in a register, overwriting the previous contents of that register.
• Jump: Extract a word from the instruction itself and copy that word into the PC, overwriting the previous value of the PC.
• ISA describes the effect of each machine-code instruction
• Microarchitecture describes how the processor is actually implemented
• Using direct memory access (DMA), the data travel directly from disk to main memory, without passing through the processor.
• Lesson: A system spends a lot of time moving information from one place to another.
• Much of copying is overhead: slows down the "real work" of the program.
• Major goal for system designers: to make these copy operations run as fast as possible.
• Processor–memory gap: [22-01]
• Register file stores much less bytes of information than main memory.
• Processor can read data from register file much faster than from memory.
• Cache memories (or simply caches): smaller, faster storage devices that serve as temporary staging areas for information that the processor is likely to need in the near future.
• L1 caches
• On the processor chip
• Holds many bytes
• Can be accessed nearly as fast as the register file
• L2 caches
• Larger
• With more bytes
• Connected to the processor by a special bus
• Still much faster than accessing the main memory
• L3 caches
• Newer and more powerful systems
• Implemented with static random-access memory (SRAM, a hardware technology)
• Idea behind: Exploit locality to get a very large and fast memory
• Locality: tendency for programs to access data and code in localized regions
• Set up caches to hold data that are likely to be accessed often
• Storage devices in a system are organized as a memory hierarchy
•
• Main idea: Storage at one level serves as a cache for storage at the next lower level
• Operating system: layer of software interposed between the application program and the hardware.
• All attempts by an application program to manipulate the hardware must go through the operating system.
• Two primary purposes:
(1) to protect the hardware from misuse by runaway applications
(2) to provide applications with simple and uniform mechanisms for manipulating complicated and often wildly different low-level hardware devices
• Both goals are achieved via the fundamental abstractions: processes, virtual memory, and files
Overview
• Process: OS's abstraction for a running program
• Multiple processes can run concurrently on the same system, and each process appears to have exclusive use of the hardware.
• Concurrently: The instructions of one process are interleaved with the instructions of another process.
• Single CPU can appear to execute multiple processes concurrently by having the processor switch among them
• OS performs this interleaving with context switching (a mechanism).
• Consider only a uniprocessor system
• Uniprocessor system contains a single CPU
Context
• Context: all the state information that the process needs in order to run
• OS keeps track of the context.
• Including information such as:
• Current values of the PC
• Register file
• Contents of main memory
Context Switching
• At any point in time, uniprocessor system can only execute the code for a single process.
• When OS decides to transfer control from current process to some new process, it performs a context switch by:
• 1. saving context of current process
• 2. restoring context of new process
• 3. passing control to new process
• New process picks up exactly where it left off.
• Basic idea for hello scenario:
•
The Operating System Kernel
• Transition from one process to another is managed by the OS kernel.
• Kernel: portion of the OS code always resident in memory.
• When application program requires some action by OS, it executes a special system call instruction, transferring control to the kernel.
• Kernel then performs the requested operation and returns back to the application program.
• Note: Kernel is:
• Not a separate process
• Instead, a collection of code and data structures that the system uses to manage all the processes.
• A process can actually consist of multiple threads (execution units)
• Each running in the context of the process and sharing the same code and global data
• Increasingly important programming model:
• Requirement for concurrency in network servers
• Sharing data between
• Efficiency
• Multi-threading
• Virtual memory: abstraction that provides each process with the illusion that it has exclusive use of main memory.
Virtual Address Space
• Virtual address space: same uniform view of memory of ach process
• Topmost region is reserved for code and data
• Lower region holds the code and data defined by the user's process.
•
• Areas
• Program code and data
• Code begins at the same fixed address for all processes, followed by data locations that correspond to global C variables.
• Initialized directly from the contents of an executable object file.
• Linking and loading
• Heap (run-time heap)
• Expands and contracts dynamically at run time as a result of calls to C standard library routines.
• Managing virtual memory
• Shared libraries
• Code and data such as the C standard library and the math library.
• Dynamic linking
• Stack (user stack)
• Used by the compiler to implement function calls.
• Expands and contracts dynamically during the execution of the program.
• In particular, each time we call a function, it grows; each time we return from a function, it contracts
• Kernel virtual memory
• Top region reserved
• Application must invoke the kernel to read or write the contents of this area or to directly call functions defined in the kernel code
Summary
• For virtual memory to work, basic idea: to store the contents of a process's virtual memory on disk and then use the main memory as a cache for the disk.
• File: sequence of bytes
• Every I/O device is modeled as a file
• All input and output in the system is performed by reading and writing files, using Unix I/O (a small set of system calls).
• Very powerful
• Provides applications with a uniform view of all the varied I/O devices.
• From the point of view of an individual system, network can be viewed as just another I/O device.
• Network adapter
• Example: using telnet
• Client and server
• Network programming
• A system is a collection of intertwined hardware and systems software that must cooperate in order to achieve the ultimate goal of running application programs.
• Amdahl's law: observation about the effectiveness of improving the performance of one part of a system.
• Main idea: when we speed up one part of a system, the effect on the overall system performance depends on both how significant this part was and how much it sped up.
• Speedup
• Consider system:
• Executing some application requires time Told
• Some part requires time αTold, performance improved by k
• Overall execution time Tnew = Told[(1− α) + α/k]
• Speedup S = Told/Tnew = 1/[(1 − α) + α/k]
• Major insight—to significantly speed up the entire system, we must improve the speed of a very large fraction of the overall system.
• Special case: S∞ = 1/(1 − α)
• Describes general principle for improving any process.
• Most meaningful in the world of computers.
• Concurrency: general concept of a system with multiple, simultaneous activities.
• Parallelism: use of concurrency to make a system run faster.
• Can be exploited at multiple levels of abstraction in a system, from highest to lowest:
• Concurrency: Multiple control flows execute concurrently.
Concurrent Execution on Uniprocessors
• Uniprocessor system: system consisting of a single processor
• Since the advent of time-sharing
• Only simulated, by having a single computer rapidly switch among its executing processes
Concurrent Execution on Multiprocessors
• Multiprocessor system: system consisting of multiple processors all under the control of a single kernel
• Have become commonplace with the advent of multi-core processors and hyperthreading
• Multi-core processors have several "cores" (CPUs) integrated onto a single integrated-circuit chip.
• Each core with its own L1 and L2 caches
• Each L1 cache split into an i-cache (instruction cache) and a d-cache (data cache)
• Cores share higher levels of cache as well as interface to main memory
• Hyperthreading (simultaneous multi-threading): technique that allows a single CPU to execute multiple flows of control.
• It involves:
• Having multiple copies of some of the CPU hardware,
• E.g., PCs and register files
• Having only single copies of other parts of the hardware,
• E.g., FPUs.
• Hyperthreaded processor decides which of its threads to execute on a cycle-by-cycle basis.
• Use of multiprocessing improve system performance—
• Reduces the need to simulate concurrency when multitasking
• Can run a single multi-threaded program faster
• Instruction-level parallelism: property that at much lower level of abstraction, processors can execute multiple instructions at one time.
• Pipelining: actions required to execute an instruction are partitioned into different steps and the processor hardware is organized as a series of stages, each performing one of these steps.
• The stages can operate in parallel, working on different parts of different instructions.
• Superscalar processors: Processors that can sustain execution rates faster than 1 instruction per cycle.
• Most modern processors support superscalar operation.
• Single-instruction, multiple-data (SIMD) parallelism: mode that processors with special hardware allows a single instruction to cause multiple operations to be performed in parallel.
• Write programs using special vector data types
• Use of abstractions is one of the most important concepts in computer science.
• On the processor side
• Instruction set architecture: abstraction of the actual processor hardware
• A machine-code program behaves as if it were executed on a processor that performs just one instruction at a time.
• Underlying hardware executes multiple instructions in parallel
• On the OS side
• Files: abstraction of I/O devices
• Virtual memory: abstraction of program memory
• Processes: abstraction of a running program
• Virtual machine: abstraction of the entire computer
• Computer system = hardware + systems software + application programs
• Information = bits + context
• Compilation system
• It pays to understand how it works
• Processors read and interpret instructions stored in main memory.
• Hardware organization
• Buses
• I/O devices
• Networks
• Main memory
• Processor
• Running the hello program
• Memory hierarchy: storage devices
• Regs
• Cache memories
• Cache matters
• DRAM
• Disk storage
• Abstractions provided by the OS:
• Processes
• Kernel
• Virtual memory
• Files
• Important themes
• Amdahl's Law
• Concurrency and Parallelism
• Abstractions