OSPP: File Systems - Winthrop

File Systems Main Points File layout Directory layout The External View of the File Manager Application Application Program Program UNIX Memory Mgr Process Mgr

File Mgr Device Mgr WriteFile() CreateFile() CloseHandle() ReadFile() SetFilePointer() Memory Mgr Process Mgr Device Mgr File Mgr

mount() write() close() open() read() lseek() Windows Hardware Operating Systems: A Modern Perspective, Chapter 13 Why Programmers Need Files

HTML HTML Editor Editor Web Web Browser Browser foo.html

File File Manager Manager File File Manager Manager

Structured information Can be read by any applic Accessibility Operating Systems: A Modern Perspective, Protocol Chapter 13 Persistent storage Shared device File Management File is a named, ordered collection of information The file manager administers the collection by:

Storing the information on a device Mapping the block storage to a logical view Allocating/deallocating storage Providing file directories Operating Systems: A Modern Perspective, Chapter 13 Information Structure Applications Records Byte Stream Files Stream-Block Translation Storage device

Operating Systems: A Modern Perspective, Chapter 13 Byte Stream File Interface fileID = open(fileName) close(fileID) read(fileID, buffer, length) write(fileID, buffer, length) seek(fileID, filePosition) Operating Systems: A Modern Perspective, Chapter 13 Low Level Files

fid = open(fileName,); read(fid, buf, buflen); close(fid); int int int int int open() {} close() {} read() {} write() {} seek() {}

b0 b1 b2 ... bi ... Stream-Block Translation Storage device response to commands Operating Systems: A Modern Perspective, Chapter 13 Structured Files Records

Structured Record Files Record-Block Translation Operating Systems: A Modern Perspective, Chapter 13 Record-Oriented Sequential Files Logical Record fileID = open(fileName) close(fileID) getRecord(fileID, record) putRecord(fileID, record) seek(fileID, position)

Operating Systems: A Modern Perspective, Chapter 13 Record-Oriented Sequential Files Logical Record H byte header k byte logical record ... Operating Systems: A Modern Perspective, Chapter 13 Record-Oriented Sequential Files Logical Record

H byte header k byte logical record ... ... Physical Storage Blocks Operating Systems: A Modern Perspective, Chapter 13 Fragment Indexed Sequential File Suppose we want to directly access records Add an index to the file

fileID = open(fileName) close(fileID) getRecord(fileID, index) index = putRecord(fileID, record) deleteRecord(fileID, index) Operating Systems: A Modern Perspective, Chapter 13 Indexed Sequential File (cont) Application structure Account # index = i Index

012345 123456 i 294376 k index = k ... 529366 ... 965987

j index = j Operating Systems: A Modern Perspective, Chapter 13 File System Design File System is an organized collection of regular files and directories (mkfs) Data structures Directories: file name -> file metadata Store directories as files File metadata: how to find file data blocks Free map: list of free disk blocks

File System Design Constraints For small files: Small blocks for storage efficiency Files used together should be stored together For large files: Contiguous allocation for sequential access Efficient lookup for random access May not know at file creation Whether file will become small or large Design Challenges Index structure How do we locate the blocks of a file? Index granularity

What block size do we use? Free space How do we find unused blocks on disk? Locality How do we preserve spatial locality? Reliability What if machine crashes in middle of a file system op? File Systems Traditional FFS file system (Linux) Microsofts FAT, FAT2 and NTFS file systems Journaling file systems, ext3 others

Microsoft File Allocation Table (FAT) Linked list index structure Simple, easy to implement Still widely used (e.g., thumb drives) File table: Linear map of all blocks on disk Each file a linked list of blocks FAT FAT Pros: Easy to find free block Easy to append to a file Easy to delete a file

Cons: Small file access is slow Random access is very slow Fragmentation File blocks for a given file may be scattered Files in the same directory may be scattered Problem becomes worse as disk fills Berkeley UNIX FFS (Fast File System) inode table Analogous to FAT table inode Metadata File owner, access permissions, access times, Set of 12 data pointers

With 4KB blocks => max size of 48KB files File System Structure Basic unit for allocating space on the disk is a block Disk File System partition Boot Block partition

Super-block i-node table partition Data blocks I-nodes Each file or directory in the file system has a unique entry in the i-node table. File type (regular, symbolic link, directory) Owner Permissions Timestamps for last access; last modification, last status change Size

i-node entry 0 5 11 12 13 14 15 DB 0 DB 5 DB

IPB IPB IPB DB FFS inode Metadata File owner, access permissions, access times, Set of 12 data pointers With 4KB blocks => max size of 48KB files Indirect block pointer pointer to disk block of data pointers

Indirect block: 1K data blocks => 4MB (+48KB) FFS inode Metadata File owner, access permissions, access times, Set of 12 data pointers With 4KB blocks => max size of 48KB Indirect block pointer pointer to disk block of data pointers 4KB block size => 1K data blocks => 4MB Doubly indirect block pointer Doubly indirect block => 1K indirect blocks 4GB (+ 4MB + 48KB)

FFS inode Metadata File owner, access permissions, access times, Set of 12 data pointers With 4KB blocks => max size of 48KB Indirect block pointer pointer to disk block of data pointers 4KB block size => 1K data blocks => 4MB Doubly indirect block pointer Doubly indirect block => 1K indirect blocks 4GB (+ 4MB + 48KB) Triply indirect block pointer Triply indirect block => 1K doubly indirect blocks

4TB (+ 4GB + 4MB + 48KB) FFS Asymmetric Tree Small files: shallow tree Efficient storage for small files Large files: deep tree Efficient lookup for random access in large files Disk Organization Boot Sector Blk Blk0 0 Volume Directory

Blk Blk1 1 Blk Blkk k Blk Blkk+1 k+1 Blk Blk Blk Blk

Blk Blkk-1k-1 Track 0, Cylinder 0 Blk Blk2k-1 2k-1 Track 0, Cylinder 1 Blk

Blk Track 1, Cylinder 0 Blk Blk Track N-1, Cylinder 0 Blk Blk Track N-1, Cylinder M-1 Blk

Blk Blk Blk Blk Blk Blk Blk Operating Systems: A Modern Perspective,

Chapter 13 Low-level File System Architecture Block 0 b0 b1 b2 b3 bn-1 ... Sequential Device Randomly Accessed Device

Operating Systems: A Modern Perspective, Chapter 13 Block Management The job of selecting & assigning storage blocks to the file Three basic strategies: Contiguous allocation Linked lists Indexed allocation Operating Systems: A Modern Perspective, Chapter 13 Contiguous Allocation

Maps the N blocks into N contiguous blocks on the secondary storage device Difficult to support dynamic file sizes File descriptor Head position 237 First block 785 Number of blocks25 Operating Systems: A Modern Perspective, Chapter 13 Linked Lists

Each block contains a header with Number of bytes in the block Pointer to next block Blocks need not be contiguous Files can expand and contract Seeks can be slow First block Head: 417 ... Length Byte 0 ... Byte 4095

Operating Systems: ABlock Modern0Perspective, Chapter 13 Length Length Byte 0 ... Byte 4095 Byte 0 ... Byte 4095 Block 1

Block N-1 Indexed Allocation Extract headers and put them in an index Simplify seeks May link indices together (for large files) Index block Head: 417 ... Byte 0 ... Byte 4095

Length Block 0 Byte 0 ... Byte 4095 Length Length Operating Systems: A Modern Perspective, Chapter 13 Byte 0 ... Byte 4095 Block N-1

Block 1 DOS FAT Files File Descriptor 43 254 Disk Block 107

Disk Block Disk Block File Descriptor 43 107 254 43 Disk Block

Disk Block 107 Disk Block 254 File Access Table (FAT) Operating Systems: A Modern Perspective, Chapter 13

UNIX Files inode mode owner Direct block 0 Direct block 1 Direct block 11 Single indirect Double indirect Triple indirect Data Data Data

Index Data Index Data Index Index Index Data Index Index Data Index

Index Operating Systems: A Modern Perspective, Chapter 13 Data Data Unallocated Blocks How should unallocated blocks be managed? Need a data structure to keep track of them Linked list Very large Hard to manage spatial locality Block status map (disk map)

Bit per block Easy to identify nearby free blocks Useful for disk recovery Operating Systems: A Modern Perspective, Chapter 13 FFS Locality Block group allocation Block group is a set of nearby cylinders Files in same directory located in same group Subdirectories located in different block groups inode table spread throughout disk inodes, bitmap near file blocks First fit allocation

Small files fragmented, large files contiguous FFS First Fit Block Allocation FFS First Fit Block Allocation FFS First Fit Block Allocation FFS Pros Efficient storage for both small and large files Locality for both small and large files Locality for metadata and data Cons Inefficient for tiny files (a 1 byte file requires both an inode and a data block)

Inefficient encoding when file is mostly contiguous on disk (no equivalent to superpages) Need to reserve 10-20% of free space to prevent fragmentation NTFS Master File Table Flexible 1KB storage for metadata and data Extents Block pointers cover runs of blocks Similar approach in linux (ext4) File create can provide hint as to size of file Journalling for reliability Discussed next time

NTFS Small File NTFS Medium File NTFS Indirect Block NTFS Multiple Indirect Blocks Named Data in a File System Directories A set of logically associated files and sub directories File manager provides set of controls: enumerate copy

rename delete traverse etc. Operating Systems: A Modern Perspective, Chapter 13 Directory Implementation Device Directory A device can contain a collection of files Easier to manage if there is a root for every file on the device -- the device root directory File Directory Typical implementations have directories implemented as a file with a special format

Entries in a file directory are handles for other files (which can be files or subdirectories) Operating Systems: A Modern Perspective, Chapter 13 Directory Structures How should files be organized within directory? Flat name space All files appear in a single directory Hierarchical name space Directory contains files and subdirectories Each file/directory appears as an entry in exactly one other directory -- a tree Popular variant: All directories form a tree, but a

file can have multiple parents. Operating Systems: A Modern Perspective, Chapter 13 Directories unix etc passwd home motd pro

dev twd unix ... slide1 slide2 Directory Representation Component Name Inode Number directory entry

. .. unix etc home pro dev 1 1 117 4 18 36 93 Directories

Directories Directories can be files Map file name to file number (MFT #, inode num) Table of file name -> file number Small directories: linear search Large Directories: B-Trees Hard Links unix etc home

pro dev twd image motd unix ... slide1 slide2 % ln /unix /etc/image # link system call

Directory Representation . .. unix etc home pro dev 1 1 117 4 18 36 93

. .. image motd 4 1 117 33 Create hard links $ echo n It is good to collect things, > abc $ ln abc xyz $ echo but it is better to go on walks >> xyz $ cat abc It is good to collect things, but it is better to on walks

Hard links If one is removed, the other name and the file itself continue to exist. Soft (symbolic) links Differing from a hard link, a soft link (or symbolic link) is a special kind of file containing the name of another file. Created with the ln s Soft Links unix etc home

pro dev twd image twd /home/twd % ln s /unix /home/twd/mylink % ln s /home/twd /etc/twd # symlink system call unix ...

slide1 slide2 mylink /unix Working Directory Maintained in kernel for each process paths not starting from / start with the working directory changed by use of the chdir system call displayed (via shell) using pwd how is this done? UNIX mount Command /

/ bin usr etc bill bin usr etc foo bill nutt foo /

nutt FS abc cde xyz / FS abc cde blah xyz blah

Operating Systems: A Modern Perspective, Chapter 13 mount FS at foo VFS-based File Manager Exports OS-specific API File FileSystem SystemIndependent Independent Part Partof ofFile FileManager Manager

Virtual VirtualFile FileSystem SystemSwitch Switch MS-DOS MS-DOSPart Partof of File FileManager Manager ISO ISO9660 9660Part Partof

of File FileManager Manager Operating Systems: A Modern Perspective, Chapter 13 ext2 ext2Part Partof of File FileManager Manager

File Management Operating Systems: A Modern Perspective, Chapter 13 13 An open() Operation Locate the on-device (external) file descriptor Extract info needed to read/write file Authenticate that process can access the file Create an internal file descriptor in primary memory

Create an entry in a per process open file status table Allocate resources, e.g., buffers, to support file usage Operating Systems: A Modern Perspective, Chapter 13 File Manager Data Structures 2 Keep the state of the processfile session 3 Return a reference to the data structure Process-File Session

Open File Descriptor External File Descriptor Operating Systems: A Modern Perspective, Chapter 13 1 Copy info from external to the open file descriptor Opening a UNIX File fid = open(fileA, flags);

read(fid, buffer, len); 0 1 2 3 stdin stdout stderr ... On-Device File Descriptor File structure inode

Open File Table Operating Systems: A Modern Perspective, Chapter 13 Internal File Descriptor File Descriptors External name Current state Sharable Owner User Locks Protection settings Length Time of creation

Time of last modification Time of last access Reference count Storage device details Operating Systems: A Modern Perspective, Chapter 13 Marshalling the Byte Stream Must read at least one buffer ahead on input Must write at least one buffer behind on output Seek flushing the current buffer and finding the correct one to load into memory Inserting/deleting bytes in the interior of

the stream Operating Systems: A Modern Perspective, Chapter 13 Full Block Buffering Storage devices use block I/O Files place an explicit order on the bytes Therefore, it is possible to predict what is likely to be read after bytei When file is opened, manager reads as many blocks ahead as feasible After a block is logically written, it is queued for writing behind, whenever the disk is available Buffer pool usually variably sized, depending on virtual memory needs Interaction with the device manager and memory

manager Operating Systems: A Modern Perspective, Chapter 13 File-Descriptor Table File-descriptor table 0 1 2 3 File descriptor . ref count

. . User address space n1 Kernel address space access mode file inode

location pointer Allocation of File Descriptors Whenever a process requests a new file descriptor, the lowest numbered file descriptor not already associated with an open file is selected; thus #include #include close(0); fd = open("file", O_RDONLY); will always associate file with file descriptor 0 (assuming that the open succeeds) Redirecting Output Twice if (fork() == 0) { /* set up file descriptors 1 and 2 in the close(1);

close(2); if (open("/home/twd/Output", O_WRONLY) == exit(1); } if (open("/home/twd/Output", O_WRONLY) == exit(1); } execl("/home/twd/bin/program", "program", exit(1); } /* parent continues here */ child process */ -1) { -1) {

0); Redirected Output File-descriptor table File descriptor 1 1 WRONLY 0 inode pointer

1 WRONLY 0 inode pointer File descriptor 2 User address space Kernel address space Redirected Output After Write

File-descriptor table File descriptor 1 1 WRONLY 100 inode pointer 1 WRONLY

0 inode pointer File descriptor 2 User address space Kernel address space Sharing Context Information if (fork() == 0) { /* set up file descriptors 1 and 2 in the child process */ close(1); close(2);

if (open("/home/twd/Output", O_WRONLY) == -1) { exit(1); } dup(1); /* set up file descriptor 2 as a duplicate of 1 */ execl("/home/twd/bin/program", "program", 0); exit(1); } /* parent continues here */ Redirected Output After Dup File-descriptor table File descriptor 1 2 File descriptor 2

User address space Kernel address space WRONLY 100 inode pointer Fork and File Descriptors int logfile = open("log", O_WRONLY); if (fork() == 0) { /* child process computes something, then does: */ write(logfile, LogEntry, strlen(LogEntry));

exit(0); } /* parent process computes something, then does: */ write(logfile, LogEntry, strlen(LogEntry)); File Descriptors After Fork logfile Parents address space 2 logfile Childs address space

Kernel address space WRONLY 0 inode pointer Naming (almost) everything has a path name files directories devices (known as special files)

keyboards displays disks etc. Uniformity int file = open("/home/twd/data", O_RDWR); // opening a normal file int device = open("/dev/tty", O_RDWR); // opening a device (ones terminal // or window) int bytes = read(file, buffer, sizeof(buffer)); write(device, buffer, bytes);

Recently Viewed Presentations

  • Today's Program - pocc.nais.org

    Today's Program - pocc.nais.org

    e.g. Phillips Exeter Academy, March 2, 2017 report. See also: Revil v. Coleman. July 19, 2016, Appeals Court of Massachusetts. Factually truthful press release that school employee was on leave pending investigation of sexual misconduct when ultimately no charges were...
  • Lessons from erigone occultation of regulus DSLR campaign

    Lessons from erigone occultation of regulus DSLR campaign

    Transcoding is the process of changing some part of a video file format to another type. Transcoding is needed to change one container format to another, such as .mov files to .avi files used by Limovie or Tangra. For this...
  • Environmental Compliance and Enforcement - Bowmans

    Environmental Compliance and Enforcement - Bowmans

    Persistent wrongdoing (dolus eventualis); Serious harm to people or the environment (fault - at least negligence) (Kidd) Over criminalisation? "…One of the . most serious problems . facing criminal law today is the fact that legislatures have . abused the...
  • Slide 1

    Slide 1

    Transition class offered by Lvn tri club in APRIL- will show that slide later Even IM walk Consider a Bike ride early that week maybe a light run two days prior to the race * See ankle chip on left...
  • Physical Optics Corp. Torrance, California TOPIC NUMBER: AF132-001

    Physical Optics Corp. Torrance, California TOPIC NUMBER: AF132-001

    Traditional development methods could have taken many years to produce results, so officials turned to the Air Force Small Business Innovation Research/Small Business Technology Transfer (SBIR/STTR) Program to form a partnership - and find an innovative fix - with California-based...
  • Viticulture in the Willamette Valley, Oregon: creating a

    Viticulture in the Willamette Valley, Oregon: creating a

    University of Wisconsin-Eau Claire. Objectives & Research Questions. Methods. Analysis of Results. Conclusions. Research Questions. What biological factors play the largest role in plant health and winegrape quality? In what ways does the biodynamic approach relate to the Triple Bottom...
  • Frog Anatomy - Ms. Fellow's Science Class Webpage

    Frog Anatomy - Ms. Fellow's Science Class Webpage

    Frog Anatomy External Anatomy External nares or nostrils - Anterior openings for the entry or exit of air. Tympanic Membrane - The eardrum - receives sound waves Nictitating Membrane - A transparent part of a frog's lower eyelid that moves...
  • PLE, TLE & Related Matters: The "After Thoughts" of Eminent ...

    PLE, TLE & Related Matters: The "After Thoughts" of Eminent ...

    USPAP & the Appraiser Expert. Which version of USPAP applies? State Certification. Wisconsin is a non-mandatory appraiser license state. All appraisals performed in conjunction with federally related transactions and non-federally related transactions shall conform to the USPAP in effect at...