Does the Ram Read Method Move the File Pointer

7.Programmer's View of Files§

  • Logical view of files:
    • An a array of bytes.
    • A file pointer marks the current position.
  • Three fundamental operations:
    • Read bytes from current position (motility file pointer)
    • Write bytes to current position (move file pointer)
    • Fix file arrow to specified byte position.

Java File Functions

RandomAccessFile(String name, Cord way)

close()

read(byte[] b)

write(byte[] b)

seek(long pos)

Main vs. Secondary Storage

  • Primary storage: Master memory (RAM)
  • Secondary Storage: Peripheral devices
    • Disk drives
    • Record drives
    • Flash drives

Comparisons

\[\begin{split}\begin{array}{fifty|r|r|r|r|r|r|r} \hline \textbf{Medium}& 1996 & 1997 & 2000 & 2004 & 2006 & 2008 & 2011\\ \hline \textbf{RAM}& \$45.00 & 7.00 & 1.500 & 0.3500 & 0.1500 & 0.0339 & 0.0138\\ \textbf{Disk}& 0.25 & 0.10 & 0.010 & 0.0010 & 0.0005 & 0.0001 & 0.0001\\ \textbf{USB drive}& -- & -- & -- & 0.1000 & 0.0900 & 0.0029 & 0.0018\\ \textbf{Floppy}& 0.50 & 0.36 & 0.250 & 0.2500 & -- & -- & --\\ \textbf{Tape}& 0.03 & 0.01 & 0.001 & 0.0003 & -- & -- & --\\ \textbf{Solid State}& -- & -- & -- & -- & -- & -- & 0.0021\\ \hline \finish{array}\end{separate}\]

  • (Costs per Megabyte)
  • RAM is commonly volatile.
  • RAM is about 1/2 one thousand thousand times faster than disk.

Golden Dominion of File Processing

  • Minimize the number of deejay accesses!
    1. Arrange information so that you lot become what y'all want with few disk accesses.
    2. Suit data to minimize future deejay accesses.
  • An organisation for data on disk is oftentimes called a file structure.
  • Disk-based space/fourth dimension tradeoff: Shrink information to save processing time past reducing disk accesses.

Deejay Drives

Disk drive platters

Sectors

The organization of a disk platter

  • A sector is the bones unit of I/O.

Terms

  • Locality of Reference: When record is read from deejay, next request is likely to come up from near the aforementioned place on the disk.
  • Cluster: Smallest unit of file allocation, usually several sectors.
  • Extent: A group of physically contiguous clusters.
  • Internal fragmentation: Wasted space within sector if record size does non lucifer sector size; wasted space within cluster if file size is not a multiple of cluster size.

Seek Time

  • Seek fourth dimension: Fourth dimension for I/O head to accomplish desired track. Largely adamant past altitude between I/O head and desired track.
  • Track-to-runway time: Minimum fourth dimension to move from one runway to an adjacent track.
  • Average Admission time: Average fourth dimension to reach a rails for random admission.

Other Factors

  • Rotational Filibuster or Latency: Time for data to rotate nether I/O head.
    • Ane one-half of a rotation on boilerplate.
    • At 7200 rpm, this is viii.three/2 = 4.2ms.
  • Transfer time: Fourth dimension for information to move under the I/O caput.
    • At 7200 rpm: Number of sectors read/Number of sectors per track * eight.3ms.

(One-time) Disk Spec Instance

  • 16.8 GB disk on 10 platters = 1.68GB/platter
  • thirteen,085 tracks/platter
  • 256 sectors/track
  • 512 bytes/sector
  • Track-to-rail seek time: 2.2 ms
  • Average seek time: 9.5ms
  • 4KB clusters, 32 clusters/rail.
  • 5400RPM

Deejay Admission Price Case (1)

  • Read a 1MB file divided into 2048 records of 512 bytes (1 sector) each.
  • Assume all records are on viii face-to-face tracks.
  • First rail: ix.5 + (11.1)(1.5) = 26.2 ms
  • Remaining 7 tracks: 2.2 + (11.1)(ane.five) = xviii.9ms.
  • Full: 26.two + seven * xviii.9 = 158.5ms

Deejay Access Cost Example (2)

  • Read a 1MB file divided into 2048 records of 512 bytes (1 sector) each.
  • Presume all file clusters are randomly spread across the disk.
  • 256 clusters. Cluster read time is 8/256 of a rotation for near 5.9ms for both latency and read time.
  • 256(ix.5 + 5.9) is about 3942ms or most 4 sec.

How Much to Read?

  • Read time for i track: \(9.5 + (11.ane)(1.5) = 26.2\) ms
  • Read time for one sector: \(ix.five + 11.1/2 + (one/256)11.1 = 15.1\) ms
  • Read time for i byte: \(9.5 + xi.one/2 = 15.05\) ms
  • Nigh all deejay drives read/write one sector (or more than) at every I/O access
  • Also referred to equally a page or block

Newer Disk Spec Example

  • Samsung Spinpoint T166
  • 500GB (nominal)
  • 7200 RPM
  • Track to track: 0.8 ms
  • Average rails access: 8.9 ms
  • Bytes/sector: 512
  • 6 surfaces/heads

Buffers

  • The information in a sector is stored in a buffer or cache.
  • If the next I/O access is to the same buffer, then no need to go to deejay.
  • Deejay drives commonly have i or more input buffers and 1 or more output buffers.

Buffer Pools

  • A serial of buffers used by an awarding to enshroud disk data is called a buffer puddle.
  • Virtual retentiveness uses a buffer pool to imitate greater RAM retention by really storing information on disk and "swapping" between disk and RAM.

Buffer Pools

Organizing Buffer Pools

  • Which buffer should be replaced when new data must be read?
  • Outset-in, First-out: Utilize the first one on the queue.
  • To the lowest degree Often Used (LFU): Count buffer accesses, reuse the least used.
  • Least Recently used (LRU): Keep buffers on a linked listing. When buffer is accessed, bring it to front. Reuse the 1 at end.

LRU

Dirty Fleck

Bufferpool ADT: Message Passing

                            // ADT for buffer pools using the message-passing style              public              interface              BufferPoolADT              {              // Copy "sz" bytes from "infinite" to position "pos" in the buffered storage              public              void              insert              (              byte              []              space              ,              int              sz              ,              int              pos              );              // Copy "sz" bytes from position "pos" of the buffered storage to "space"              public              void              getbytes              (              byte              []              space              ,              int              sz              ,              int              pos              );              }            

Bufferpool ADT: Buffer Passing

                            // ADT for buffer pools using the buffer-passing style              public              interface              BufferPoolADT              {              // Return arrow to the requested block              public              byte              []              getblock              (              int              block              );              // Set the dirty fleck for the buffer holding "block"              public              void              dirtyblock              (              int              cake              );              // Tell the size of a buffer              public              int              blocksize              ();              };            

Design Issues

  • Disadvantage of message passing:
    • Messages are copied and passed back and forth.
  • Disadvantages of buffer passing:
    • The user is given admission to arrangement memory (the buffer itself)
    • The user must explicitly tell the buffer pool when buffer contents take been modified, so that modified information can be rewritten to disk when the buffer is flushed.
    • The pointer might become dried when the bufferpool replaces the contents of a buffer.

Some Goals

  • Be able to avoid reading data when the block contents will be replaced.
  • Exist able to support multiple users accessing a buffer, and independantly releasing a buffer.
  • Don't make an active buffer stale.

Improved Interface

                            // Improved ADT for buffer pools using the buffer-passing manner.              // Most user functionality is in the buffer class, non the buffer puddle itself.              // A unmarried buffer in the buffer pool              public              interface              BufferADT              {              // Read the associated block from disk (if necessary) and return a              // arrow to the data              public              byte              []              readBlock              ();              // Return a arrow to the buffer's information array (without reading from disk)              public              byte              []              getDataPointer              ();              // Flag buffer's contents as having changed, so that flushing the              // block will write information technology back to deejay              public              void              markDirty              ();              // Release the block'south access to this buffer. Further accesses to              // this buffer are illegal              public              void              releaseBuffer              ();              }            

Improved Interface (2)

                            public              interface              BufferPoolADT              {              // Chronicle a block to a buffer, returning a pointer to a buffer object              Buffer              acquireBuffer              (              int              block              );              }            

External Sorting

  • Problem: Sorting data sets too big to fit into main memory.
    • Assume data are stored on deejay drive.
  • To sort, portions of the information must be brought into main memory, processed, and returned to deejay.
  • An external sort should minimize disk accesses.

Model of External Computation

  • Secondary memory is divided into equal-sized blocks (512, 1024, etc…)
  • A basic I/O operation transfers the contents of one disk block to/from master memory.
  • Under certain circumstances, reading blocks of a file in sequential lodge is more efficient. (When?)
  • Primary goal is to minimize I/O operations.
  • Presume just 1 disk bulldoze is available.

Central Sorting

  • Oftentimes, records are big, keys are pocket-size.
    • Ex: Payroll entries keyed on ID number
  • Approach ane: Read in entire records, sort them, then write them out once again.
  • Approach ii: Read only the key values, store with each central the location on disk of its associated record.
  • Afterwards keys are sorted the records can be read and rewritten in sorted order.

Uncomplicated External Mergesort (i)

  • Quicksort requires random access to the entire ready of records.
  • Ameliorate: Modified Mergesort algorithm.
    • Process \(due north\) elements in \(\Theta(\log n)\) passes.
  • A group of sorted records is called a run.

Simple External Mergesort (two)

ane. Dissever the file into two files.

2. Read in a cake from each file.

3. Have first record from each cake, output them in sorted society.

4. Take next tape from each block, output them to a second file in sorted order.

v. Repeat until finished, alternating between output files. Read new input blocks as needed.

half-dozen. Repeat steps ii-5, except this time input files accept runs of two sorted records that are merged together.

seven. Each laissez passer through the files provides larger runs.

Simple External Mergesort (three)

Bug with Simple Mergesort

  • Is each laissez passer through input and output files sequential?
  • What happens if all work is done on a single disk drive?
  • How can we reduce the number of Mergesort passes?
  • In general, external sorting consists of ii phases:
    • Interruption the files into initial runs
    • Merge the runs together into a unmarried run.

A Better Procedure

Breaking a File into Runs

  • General approach:
    • Read equally much of the file into retentiveness as possible.
    • Perform an in-retentiveness sort.
    • Output this grouping of records as a unmarried run.

Replacement Option (1)

  • Break available memory into an array for the heap, an input buffer, and an output buffer.
  • Fill the array from disk.
  • Make a min-heap.
  • Send the smallest value (root) to the output buffer.

Replacement Selection (2)

  • If the next key in the file is greater than the last value output, and so

    • Replace the root with this primal

    else

    • Supercede the root with the terminal key in the assortment

    Add the next tape in the file to a new heap (actually, stick it at the finish of the array).

RS Example

Snowplow Analogy (i)

  • Imagine a snowplow moving effectually a circular rails on which snow falls at a steady rate.
  • At any instant, there is a sure amount of snow Southward on the track. Some falling snow comes in front of the plow, some backside.
  • During the adjacent revolution of the plow, all of this is removed, plus 1/ii of what falls during that revolution.
  • Thus, the plow removes 2S amount of snow.

Snowplow Illustration (ii)

Problems with Elementary Merge

  • Uncomplicated mergesort: Place runs into ii files.
    • Merge the first ii runs to output file, then side by side two runs, etc.
  • Repeat process until just one run remains.
    • How many passes for r initial runs?
  • Is there benefit from sequential reading?
  • Is working retentiveness well used?
  • Need a style to reduce the number of passes.

Multiway Merge (ane)

  • With replacement option, each initial run is several blocks long.
  • Assume each run is placed in divide file.
  • Read the start block from each file into retentiveness and perform an r-way merge.
  • When a buffer becomes empty, read a block from the advisable run file.
  • Each tape is read only one time from disk during the merge process.

Multiway Merge (2)

  • In practice, use merely one file and seek to appropriate block.

Limits to Multiway Merge (1)

  • Assume working retentiveness is \(b\) blocks in size.
  • How many runs can be processed at in one case?
  • The runs are \(2b\) blocks long (on boilerplate).
  • How big a file can be merged in 1 pass?

Limits to Multiway Merge (2)

  • Larger files will need more passes – but the run size grows speedily!
  • This approach trades (\(\log b\)) (possibly) sequential passes for a single or very few random (block) access passes.

General Principles

  • A adept external sorting algorithm will seek to do the following:
    • Brand the initial runs as long equally possible.
    • At all stages, overlap input, processing and output equally much as possible.
    • Apply as much working memory as possible. Applying more retentivity normally speeds processing.
    • If possible, use additional deejay drives for more overlapping of processing with I/O, and allow for more sequential file processing.

douglassbeire1996.blogspot.com

Source: https://opendsa-server.cs.vt.edu/OpenDSA/Books/CS3slides/html/FileProc.html

0 Response to "Does the Ram Read Method Move the File Pointer"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel