Tuesday, February 26, 2013

FILE ORGANIZATION IN DBMS



FILE ORGANIZATION IN DBMS


File Organization


  • File organization refers to the relationship of the key of the record to the physical location of that record in the computer file.
  • File organization may be either physical file or a logical file. A physical file is a physical unit, such as magnetic tape or a disk.
  • A logical file on the other hand is a complete set of records for a specific application or purpose.
  • A logical file may occupy a part of physical file or may extend over more than one physical file.
  • Typical DBMS applications need a small subset of the DB at any given time.
  • when a portion of the data is needed it must be located on disk, copied to
  • memory for processing and rewritten to disk if the data was modified.

Advantages of File Organization


  1. Fast access to single record or collection of related records.
  2. Easy record adding/update/removal, without disrupting.
  3. Storage efficiency.
  4. Redundancy as a warranty against data corruption.

Types of File Organization

  1. Sequential File
  2. Indexed Sequential
  3. Direct file / Hash file


Sequential file

  • A sequential file maintains the records in the logical sequence of its primary key values.
  • A sequential file can be stored on devices like magnetic tape that allow sequential access.
  • In this organization records are written consecutively when the file is created. Records in a sequential file can be stored in two ways.

 Pile file: Records are placed one after another as they arrive (no sorting of any
kind).

Sorted file: Records are placed in ascending or descending values of the
primary key.

File Reorganization: In file reorganization all records, which are marked to be deleted are deleted and all inserted records are moved to their correct place (sorting). 

File reorganization steps are:

  1. read the entire file (all blocks) in RAM.
  2. remove all the deleted records. 
  3. write all the modified blocks at a different place on the disk.

1. Inserting a record: To insert a record, it is placed at the end of the file. No need to sort (ascending or descending order) the file.

2. Deleting or modifying a record: This will require to fetch the block containing the record, find the record in the block and just mark it deleted, then write the modified block to the disk. Total time required: Tor T= T+ 2r.

3. Sorted Sequential File: In a sorted file, first the record is inserted at the end of the file and then moved to its correct location (ascending or descending). Records are stored in order of the values of the key field.


Advantages of Sequential File Organisation

     
1. Good for report generation, statistical computation and inventory control.
2. It is fast and efficient when dealing with large volumes of data that need to be processed periodically (batch system)
3. Simple file design
4. Very efficient when most of the records must be processed e.g. Payroll
5. Very efficient if the data has a natural order
6. Can be stored on inexpensive devices like magnetic tape



Disadvantages of sequential File Organisation

• Requires that all new transactions be sorted into the proper sequence for
sequential access processing.

• Locating, storing, modifying, deleting, or adding records in the file require
rearranging the file.

• This method is too slow to handle applications requiring immediate updating or
responses.


Direct File Organization

1. Records are read directly from or written on to the file.
2. The records are stored at known address.
3. Address is calculated by applying a mathematical function to the key field.
4. Such file are created using some hashing function so they are called hashing
organization or hashed files.


5. Files in his type are stored in direct access storage devices such as magnetic disk, using an identifying key.
6. The identifying key relates to tits actual storage position in the file.
7. The computer can directly locate the key to find the desired record without having to 
search through any other record first.
8. Here the records are stored randomly, hence the name random file. 
9. It uses online system where the response and updation are fast.

Example :
Any information retrieval system.Eg Train timetable system.

Advantages of Direct File Organization



 1.Records can be immediately accessed for updation.
 2. Several files can be simultaneously updated during transaction processing.
3. Transaction need not be sorted.
4. Existing records can be amended or modified.
5. Most suitable for interactive online applications.Very easy to handle random enquiries.
Disadvantages of Direct File Organization

1. Data may be accidentally erased or over written unless special precautions are taken.
2. Risk of loss of accuracy and breach of security. 
3. Special backup and reconstruction procedures must be established.
4. Less efficient use of storage space.
5. Expensive hardware and software are required.
6. High complexity in programming.
7. File updation is more difficult when compared to that of sequential method.

Indexed File Organization

1. An indexed file contains records ordered by a record key.
2. Each record contains a field that contains the record key.
3. The record key uniquely identifies the record and determines the sequence in which it is accessed with respect to other records.
4. A record key for a record might be, for example, an employee number or an invoice
number.
5. An indexed file can also use alternate indexes, that is, record keys that let you access the file using a different logical arrangement of the records.
6. For example, you could access the file through employee department rather than
through employee number.
7. The record transmission (access) modes allowed for indexed files are sequential,
random, or dynamic. When indexed files are read or written sequentially, the sequence 
is that of the key values.

Advantages of Indexed File Organization

1.     Quite easy to process,
2.     With proper selection of a key field, records in a large file can be searched and accessed in very quickly.
3.     Any field of the records can be used as the key. The key field can be numerical or alphanumerical.

Disadvantages Of Indexed Files

1.     Extra data structures have to be maintained .These extra data structures maintained on the disk can use up much disk space, especially for long key values.
2.     The indexed files have to be reorganized from time time to get rid of deleted records and improve performance that gets gradually decreased with addition of new records.



























No comments:

Post a Comment