Master file table (mft)
Clusters are the key element to allocation:
- Logically, the disk consists of allocation units called
clusters.
- A cluster is a power-of-two multiple of the physical disk block
size. The cluster size is set when the disk is formatted. A small clusterprovides a finer granularity of allocation, but may require more space to
describe the file and more separate operations to transfer data to or frommemory.
- The free list is a bitmap, each of whose bits describe one
cluster.
- Clusters on the disk are numbered starting from zero to the
maximum number of clusters (minus one). These numbers are called logical clusternumbers (LCN) and are used to name blocks (clusters) on disk.
The MFT is the major, and in some ways, the only
data structure on disk:
- All files, and therefore all objects stored on disk are
described by the MFT.
- All files are logically stored in the MFT and, for small files
are physically within the bounds of the MFT. In this sense, the MFT is the filesystem.
- The MFT logically can be described as a table with one row per
file.
- The first rows in the table described important configuration
files, including files for the MFT itself.
Mft entries
As stated previously, each row or entry in the MFT
(called a record) describes a file and logically contains the file. In the caseof small files, the entry actually contains the contents of the file.
Each entry is consists of (attribute, value) pairs.
While the conceptual design of NTFS is such that this set of pairs is extensibleto include user-defined attributes, current versions of NTFS have a fixed set.
The main attributes are:
- Standard information: This attribute includes the information
that was standard in the MS-DOS world:
- read/write permissions,
- creation
time,
- last modification time,
- count of how many
directories point to this file (hard link count.
- File Name: This attribute describes the file's name in the
Unicode character set. Multiple file names are possible, such as when:
- the file has multiple links,
or
- the file has an MS-DOS short name.
- Security Descriptor: This attribute lists which user owns the
file and which users can access it (and how they can access it).
- Data: This attribute either contains the actual file data in the
case of a small file or points to the data (or points to the objects that pointto the data) in the case of larger files.
For small files, this design is extremely
efficient. By looking no further than the MFT entry, you have the completecontents of the file.
However, the Data field gets interesting when the
data contained in the file is larger than an MFT entry. When dealing with largedata, the Data attribute contains pointers to the data, rather than the data
itself.
- The pointers to data are actually pointers to sequences of
logical clusters on the disk.
- Each sequence is identified by three parts:
- starting cluster in the file, called the
virtual cluster number (VCN),
- starting logical cluster (LCN) of the
sequence on disk,
- length, counted as the number of
clusters.
- The run of clusters is called an extent, following the
terminology developed by IBM in the 1960's.
- NTFS allocates new extents as necessary. When there is no more
space left in the MFT entry, then another MFT entry is allocated. This design iseffectively a list of extents, rather than the Unix or DEMOS tree of extents.