Tuesday, 3 August 2021

Linux fiemap internals

In this blog lets look and try to find the physical blocks of a file. Given a struct inode * of a file.

Every file has a inode. There is a on disk inode and one struct inode. The disk inode is stored in the disk, so across reboots it holds the information of a file whereabouts and details in the disk. 

On disk file inode has different layout for various filesystems: 

If we see the ext4.h, it looks like this :
/*
 * Structure of an inode on the disk
 */
struct ext4_inode {
..
..
};

This has mapping from file blocks to disk blocks.

For every filesystem, we can call an IOCTL and get the filemaps. 
A C program to call this IOCTL is mentioned here :
https://github.com/shekkbuilder/fiemap/blob/master/fiemap.c

User space C code looks like : 
/* Find out how many extents there are */
if (ioctl(fd, FS_IOC_FIEMAP, fiemap) < 0) {
fprintf(stderr, "fiemap ioctl() failed\n");
return NULL;
}
..
..
if (ioctl(fd, FS_IOC_FIEMAP, fiemap) < 0) {
fprintf(stderr, "fiemap ioctl() failed\n");
return NULL;
}

Kernel ioctl called for these is : 

static int ioctl_fiemap(struct file *filp, struct fiemap __user *ufiemap)
{  
..
error = inode->i_op->fiemap(inode, &fieinfo, fiemap.fm_start,
fiemap.fm_length);
..
}

This will call inode specific fiemap : 
For ext4 it is :
int ext4_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
u64 start, u64 len)
{
int error = 0;
For BTRFS is is :
static int btrfs_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
__u64 start, __u64 len)
{

For XFS it is :
STATIC int
xfs_vn_fiemap(
struct inode *inode,
struct fiemap_extent_info *fieinfo,
u64 start,
u64 length)
{


Kernel documentation talks about the fiemap here :
Documentation\filesystems\fiemap.rst

So can we print the filemap from kernel. Answers is yes, we can do it by calling inode fiemap function from kernel.

Lets look at the implementation which fetches this info from kernel : 
u64 traverse_extent(struct inode *inode)
{
        int n = 0;
        u64 sector_num = 0;
    if (inode->i_op->fiemap)
    {
        printk("traverse_extent fiemap present = %llx\n", inode->i_op->fiemap);
        struct fiemap fiemap;
        fiemap.fm_start = 0;
        fiemap.fm_extent_count = 1;

        struct fiemap_extent_info fieinfo = { 0, };
        struct fiemap_extent *ext1 = kmalloc(sizeof(struct fiemap_extent), GFP_KERNEL);
        memset(ext1, 0, sizeof(struct fiemap_extent));
        fieinfo.fi_extents_start = ext1;
        fieinfo.fi_extents_max = 1;
        u64 len = 4096;
        int ret = inode->i_op->fiemap(inode, &fieinfo, fiemap.fm_start, len);

        if(fieinfo.fi_extents_start)
        {
                printk("after dest logical = %llu\n", fieinfo.fi_extents_start->fe_logical);
                printk("after dest physical = %llu\n", fieinfo.fi_extents_start->fe_physical);
                if(fieinfo.fi_extents_start->fe_physical)
                        sector_num = fieinfo.fi_extents_start->fe_physical/512;
        }
        kfree(ext1);
    }

Very loosely written code, but if spent time it can be modified. 

Happy Hacking 😉😉

1 comment: