Inspecting Filesystem Using SPL

The Standard PHP Library (SPL) is a collection of interfaces and classes that are meant to solve common problems.

SPL had been introduced in 2005 with PHP 5.0.0.

SPL doesn’t require any additional libraries; it comes by default with the PHP installation.

In this post, I’ll be showing you some iterators that are used to deal with the filesystem.

File Handling

SPL library provides three classes for file handling:

  • SplFileInfo: file information, such as size, pathname, real path, etc.
  • SplFileObject: an object-oriented interface for a file.
  • SplTempFileObject: an object-oriented interface for a temporary file.

Let’s dive into these three classes.

As I mentioned, SplFileObject provides an object-oriented interface for a file, so, instead of using fopen(), fgets(), eof() functions you’d use the SplFileObject:

// Writing to a file
$file = new SplFileObject('myfile', 'w+');
$file->fwrite('Hello World');

Let me show you another example.

Create a new text file with the following contents:

Hello World
I love PHP
PHP is amazing
SPL is great


PHP is the most used server-side programming language.

SplFileObject extends the SplFileInfo as well as the Iterator interface, this means that you can iterate over the files list using foreach:

$file = new SplFileObject('php.txt');
foreach ($file as $item) {
    echo $item;
}

Since it extends the SplFileInfo.

SplFileInfo offers an inteface that provides many usefull methods such as getMime(), getPath(), getSize() etc…

echo $item->getSize();

Refer to the PHP’s documentation for the full list of SplFileInfo methods.

SplFileInfo can also be instanciated on a file path:

$info = new SplFileInfo('example.php');
if ($info->isFile()) {
    echo $info->getRealPath();
}

You may want to use the SplTempFileObject to create a memory-based temporary file:

$temp = new SplTempFileObject();
$temp->fwrite('Hello World');

var_dump($temp->getPathName()); // php://temp

As you see here, the temporary file is stored in the memory and not in the file disk, but that’s not always the case, as I will discuss it later.

You may wondering why the file is saved into memory and not in the file system?

The memory is much faster than the file system.

Let’s see an example, imagine that you’re parsing a CSV file and sending it back to the end user so she can download it:

$file = new SplTempFileObject();
// csv processing here
$file->rewind();
header('Content-Type: text/csv');
header('Content-Disposition: attachmenet; filename=mycsvfile.csv');
$file->fpassthru();

Read more about parsing the csv files in PHP

If the file size exceeds the max_memory value (which is 2 MB by default), then it will be moved to a file in the system’s temp directory:

$temp = new SplTempFileObject(10); // 10 bytes is the max size
$temp->fwrite("This is the first line\n");
$temp->fwrite("And this is the second.\n");

Even if the file will be moved to the temp directory, there’s no way to get its full system path, it’ll always refer to the php://temp as a file path, it’s wired, isn’t it?

You may use the tmpfile() function to create a disk-based temp file and then reterive its path by using the stream_get_meta_data:

$file = tmpfile();
$path = stream_get_meta_data($file)['uri']; // eg: /tmp/phpFx0513a

DirectoryIterator

As its name implies, DirectoryIterator traverses the given directory:

$files = new DirectoryIterator('/Users/ahmad');
echo '<ul>';
foreach ($files as $file) {
    echo '<li>'.$file.'</br>';
}
echo '</ul>';

By default,DirectoryIterator includes the . and .. when listing the files, you may use the isDot() method to skip the dots while traversing them:

/** @var DirectoryIterator $item */
foreach ($dir as $key => $item) {
    if ($item->isDot()) { continue; }
    echo '<li>'.$item.'</br>';
}

You may view the available methods for the DirectoryIterator by either calling the get_class_methods() or inspecting the PHP’s documentation:

print_r(get_class_methods(DirectoryIterator::class));

Sometimes, you need to filter the files by storing them into a new array:

$files = [];
foreach ($dir as $key => $item) {
    $files[] = $item;
}
echo $files[0]->getFilename();

Due to the nature of the iterators, the last line snippet won’t return anything, to fix it, you do need to clone the item:

$files = [];
foreach ($dir as $key => $item) {
    $files[] = clone $item;
}
echo $files[0]->getFilename();

FilesystemIterator

The FilesystemIterator is an enhanced version of the DirectoryIterator.

In fact, FilesystemIterator extends the DirectoryIterator and adds a few more features:

  • flags: configurable options.
  • Returns SplFileInfo as a file object instead of the DirectoryIterator.
  • Uses the file path as a key/value pair.

Let’s see a few examples:

// Skipping the dots by using the SKIO_DOTS flag
$files = new FilesystemIterator('/Users/ahmad', FilesystemIterator::SKIP_DOTS);
foreach ($files as $key => $item) {
    var_dump($key); // $key is used a full path name
}

You may want to use the filename instead of the full path as the key:

// Use | to add more flags
$files = new FilesystemIterator($dirName, FilesystemIterator::SKIP_DOTS | FilesystemIterator::KEY_AS_FILENAME);

If you dump the $item you can see that it’s of type SplFileInfo and not DirectoryIterator:

foreach ($files as $item) {
    var_dump($item); // SplFileInfo
}
// Listing the full SplFileInfo methods
print_r(get_class_methods(SplFileInfo::class));

Unlike DirectoryIterator you don’t need to clone the $item while storing it into an array.

$files = [];
foreach ($dir as $key => $file) {
    $files[] = $file;
}
var_dump($files[0]->getSize());

I highly encourage you to use the FilesystemIterator instead of DirectoryIterator.

RecursiveDirectoryIterator

You may use the RecursiveDirectoryIterator to get all the files/directories recursively.

RecursiveDirectoryIterator extends the FilesystemIterator as well as implementing the RecursiveIterator interface:

$files = new RecursiveDirectoryIterator('/Users/ahmad');
foreach ($files as $item) {
    echo $item.'<br>';
}

As you can see here, nothing happens but returning the exact same result as if we were using the FilesystemIterator.

To make it return all the files/directories recursively, you have to give the RecursiveDirectoryIterator as a parameter to the RecursiveIteratorIterator it’s kinda weird, but it makes sense, because RecursiveIteratorIterator traverses all the children recursively, so, let’s see how it works:

$files = new RecursiveDirectoryIterator('/Users/ahmad');
$files = new RecursiveIteratorIterator($files);
foreach ($files as $file) {
    echo $file.PHP_EOL;
}

You may use the setMaxDepth() method to make it traversing depth, the default value is zero which means traversing all the files/directories.

LimitIterator

LimitIterator allows iteration over a limited subset of items in an Iterator.

$files = new RecursiveDirectoryIterator('/Users/ahmad');
$files = new RecursiveIteratorIterator($files);

// Get the first 10 results
$files = new LimitIterator($files, 0, 10);
foreach ($files as $file) {
	echo $file.PHP_EOL;
}

GlobIterator

As its name implies, GlobIterator used the glob patterns.

GlobIterator extends the FilesystemIterator which means that it returns an iterator ofSplFileInfo when returning its results.

Let’s see how can we get all the pdf files that reside in a particular directory:

$files = new GlobIterator('/Users/ahmad/Library/*.pdf');
/** @var SplFileInfo $file */
foreach ($files as $file) {
	echo $file->getFilename();
}

GlobalIterator only accepts an absolute path

RegexIterator

As its name implies, RegexIterator is used to apply regular expressions on the file system:

// Get all pdf and epub files
$files = new FilesystemIterator('/Users/ahmad/Library');
$files = new RegexIterator($files, '/\.(?:pdf|epub)$/i');
foreach ($files as $file) {
	echo $file.PHP_EOL;
}

You may want to apply the RegexIterator to search for the given pattern recursively:

$files = new RecursiveDirectoryIterator('/Users/ahmad/Library');
$files = new RecursiveIteratorIterator($files);
$files = new RegexIterator($files, '/\.(?:pdf|epub)$/i');

foreach ($files as $item) {
	echo $item.PHP_EOL;
}

Conclusion

SPL is a rich library, it provides solutions for some common problems.

I highly encourage you to use SPL instead of some composer libraries unless the composer library provides something difficult to implement such as SymfonyFinder.

Leave a Reply