Directory Traversal
Add these crates to your own project:
cargo add glob walkdir
Find all png files recursively
Recursively find all PNG files in the current directory.
In this case, the **
pattern matches the current directory and all subdirectories.
Use the **
pattern in any path portion. For example, /media/**/*.png
matches all PNGs in media
and it's subdirectories.
use std::error::Error; use glob::glob; fn main() -> Result<(), Box<dyn Error>> { for entry in glob("**/*.png")? { println!("{}", entry?.display()); } Ok(()) }
Recursively calculate file sizes at given depth
Recursion depth can be flexibly set by WalkDir::min_depth
& WalkDir::max_depth
methods.
Calculates sum of all file sizes to 3 subfolders depth, ignoring files in the root folder.
use walkdir::WalkDir; fn main() { let total_size = WalkDir::new(".") .min_depth(1) .max_depth(3) .into_iter() .filter_map(|entry| entry.ok()) .filter_map(|entry| entry.metadata().ok()) .filter(|metadata| metadata.is_file()) .fold(0, |acc, m| acc + m.len()); println!("Total size: {} bytes.", total_size); }
WalkDir
uses the builder pattern to set the arguments, then this struct is converted to an iterator.
For each file that exists and has metadata, the size is retrieved and then summed.
File names that have been modified in the last 24 hours
Gets the current working directory by calling env::current_dir
,
then for each entries in fs::read_dir
, extracts the
DirEntry::path
and gets the metadata via fs::Metadata
. The
Metadata::modified
returns the SystemTime::elapsed
time since
last modification. Duration::as_secs
converts the time to seconds and
compared with 24 hours (24 * 60 * 60 seconds). Metadata::is_file
filters
out directories.
use std::error::Error; use std::{env, fs}; fn main() -> Result<(), Box<dyn Error>> { let current_dir = env::current_dir()?; println!( "Entries modified in the last 24 hours in {:?}:", current_dir ); for entry in fs::read_dir(current_dir)? { let entry = entry?; let path = entry.path(); let metadata = fs::metadata(&path)?; let last_modified = metadata.modified()?.elapsed()?.as_secs(); if last_modified < 24 * 3600 && metadata.is_file() { println!( "Last modified: {:?} seconds, is read only: {:?}, size: {:?} bytes, filename: {:?}", last_modified, metadata.permissions().readonly(), metadata.len(), path.file_name().ok_or("No filename")? ); } } Ok(()) }
Find loops for a given path
Because it uses symbolic links, this example will run on:
- Linux / Unix
- MacOS
- Windows Services for Linux (WSL)
Add the same-file
crate to your own project:
cargo add same-file
Use same_file::is_same_file
to detect loops for a given path.
For example, a loop could be created on a Unix system via symlinks:
mkdir -p /tmp/foo/bar/baz
ln -s /tmp/foo/ /tmp/foo/bar/baz/qux
The following would assert that a loop exists.
use std::error::Error; use std::io; use std::path::{Path, PathBuf}; use std::fs::{create_dir_all, remove_file}; use std::os::unix::fs::symlink; use same_file::is_same_file; fn contains_loop<P: AsRef<Path>>(path: P) -> io::Result<Option<(PathBuf, PathBuf)>> { let path = path.as_ref(); let mut path_buf = path.to_path_buf(); while path_buf.pop() { if is_same_file(&path_buf, path)? { return Ok(Some((path_buf, path.to_path_buf()))); } else if let Some(looped_paths) = contains_loop(&path_buf)? { return Ok(Some(looped_paths)); } } return Ok(None); } fn main() -> Result<(), Box<dyn Error>>{ create_dir_all("/tmp/foo/bar/baz")?; let _ = remove_file("/tmp/foo/bar/baz/qux"); // Dont't care if the file doesn't exist yet. symlink("/tmp/foo", "/tmp/foo/bar/baz/qux")?; assert_eq!( contains_loop("/tmp/foo/bar/baz/qux/bar/baz").unwrap(), Some(( PathBuf::from("/tmp/foo"), PathBuf::from("/tmp/foo/bar/baz/qux") )) ); Ok(()) }
The Unix / Linux find
utility also detects the loop:
$ find /tmp/foo | xargs file
/tmp/foo: directory
/tmp/foo/bar: directory
/tmp/foo/bar/baz: directory
/tmp/foo/bar/baz/qux: symbolic link to /tmp/foo
$ find -L /tmp/foo
find: File system loop detected; ‘/tmp/foo/bar/baz/qux’ is part of the same file system loop as ‘/tmp/foo’.
/tmp/foo
/tmp/foo/bar
/tmp/foo/bar/baz
Recursively find duplicate file names
Find recursively in the current directory any duplicate filenames, printing them only once.
A [HashMap
] is used to collect the filenames, the file name itself is
the key, and the count of how many times that filename has been seen is
the value stored in the HashMap. Any errors encountered, such as a
directory that does not have read + execute permissions for the current
user are silently ignored.
use std::collections::HashMap; use walkdir::WalkDir; fn main() { let mut filenames = HashMap::new(); for entry in WalkDir::new(".") .into_iter() .filter_map(Result::ok) .filter(|e| !e.file_type().is_dir()) { let f_name = String::from(entry.file_name().to_string_lossy()); let counter = filenames.entry(f_name.clone()).or_insert(0); *counter += 1; if *counter == 2 { println!("{}", f_name); } } }
Recursively find all files with given predicate
Find JSON files modified within the last day in the current directory.
Using follow_links
ensures symbolic links are followed like they were
normal directories and files.
use std::error::Error; use walkdir::WalkDir; fn main() -> Result<(), Box<dyn Error>> { for entry in WalkDir::new(".") .follow_links(true) .into_iter() .filter_map(|e| e.ok()) { let f_name = entry.file_name().to_string_lossy(); let sec = entry.metadata()?.modified()?; if f_name.ends_with(".json") && sec.elapsed()?.as_secs() < 86400 { println!("{}", f_name); } } Ok(()) }
Traverse directories while skipping dotfiles
Uses filter_entry
to descend recursively into entries passing the
is_not_hidden
predicate thus skipping hidden files and directories.
Iterator::filter
applies to each WalkDir::DirEntry
even if the parent
is a hidden directory.
Root dir "."
yields through WalkDir::depth
usage in is_not_hidden
predicate.
use walkdir::{DirEntry, WalkDir}; fn is_not_hidden(entry: &DirEntry) -> bool { entry .file_name() .to_str() .map(|s| entry.depth() == 0 || !s.starts_with(".")) .unwrap_or(false) } fn main() { WalkDir::new(".") .into_iter() .filter_entry(|e| is_not_hidden(e)) .filter_map(|v| v.ok()) .for_each(|x| println!("{}", x.path().display())); }
Find all files with given pattern ignoring filename case.
Find all image files in the /media/
directory matching the img_[0-9]*.png
pattern.
A custom MatchOptions
struct is passed to the glob_with
function making the glob pattern case insensitive while keeping the other options Default
.
use std::error::Error; use glob::{glob_with, MatchOptions}; fn main() -> Result<(), Box<dyn Error>> { let options = MatchOptions { case_sensitive: false, ..Default::default() }; for entry in glob_with("test_data/foo_[0-9]*.txt", options)? { println!("{}", entry?.display()); } Ok(()) }