find_duplicates
find_duplicates(directory, method='content')
Finds duplicate files within a given directory and its subdirectories. This is the main function that will call the other functions.
Parameters
| directory |
str |
The path to the directory to search for duplicates. |
required |
| method |
str |
The method to use for finding duplicates. Can be ‘name’, ‘size’, or ‘content’. |
'content' |
Returns
|
dict |
A dictionary where keys are duplicate identifiers and values are lists of matching file paths. Empty if none. |
Raises
| ValueError |
If the provided method is not one of ‘name’, ‘size’, or ‘content’. |
| FileNotFoundError |
If the provided directory path does not exist or is not a directory. |
Examples
>>> import tempfile
>>> import os
>>> with tempfile.TemporaryDirectory() as tmp:
... path_1 = os.path.join(tmp, "a.txt")
... path_2 = os.path.join(tmp, "b.txt")
... _ = open(path_1, "w").write("same")
... _ = open(path_2, "w").write("same")
... duplicates = find_duplicates(tmp, method="content")
... any(len(paths) > 1 for paths in duplicates.values())
True