find_duplicates_by_size

find_duplicates_by_size(directory)

Finds duplicate files based on file sizes within a given directory and its subdirectories.

Parameters

Name Type Description Default
directory str The path to the directory to search for duplicates. required

Returns

Name Type Description
dict A dictionary where keys are file sizes (in bytes) and values are lists of file paths that have that size. Only includes sizes that appear more than once.

Examples

>>> import tempfile
>>> import os
>>> with tempfile.TemporaryDirectory() as tmp:
...     _ = open(os.path.join(tmp, "a.txt"), "w").write("same")
...     _ = open(os.path.join(tmp, "b.txt"), "w").write("same")
...     duplicates = find_duplicates_by_size(tmp)
...     list(duplicates.keys()) == [4]
True