Dupseek
Details
| Size: | 13K |
| Last Update: | 2008-05-31 01:16:09 |
| Version: | 1.3 |
| OS Support: | Linux |
| License/Program Type: | GPL (GNU General Public License) |
| Publisher: | Antonio Bellezza |
| Price: | $0.00 |
Description:
Dupseek 1.3 is file managers software developed by Antonio Bellezza.
Dupseek is a command-line interactive perl program to find and remove duplicate files.
A few strategies are possible for finding duplicate files in a big set, such as a heavily populated directory.
One of the most widely used consists of grouping files by size (because files of different size can't be identical) and then computing a short digital fingerprint (such as a md5 checksum) for the files.
Files with a different fingerprint are different, and files with the same digital fingerprint are very probably the same. Just to be sure, one can further check possible duplicates.
Here are some key features of "Dupseek":
It starts by grouping files by size.
Then it starts reading small chunks of the files of the same size and comparing them. It creates smaller groups depending on these comparisons.
It goes on with bigger and bigger chunks (of size up to a hard-coded limit).
It stops reading from files as soon as they form a single-element group or they are read completely (which only happens when they have a very high probability of having duplicates).
This algorithm is much more efficient than competitors when dealing with large files of the same size. When files differ, reading usually stops after very few reads.
Dupseek (and destroy) can be interrupted at any moment. The user is then presented with partial results and can either intervene manually or go on with the reading and computation, on a group-by-group basis. Since subsequent reads happen sparsely in the file, if some files are still in the same group after many iterations, they are most probably identical, unless the differences are very small.
Requirements:
File::Find directory recursion;
IO::File object-oriented file handles;
Getopt::Std option parsing
Dupseek 1.3 supports english interface languages and works with Linux.
Downloading Dupseek 1.3 will take several seconds if you use fast ADSL connection.
0 comments
Add to
Dupseek Version History
Related Software
|
|
From category: Fonts |
| FONTpage 2.0 is fonts software developed by Paul Sherman. FONTpage is a Python font viewing and image-generating utility. It displays system fonts and allows you to change the font size and color,... |
|
|
From category: Themes |
| Divinorum is a dark GTK theme for the GNOME desktop.... |
|
|
From category: Fonts |
| FreeFont 20060126 is fonts software developed by Primoz Peterlin. FreeFont project aims to provide a set of free outline (PostScript Type0, TrueType, OpenType...) fonts covering the ISO 10646/Unico... |
|
|
From category: File-managers |
| CLEX is a file manager with a full-screen user interface written in C with the curses library.... |
|
|
From category: Window-Managers |
| AntiRight Desktop Environment 2.99.2 is window managers software developed by Jeffrey Bedard. AntiRight Desktop Environment is a lightweight and scripted desktop environment that uses the Motif too... |
|
|
From category: Desktop-Widgets |
| FlickerKaramba Interestingness 0.1 is desktop widgets software developed by boleyboley. FlickerKaramba Interestingness is another simple widget for displaying flickr photos. This one doesn\'t use t... |
|
|
From category: Icons |
| Text Dock Icons is a set of text-style icons for docks/panels/etc.... |
|
|
From category: Tools |
| WhichPKG is a simple Nautilus script that will show you which package the file belongs.... |
|
|
From category: Tools |
| beagle KIO slave 0.3.1 is tools software developed by Debajyoti Bera. Beagle is a lucene-based desktop search tool written in Mono. beagle KIO slave is a kio-slave to perform beagle query.... |
|
|
From category: File-managers |
| Nemo is a new way of managing files.... |
|
|
From category: KDE |
| PSP convert is a KDE servicemenu which allows to convert movies for Sony PSP console with many encoding presets.... |
|
|
From category: Desktop-Widgets |
| ccache stats 0.2 is desktop widgets software developed by giggluigg. ccache stats is a simple way to get real time ccache stats for gentoo maniacs!! I was in trouble formatting output. Final... |
|
|
From category: Desktop-Widgets |
| FritzCallerID 0.2 is desktop widgets software developed by themb. FritzCallerID is a SuperKaramba theme that shows number, name and address (if available) for incoming and outgoing calls via the AV... |
|
|
From category: Gnome |
| gnome-themes is a GNOME theme manager.... |
|
|
From category: Gnome |
| Deskbar HostLookup Plugin is a very simple Deskbar plugin that allows DNS lookups for IP&039;s and host names.... |
Leave a comment