Как найти файл по хэш

Is there a way I can have a hash value as input when searching for files and a complete list of files and their locations as output?

This could be helpful when trying to pin point file duplicates. I often times find myself in situations where I have a bunch of files that I know I already have stored in some location but I don’t know where. They are essentially duplicates.

For instance, I could have a bunch of files on a portable hard drive, and also hard copies of those files on the internal hard drive of a desktop computer… but I’m not sure of the location! Now if the files are not renamed, I could do a file name search to try to locate the hard copy on the desktop. I could then compare them side by side and in case they are the same I could delete the copy I have on the portable hard drive. But if the files have been renamed on either one of the hard drives this would probably not work (depending on how much the new names differ from the original).

If a file is renamed, but not edited, I could calculate its hash value, e.g. SHA1 value is 74e7432df4a66f246b5214d60b190b67e2f6ce52. I would then like to have this value as input when searching for files and have the operating system search through a given directory or the entire file system for files with this exact SHA1 hash value and output a complete list of locations where these files are stored.

I’m using Windows, but I am generally interested in knowing how something like this could be achieved, regardless of operating system.

asked Dec 24, 2013 at 12:52

Samir's user avatar

4

Linux example:

hash='74e7432df4a66f246b5214d60b190b67e2f6ce52'
find . -type f -exec sh -c '
   sha1sum "$2" | cut -f 1 -d " " | sed "s|^\\||" | grep -Eqi "$1"
' find-sh "$hash" {} ; -print

This code is more complex than you would think it should be because:

  • it is intended to correctly handle filenames with spaces, newlines, backslashes, quotations, special characters etc. (change -print to -print0 to parse them further);
  • it is intended to accept hash(es) as regex (compatible with grep -E i.e. egrep),
    e.g. '^00|00$' will match if the file hash starts or ends with 00; a more practical example is searching by many hashes at once: '74…|a9…|…|…|…' (ellipses for brevity, use full hashes).

You can use other *sum tools with compatible interface (e.g. md5sum).

answered Jul 7, 2017 at 11:19

Kamil Maciorowski's user avatar

Kamil MaciorowskiKamil Maciorowski

67.5k22 gold badges130 silver badges191 bronze badges

4

This is an intriguing question. I have been using a tool called fdupes to accomplish something similar. Fdupes will recursively search through directories and compare every file with every other file. First it compares size, and if the sizes are identical then it creates hashes of the files and compares that, if the hashes are the same then in actually goes through each file byte by byte and compares it.

When if finds all the files that are truly identical you can have it do several things. I have it delete the duplicate and create a hardlink in it’s place (thus saving me HDD space), although you can have it simply output the locations of the duplicate files and not do anything with them. This is the scenario you are asking about.

Some downsides with fdupes are that as far as I know it’s Linux only, and since it compares every file to every other file it takes quite a bit of I/O and time to run. It does not “search” for a file per say, but it would list all the files that have an identical hash.

I would highly recommend it and I set it to run in a cron job every day so that I never have any unnecessary duplicates of my data (it excludes my backups of course).

Fdupes Source Page

answered Dec 24, 2013 at 20:38

tbenz9's user avatar

tbenz9tbenz9

6,9173 gold badges28 silver badges32 bronze badges

If you have PowerShell v.4.0 or higher, you can use the command:

Get-ChildItem _search_location_ -Recurse | Get-FileHash | 
Where-Object hash -eq (Get-FileHash _search_file_).hash | Select path

Where _search_location_ is folder or disk where you want to search for a duplicate and _search_file_ is a file that has a duplicate somewhere. You can put this command in a loop to search for several files or add | Remove-Item at the end of the line to automatically delete duplicates.

Also note that this command is suitable for small search folders only – it will take a lot of time if your search location has thousands of files (like a whole HDD).

Seth's user avatar

Seth

8,9701 gold badge18 silver badges34 bronze badges

answered Dec 20, 2016 at 15:31

Alex K's user avatar

I like to use simple tools that I happen to already have so here is a way to do that with Windows PowerShell (so it obviously only works on windows). It is actually a small edit to Alex K’s answer however the question was how to search using hashes, whereas his answer searched for a copy of a specific file.

Get-ChildItem "_search_location_" -Recurse | Get-FileHash | Where-Object hash -eq _hash_here_ | Select path

Simply replace _search_location_ with what directory you wish to search and replace _hash_here_ with the hash of the file you wish to find.

Seth's user avatar

Seth

8,9701 gold badge18 silver badges34 bronze badges

answered Jul 7, 2017 at 7:22

user746340's user avatar

1

There’s a tool ($) called FileLocator Pro that can search by file hash (SHA-x or MD5).

Excerpt from this page:
http://www.mythicsoft.com/filelocatorpro/help/en/advanced_criteria.htm

Note: If the expression type is set to ‘File Hash’ then the containing
text box can include a comma separated list of hash values or a
pointer to a file containing a list of hash values, e.g.

5A9C9B42A16F5E1985B7B0A019114C7A,675C9B42A16F5E1985B7B0A019114C7A

or,

=c:FileHashTable.txt

The actual algorithms used to calculate the hash, e.g. SHA1, MD5, are
specified in the Options tab.

answered Dec 30, 2013 at 13:42

snowdude's user avatar

snowdudesnowdude

2,86017 silver badges20 bronze badges

Here’s an example for an MD5 algorithm:

Get-ChildItem "_search_location_" -Recurse | Get-FileHash -Algorithm MD5 | Where-Object hash -eq _hash_here_ | Select path

Replace _search_location_ with what directory you wish to search and replace _hash_here_ with the hash of the file you wish to find.

If you want to search for a hash besides the sha256 hash you add -Algorithm _algorithm_ after Get-FileHash where _algorithm_ is the chosen algorithm.

Beware that this requires PowerShell 4.0 and will recalculate every hash for every file for every search!

Seth's user avatar

Seth

8,9701 gold badge18 silver badges34 bronze badges

answered Jul 7, 2017 at 7:39

user746347's user avatar

Voidtools Everything 1.5 search tool has an option to add a column of various hashes, such as CRC-32, CRC-64, MD5, SHA-1, SHA-256 for each file.

enter image description here

You can then search for a particular hash as well, for example md5:71E..
enter image description here

answered Apr 22 at 19:53

Rudolph's user avatar

RudolphRudolph

2121 gold badge5 silver badges20 bronze badges

Есть ли способ получить хеш-значение в качестве ввода при поиске файлов и полный список файлов и их расположения в качестве вывода?

Это может быть полезно при попытке определить дубликаты файлов. Я часто оказываюсь в ситуациях, когда у меня есть куча файлов, которые, как я знаю, уже были сохранены в каком-то месте, но я не знаю, где. Они по сути дубликаты.

Например, у меня может быть куча файлов на переносном жестком диске, а также бумажные копии этих файлов на внутреннем жестком диске настольного компьютера … но я не уверен в их местонахождении! Теперь, если файлы не переименованы, я могу выполнить поиск по имени файла, чтобы попытаться найти печатную копию на рабочем столе. Затем я могу сравнить их рядом, и в случае, если они совпадают, я могу удалить имеющуюся копию на переносном жестком диске. Но если файлы были переименованы на одном из жестких дисков, это, вероятно, не сработает (в зависимости от того, насколько новые имена отличаются от оригинальных).

Если файл переименован, но не отредактирован, я мог бы вычислить его хеш-значение, например, значение SHA1 равно 74e7432df4a66f246b5214d60b190b67e2f6ce52 . Затем я хотел бы использовать это значение в качестве входных данных при поиске файлов, чтобы операционная система выполняла поиск по заданному каталогу или по всей файловой системе для файлов с этим точным значением хеш-функции SHA1 и выводила полный список мест, где эти файлы хранятся.

Я использую Windows, но мне, как правило, интересно знать, как можно добиться чего-то подобного, независимо от операционной системы.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
Unit Crc32;
INTERFACE
Function  GetCRC(Const Name: string): LongWord;
Function  CalcCRC(Const s: String): LongWord;
Function  CRCstring2Num(Const f: string): LongWord;
 
IMPLEMENTATION
 
Uses Crt;
 
CONST crc_32_tab: ARRAY[0..255] OF cardinal = (
$00000000, $77073096, $ee0e612c, $990951ba, $076dc419, $706af48f, $e963a535,$9e6495a3,
$0edb8832, $79dcb8a4, $e0d5e91e, $97d2d988, $09b64c2b, $7eb17cbd, $e7b82d07,$90bf1d91,
$1db71064, $6ab020f2, $f3b97148, $84be41de, $1adad47d, $6ddde4eb, $f4d4b551,$83d385c7,
$136c9856, $646ba8c0, $fd62f97a, $8a65c9ec, $14015c4f, $63066cd9, $fa0f3d63,$8d080df5,
$3b6e20c8, $4c69105e, $d56041e4, $a2677172, $3c03e4d1, $4b04d447, $d20d85fd,$a50ab56b,
$35b5a8fa, $42b2986c, $dbbbc9d6, $acbcf940, $32d86ce3, $45df5c75, $dcd60dcf,$abd13d59,
$26d930ac, $51de003a, $c8d75180, $bfd06116, $21b4f4b5, $56b3c423, $cfba9599,$b8bda50f,
$2802b89e, $5f058808, $c60cd9b2, $b10be924, $2f6f7c87, $58684c11, $c1611dab,$b6662d3d,
$76dc4190, $01db7106, $98d220bc, $efd5102a, $71b18589, $06b6b51f, $9fbfe4a5,$e8b8d433,
$7807c9a2, $0f00f934, $9609a88e, $e10e9818, $7f6a0dbb, $086d3d2d, $91646c97,$e6635c01,
$6b6b51f4, $1c6c6162, $856530d8, $f262004e, $6c0695ed, $1b01a57b, $8208f4c1,$f50fc457,
$65b0d9c6, $12b7e950, $8bbeb8ea, $fcb9887c, $62dd1ddf, $15da2d49, $8cd37cf3,$fbd44c65,
$4db26158, $3ab551ce, $a3bc0074, $d4bb30e2, $4adfa541, $3dd895d7, $a4d1c46d,$d3d6f4fb,
$4369e96a, $346ed9fc, $ad678846, $da60b8d0, $44042d73, $33031de5, $aa0a4c5f,$dd0d7cc9,
$5005713c, $270241aa, $be0b1010, $c90c2086, $5768b525, $206f85b3, $b966d409,$ce61e49f,
$5edef90e, $29d9c998, $b0d09822, $c7d7a8b4, $59b33d17, $2eb40d81, $b7bd5c3b,$c0ba6cad,
$edb88320, $9abfb3b6, $03b6e20c, $74b1d29a, $ead54739, $9dd277af, $04db2615,$73dc1683,
$e3630b12, $94643b84, $0d6d6a3e, $7a6a5aa8, $e40ecf0b, $9309ff9d, $0a00ae27,$7d079eb1,
$f00f9344, $8708a3d2, $1e01f268, $6906c2fe, $f762575d, $806567cb, $196c3671,$6e6b06e7,
$fed41b76, $89d32be0, $10da7a5a, $67dd4acc, $f9b9df6f, $8ebeeff9, $17b7be43,$60b08ed5,
$d6d6a3e8, $a1d1937e, $38d8c2c4, $4fdff252, $d1bb67f1, $a6bc5767, $3fb506dd,$48b2364b,
$d80d2bda, $af0a1b4c, $36034af6, $41047a60, $df60efc3, $a867df55, $316e8eef,$4669be79,
$cb61b38c, $bc66831a, $256fd2a0, $5268e236, $cc0c7795, $bb0b4703, $220216b9,$5505262f,
$c5ba3bbe, $b2bd0b28, $2bb45a92, $5cb36a04, $c2d7ffa7, $b5d0cf31, $2cd99e8b,$5bdeae1d,
$9b64c2b0, $ec63f226, $756aa39c, $026d930a, $9c0906a9, $eb0e363f, $72076785,$05005713,
$95bf4a82, $e2b87a14, $7bb12bae, $0cb61b38, $92d28e9b, $e5d5be0d, $7cdcefb7,$0bdbdf21,
$86d3d2d4, $f1d4e242, $68ddb3f8, $1fda836e, $81be16cd, $f6b9265b, $6fb077e1,$18b74777,
$88085ae6, $ff0f6a70, $66063bca, $11010b5c, $8f659eff, $f862ae69, $616bffd3,$166ccf45,
$a00ae278, $d70dd2ee, $4e048354, $3903b3c2, $a7672661, $d06016f7, $4969474d,$3e6e77db,
$aed16a4a, $d9d65adc, $40df0b66, $37d83bf0, $a9bcae53, $debb9ec5, $47b2cf7f,$30b5ffe9,
$bdbdf21c, $cabac28a, $53b39330, $24b4a3a6, $bad03605, $cdd70693, $54de5729,$23d967bf,
$b3667a2e, $c4614ab8, $5d681b02, $2a6f2b94, $b40bbe37, $c30c8ea1, $5a05df1b,$2d02ef8d);
 
Function GetCRC(Const Name: string): longword;
 
  (* This function returns the 32 bit CRC of a filename. It will return *)
  (* a -1 if the file could not be found or opened.                     *)
 
  Const Size = 4096;
  Type  Buffer = Array[1..Size] of Byte;
 
  var f  : file;
      crc: longword;
      Buf: ^Buffer;
      {$IFDEF OS2}
      nr : longint;
      {$ELSE}
      nr : word;
      {$ENDIF}
      Cnt: word;
      IO : word;
 
  begin
    crc := $FFFFFFFF;
    New(Buf);
 
    assign(f,Name);
    {$I-} reset(f,1); {$I+}
    IO := IOresult;
    If IO = 162 then
    begin
      cnt := 0;
      While (Cnt < 10) and (IO = 162) do
      begin
        inc(Cnt);
        Delay(10);
        {$I-} Reset(f,1); {$I+}
        IO := IOresult;
      end;
    end;
    If IO = 0 then
    begin
      Blockread(f,Buf^,Size,Nr);
      while (nr > 0) do
      begin
        For cnt := 1 to nr do
          Crc := crc_32_tab[byte(crc xor longint(Buf^[Cnt]))] xor ((crc shr 8) and $00FFFFFF);
        Blockread(f,Buf^,Size,Nr);
      end;
      close(f);
      Crc := not Crc;
    end;
    GetCRC := Crc;
    Dispose(Buf);
  end;
 
Function CalcCRC(Const s: string): LongWord;
 
  (* This function will return the 32 bit CRC of a string *)
 
  var crc: cardinal;
      Cnt: word;
 
  begin
    crc := $FFFFFFFF;
 
    For cnt := 1 to Length(s) do
      Crc := crc_32_tab[byte(crc xor longint(s[Cnt]))] xor ((crc shr 8) and $00FFFFFF);
 
    CalcCRC := -(Crc+1);
  end;
 
Function CRCstring2Num(Const f: string): LongWord;
 
  (* This function will convert a hexidecimal representation of a 32 bit *)
  (* CRC to a numerical value                                            *)
 
  var CRC: Longint;
      Cnt: Longint;
      Weight: Longint;
 
  begin
    CRC := 0;
 
    For cnt := 8 downto 1 do
    begin
      Weight := 1 shl ((8-cnt)*4);
      If f[cnt] > #57 then
        CRC := CRC + Weight*(Ord(f[cnt])-55) else
        CRC := CRC + Weight*(Ord(f[cnt])-48);
    end;
    CRCstring2Num := Crc;
  end;
end.

This tutorial explains how to find files by hash in Windows. Here I will talk about 2 different tools for Windows to find files via their SHA1, MD5, or SHA256 hash. These tools take a folder in which you want to look for a specific file and the corresponding unique hash. After that, these software look for the file and show you the result. However, these cannot do this operation in bulk. You can only find a single file at a time by providing the hash. And not only the hash, but you can search your files via other criteria as well. You can opt to search files by content, publisher, name or a mask.

There are a lot of file search software out there which use traditional methods for searching files. But if you want to search for a specific file by its hash, then you can’t do that with those software. And that’s where the tools I have mentioned below come in handy. You just have to specify a hash value and path to the folder to start search. And then these software will show you the matching file.

Find Files by Hash in Windows

To find files by hash, I have listed a free software and a small script that you can run on your PC easily. Both the tools can easily find files by their MD5 hash and yield the result. I have explained about both the tools in different sections of this article.

Find Files by Hash in Windows using Smart File Finder

Smart File finder search files by hash

Smart File Finder is a very powerful file finder software to find files in Windows using various criterias. And one of those criteria is by searching files using the hash value. It supports MD5, SHA1, SHA256, SHA512, TIGER192 and some other hash types. Just specify the hash value in the search field and the path to start the search. After that, it will show you what file matches the criteria. In the result, it shows the path and name of the file. And the result that it shows can be exported to a file.

This file search software allows you to opt for certain other options to assist the search. You can opt to include, exclude some files and folder to appear in the result. Also, you can opt to leave hidden files and subdirectories if you want. And the best part is that it supports wild card feature that you can use to narrow down the search.

Using this software to search for files by their hash is very simple. If you have a generated hash value, then you can enter that in the software. For that, run the software right after downloading it and then in Search type drop down specify the hash type that you have. After that, enter the hash value in search string box, folder to start file search, and hit the Search button. Also, before starting the search, you can opt to specify some other parameters to assist the search. It will start looking for files in the specified folder and if it finds the matching file, it will show that to you.

Find Files by Hash in Windows using a PowerShell Script

Apart from using a software to search files by hash, you can use a simple script to do the same. I found a script on GitHub that opens a window form and lets you specify a MD5 hash and a folder to search file in that. It is a simple script that you can easily execute on your Windows PC. After finding the file, matching the hash you have specified, it will show its name and path to you. However, you cannot copy that or select it.

It is very simple to use this script to search files by hash. The following steps will help you in learning how to use this script.

Step 1: Open PowerShell ISE. To open that, you can type “PowerShell” in Start and it will show up. The interface of PowerShell ISE looks like this screenshot.

powershell ISE

Step 2: Go to this link and copy all the code. Next, paste the code on the interface of the PowerShell ISE.

paste codein powershell ISE

Step 3: Now, to run the script, simply click on the ‘play’ button and a window will appear. You can specify the MD5 hash and path to the folder in which you want to look up for the file. Finally, hit the Search button and it will start looking for the target file. When it finds the file, it will show its name and path on its interface. See the screenshot below.

search files by md5 hash powershell

In this way, you can use this simple PowerShell script to find files by its MD5 hash. The speed of the search completely depends on the number of files inside the folder that you have specified. So, it is taking a long time to find your file, then it is okay. However, I really wish that it would have come with the option to specify filters.

Closing Thoughts

Normally, file search software use the file name or any other piece of text to find files. But the software I have mentioned above uses a whole different approach to locate a file. Hash is a unique value and if you don’t know the name, or any other information except hash about a file, then you can use these software. Both the tools will easily find the target file by simply providing the hash value.

Поиск по хешу

Технология поиска по преднастроенным наборам контрольных сумм файлов (хеш-сумм) служит для обнаружения определенных файлов в файловых системах рабочих станций.

В системе реализованы возможности добавления хеш-сумм отдельных файлов и банков хешей, настраиваемых в Консоли администратора или в менеджере банков хешей.

Для создания условия поиска по хеш-суммам:

  1. На панели инструментов нажмите стрелку комбинированного меню Искать и выберите в меню Поиск по хешам.
  1. В выпадающем списке Добавить выберите опцию добавления хеш-суммы (доступные для выбора алгоритмы хеш-функции MD5, SHA-1 и SHA-256) или банка хешей, либо нажмите кнопку Менеджер банков хешей для создания и добавления собственного банка хешей или просмотра и редактирования содержимого добавленных ранее банков (подробнее см. Менеджер банков хешей).

  1. Если требуется, укажите дополнительные ограничивающие условия, как описано в главе Создание поискового запроса.

Примечание:

Вы также можете управлять банками в Консоли администратора при условии наличия соответствующих прав пользователя (подробнее см. Настройка банков хешей Руководства системного администратора SecureTower).

Добавить комментарий