0

I am looking into contributing to software for whistleblowers, and need the ability to best support privacy of their submissions. Which would include stripping all metadata from the files they upload. For metadata that cannot be removed (eg. certain file system metadata) then their metadata should be wiped clean / reset (so date created becomes today, etc).

I've found tools to do it for specific file types, however nothing seems comprehensive.

Naturally any file type is a bit unreasonable, but I think all the most common file types would be good enough: text files, office documents, pdf documents, image files, audio files, video files, zip/tar files

balupton
  • 471

1 Answers1

2

Removing file metadata from all file types is impossible because the metadata is stored within the contents of the file. Hence such a tool would need to be able to comprehend every file type which has been, and will be, created.

However it is possible to create a tool which is as comprehensive as you can get it to be. In short, find tools which strip the metadata from the "most common file types", however you choose to define that, and then create a wrapper. You can use file, namely it's "magic" database, to detect the file type. Then, delegate the stripping to the designated tool.