DocFetcher - File Content Indexer

Submit portable freeware that you find here. It helps if you include information like description, extraction instruction, Unicode support, whether it writes to the registry, and so on.
Message
Author
User avatar
webfork
Posts: 10818
Joined: Wed Apr 11, 2007 8:06 pm
Location: US, Texas
Contact:

Re: DocFetcher - File Content Indexer

#46 Post by webfork »

Emka wrote:v1.1.11 is out
Ah great news.

February 19th, 2014
  • Features:
    Added EPUB support.
    Advanced settings: New entry 'SkipTarArchives' for disabling tar archive support.
    Bugfixes:
    Bug #659: Windows installer downloaded an obsolete Java runtime version.
    Bug #670: Crash when entering an invalid value into the occurrence field above the preview pane.
    Bug #573, #612 and others: Crash while indexing zip archives.
    Changes:
    For developers:) Removed AspectJ dependency.
    Updated Russian GUI translation.
http://www.softpedia.com/progChangelog/ ... 10899.html

robertcollier4
Posts: 39
Joined: Fri Apr 12, 2013 3:20 am

Re: DocFetcher - File Content Indexer

#47 Post by robertcollier4 »

https://www.donationcoder.com/forum/ind ... _next=next
Skrommel
"InEverything - Search inside files using the Everything NTFS search engine.

I use Everything to search for files, but it doesn't search inside files, so I've thrown together a solution. Beta.

To run the script, you need to download Everything and ES from http://www.voidtools.com/download.php. Install Everything, turn on Tools - Options - ETP/FTP and check Start ETP/FTP server on startup. Plase ES.exe next to the script."

User avatar
webfork
Posts: 10818
Joined: Wed Apr 11, 2007 8:06 pm
Location: US, Texas
Contact:

Re: DocFetcher - File Content Indexer

#48 Post by webfork »

Update: whatever was causing this issue has cleared up in 1.1.13. Note: I backed up to version 1.1.11 as there were issues creating multiple indexes on 1.1.12 in my Win7x64 environment.

---

DocFetcher 1.1.12 is out. I use this program quite a bit and v.1.1.11 was fairly bug free so I haven't seen a clear indicator that the new version fixes something in particular but always good to see an open, active project surrounding freeware I love.

Changelog: http://www.softpedia.com/get/PORTABLE-S ... cher.shtml

Home: http://docfetcher.sourceforge.net/en/index.html and Softpedia: http://www.softpedia.com/progChangelog/ ... 48322.html

Emka
Posts: 290
Joined: Fri Sep 17, 2010 9:31 pm

Re: DocFetcher - File Content Indexer

#49 Post by Emka »


User avatar
webfork
Posts: 10818
Joined: Wed Apr 11, 2007 8:06 pm
Location: US, Texas
Contact:

Re: DocFetcher - File Content Indexer

#50 Post by webfork »

Thanks for adding that, Emka.

User avatar
webfork
Posts: 10818
Joined: Wed Apr 11, 2007 8:06 pm
Location: US, Texas
Contact:

Re: DocFetcher - File Content Indexer

#51 Post by webfork »

Been working with program over the last two years now and feel like I should give it a separate post for this.

Wishlist:
  • Search entirely by file dates (thinking of another box next to the size box)
  • Ability to search by various types of metadata
  • DjVu capability - according to this it doesn't look very posible: http://sourceforge.net/p/docfetcher/fea ... quests/35/).
  • Ability to set processor usage. While the program is running javaw.exe *32 process is almost always running as much as 13% processor usage for indexing. Would like to be able to keep that to a minimum.
  • In the "Document Types" window along with check, uncheck, and invert, I'd like to see "Microsoft, OpenOffice, and 'other'"

User avatar
webfork
Posts: 10818
Joined: Wed Apr 11, 2007 8:06 pm
Location: US, Texas
Contact:

Re: DocFetcher - File Content Indexer

#52 Post by webfork »

I'm pleased to report that PA is working on a DocFetcher version. I'm obviously a fan of the program so more attention is always good but this is huge for people who need a solid search program but have network requirements against Java installations, which I ran into recently.

http://portableapps.com/node/53747

User avatar
webfork
Posts: 10818
Joined: Wed Apr 11, 2007 8:06 pm
Location: US, Texas
Contact:

Re: DocFetcher - File Content Indexer

#53 Post by webfork »

Usage note: this program lists support for "PST" but two separate installations of Office 2010 I tested didn't have a PST file to check. I went digging and fortunately you can use an "OST" file located here:

C:\Users\USERNAME\AppData\Local\Microsoft\Outlook\EMAIL@ADDRESS.ost

DocFetcher didn't seem to know the difference and indexed it as normal.

This article gives some insight on finding the file: https://askleo.com/where_is_my_outlook_ ... e_located/

User avatar
webfork
Posts: 10818
Joined: Wed Apr 11, 2007 8:06 pm
Location: US, Texas
Contact:

Re: DocFetcher - File Content Indexer

#54 Post by webfork »

So I recently ran into a situation where I had to search many thousands of very large, detailed documents and realized that DocFetcher had some memory limitations past a certain set of documents. I tested out X1 (https://www.x1.com/), which has some amazing features but I still find myself really preferring DocFetcher

DocFetcher:
* Generally more responsive
* Always finds the text you're searching for in the document preview window
* Generally simpler (setup can sometimes be tedious but past that it's very easy)

X1:
* MUCH more detailed preview window (DF is just text)
* Searches more file types
* Ability to search a much larger database of documents
* Searches Outlook a bit better, searches Outlook attachments

User avatar
webfork
Posts: 10818
Joined: Wed Apr 11, 2007 8:06 pm
Location: US, Texas
Contact:

Re: DocFetcher - File Content Indexer

#55 Post by webfork »

I've been a fan of DocFetcher for some time now but somehow had not caught on to the advanced searching capability. This is hugely valuable and frustrating that I didn't know about it back in 2012 (when this article came out).

For example, title:"portable" AND 200? NOT "MP3 Player" would return files that list "portable" listing dates from the 2000s but excluding items with the phrase "MP3 Player".

The program supports the excellent Apache Lucene query toolset, which looks radically more mature than what was available in the commercial X1 program.

User avatar
webfork
Posts: 10818
Joined: Wed Apr 11, 2007 8:06 pm
Location: US, Texas
Contact:

Re: DocFetcher - File Content Indexer

#56 Post by webfork »

webfork wrote: Mon Jun 18, 2018 5:33 pm The program supports the excellent Apache Lucene query toolset, which looks radically more mature than what was available in the commercial X1 program.
One particular trick in particular has come in handy the most: proximity searches. Using the series:

Code: Select all

"term1 term2"~10
...means finding all appearances of term1 within 10 words of each other. This has made digging through hundreds or thousands of documents for relevant information radically better versus standard keyword searches. If you haven't found a reason to use DocFetcher, this is a good test.

User avatar
webfork
Posts: 10818
Joined: Wed Apr 11, 2007 8:06 pm
Location: US, Texas
Contact:

Re: DocFetcher - File Content Indexer

#57 Post by webfork »

Some usage notes with this program ...

Large database note: So I keep pushing this program well past what it was designed for (~200 gigs) and ran out of memory again using this program. I luckily found a workaround: run two separate instances of DocFetcher, each one indexing separate areas.

Although initially it looked like this could work by running both instances simultanously, it took up ~1 GB of RAM and then errored out. It looks like you have to run them separately (i.e. close one down before launching the other).

Outlook 2016 - if you want to index your Outlook email (recent versions), you first have to export to PST format. This isn't ideal of course because you have to index a new backup each time, but it's good for digging through old emails: https://www.systoolsgroup.com/updates/b ... look-2016/

User avatar
webfork
Posts: 10818
Joined: Wed Apr 11, 2007 8:06 pm
Location: US, Texas
Contact:

Re: DocFetcher - File Content Indexer

#58 Post by webfork »

I'm sad to report that I've been having issues with a recent install of Win10 Pro and DocFetcher. While I strongly suspect it's something to do with the most recent updates to Java that created some odd incompatibility, it does mean the program isn't functional. Happily there is a workaround: the still-in-beta DocFetcher by PortableApps works not just well but (so far) perfectly.

DocFetcher Portable
https://portableapps.com/node/53747

As crucial as this program is to me, this was a huge relief. I've tested this on two separate machines so far so I'll go ahead and ad something tentative to the entry. If more people report issues, I'm happy to switch over to the beta version. Still even all these years later, drastically better over Windows' own search tools and (also surprisingly) much better than several newer programs I tested.

---

EDIT: an update just came out: DocFetcher - https://www.portablefreeware.com/index.php?id=2660 (thanks Andrew)

Post Reply