Page 4 of 4

Re: DocFetcher - File Content Indexer

Posted: Sun Mar 09, 2014 8:35 am
by webfork
Emka wrote:v1.1.11 is out
Ah great news.

February 19th, 2014
  • Features:
    Added EPUB support.
    Advanced settings: New entry 'SkipTarArchives' for disabling tar archive support.
    Bugfixes:
    Bug #659: Windows installer downloaded an obsolete Java runtime version.
    Bug #670: Crash when entering an invalid value into the occurrence field above the preview pane.
    Bug #573, #612 and others: Crash while indexing zip archives.
    Changes:
    For developers:) Removed AspectJ dependency.
    Updated Russian GUI translation.
http://www.softpedia.com/progChangelog/ ... 10899.html

Re: DocFetcher - File Content Indexer

Posted: Fri Mar 14, 2014 11:14 pm
by robertcollier4
https://www.donationcoder.com/forum/ind ... _next=next
Skrommel
"InEverything - Search inside files using the Everything NTFS search engine.

I use Everything to search for files, but it doesn't search inside files, so I've thrown together a solution. Beta.

To run the script, you need to download Everything and ES from http://www.voidtools.com/download.php. Install Everything, turn on Tools - Options - ETP/FTP and check Start ETP/FTP server on startup. Plase ES.exe next to the script."

Re: DocFetcher - File Content Indexer

Posted: Sat Oct 04, 2014 10:25 am
by webfork
Update: whatever was causing this issue has cleared up in 1.1.13. Note: I backed up to version 1.1.11 as there were issues creating multiple indexes on 1.1.12 in my Win7x64 environment.

---

DocFetcher 1.1.12 is out. I use this program quite a bit and v.1.1.11 was fairly bug free so I haven't seen a clear indicator that the new version fixes something in particular but always good to see an open, active project surrounding freeware I love.

Changelog: http://www.softpedia.com/get/PORTABLE-S ... cher.shtml

Home: http://docfetcher.sourceforge.net/en/index.html and Softpedia: http://www.softpedia.com/progChangelog/ ... 48322.html

Re: DocFetcher - File Content Indexer

Posted: Sun Dec 07, 2014 12:50 am
by Emka

Re: DocFetcher - File Content Indexer

Posted: Fri Dec 12, 2014 7:44 pm
by webfork
Thanks for adding that, Emka.

Re: DocFetcher - File Content Indexer

Posted: Sat May 16, 2015 7:33 pm
by webfork
Been working with program over the last two years now and feel like I should give it a separate post for this.

Wishlist:
  • Search entirely by file dates (thinking of another box next to the size box)
  • Ability to search by various types of metadata
  • DjVu capability - according to this it doesn't look very posible: http://sourceforge.net/p/docfetcher/fea ... quests/35/).
  • Ability to set processor usage. While the program is running javaw.exe *32 process is almost always running as much as 13% processor usage for indexing. Would like to be able to keep that to a minimum.
  • In the "Document Types" window along with check, uncheck, and invert, I'd like to see "Microsoft, OpenOffice, and 'other'"

Re: DocFetcher - File Content Indexer

Posted: Fri Mar 24, 2017 6:19 pm
by webfork
I'm pleased to report that PA is working on a DocFetcher version. I'm obviously a fan of the program so more attention is always good but this is huge for people who need a solid search program but have network requirements against Java installations, which I ran into recently.

http://portableapps.com/node/53747

Re: DocFetcher - File Content Indexer

Posted: Fri Apr 28, 2017 10:39 pm
by webfork
Usage note: this program lists support for "PST" but two separate installations of Office 2010 I tested didn't have a PST file to check. I went digging and fortunately you can use an "OST" file located here:

C:\Users\USERNAME\AppData\Local\Microsoft\Outlook\EMAIL@ADDRESS.ost

DocFetcher didn't seem to know the difference and indexed it as normal.

This article gives some insight on finding the file: https://askleo.com/where_is_my_outlook_ ... e_located/

Re: DocFetcher - File Content Indexer

Posted: Sun Nov 19, 2017 10:23 am
by webfork
So I recently ran into a situation where I had to search many thousands of very large, detailed documents and realized that DocFetcher had some memory limitations past a certain set of documents. I tested out X1 (https://www.x1.com/), which has some amazing features but I still find myself really preferring DocFetcher

DocFetcher:
* Generally more responsive
* Always finds the text you're searching for in the document preview window
* Generally simpler (setup can sometimes be tedious but past that it's very easy)

X1:
* MUCH more detailed preview window (DF is just text)
* Searches more file types
* Ability to search a much larger database of documents
* Searches Outlook a bit better, searches Outlook attachments

Re: DocFetcher - File Content Indexer

Posted: Mon Jun 18, 2018 5:33 pm
by webfork
I've been a fan of DocFetcher for some time now but somehow had not caught on to the advanced searching capability. This is hugely valuable and frustrating that I didn't know about it back in 2012 (when this article came out).

For example, title:"portable" AND 200? NOT "MP3 Player" would return files that list "portable" listing dates from the 2000s but excluding items with the phrase "MP3 Player".

The program supports the excellent Apache Lucene query toolset, which looks radically more mature than what was available in the commercial X1 program.

Re: DocFetcher - File Content Indexer

Posted: Thu Nov 08, 2018 2:59 pm
by webfork
webfork wrote: Mon Jun 18, 2018 5:33 pm The program supports the excellent Apache Lucene query toolset, which looks radically more mature than what was available in the commercial X1 program.
One particular trick in particular has come in handy the most: proximity searches. Using the series:

Code: Select all

"term1 term2"~10
...means finding all appearances of term1 within 10 words of each other. This has made digging through hundreds or thousands of documents for relevant information radically better versus standard keyword searches. If you haven't found a reason to use DocFetcher, this is a good test.

Re: DocFetcher - File Content Indexer

Posted: Tue May 07, 2019 6:10 am
by webfork
Some usage notes with this program ...

Large database note: So I keep pushing this program well past what it was designed for (~200 gigs) and ran out of memory again using this program. I luckily found a workaround: run two separate instances of DocFetcher, each one indexing separate areas.

Although initially it looked like this could work by running both instances simultanously, it took up ~1 GB of RAM and then errored out. It looks like you have to run them separately (i.e. close one down before launching the other).

Outlook 2016 - if you want to index your Outlook email (recent versions), you first have to export to PST format. This isn't ideal of course because you have to index a new backup each time, but it's good for digging through old emails: https://www.systoolsgroup.com/updates/b ... look-2016/

Re: DocFetcher - File Content Indexer

Posted: Fri May 28, 2021 6:12 pm
by webfork
I'm sad to report that I've been having issues with a recent install of Win10 Pro and DocFetcher. While I strongly suspect it's something to do with the most recent updates to Java that created some odd incompatibility, it does mean the program isn't functional. Happily there is a workaround: the still-in-beta DocFetcher by PortableApps works not just well but (so far) perfectly.

DocFetcher Portable
https://portableapps.com/node/53747

As crucial as this program is to me, this was a huge relief. I've tested this on two separate machines so far so I'll go ahead and ad something tentative to the entry. If more people report issues, I'm happy to switch over to the beta version. Still even all these years later, drastically better over Windows' own search tools and (also surprisingly) much better than several newer programs I tested.

---

EDIT: an update just came out: DocFetcher - https://www.portablefreeware.com/index.php?id=2660 (thanks Andrew)