Ease the "too common word" search rule

All suggestions about TPFC should be posted here. Discussions about changes to TPFC will also be carried out here.
Post Reply
Message
Author
User avatar
vevy
Posts: 752
Joined: Tue Sep 10, 2019 11:17 am

Ease the "too common word" search rule

#1 Post by vevy » Tue Nov 03, 2020 4:00 pm

The following words in your search query were ignored because they are too common words: {word}.
You must specify at least one word to search for. Each word must consist of at least 3 characters and must not contain more than 14 characters excluding wildcards.
Many times, I know what I want to search for. For example the word "test" in a topic, or narrow a search with a word combination, where one or both are common, but together they are not.

I suggest an option that says "Search anyway" after the warning. And don't automatically ignore the common word in a multi-word search.

Thanks
Help make the comprehensive CLI database happen:
                    Vote for filters/badges!

User avatar
Andrew Lee
Posts: 2608
Joined: Sat Feb 04, 2006 9:19 am
Contact:

Re: Ease the "too common word" search rule

#2 Post by Andrew Lee » Tue Nov 03, 2020 5:52 pm

That is a limitation of the phpBB forum software. Don't think there is an easy way to solve it. Try the old "keywords site:portablefreeware.com/forums" Google search to work around this limitation.

There are basically 2 different search functions, and these are not integrated. The database search is done using MySQL full-text search, while the forum search is done by its phpBB's own native search engine. Both have its limitations (MySQL search filters out keywords that are 3 chars or less, and also a bunch of stop words).

Search is a difficult problem, although when it works it looks so easy (spelling correction, alternative suggestions etc.). I guess maybe that's why Google is worth its billions today.

Maybe we should just replace the default search boxes on both database and forum with Google custom search. What do you guys think?

User avatar
webfork
Posts: 9799
Joined: Wed Apr 11, 2007 8:06 pm
Location: US, Texas
Contact:

Re: Ease the "too common word" search rule

#3 Post by webfork » Tue Nov 03, 2020 8:17 pm

Andrew Lee wrote:
Tue Nov 03, 2020 5:52 pm
Maybe we should just replace the default search boxes on both database and forum with Google custom search. What do you guys think?
I acknowledge the current database search is very hit and miss, but I do still find it useful. If you'd like to *add* some kind of aditional search function (i.e. Google/Bing/DuckDuckGo etc. with the "site:www.portablefreeware.com/forums" trick), that's fine but I'd rather not hand off all search operations to Google. I already use way more of their products and services than I'd like.

User avatar
SYSTEM
Posts: 1969
Joined: Sat Jul 31, 2010 1:19 am
Location: Helsinki, Finland

Re: Ease the "too common word" search rule

#4 Post by SYSTEM » Tue Nov 03, 2020 9:08 pm

Andrew Lee wrote:
Tue Nov 03, 2020 5:52 pm
Maybe we should just replace the default search boxes on both database and forum with Google custom search. What do you guys think?
Please no. I prefer the existing search options over Google custom search.
My YouTube channel | Release date of my 13th playlist: August 24, 2020

billon
Posts: 844
Joined: Sat Jun 23, 2012 4:28 pm

Re: Ease the "too common word" search rule

#5 Post by billon » Tue Nov 03, 2020 9:34 pm

Andrew Lee wrote:
Tue Nov 03, 2020 5:52 pm
Maybe we should just replace the default search boxes on both database and forum with Google custom search.
NO!

User avatar
vevy
Posts: 752
Joined: Tue Sep 10, 2019 11:17 am

Re: Ease the "too common word" search rule

#6 Post by vevy » Tue Nov 03, 2020 10:09 pm

How about a workaround hack? Find the common word list and trim it down or replace it with a bare-metal list?

I had good search experience in many non-Google places, including open-source projects and even desktop applications. There has got to be a better ready-to-use solution out there!
Help make the comprehensive CLI database happen:
                    Vote for filters/badges!

User avatar
Andrew Lee
Posts: 2608
Joined: Sat Feb 04, 2006 9:19 am
Contact:

Re: Ease the "too common word" search rule

#7 Post by Andrew Lee » Wed Nov 04, 2020 2:18 am

OK, I did a little poking around.

phpBB supports a number of search backends, Default is "native", written in PHP. They also support MySQL full-text indexing. That used to have certain restrictions on table types (only MyISAM support), but that restriction is now gone for more recent versions of MySQL.

MySQL indexing has certain advantages, so I switched over to give it a try. The full search syntax is given here:

https://wiki.phpbb.com/MySQL_Fulltext_Search

For the casual searcher, searching for eg. "test jpeg" will now automatically search for both keywords. And since the minimum keyword length is 3, keywords like "test" will pass without issue (thought queries like "for jpeg" or "to jpeg" still won't work).

Let's give this a shot and see how it goes. From what I have read, it should be better than native indexing both in terms of performance and resource utilization. If for some reason we hate it, it will be trivial to switch back to native indexing.

User avatar
vevy
Posts: 752
Joined: Tue Sep 10, 2019 11:17 am

Re: Ease the "too common word" search rule

#8 Post by vevy » Wed Nov 04, 2020 2:31 am

Thanks!

A quick test (search.php?keywords=is+desel*&t=24751&sf=msgonly): Overall, a much better experience. 👍
Help make the comprehensive CLI database happen:
                    Vote for filters/badges!

User avatar
Midas
Posts: 5832
Joined: Mon Dec 07, 2009 7:09 am
Location: Sol3

Re: Ease the "too common word" search rule

#9 Post by Midas » Wed Nov 04, 2020 6:32 am

Andrew Lee wrote: Maybe we should just replace the default search boxes on both database and forum with Google custom search. What do you guys think?

At least, add it as an option. You can't have too many of them IMHO... :)

User avatar
Andrew Lee
Posts: 2608
Joined: Sat Feb 04, 2006 9:19 am
Contact:

Re: Ease the "too common word" search rule

#10 Post by Andrew Lee » Wed Nov 04, 2020 3:10 pm

vevy wrote:
Wed Nov 04, 2020 2:31 am
  • it says "is" is ignored but it is highlighted.
I suspect they are ignored in the search, but if they are present in the search result, they will be highlighted i.e. search and highlighting are independent modules.
Midas wrote:
Wed Nov 04, 2020 6:32 am
At least, add it as an option. You can't have too many of them IMHO... :)
Don't bait me :lol:

User avatar
Andrew Lee
Posts: 2608
Joined: Sat Feb 04, 2006 9:19 am
Contact:

Re: Ease the "too common word" search rule

#11 Post by Andrew Lee » Wed Nov 04, 2020 3:26 pm

OK, I took the bait :D

There is this Google Search extension for phpBB here, which took like a minute to install. Another minute to setup a custom search engine for Google, and we are done.

This extension will only appear in "Advanced search", just like "Detailed search" in the database frontend.

User avatar
webfork
Posts: 9799
Joined: Wed Apr 11, 2007 8:06 pm
Location: US, Texas
Contact:

Re: Ease the "too common word" search rule

#12 Post by webfork » Wed Nov 04, 2020 10:02 pm

Andrew Lee wrote:
Wed Nov 04, 2020 3:26 pm
This extension will only appear in "Advanced search", just like "Detailed search" in the database frontend.
Looks good -- thanks for enabling both options.

User avatar
Midas
Posts: 5832
Joined: Mon Dec 07, 2009 7:09 am
Location: Sol3

Re: Ease the "too common word" search rule

#13 Post by Midas » Thu Nov 05, 2020 4:26 am

And my thanks for that, too, Andrew. 8)

FWIW, the search URL/expression will be available after execution as a link in the bottom left corner under the "Search for [string] on Google" in the following basic form:

Code: Select all

https://www.google.com/search?client=ms-google-coop&q=[string]&cx=15a760a66939137fe

Post Reply