<Weird unicode characters thread issue>

All suggestions about TPFC should be posted here. Discussions about changes to TPFC will also be carried out here.
Message
Author
User avatar
Orca
Posts: 2
Joined: Mon Oct 17, 2016 9:00 am

<Weird unicode characters thread issue>

#1 Post by Orca »

[Moderator note: this user was given a warning on his account and then decided to bail and delete all his previous entries. What follows are the replies to a site issue with odd characters.]
Last edited by Orca on Fri May 05, 2017 4:45 pm, edited 3 times in total.

Zero3K
Posts: 68
Joined: Sun Oct 30, 2016 1:48 pm

Re: "Бесплатная версия pc tools antivirus":

#2 Post by Zero3K »

It seems like you all are popular with foreigners then. I think that box should support unicode if that's the case.

Specular
Posts: 443
Joined: Sun Feb 16, 2014 10:54 pm

Re: "Бесплатная версия pc tools antivirus":

#3 Post by Specular »

Already brought up here. It's the lack of Unicode support causing the mangled characters, and since the box now displays most popular for the day not just all time (since that was rather static).

I do wonder about the potential for abuse though if spammers game the search queries to place random software/ads in the popular search items.

User avatar
Andrew Lee
Posts: 3048
Joined: Sat Feb 04, 2006 9:19 am
Contact:

Re: �-zip?

#4 Post by Andrew Lee »

Will look into this to find out what's going on...

User avatar
joby_toss
Posts: 2970
Joined: Sat Feb 09, 2008 9:57 am
Location: Romania
Contact:

Re: <Weird unicode characters thread issue>

#5 Post by joby_toss »

Image

billon
Posts: 843
Joined: Sat Jun 23, 2012 4:28 pm

Weird unicode characters thread issue

#6 Post by billon »

This topic not visible on main page and not clickable:
x.png
that's because of "< >" around title?

User avatar
Andrew Lee
Posts: 3048
Joined: Sat Feb 04, 2006 9:19 am
Contact:

Re: <Weird unicode characters thread issue>

#7 Post by Andrew Lee »

This topic not visible on main page and not clickable:
Fixed. Thanks for pointing this out.

User avatar
SYSTEM
Posts: 2041
Joined: Sat Jul 31, 2010 1:19 am
Location: Helsinki, Finland

Re: <Weird unicode characters thread issue>

#8 Post by SYSTEM »

Orca wrote:At TPFC, "Бесплатная версия pc tools antivirus" is a popular search.

Really? I wouldn't have thought so.
Sounds believable to me. TPFC is such a small site that a handful of Russian visitors can probably bring such a search near the top.
My YouTube channel | Release date of my 13th playlist: August 24, 2020

HairyPorter
Posts: 26
Joined: Sat Jan 07, 2017 8:27 pm

Re: <Weird unicode characters thread issue>

#9 Post by HairyPorter »

Orca wrote:At TPFC, "[markb /forums/ucp.php?mode=login" is a popular search. Really? I wouldn't have thought so.
Such generic URLs can turn out to be a popular "search term" because other than text queries via the search box, it seems that TPFC's search engine is also capturing user clicks on hyperlinks & buttons.

Note: If interested, just mouseover the below sample generic URLs to view them. Don't click on them to make them even more "popular" !

The URL syntax in question suggests that a TPFC registered user "markb" might have been trying to login today &/or yesterday from TPFC's homepage or forum index page. (And TPFC does have a user named MarkB who had previously commented on various TPFC's software pages.) Perhaps he encountered repeated login failures (eg. wrong password, or forgot to allow session cookies), & kept clicking the login button upon every page refresh. And TPFC's search engine duly captured all his clicks.

Google Search likewise captured one of MarkB's login clicks on 09 Jan 2017 (this time from pg 5 of TPFC's software pages). Screenshot:
TPFCSearch-CapturesUserClicksGenericURLs-21Jan17.png
Furthermore, Google Search also captured MarkB's clicks on various occasions when he rated different software (Eg 1: 29 Dec 2016 | Eg 2: 11 Jan 2017 | Eg 3: 15 Jan 2017) whilst browsing through TPFC's software index pages.

To be fair, TPFC's & Google's search engines appear to capture all visitors' & registered users' clicks on hyperlinks & buttons. Except that most URLs don't receive numerous clicks within a short span of time, so these URLs (eg. the aforementioned software rating clicks) are deemed "not popular" by the search algorithm, are buried way down in search results, & hence don't usually come to attention.

User avatar
SYSTEM
Posts: 2041
Joined: Sat Jul 31, 2010 1:19 am
Location: Helsinki, Finland

Re: <Weird unicode characters thread issue>

#10 Post by SYSTEM »

HairyPorter wrote: Note: If interested, just mouseover the below sample generic URLs to view them. Don't click on them to make them even more "popular" !
That's not a problem.

https://www.portablefreeware.com/forums ... 614#p83614
Andrew Lee wrote:The "u=0" parameter in those links ensure that they do not count towards the stats. I even double-checked again to make sure there isn't a bug in the code.
My YouTube channel | Release date of my 13th playlist: August 24, 2020

HairyPorter
Posts: 26
Joined: Sat Jan 07, 2017 8:27 pm

Re: <Weird unicode characters thread issue>

#11 Post by HairyPorter »

SYSTEM wrote:https://www.portablefreeware.com/forums ... 614#p83614
Andrew Lee wrote:The "u=0" parameter in those links ensure that they do not count towards the stats. I even double-checked again to make sure there isn't a bug in the code.
@SYSTEM -- Thanks for the info about TPFC's "u=0" parameter. Based on above description, I assume "u=0" is supposed to work the same way as the standard rel="nofollow" HTML attribute.

1) But why is TPFC's search engine apparently ignoring the "u=0" parameter currently hardcoded into TPFC's 'Popular Searches' links, as well as functional links such as those related to Login/ Rate/ Register etc. ? As implied by the recently-indexed "user login" URL example (which does have "u=0" appended to it), TPFC's search engine seems to be following & indexing user clicks on Login/ Rate buttons, to the point that the functional clicks of a persistent TPFC user managed to get ranked highly in TPFC's 'Popular Searches'.

In contrast, it is understandable that Google Search & other search engines are ignoring "u=0", since it is not the HTML standard. Hence the long list of TPFC functional-click URLs stored in their search indices.

2) Based on brief research, other phpBB-powered forums & websites appear to be using rel="nofollow" instead to make their internal &/or external links automatically obey that directive. Egs:-



Related Issue: Use rel="nofollow" for Specific Links (Google Webmasters)
Google Webmasters Search Console Help Center wrote: Crawl Prioritization: Search engine robots can't sign in or register as a member on your forum, so there's no reason to invite Googlebot to follow "register here" or "sign in" links. Using nofollow on these links enables Googlebot to crawl other pages you'd prefer to see in Google's index.
On the other hand, I can't find any phpBB documentation, examples of phpBB-powered sites, or any non-phpBB website using "u=0" for this purpose.

How did this "u=0" parameter come about ? Is it some special code unique to TPFC's backend ? More importantly, is it working as it should ?

User avatar
SYSTEM
Posts: 2041
Joined: Sat Jul 31, 2010 1:19 am
Location: Helsinki, Finland

Re: <Weird unicode characters thread issue>

#12 Post by SYSTEM »

HairyPorter wrote:How did this "u=0" parameter come about ? Is it some special code unique to TPFC's backend ? More importantly, is it working as it should ?
Yes, "u=0" is unique to the TPFC backend written in PHP.

In the post I quoted, Andrew said that he had double-checked that "u=0" works correctly.
HairyPorter wrote:
SYSTEM wrote:https://www.portablefreeware.com/forums ... 614#p83614
Andrew Lee wrote:The "u=0" parameter in those links ensure that they do not count towards the stats. I even double-checked again to make sure there isn't a bug in the code.
@SYSTEM -- Thanks for the info about TPFC's "u=0" parameter. Based on above description, I assume "u=0" is supposed to work the same way as the standard rel="nofollow" HTML attribute.
No, not really. rel="nofollow" advises search engines not to index the link. "u=0" tells TPFC code not to count the search towards the search popularity statistics.
HairyPorter wrote: 1) But why is TPFC's search engine apparently ignoring the "u=0" parameter currently hardcoded into TPFC's 'Popular Searches' links, as well as functional links such as those related to Login/ Rate/ Register etc. ? As implied by the recently-indexed "user login" URL example (which does have "u=0" appended to it), TPFC's search engine seems to be following & indexing user clicks on Login/ Rate buttons, to the point that the functional clicks of a persistent TPFC user managed to get ranked highly in TPFC's 'Popular Searches'.
What is happening is that the Popular Searches box appends the "u=0" parameter. The searches the TPFC code indexes are without "u=0". However, when the Popular Searches box shows the most popular searches, then it adds "u=0" to prevent a feedback loop.

Following and indexing clicks on login and rate buttons sounds like a believable (although very strange) explanation. :|
HairyPorter wrote: 2) Based on brief research, other phpBB-powered forums & websites appear to be using rel="nofollow" instead to make their internal &/or external links automatically obey that directive. Egs:-



Related Issue: Use rel="nofollow" for Specific Links (Google Webmasters)
Google Webmasters Search Console Help Center wrote: Crawl Prioritization: Search engine robots can't sign in or register as a member on your forum, so there's no reason to invite Googlebot to follow "register here" or "sign in" links. Using nofollow on these links enables Googlebot to crawl other pages you'd prefer to see in Google's index.
On the other hand, I can't find any phpBB documentation, examples of phpBB-powered sites, or any non-phpBB website using "u=0" for this purpose.
TPFC can't use rel="nofollow" here. It's an HTML attribute. TPFC search code (written in PHP) can't know if the visitor triggered the search by clicking a link that has a rel="nofollow" attribute.
My YouTube channel | Release date of my 13th playlist: August 24, 2020

User avatar
Andrew Lee
Posts: 3048
Joined: Sat Feb 04, 2006 9:19 am
Contact:

Re: <Weird unicode characters thread issue>

#13 Post by Andrew Lee »

I have verified again that "u=0" works as intended, and clicking on the "Popular searches" links does not create a feedback loop.

Maybe the current window of 1 day is too short and create all kinds of spurious results. Markb's careless romping with an hour is enough to skew the stats.

Should we increase the stats window to 3 or 5 days to smooth things out?

[EDIT] Stats window increased to 3 days.

User avatar
SYSTEM
Posts: 2041
Joined: Sat Jul 31, 2010 1:19 am
Location: Helsinki, Finland

Re: <Weird unicode characters thread issue>

#14 Post by SYSTEM »

Andrew Lee wrote:Should we increase the stats window to 3 or 5 days to smooth things out?
Sounds good to me.
My YouTube channel | Release date of my 13th playlist: August 24, 2020

User avatar
__philippe
Posts: 687
Joined: Wed Jun 26, 2013 2:09 am

Weird results in the Popular Searches box

#15 Post by __philippe »

Improbable current "Popular Searches"... :?:
[markb /forums/ucp.php?mode=login
[markb /forums/forums/ucp.php?
mode=login
[markb /forums/?p=2[markb /forums/?p=5
[markb /forums/?p=4[markb /forums/?p=3
...

Post Reply