House of fail (we got scraped)

Any other tech-related topics
Message
Author
User avatar
Andrew Lee
Posts: 3052
Joined: Sat Feb 04, 2006 9:19 am
Contact:

Re: House of fail (we got scraped)

#16 Post by Andrew Lee »

My apologies, the site in question is "auqk dot org", not the earlier ones listed in this topic.

webfork brought it to my attention in a PM.

phpBB passwords are stored as hashes in the database, not plaintext. However, given today's computation power, I suspect that's as good as gone if someone wants to crack it.

User avatar
SYSTEM
Posts: 2041
Joined: Sat Jul 31, 2010 1:19 am
Location: Helsinki, Finland

Re: House of fail (we got scraped)

#17 Post by SYSTEM »

Andrew Lee wrote:My apologies, the site in question is "auqk dot org", not the earlier ones listed in this topic.

webfork brought it to my attention in a PM.

phpBB passwords are stored as hashes in the database, not plaintext. However, given today's computation power, I suspect that's as good as gone if someone wants to crack it.
Whoa. The site looks exactly the same as TPFC. :shock: It indeed looks like they have the source code. :o

In fact, I had to check that it's a distinct website (and not just a DNS redirect):

Code: Select all

>nslookup www.portablefreeware.com
Server:  resolver2.dnaip.fi
Address:  62.241.198.246

Non-authoritative answer:
Name:    www.portablefreeware.com
Address:  162.250.145.13


>nslookup auqk.org
Server:  resolver2.dnaip.fi
Address:  62.241.198.246

Non-authoritative answer:
Name:    auqk.org
Address:  146.0.78.80
I repeated the UniExtract test there, and it indeed returned Universal Extractor and Universal Extractor 2.

Before we panic, there is one more thing to check: could it be that the site is just a proxy that forwards all requests, including searches, to TPFC? You could check by searching for a random string of characters and then looking if such a string was searched via our website search.
My YouTube channel | Release date of my 13th playlist: August 24, 2020

Emka
Posts: 290
Joined: Fri Sep 17, 2010 9:31 pm

Re: House of fail (we got scraped)

#18 Post by Emka »

Weird: I went to said clone (or whatever you should call it) of TPFC in my TPFC browser tab and when I pushed the browser's back button I wasn't taken back to TPFC but to http://instatime.dist-app.com/?c=rhinst ... 2279010446, which offered me to download the InstaTime app.

User avatar
JohnTHaller
Posts: 715
Joined: Wed Feb 10, 2010 4:44 pm
Location: New York, NY
Contact:

Re: House of fail (we got scraped)

#19 Post by JohnTHaller »

For what it's worth, it looks like a scrape of every page on here rather than a download of the database. Every RSS entry is scraped and added as a new WordPress page. The search works because it's searching those WordPress pages are indexed within WordPress. You can tell it's WordPress by the theme being used, the indicator at the bottom which is a default footer for said theme, some of the plugins linked within the HTML, etc.

UPDATE: Sorry, this was in reference to houseofportable dot com, not the later one referenced. If that one was an exact clone, that's another story.
Last edited by JohnTHaller on Sun Feb 14, 2016 7:44 am, edited 1 time in total.
PortableApps.com - The open standard for portable software | Support Net Neutrality

User avatar
Midas
Posts: 6710
Joined: Mon Dec 07, 2009 7:09 am
Location: Sol3

Re: House of fail (we got scraped)

#20 Post by Midas »

Couldn't this be just some kind of experiment with the database dump mentioned elsewhere? :shock:

User avatar
SYSTEM
Posts: 2041
Joined: Sat Jul 31, 2010 1:19 am
Location: Helsinki, Finland

Re: House of fail (we got scraped)

#21 Post by SYSTEM »

Midas wrote:Couldn't this be just some kind of experiment with the database dump mentioned elsewhere? :shock:
No way. They have exactly the same theme and functionality. Duplicating all that would be way too much effort for an "experiment". They either have the TPFC source code or have a proxy of some kind (as I suggested in my previous comment).
My YouTube channel | Release date of my 13th playlist: August 24, 2020

User avatar
Midas
Posts: 6710
Joined: Mon Dec 07, 2009 7:09 am
Location: Sol3

Re: House of fail (we got scraped)

#22 Post by Midas »

Please, bear with me. Couldn't someone get pretty much the same result by scrapping the site layout files and then fleshing it somehow with a database engine backend and the database dump? :?

I'm trying to apply Ockam's razor here...

User avatar
SYSTEM
Posts: 2041
Joined: Sat Jul 31, 2010 1:19 am
Location: Helsinki, Finland

Re: House of fail (we got scraped)

#23 Post by SYSTEM »

Midas wrote:Please, bear with me. Couldn't someone get pretty much the same result by scrapping the site layout files and then fleshing it somehow with a database engine backend and the database dump? :?

I'm trying to apply Ockam's razor here...
It would still be too much work. TPFC has not only the CSS and JavaScript controlling the final layout and look, but also PHP code to generate the HTML code based on database content. Even the PHP code alone would be too much to duplicate as an experiment.
My YouTube channel | Release date of my 13th playlist: August 24, 2020

User avatar
Andrew Lee
Posts: 3052
Joined: Sat Feb 04, 2006 9:19 am
Contact:

Re: House of fail (we got scraped)

#24 Post by Andrew Lee »

Before we panic, there is one more thing to check: could it be that the site is just a proxy that forwards all requests, including searches, to TPFC? You could check by searching for a random string of characters and then looking if such a string was searched via our website search.
Good hypothesis. I tried with a couple of random strings. Unforuntately, they did not appear in our database, meaning it's not a proxy.
Couldn't this be just some kind of experiment with the database dump mentioned elsewhere? :shock:
The database dump does not have enough auxiliary information to pull off this stunt. There are many working data associated with each entry that are not exported, but essential to running operation.

User avatar
webfork
Posts: 10818
Joined: Wed Apr 11, 2007 8:06 pm
Location: US, Texas
Contact:

Re: House of fail (we got scraped)

#25 Post by webfork »

Andrew Lee wrote:webfork brought it to my attention in a PM.
Credit for the initial find belongs to lintalist.

Steps I've taken
Suggestions welcome.

User avatar
nickoftime
Posts: 72
Joined: Sat Dec 22, 2007 11:46 am
Location: USA

Re: House of fail (we got scraped)

#26 Post by nickoftime »

I hate to hear that this happened and if they actually 'stole' the site source code, I hope you can track these idiots down and prosecute them to the highest extent of the law. The credibility of TPFC will NOT be compromised on our watch, that's for sure.

Shout out to lintalist for finding it, webfork for calling out to Andrew and taking steps to stop it...

Hope you can get to the bottom of this... this is still our favorite site and you guys are the best.

User avatar
tactictoe
Posts: 283
Joined: Thu Dec 10, 2015 10:56 am
Location: A galaxy far far downunder
Contact:

Re: House of fail (we got scraped)

#27 Post by tactictoe »

..., the site in question is "auqk dot org", not the earlier ones listed in this topic
This last one is a Clone: definitely.

These link might give you some idea and help:
How do you send Digital Millennium Copyright Act (DMCA) notifications to AOL, Bing, Google, Yahoo!, and the other major search engines?

http://www.seologic.com/faq/dmca-notifications.

This website talk also about violation of DCMA and is quite good: https://lorelle.wordpress.com/2006/04/1 ... r-content/

This website offer pay protection and some free that might have your interest, worth investigating: https://www.dmca.com/, even for just reading info about copyright violation, this site is pretty good.
This website offer protection for free the might have your interest and answer lots of question: https://www.owasp.org/index.php/Main_Page, it really worth reading here.

Idea for the server in bulk, I am trying to help here, I have no idea if this is possible and even by-passable by copyright violator in the future, may be I am just dreaming here:

Encryption of your page, Decryption only on this Site (DNS), Sort of 'ATM' code, kept off line (the internet but not LAN), if the input code is incorrect the page still encrypted and not usable.
Anti web-crawling measure. There is sure a way to block this type of request, isn't it?
Websites that are particularly challenging to crawl and scrape?
http://stackoverflow.com/questions/1876 ... and-scrape
This document might help you too: http://www.imperva.com/docs/wp_detectin ... ttacks.pdf


That is it.

I am so sorry for what is happening here, I do hope that this site will be put down and/or counter measure you will find make the guilty site unusable.
Best of luck to the team of TPFC.

Kind regards, Ben (aka tactictoe)
Last edited by tactictoe on Sat Feb 13, 2016 7:35 pm, edited 1 time in total.

User avatar
tactictoe
Posts: 283
Joined: Thu Dec 10, 2015 10:56 am
Location: A galaxy far far downunder
Contact:

Re: House of fail (we got scraped)

#28 Post by tactictoe »

Forgot to mention on my last post, but I am sure you have been there too:
http://dpivst.com/en/ireport/auqk dot org

User avatar
SYSTEM
Posts: 2041
Joined: Sat Jul 31, 2010 1:19 am
Location: Helsinki, Finland

Re: House of fail (we got scraped)

#29 Post by SYSTEM »

Okay, I have run out of easy explanations.

Image

Their website has even the latest updates, so whatever they are doing to access the database, it's still going on. :shock:

Just how are they doing this? And why?
My YouTube channel | Release date of my 13th playlist: August 24, 2020

User avatar
joby_toss
Posts: 2970
Joined: Sat Feb 09, 2008 9:57 am
Location: Romania
Contact:

Re: House of fail (we got scraped)

#30 Post by joby_toss »

BUT! If someone has full access to TPFC database, would it be wise to try and change anything in our accounts now (password, etc.)? This is my question.

Post Reply