House of fail (we got scraped)
- Andrew Lee
- Posts: 3052
- Joined: Sat Feb 04, 2006 9:19 am
- Contact:
Re: House of fail (we got scraped)
My apologies, the site in question is "auqk dot org", not the earlier ones listed in this topic.
webfork brought it to my attention in a PM.
phpBB passwords are stored as hashes in the database, not plaintext. However, given today's computation power, I suspect that's as good as gone if someone wants to crack it.
webfork brought it to my attention in a PM.
phpBB passwords are stored as hashes in the database, not plaintext. However, given today's computation power, I suspect that's as good as gone if someone wants to crack it.
Re: House of fail (we got scraped)
Whoa. The site looks exactly the same as TPFC. It indeed looks like they have the source code.Andrew Lee wrote:My apologies, the site in question is "auqk dot org", not the earlier ones listed in this topic.
webfork brought it to my attention in a PM.
phpBB passwords are stored as hashes in the database, not plaintext. However, given today's computation power, I suspect that's as good as gone if someone wants to crack it.
In fact, I had to check that it's a distinct website (and not just a DNS redirect):
Code: Select all
>nslookup www.portablefreeware.com
Server: resolver2.dnaip.fi
Address: 62.241.198.246
Non-authoritative answer:
Name: www.portablefreeware.com
Address: 162.250.145.13
>nslookup auqk.org
Server: resolver2.dnaip.fi
Address: 62.241.198.246
Non-authoritative answer:
Name: auqk.org
Address: 146.0.78.80
Before we panic, there is one more thing to check: could it be that the site is just a proxy that forwards all requests, including searches, to TPFC? You could check by searching for a random string of characters and then looking if such a string was searched via our website search.
My YouTube channel | Release date of my 13th playlist: August 24, 2020
Re: House of fail (we got scraped)
Weird: I went to said clone (or whatever you should call it) of TPFC in my TPFC browser tab and when I pushed the browser's back button I wasn't taken back to TPFC but to http://instatime.dist-app.com/?c=rhinst ... 2279010446, which offered me to download the InstaTime app.
- JohnTHaller
- Posts: 715
- Joined: Wed Feb 10, 2010 4:44 pm
- Location: New York, NY
- Contact:
Re: House of fail (we got scraped)
For what it's worth, it looks like a scrape of every page on here rather than a download of the database. Every RSS entry is scraped and added as a new WordPress page. The search works because it's searching those WordPress pages are indexed within WordPress. You can tell it's WordPress by the theme being used, the indicator at the bottom which is a default footer for said theme, some of the plugins linked within the HTML, etc.
UPDATE: Sorry, this was in reference to houseofportable dot com, not the later one referenced. If that one was an exact clone, that's another story.
UPDATE: Sorry, this was in reference to houseofportable dot com, not the later one referenced. If that one was an exact clone, that's another story.
Last edited by JohnTHaller on Sun Feb 14, 2016 7:44 am, edited 1 time in total.
PortableApps.com - The open standard for portable software | Support Net Neutrality
Re: House of fail (we got scraped)
Couldn't this be just some kind of experiment with the database dump mentioned elsewhere?
Re: House of fail (we got scraped)
No way. They have exactly the same theme and functionality. Duplicating all that would be way too much effort for an "experiment". They either have the TPFC source code or have a proxy of some kind (as I suggested in my previous comment).Midas wrote:Couldn't this be just some kind of experiment with the database dump mentioned elsewhere?
My YouTube channel | Release date of my 13th playlist: August 24, 2020
Re: House of fail (we got scraped)
Please, bear with me. Couldn't someone get pretty much the same result by scrapping the site layout files and then fleshing it somehow with a database engine backend and the database dump?
I'm trying to apply Ockam's razor here...
I'm trying to apply Ockam's razor here...
Re: House of fail (we got scraped)
It would still be too much work. TPFC has not only the CSS and JavaScript controlling the final layout and look, but also PHP code to generate the HTML code based on database content. Even the PHP code alone would be too much to duplicate as an experiment.Midas wrote:Please, bear with me. Couldn't someone get pretty much the same result by scrapping the site layout files and then fleshing it somehow with a database engine backend and the database dump?
I'm trying to apply Ockam's razor here...
My YouTube channel | Release date of my 13th playlist: August 24, 2020
- Andrew Lee
- Posts: 3052
- Joined: Sat Feb 04, 2006 9:19 am
- Contact:
Re: House of fail (we got scraped)
Good hypothesis. I tried with a couple of random strings. Unforuntately, they did not appear in our database, meaning it's not a proxy.Before we panic, there is one more thing to check: could it be that the site is just a proxy that forwards all requests, including searches, to TPFC? You could check by searching for a random string of characters and then looking if such a string was searched via our website search.
The database dump does not have enough auxiliary information to pull off this stunt. There are many working data associated with each entry that are not exported, but essential to running operation.Couldn't this be just some kind of experiment with the database dump mentioned elsewhere?
Re: House of fail (we got scraped)
Credit for the initial find belongs to lintalist.Andrew Lee wrote:webfork brought it to my attention in a PM.
Steps I've taken
- WOT post:
https://www.mywot.com/en/scorecard/auqk.org
Google takedown request:
https://support.google.com/legal/answer/3110420?hl=en
- nickoftime
- Posts: 72
- Joined: Sat Dec 22, 2007 11:46 am
- Location: USA
Re: House of fail (we got scraped)
I hate to hear that this happened and if they actually 'stole' the site source code, I hope you can track these idiots down and prosecute them to the highest extent of the law. The credibility of TPFC will NOT be compromised on our watch, that's for sure.
Shout out to lintalist for finding it, webfork for calling out to Andrew and taking steps to stop it...
Hope you can get to the bottom of this... this is still our favorite site and you guys are the best.
Shout out to lintalist for finding it, webfork for calling out to Andrew and taking steps to stop it...
Hope you can get to the bottom of this... this is still our favorite site and you guys are the best.
- tactictoe
- Posts: 283
- Joined: Thu Dec 10, 2015 10:56 am
- Location: A galaxy far far downunder
- Contact:
Re: House of fail (we got scraped)
This last one is a Clone: definitely...., the site in question is "auqk dot org", not the earlier ones listed in this topic
These link might give you some idea and help:
How do you send Digital Millennium Copyright Act (DMCA) notifications to AOL, Bing, Google, Yahoo!, and the other major search engines?
http://www.seologic.com/faq/dmca-notifications.
This website talk also about violation of DCMA and is quite good: https://lorelle.wordpress.com/2006/04/1 ... r-content/
This website offer pay protection and some free that might have your interest, worth investigating: https://www.dmca.com/, even for just reading info about copyright violation, this site is pretty good.
This website offer protection for free the might have your interest and answer lots of question: https://www.owasp.org/index.php/Main_Page, it really worth reading here.
Idea for the server in bulk, I am trying to help here, I have no idea if this is possible and even by-passable by copyright violator in the future, may be I am just dreaming here:
Encryption of your page, Decryption only on this Site (DNS), Sort of 'ATM' code, kept off line (the internet but not LAN), if the input code is incorrect the page still encrypted and not usable.
Anti web-crawling measure. There is sure a way to block this type of request, isn't it?
http://stackoverflow.com/questions/1876 ... and-scrapeWebsites that are particularly challenging to crawl and scrape?
This document might help you too: http://www.imperva.com/docs/wp_detectin ... ttacks.pdf
That is it.
I am so sorry for what is happening here, I do hope that this site will be put down and/or counter measure you will find make the guilty site unusable.
Best of luck to the team of TPFC.
Kind regards, Ben (aka tactictoe)
Last edited by tactictoe on Sat Feb 13, 2016 7:35 pm, edited 1 time in total.
- tactictoe
- Posts: 283
- Joined: Thu Dec 10, 2015 10:56 am
- Location: A galaxy far far downunder
- Contact:
Re: House of fail (we got scraped)
Forgot to mention on my last post, but I am sure you have been there too:
http://dpivst.com/en/ireport/auqk dot org
http://dpivst.com/en/ireport/auqk dot org
Re: House of fail (we got scraped)
Okay, I have run out of easy explanations.
Their website has even the latest updates, so whatever they are doing to access the database, it's still going on.
Just how are they doing this? And why?
Their website has even the latest updates, so whatever they are doing to access the database, it's still going on.
Just how are they doing this? And why?
My YouTube channel | Release date of my 13th playlist: August 24, 2020
Re: House of fail (we got scraped)
BUT! If someone has full access to TPFC database, would it be wise to try and change anything in our accounts now (password, etc.)? This is my question.