Posting program hashes

All suggestions about TPFC should be posted here. Discussions about changes to TPFC will also be carried out here.
Post Reply
Message
Author
User avatar
webfork
Posts: 10836
Joined: Wed Apr 11, 2007 8:06 pm
Location: US, Texas
Contact:

Posting program hashes

#1 Post by webfork »

Problem
  • Here on the site is losing track of an individual file or download location, as with the Simpo PDF to Text situation today. Right now I just do a bunch of digging to try and find an alternative, but unless I already had that program downloaded to my machine, I have no way to verify it.


Solution
  • A faster, probably more reliable way was suggested by friend of mine: a search for MD5 hashes on Google. So for example with Simpo, I pasted the MD5 hash and was able to track down a site that didn't come up in my previous search. Additionally, that file is MUCH more likely to be the file that I'm looking for and I don't have to go digging inside the archive file for a version number. This is awesome.

    Because they are considered weak, MD5 and SHA1 are gradually being replaced by more complex hashing methods including SHA256 and SHA512. Since these are still relatively new, maybe one hash value might be useful for tracking down the file and another is more suitable for actual verification.
Recommendation
  • Obviously we could just paste a hash value into the entry but to avoid crowding, I also came up with something a bit more involved:
    1. Include a "Hash" link somewhere in the entry. This would go to a sub-page with available values including MD5, SHA-1, SHA256, and SHA256 (as far as I can tell, these are the most popular). Obviously this would be optional and if the poster/updater doesn't fill these out or care, the link would not show up in the entry.
    2. When updating an entry, any changes to version number would mean a second page asking if the user wants to update the hash values. This page would have blanks for available hash values and the options "Ignore/Skip" (leave them empty) "Keep Previous Values" (no change in hash values) or "Save" to add whatever was typed in.

      The "Keep Previous Values" would be necessary if there was a case the version was entered incorrectly or some other oddity (this has come up for me).
  • Ideally old hashes should be somewhere in the edit history in case we have to go back a version due to software phoning home or otherwise causing problems (like when XMedia Recode was somehow broken did a few months back).

---

Edit: I'm not the first person to come up with this.

User avatar
m^(2)
Posts: 890
Joined: Sat Mar 31, 2007 2:38 am
Location: Kce,PL
Contact:

Re: Posting program hashes

#2 Post by m^(2) »

My thoughts:
1. Maintenance nightmare. Quite costly and will be frequently be outdated.
2. When it's outdated, searching for it will probably give only sites with an outdated version.
3. When a link is broken, users can search for the app name. It's far more straightforward, I guess that usually will lead to more results and you usually get the latest version.

So IMHO no solution is actually better than this. But if you want to solve the issue, I suggest a link crawler scanning all downloads links daily and giving some kind of alert when a link is dead for a couple of days.

User avatar
webfork
Posts: 10836
Joined: Wed Apr 11, 2007 8:06 pm
Location: US, Texas
Contact:

Re: Posting program hashes

#3 Post by webfork »

1. Maintenance nightmare. Quite costly
Why?
2. When it's outdated, searching for it will probably give only sites with an outdated version.
True.
3. When a link is broken, users can search for the app name. It's far more straightforward, I guess that usually will lead to more results and you usually get the latest version.
I've spent a lot of time fixing dead entries over the past year and oddly app name searching has been a hit-and-miss process. With hashing, I can find and verify the download quickly. We get around problems with Download.com and other sites yet to be named who modify the original executable, as well as whether you trust the place you're downloading from.

User avatar
m^(2)
Posts: 890
Joined: Sat Mar 31, 2007 2:38 am
Location: Kce,PL
Contact:

Re: Posting program hashes

#4 Post by m^(2) »

webfork wrote:
1. Maintenance nightmare. Quite costly
Why?
When I wrote this I thought about the need to install an app to get the checksums. But now I see that you meant checksums of the installer. Quite sensible and invalidates my remark above.
webfork wrote:
3. When a link is broken, users can search for the app name. It's far more straightforward, I guess that usually will lead to more results and you usually get the latest version.
I've spent a lot of time fixing dead entries over the past year and oddly app name searching has been a hit-and-miss process. With hashing, I can find and verify the download quickly. We get around problems with Download.com and other sites yet to be named who modify the original executable, as well as whether you trust the place you're downloading from.
I don't see any mention of an executable modification neither in the thread nor on the cnet site linked from there.
Only installer mods. But IMHO these are not problems for PF to solve. It's rather app developers who are willing to upload their software to crap sites.

Idea: API for app devs that lets them update entries automatically.
Idea2: Torrents for app storage for apps that have their entries kept up to date.

User avatar
webfork
Posts: 10836
Joined: Wed Apr 11, 2007 8:06 pm
Location: US, Texas
Contact:

Re: Posting program hashes

#5 Post by webfork »

>Only installer mods.
  • Yes that's a better description.
> But IMHO these are not problems for PF to solve. It's rather app developers who are willing to upload their software to crap sites.
>
> Idea: API for app devs that lets them update entries automatically.

  • Mainly the problem I'm trying to solve is the slow transition from freeware to abandonware. Mainly it's the hours and hours of work that people have done over the years that we're basically throwing away when the executable disappears. I don't want to ask Andrew to host the files as we're already using a lot of bandwidth.
> Idea2: Torrents for app storage for apps that have their entries kept up to date.
  • Yes, that would be a legitimate and workable alternative that I could probably put together over a weekend. Good call.

    The only weakness is a single point of failure; if no one is seeding or someone can't if someone can't download torrents from their location due to security or other issues, you're stuck. The point of the hashing bit is to be location agnostic -- it doesn't matter where you get it -- assuming you trust PFWC -- you know it's good.

    The hashing idea also has probably the same number of weaknesses so at this stage a torrent seems the most reasonable route.

donald
Posts: 561
Joined: Wed Dec 19, 2007 4:14 am
Location: knoxville TN USA

Hash this out ...

#6 Post by donald »

Hash this out ... :lol: A proposition

Possibly we could post hashes to the comments, if we do a version number and time stamp are appended automatically. (potentially helpful if a new version is unacceptable after an update as well as with abandon-ware etc)

When an application is updated the primary version could be downloaded, hashed and the hash posted to the comments by volunteers.

Finally an option in the entry form (Y/N Dist.) could point out if an application may be freely distributed or not, and Volunteers could host this and make it available through cloud storage like that found with email. (example skydrive)

Maybe this relies too much on the elusive resource Volunteers? ... Perhaps.

But good applications have their supporters who might volunteer to help support their favorite redistributable application.

User avatar
m^(2)
Posts: 890
Joined: Sat Mar 31, 2007 2:38 am
Location: Kce,PL
Contact:

Re: Posting program hashes

#7 Post by m^(2) »

webfork wrote:> Idea2: Torrents for app storage for apps that have their entries kept up to date.

Yes, that would be a legitimate and workable alternative that I could probably put together over a weekend. Good call.

The only weakness is a single point of failure; if no one is seeding or someone can't if someone can't download torrents from their location due to security or other issues, you're stuck.
Yeah, I've been thinking about authors seeding their stuff, but it won't work for abandonware.

Well, I think there's a niche for one program, here is not the first place where I thought it would be useful.
A super-simple app for community members who are willing to help; A torrent deamon that you configure once and forget. By configure I mean bandwidth/disk space limits, just that. A project like TPFC would take a template, insert their own artwork and settings, set up a server/tracker telling clients what needs seeding and act as a backup download place (for users who have torrent traffic blocked and things that are w/out seeders for the moment) and ask their users for help. The backup server might not work well for TFPC if bandwidth or space are too limited though. And I'm not sure if TPFC has a big enough community, but I think yes.
webfork wrote:The point of the hashing bit is to be location agnostic -- it doesn't matter where you get it -- assuming you trust PFWC -- you know it's good.
Indeed.
In fact torrents/ed2k links etc. are nothing but hashes too. I wonder why they are never used as such, for hashing downloads they would do just as well as SHA, but would come with the benefit that when a link is dead, you can get it from other places too.

Overall, hashing seems the simplest option though.
But it opens some new possibilities too. Server could be downloading apps from time to time and checking the checksums. If they change, it could be sending notifications that apps have been updated.

bzl333
Posts: 167
Joined: Wed Jan 12, 2011 3:11 pm

Re: Posting program hashes

#8 Post by bzl333 »

m^(2) wrote:
It's rather app developers who are willing to upload their software to crap sites.
it might be worthwhile if PFC would post a list of decent sites and also let folks know which ones aren't so good. For myself, i've been downloading free programs for at least 10 years and sometimes i'll stumble onto some site like "FreeDownloadCenter" or something similar and won't have any idea if its bogus or not.

User avatar
webfork
Posts: 10836
Joined: Wed Apr 11, 2007 8:06 pm
Location: US, Texas
Contact:

Re: Posting program hashes

#9 Post by webfork »

Old thread update:

I haven’t been able to do the site torrent as m^(2) suggested (this would be a great project) but I did want to link to a related thread where I’ve implemented SOME of the original thread suggestions around hash inclusion using SigCheckGUI, though informally and in the forums.

Specular
Posts: 443
Joined: Sun Feb 16, 2014 10:54 pm

Re: Posting program hashes

#10 Post by Specular »

I like the idea. Having hashes of downloads would provide a useful way of comparing if anything had been modified in any mirrors one finds of unsigned, abandoned software.

The problems I can see are: increased user maintenance and entry time per update, and no clear way of other users verifying which source the hashes are from (which is important since it could be any version).

Since the database already keeps a history of changes the separate hash page mightn't be necessary.

The best way of sourcing original, abandoned files is still archive.org imo. That way everyone can check for themselves that the source is original, can be changed to the download link if necessary, and can be compared by users themselves using hashes with programs like Md5Hash. Sometimes such archived sites also provide their own hashes (though rarer).

User avatar
Midas
Posts: 6905
Joined: Mon Dec 07, 2009 7:09 am
Location: Sol3

Re: Posting program hashes

#11 Post by Midas »

A further possibility would be automatically generated hashes -- allowing the journalling capabilities already in place to take care of the rest... Is this too much of a far shot? I'm thinking of plain old md5 (though VirusTotal integration would be even better).

Post Reply