Software catalog - Database for TPC content

Any other tech-related topics
Post Reply
Message
Author
User avatar
tactictoe
Posts: 283
Joined: Thu Dec 10, 2015 10:56 am
Location: A galaxy far far downunder
Contact:

Software catalog - Database for TPC content

#1 Post by tactictoe »

I have an idea that might interests the whole community.

What about creating a Portable software database (light as possible) containing the catalog of ALL the software by category of the TPFC?

This software would be able to update on demand the catalog. How?
Here it is going to hurt, may be: JSON api for TPFC catalog?, or XML file, special server report? Better than scrapping/parsing the whole site which is never ever efficient; page modification = code to review eventually > potential broken software (if it is not recoverable by included algorithm). No other solution come to mind for now...

In term of feature I can see this (draft):
Menu list of Category of software > Software list per category (included submitted software?) > A software page from that category as presented here:
- With or without (need a visit to the page of TPFC to retrieve the link) possibility to direct download the file.
- Information if this software was downloaded or not by user and where it is located (launcher)
- Personal system rating and note of the software.
Possibility to directly submit software from the software, with control it is not a bot doing it?
And who knows what else that could be useful.

If TPFC approved, I say let's do it. I am willing to even develop this software solo with list of features to be done approved by the community and the TPFC team.

This is just a draft of an idea that has potential.
What do you think?

User avatar
SYSTEM
Posts: 2041
Joined: Sat Jul 31, 2010 1:19 am
Location: Helsinki, Finland

Re: Software catalog - Database for TPC content

#2 Post by SYSTEM »

It is already possible to get a database dump: http://www.portablefreeware.com/dump.php. The database dump is in the CSV format. To download it, you need to be logged in with an account that has the privilege to insert new programs into the database (registered for at least one day + not banned).
My YouTube channel | Release date of my 13th playlist: August 24, 2020

User avatar
webfork
Posts: 10818
Joined: Wed Apr 11, 2007 8:06 pm
Location: US, Texas
Contact:

Re: Software catalog - Database for TPC content

#3 Post by webfork »

tactictoe wrote:What do you think?
I'm still not sure I understand the use case of something like this. But if you can demonstrate something with that CSV file, please do.

User avatar
tactictoe
Posts: 283
Joined: Thu Dec 10, 2015 10:56 am
Location: A galaxy far far downunder
Contact:

Re: Software catalog - Database for TPC content

#4 Post by tactictoe »

SYSTEM wrote:It is already possible to get a database dump: http://www.portablefreeware.com/dump.php. The database dump is in the CSV format. To download it, you need to be logged in with an account that has the privilege to insert new programs into the database (registered for at least one day + not banned).

Webfork wrote:I'm still not sure I understand the use case of something like this. But if you can demonstrate something with that CSV file, please do.
A tool from system pointing to a database dump, and a bit of in the blur for Webfork giving me a green to demonstrate: I am going to work on this one.
I will probably have question... but a demonstration will be provided.
I intend to works this software step by step till final. Hoping to have more suggestion/idea/request on the way to implant in this software.

One Scenario...
A catalog that stands on USB thumdrive:
On the go you need one or more software/tool? You have this software (the TPFC database tool, no name for now) on a thumbdrive and verified you have no tools that you need but the TPFC database provide one and link to download it. You read about it, like you would do on this site then add it to your collection on your thumbdrive and launch it from the software itself. Of course the download is extracted and install regarding instruction of install in the TPFC DB, it is also marked as download and stored in the thumbdrive in the right category provide by the TPFC DB.

Another Scenario...
You have the software/tool you need to perform what your need ask, the TPFC database tool become a launcher and permits you to launch this tool straight away from your thumbdrive... A launcher with database.

TPFC Admin scenario...
Admin launch the tools and verification of:
- verification of valid download link are done, a report is generated for incorrect link. (easy)
- version number of software is compared with the one in the database, a report is generated for updated version TPFC is not aware of. (Complicated, needs algorithm and /parsing page but can be done)
- more?

And what about developer scenario:
Someone would like to suggest/submit (or modify it software entry) a software... Easy or not so easy? A Twebbrowser integrated inside the software will permits to that. Submission form and forum for a member authenticated.

Registration Scenario:
At first start of the software, you are asked for your credential with TPFC or new member registration if you have none. The software remembers you credential/registration information at next startup. Banned user are informed they cannot use this software.

Update of catalog: could be done manually or automatically via option and pointing to the file System provided. For the login part, I suppose I could create an internal software login dialog to identify/authentify a user with what is already enforced on the TPFC site.

And more scenario: from approved user request can be done...

I hope this will help to understand a little bit more clearly what I think is a good idea... The TPFC Special launcher tool or whatever the name will be. That would be to me a complementary tool that help the community... IMHO.


Please ALL feel free to suggest anything for this just born project... let's make it something UNIQUE, something no other website distributor of software has ever done before. So, if you have an idea as crazy as it can be, do not be shy: POST it.

Have a nice day.

User avatar
tactictoe
Posts: 283
Joined: Thu Dec 10, 2015 10:56 am
Location: A galaxy far far downunder
Contact:

Re: Software catalog - Database for TPC content

#5 Post by tactictoe »

Project is born: 26/02/2016
It will take a bit of time but it's on the way. I will post the result on a private page of my site and PM concerned party when done.
If approved I will therefor publish it officially.

Project progress:
- Analysis is done and revised.
- UI Looks and function is currently in progress to be optimised regarding analysis.
- Some routine done and will have some question in the future, especially for the database access. For now I am stuck with a Twebbrowser login type routine. Works thought.

User avatar
tactictoe
Posts: 283
Joined: Thu Dec 10, 2015 10:56 am
Location: A galaxy far far downunder
Contact:

Re: Software catalog - Database for TPC content

#6 Post by tactictoe »

Project Suspended Until further notice. Sorry guys.

User avatar
webfork
Posts: 10818
Joined: Wed Apr 11, 2007 8:06 pm
Location: US, Texas
Contact:

Re: Software catalog - Database for TPC content

#7 Post by webfork »

Sorry this took me so long to get back to. Even though you suspended it, I'll try to respond to this just because it was a bit rude on my part.
tactictoe wrote:
Webfork wrote:I'm still not sure I understand the use case of something like this.
On the go you need one or more software/tool? You have this software (the TPFC database tool, no name for now) on a thumbdrive and verified you have no tools that you need but the TPFC database provide one and link to download it. You read about it, like you would do on this site then add it to your collection on your thumbdrive and launch it from the software itself. Of course the download is extracted and install regarding instruction of install in the TPFC DB, it is also marked as download and stored in the thumbdrive in the right category provide by the TPFC DB.

You have the software/tool you need to perform what your need ask, the TPFC database tool become a launcher and permits you to launch this tool straight away from your thumbdrive... A launcher with database.
This sounds a bit like what UGMFree was trying to do with Symenu. I welcome other efforts on something as I feel like his approach was too dependent on one person.
tactictoe wrote:Admin launch the tools and verification of:
- verification of valid download link are done, a report is generated for incorrect link. (easy)
- version number of software is compared with the one in the database, a report is generated for updated version TPFC is not aware of. (Complicated, needs algorithm and /parsing page but can be done)
I could definitely use that. I've been meaning to run the website as a whole through some link-checking tools for quite some time now. There are a LOT of things I'd like to do on the website when I have time.
tactictoe wrote:developer, registration, update of catalog
All those items sound like a front-end for the website. It's intriguing but I wonder if you're reinventing something that seems to work fairly well.

If there was a way to build in a distributed database (all the data listed here no longer relies on portablefreeware.com) I could see that being very useful.
tactictoe wrote:Please ALL feel free to suggest anything for this just born project... let's make it something UNIQUE, something no other website distributor of software has ever done before. So, if you have an idea as crazy as it can be, do not be shy: POST it.
That is a bold statement. I could (and have) come up with multiple ideas here. Also, have you seen the requests thread?
Last edited by webfork on Fri Feb 26, 2016 12:59 pm, edited 1 time in total.
Reason: (updated the requests link which was pointing to the wrong place)

User avatar
tactictoe
Posts: 283
Joined: Thu Dec 10, 2015 10:56 am
Location: A galaxy far far downunder
Contact:

Re: Software catalog - Database for TPC content

#8 Post by tactictoe »

The project was suspended for family reason not because I am at running capacity. Sometime someone is asked kindly to remember his family and forget a little bit computer and community. It will resume when other software I work on are completed or can wait for new upgrade. For now mainly:
-Movie Info search: slowed down a little bit but due to a major upgrade coming soon. Bringing me some more grey hairs this upgrade.
-Search and Delete: slowed a little bit for the same reason of MIS; one of the new feature would be multiple file search with multiple mask. For now even it works, I am making the algorithm better and better for the sake of time consumption of the search. Another source of grey hairs and still the same coffee machine.
-URL Monit@r: this one is on steroid as I did not yet achieve all I wanted with this software. And its a lot. Documentation has also to be done, partial now.
-Other project that will never be release to public because private. Mainly robotic and automation software and hardware for improving my family life and some gadget for me.
-Work is taking its toll too, I bring works at home.
This sounds a bit like what UGMFree was trying to do with Symenu. I welcome other efforts on something as I feel like his approach was too dependent on one person.
I agree. I could share the source (.pas) with anyone savy in pascal language with Delphi as Main compiler or even Lazarus (Free IDE). That is if someone would like to work with a guy on steroid when passionate by one of is creation.
I could definitely use that. I've been meaning to run the website as a whole through some link-checking tools for quite some time now. There are a LOT of things I'd like to do on the website when I have time.
These are features I found very useful for the end user and TPFC itself and WILL be implemented.
Edited: Actually URL Monit@r was born form one of my own needs but is also in a way a reference achieved prototype (reuse of code) for most function I might need in this proposed software.
All those items sound like a front-end for the website. It's intriguing but I wonder if you're reinventing something that seems to work fairly well.

If there was a way to build in a distributed database (all the data listed here no longer relies on portablefreeware.com) I could see that being very useful.
I am aware TPFC bandwidth is precious, the only thing the software will do is to update the database on demand via the dump. The main idea is using the user bandwidth for any on line features proposed but the dump of the database.
That is a bold statement. I could (and have) come up with multiple ideas here.

Who dares win. It is a dare to trigger a reaction. IMHO being bold is a good thing if you do not overkill your action or statement and do not offend anyone.
Also, have you seen the requests thread?
This is something that bothers me a little bit, not your question but the way I can find information in the forum. It is NOT easy for all newbies to find a specific topic talking about something specific. I still lost for that part inside this forum. They are hundreds of topics scattered all around this forum. How to get the right one? How to know it is not deprecated by another topic, or even relevant? It might seems easy for old members, but to me as a fairly new member it is certainly not the case. On top of to have the idea to look for it and the will or courage to browse and read inside these hundreds of topics one or multiple entry (especially when popular). So, to answer you question: no, I did not.

User avatar
tactictoe
Posts: 283
Joined: Thu Dec 10, 2015 10:56 am
Location: A galaxy far far downunder
Contact:

Re: Software catalog - Database for TPC content

#9 Post by tactictoe »

I am relaunching the project. And working on it. However I met some serious obstacles. Resulting in questions; I am trying to solve these first then if I cannot solve will be posted here in hope of an answer.

:?: A bit of a Chinese puzzle, lost in a maze...
Main obstacle: Logon the TPFC site through the software. So far I tried around 12 different methods to access via the software even it seems to work it does not. When I try to download the database I am asked to login even with cookies handler and the like. Well I have a work around this one but still trying first to login with simple user input (Login and password) to be able to download the dump of the database without a physical login from the web. Would love to have more input where to address the login and the password, a URL where it is submitted perhaps... structure of the data expected... whatever can help. As I am sure these information are not to be posted... please PM or email me if you have these information. Would appreciate it. The goal is to be able to login securely and genuinely from the software to update the database with username and password required by TPFC.

:!: Workable CSV file with error, another puzzle...
Secondary obstacle: I am forced to parse the whole database for bad CSV Input. YES, your database contains error. I have created a proto just for you to verify your database. The file contains the database I was working on at the period of time (your CSV untouched) and the executable which permits to load that database. Error comes from e.g. HTML tag included sometimes inside the database. Check between field entry 57 and 58...633 and 634...919 and 920...1452 and 1453 (interesting)...1656 and 1657...1687 and 1688...2125 and 2126...2289 and 2290... 2305 and 2306...> errors of 2768 records. And , please do not tell me my code is faulty... IT IS NOT. I use this code with hundreds of CSV files from various source include my own CSV data dump and it never fails. Even the database is workable, you might look into those errors.
And here is the link to the prototype to check by yourself: http://www.geocities.ws/tactictoe/TPFCP ... aProto.zip
To use it: just extract the executable and the database (tpfc.csv) into the same folder launch the proto (Project1.exe). The proto accepts any CSV file (Delimiter is ','). It works just fine with other CSV loaded into the software (of course the software looks for tpfc.csv database for now, hard coded. Any CSV named this way will load).

I can still work around both problems and continue to develop the software thought. I need more coffee now, I am losing myself in the shuffle again.
Anyway, have all a nice day.

Edited: you will notice these errors also in Excel but at other positions as it manipulates the CSV in a different way. However, it's there.

User avatar
SYSTEM
Posts: 2041
Joined: Sat Jul 31, 2010 1:19 am
Location: Helsinki, Finland

Re: Software catalog - Database for TPC content

#10 Post by SYSTEM »

tactictoe wrote::?: A bit of a Chinese puzzle, lost in a maze...
Main obstacle: Logon the TPFC site through the software. So far I tried around 12 different methods to access via the software even it seems to work it does not. When I try to download the database I am asked to login even with cookies handler and the like. Well I have a work around this one but still trying first to login with simple user input (Login and password) to be able to download the dump of the database without a physical login from the web. Would love to have more input where to address the login and the password, a URL where it is submitted perhaps... structure of the data expected... whatever can help. As I am sure these information are not to be posted... please PM or email me if you have these information. Would appreciate it. The goal is to be able to login securely and genuinely from the software to update the database with username and password required by TPFC.
I sent Andrew a PM. Hopefully he can help you.
tactictoe wrote: :!: Workable CSV file with error, another puzzle...
Secondary obstacle: I am forced to parse the whole database for bad CSV Input. YES, your database contains error. I have created a proto just for you to verify your database. The file contains the database I was working on at the period of time (your CSV untouched) and the executable which permits to load that database. Error comes from e.g. HTML tag included sometimes inside the database. Check between field entry 57 and 58...633 and 634...919 and 920...1452 and 1453 (interesting)...1656 and 1657...1687 and 1688...2125 and 2126...2289 and 2290... 2305 and 2306...> errors of 2768 records. And , please do not tell me my code is faulty... IT IS NOT. I use this code with hundreds of CSV files from various source include my own CSV data dump and it never fails.
Your code is faulty. :P

I had a look at the database dump. In all the records you complain about, the License field spans multiple lines. For example here:

Code: Select all

"1687","Reprofiler","http://iwr.cc/reprofiler","http://iwr.cc/download/reprofiler.zip","System - Installation/Configuration","Win2K/WinXP/Vista/Win7","[url=http://www.gnu.org/licenses/gpl.html]GPL v3[/url] (note that source does not appear to be available on the [url=http://sourceforge.net/projects/reprofiler/]SourceForge site[/url])
[url=http://www.gnu.org/licenses/gpl.html][/url]","","1","Reprofiler offers a quick and easy way to see which profile is associated with which user. If a problem is evident, it then provides an intuitive and straightforward means of correcting the profile ownership."
Fields that span multiple lines are allowed per RFC 4180:
RFC 4180 wrote: 6. Fields containing line breaks (CRLF), double quotes, and commas
should be enclosed in double-quotes. For example:

"aaa","b CRLF
bb","ccc" CRLF
zzz,yyy,xxx
Thus, a standards-compliant CSV parser can load our dump. If Excel gives you an error when trying to open the file, then its parser isn't standards-compliant. LibreOffice Calc can open the dump flawlessly.
My YouTube channel | Release date of my 13th playlist: August 24, 2020

User avatar
tactictoe
Posts: 283
Joined: Thu Dec 10, 2015 10:56 am
Location: A galaxy far far downunder
Contact:

Re: Software catalog - Database for TPC content

#11 Post by tactictoe »

No the code isn't faulty has it reads the database. You should know if you used the proto. BUT... is URL tag part of your database? Hmmmm? If so why 97% of URL are decoded correctly and the others 3% not? It indicates to me a visible problem you might not be aware.

With Excel or libreoffice or open office or any CSV reader (tried 5 of them all compliant) I have the same result:
Libroffice.png
So you are telling me this is absolutely normal, isn't it? Wrong output provoke wrong reading... that is all I am saying. I did my home work before posting, and try to help... Oh well, same old same. I can still parse the data and strip it from any BBCode (?) and tag found and make the output data REALLY compiliant for human reading. It is already done anyway.
And please do not lecture me on how works a CSV file will you? I have MORE than 40 years of experience in developing hardware and software. CSV is not new to me, neither using it.
Anyway thanks to have PM Andrew, and have a nice day, :)

User avatar
SYSTEM
Posts: 2041
Joined: Sat Jul 31, 2010 1:19 am
Location: Helsinki, Finland

Re: Software catalog - Database for TPC content

#12 Post by SYSTEM »

tactictoe wrote: Image
So you are telling me this is absolutely normal, isn't it?
Oh, that. Yes, it's normal.

The description of each entry is arbitrary text and can optionally contain BBCode. For wrappered portable applications, we have usually, by convention, started the entry with a link to the upstream project. In your screenshot, Firefox Portable, Thunderbird Portable and Sunbird Portable are all wrappered applications and their entries start with such links. The database dump, of course, gives you the raw BBCode.

If you want to display the entries properly, you need a BBCode parser. Alternatively you can just drop BBCode from the entries.
tactictoe wrote: And please do not lecture me on how works a CSV file will you? I have MORE than 40 years of experience in developing hardware and software. CSV is not new to me, neither using it.
Yeah, I guess I was a bit too harsh. Sorry. :(
My YouTube channel | Release date of my 13th playlist: August 24, 2020

User avatar
Andrew Lee
Posts: 3052
Joined: Sat Feb 04, 2006 9:19 am
Contact:

Re: Software catalog - Database for TPC content

#13 Post by Andrew Lee »

tactictoe wrote:Main obstacle: Logon the TPFC site through the software.
This one is a toughie. I'm primarily tagging on to phpBB's authentication system. The PHP code looks something like:

Code: Select all

global $auth, $user;
$user->session_begin();
$auth->acl($user->data);
which is boilerplate stuff, I believe. If your code is able to login and retrieve the forum front page showing the user is logged in, I think you should be able to retrieve the CSV file.

Feel free to PM me any sample code or executable. I will try to help.
tactictoe wrote:Secondary obstacle: I am forced to parse the whole database for bad CSV Input.
You are right. This one is a far easier problem. I have fixed up the CSV output. Let me know if you hit any other issues and I will fix those as well.

User avatar
tactictoe
Posts: 283
Joined: Thu Dec 10, 2015 10:56 am
Location: A galaxy far far downunder
Contact:

Re: Software catalog - Database for TPC content

#14 Post by tactictoe »

@Andrew Lee

Thanks for answering the call of System about these two obstacles.

Main Obstacle:
That's what I suspected; a script acting a little bit like a firewall. The code I wrote attempts to login via sending a HTTP request. Even with a successful request, it did nothing in the background. PHPBB script is the guilty one; information are passed but not used. The code can still access via the classic method to the login page via a TwebBrowser or the like. The download of the database is working that way. That was the primary idea I had and it was working fine for the database download. I just did not like the fact the end user had to login that way. I will go with this solution and work later on how to address the PHPBB script to accept the input of the HTTP request.
I will need to learn how it works first PHPBB, PHP, CSS and all these internet languages, save JAVA and XML, are still to be study for me to do this login. At least the basic, but this is what is all about developing software: learning something new every time you meet an obstacle. In a way the software engineer might be the less useful engineer at the beginning of his career but at the end he is the one who knows the most about every domain he did a software for. A multi talented engineer.

Secondary obstacle:
Thank you for fixing the CSV Database. They are still some BBCode present but I can work with these one. The main thing is the database is readable and each columns now address the right data for the right records. Sorry, for the trouble here.

@SYSTEM

Don't be sorry. I know you like to argue and try to help as much as you can. That what is making you a fantastic and unique person. You have some great spirit here. Never lose it. Even old dinosaur like me does not take well to be patronised a little bit. We just growl but never bites, but we listen. :wink:

Post Reply