Discussions related to Visual Prolog
Kari Rastas
Active Member
Posts: 36
Joined: 4 Mar 2000 0:01

Filtering the needed facts of very large internal database files

Unread post by Kari Rastas »

I have in an internal database USA’s custom statistics of monthly trade in goods with different countries. In the database are now 17.5 million facts. Every month new facts are added to it.

The problem in creating pages and charts of different product group’s import and export (for example Gases Sitc 34) with different countries, using and going through the large database in memory is very time consuming. Producing the yearly and monthly charts and html-pages of nearly 100 product groups did take almost 24 hours. So developing and/or correcting the pages was rather “difficult”.

Finally I divided the database in smaller each one product groups facts having files. Reading the rows of such large data file, which is fast, and writing the wanted fact rows in a new file is easy. With the new internal database files the time to make the charts and pages can now be made in some tens of minutes.

Is it possible to create the needed internal database asserting the chosen rows during this reading the rows of the input stream?

Code: Select all

clauses     readProductTerms(InFile,ProductCode):-         Input = inputStream_file::openFile8(InFile),             readProductStream(Input,string::format(",\"%\",",ProductCode)),         Input:close(),!.   predicates     readProductStream : (inputStream Input,string SearchStr) procedure(i,i).   clauses     readProductStream(Input,_):-         Input:endOfStream(),         !.     readProductStream(Input,SearchStr):-         String = Input:readLine(),             handleString(String,SearchStr),         readProductStream(Input,SearchStr).   predicates     handleString:(string RowsString,string SearchString).   clauses     handleString(String,SearchStr):-         LEN = string::length(String),         LEN > 8,         string::search(String,SearchStr)=N,         N>0,!,         ?????             assert this fact in the string to internal database in memory         .
Harrison Pratt
VIP Member
Posts: 432
Joined: 5 Nov 2000 0:01

Re: Filtering the needed facts of very large internal database files

Unread post by Harrison Pratt »

Maybe something like this?

Code: Select all

class facts     myData : (string).   class predicates     readProductTerms : (string InFile, string ProductCode). clauses     readProductTerms(InFile, ProductCode) :-         Input = inputStream_file::openFile8(InFile),         SearchCode = string::format(string::format(",\"%\",", ProductCode)),         foreach Input:repeatToEndOfStream() and S = Input:readLine() and string::length(S) > 8 and string::search(S, SearchCode) > 0 do             % do your custom parsing here             assert(myData("Some data you extract from S"))         end foreach,         Input:close().
Given the large size of your application data, you might want to create a productReader class that creates instances of a productReader that read each ProductCode.
Kari Rastas
Active Member
Posts: 36
Joined: 4 Mar 2000 0:01

Re: Filtering the needed facts of very large internal database files

Unread post by Kari Rastas »

The readLine produces a string.

Assert needs that string to be identified as a fact.

That is the problem, to which I do not know the solution. I suppose there must be some predicate to perform that.

It would rather useful in case of very large internal databases to pick the wanted/needed facts using inputstream and forming a smaller database straight to memory. Naturally that also can be easily to accomplished by writing the picked fact strings to a output stream forming a new smaller database in a file and then consulting that new database.
Harrison Pratt
VIP Member
Posts: 432
Joined: 5 Nov 2000 0:01

Re: Filtering the needed facts of very large internal database files

Unread post by Harrison Pratt »

If the data on disk is already in the form of VIP prolog facts, you can do something like the below to assert facts into different databases (if those facts have different structures).

Code: Select all

class facts - myDataDB     myData : (string).   class facts - yourDataDB     yourData : (integer).   clauses     run() :-         MyS = "myData( \"III\" )",         YourS = "yourData(333)",         if MyTerm = tryToTerm(myDataDB, MyS) and YourTerm = tryToTerm(yourDataDB, YourS) then             assert(MyTerm),             assert(YourTerm)         end if.
Kari Rastas
Active Member
Posts: 36
Joined: 4 Mar 2000 0:01

Re: Filtering the needed facts of very large internal database files

Unread post by Kari Rastas »

Of course the large datafile is in VIP dataformat. I would not have asked the question, if it would not been. I have used PDC prolog for over 30 years.

That tryToTerm(myDataDB, String) is exactly what I need and what I was originally searching, but there is a "problem". It doesn't exist, at least not in VIP7.5, which I mainly use. it demands tryToTerm(_). When I tried that toTerm before writing this question the forum, the answer was that the type of term can not be decided. I have not yet checked the situation with VIP10.

I have VIP10 (updated it 5 months ago), but I have not yet started to use, because when I updated the code for my picture DLL the new font - "type" caused some problems (with choosing the angle of the text). I have had not the energy and time to figure out the needed changes. Nowadays it takes time and lots of coffee to find the needed new information, so large the VIP has become because the demands of developement of computers and programming languages.
Harrison Pratt
VIP Member
Posts: 432
Joined: 5 Nov 2000 0:01

Re: Filtering the needed facts of very large internal database files

Unread post by Harrison Pratt »

Have you considered writing a tiny tool in Vip10 just to do the data allocation so you can continue to use your Vip7.5 application? It would be a "safe" and easy way to get up to speed on Vip10. The conversion of your 7.5 legacy app(s) in their entirety could be tedious.
User avatar
Thomas Linder Puls
VIP Member
Posts: 1395
Joined: 28 Feb 2000 0:01

Re: Filtering the needed facts of very large internal database files

Unread post by Thomas Linder Puls »

Writing:

Code: Select all

hasDomain(myDataDB, MyTerm), MyTerm = tryToTerm(MyS)
Will give same effect as:

Code: Select all

MyTerm = tryToTerm(myDataDB, MyS)
When we have that much data we always store it in an SQL database (but usually we also have the need to share it simultaneously between many clients).

But when having it in memory you should really consider creating some "indexes" in form of maps and the like. See Collection library. I can't remember how the collection library looked in vip 7.3, but I am pretty sure that there were at least algebraic red-black trees.
Regards Thomas Linder Puls
PDC
Post Reply