Page 1 of 1

Problem reading large VIP 5.2 database file as text strings

Posted: 19 Jan 2017 18:05
by Harrison Pratt
A simple routine to read and write some saved facts from a 20,000 KB VIP 5.2 database into another file stops unexpectedly in the middle of the data source file. I assume it is at an endOfStream condition. I looked at the file in both Notepad++ and a hex editor and there are no spurious control characters in the region where it stops reading. I opened the database file in NNP, saved it as a plain text file, did a text comparison using CodeCompare and the files are identical.

Then I had the 5.2 application purge some redundant facts from "above" the break point and the 7.5 read clause still stops a the same line, but at a different point in the 5.2 database file. The file position of the break is 15,494,135, at the 285,435th line.

Maybe there's something strange about the file that I'm reading as lines of string data. If it were a memory issue then the stopping point (last record read) would change after the purge, but it didn't.

I haven't tried having my 5.2 application export this data yet because I'd really like to understand this 7.5 problem.

The data around the breakpoint looks like this with the last line successfully read indicated:

Code: Select all

dr(1603151030,"03/15/2016","10:30:00",35.16,2,305,2,1010,46.4,44.4,-99.9) dr(1603151036,"03/15/2016","10:36:00",36.02,1,301,3,1010.1,46.6,44.4,100) dr(1603151042,"03/15/2016","10:42:00",36.02,1,281,3,1010.1,46.4,44.4,-99.9) dr(1603151048,"03/15/2016","10:48:00",37.32,1,312,2,1010,46.5,44.4,-99.9) dr(1603151054,"03/15/2016","10:54:00",37.36,0,326,1,1009.9,46.8,44.4,-99.9) % <== last line successfully read dr(1603151100,"03/15/2016","11:00:00",37.13,0,159,1,1009.8,46.9,44.4,-99.9) dr(1603151106,"03/15/2016","11:06:00",37.13,1,317,1,1009.9,47,44.4,-99.9) dr(1603151112,"03/15/2016","11:12:00",36.69,1,286,2,1010.1,46.8,44.4,-99.9) dr(1603151118,"03/15/2016","11:18:00",36.5,1,302,2,1010.3,45.7,44.4,-99.9) dr(1603151124,"03/15/2016","11:24:00",36.69,2,335,3,1010.3,44.5,44.4,-99.9)
A test predicate to copy the 5.2 database file lines to another file is:

Code: Select all

class predicates     copyData : (). clauses     copyData():-         file::existExactFile( dataFile52 ),         stdio::write( "\nREADING FILE ... " ),         _ = vpi::processEvents(),         IS = inputstream_file::openFile8( dataFile52 ),         OS = outputStream_file::create8( "TEST.FILE" ),         IS:repeatToEndOfStream(),             S = IS:readLine(),                 frontToken(S,_Tok,_),                 OS:write(S,"\n"),             IS:endOfStream(),             !,         OS:close(),         IS:close(),         stdio::write("\nSUCCESS ", predicate_fullname() ).     copyData():-         stdio::write( "\nFAILED ", predicate_fullname() ).
I experimented with file5x clauses and had the same result. I've done this sort of thing with different file types for many years and have to admit that I'm baffled on this one! :?

Posted: 19 Jan 2017 18:16
by Paul Cerkez
A 'dumb' suggestion.

Have you tried breaking up the file being imported to see if you can do multiple successive file reads?

I had a situation years ago where I could not import the entire single file but if I broke into two, I could import everything using two reads.


Posted: 19 Jan 2017 21:18
by Harrison Pratt
Not a dumb suggestion. There are a few ways I can work around this problem but I really wanted to see if there is something I'm doing wrong in VIP 7.5 reading that file.

I now think it may be something in the way that 5.2 saves databases because if I use the DOS FIND command to pipe the lines of interest to another file the data transfer breaks (stops being transmitted) at the same spot as when I read the file with 7.5.

However, if I export the data from the 5.2 application to a CSV file all of the data is exported, so I know that 5.2 can read the in-memory database.

I update the 5.2 file daily with consult & save, so I'm pretty sure it's a good file (for 5.2, anyhow). When I get some time I'll try regenerating the database from the CSV file and see if this behavior persists.

Thanks for the hint,

Posted: 20 Jan 2017 2:01
by Gukalov

flush may be...

Code: Select all

foreach     IS:repeatToEndOfStream(),     S = IS:readLine(),     frontToken(S,_Tok,_) do             OS:write(S,"\n"),     OS:flush() end foreach,

Posted: 20 Jan 2017 3:10
by Harrison Pratt
Good idea, but didn't help. That doesn't explain the problem with FIND either.


Posted: 20 Jan 2017 10:17
by Thomas Linder Puls
Hi, Harrison.

Your output stream is only closed if your test reach "SUCCESS" if it reach "FAILED" the file is not closed.
When you don't close a file the last buffer will not be flushed to the file.

You may convince your self that if the file ends with empty lines then you will end in the "FAILED" case (even though I don't believe you consider that an error).

Anyway since you have also tried your test with flush this cannot explain the problem.

I however the problem is that fronToken fails in some unexpected situation (from your line and on all the rest) then you will also end in the "FAILED" case.

So for solving the problem I suggest that you also run the test without the frontToken test.

And in general for good "stream behavior" I suggest that you always close your streams in a finally section:
For the stream handling loop I think you should use a foreach. You can place the readline and frontToken-test inside the foreachpart as Gokalov suggests or if you like your code to be more explicit you can use an if-then-construction:
If the problem persist I would like to see the file, but it is too large to attach here and I also think it is too large to attach to a mail (even if you zip it). If you can place it somewhere on the net (e.g. in a dropbox) and mail the link to then I will look at it.

Posted: 20 Jan 2017 14:42
by Harrison Pratt
Thanks to everyone for their guidance on good stream handling practices. The "problem" is between the keyboard and my chair -- I inadvertently inserted the name of an OLD file left over from when I migrated to a new PC, but was comparing the output to the current working file. :oops:

My original code works as intended, but I have revised my approach per your suggestions, so your tutorial efforts were not in vain.
Please accept my apologies.

Posted: 23 Jan 2017 9:24
by Thomas Linder Puls
No problem :-), I think everyone knows that situation. Typically you compile one program, but its another one that is used when you run.