Discussions related to Visual Prolog
David Snook
Active Member
Posts: 36
Joined: 6 Feb 2003 0:01

Coding question

Unread post by David Snook »

This is a general question concerning code usage and style.

Given:
A list of reference numbers - SLCODES
An internal database - nameRefDB
A temp working internal database - labelDB
Predicate to find the relation to each number - get_lable

is there a difference in the following coding methods to obtain a list of labels from the list of reference numbers:

Code: Select all

LNAMES = [Name || CODE = list::getMember_nd(SLCODES), nameRefDB(CODE,LabelCODE), get_label(LabelCODE,Name)]

Code: Select all

foreach CODE = list::getMember_nd(SLCODES) and nameRefDB(CODE,LabelCODE) and get_label(LabelCODE,Name) do      assert(labelDb(Code,Name)) end foreach, LNAMES = [LName || labelDb(Code,Name)]
I started with the second version and realized that the first version may be simpler but I'm not sure if it works the way I think it does and am wondering if there is a possible performance difference?

Any thoughts and recommendations are gladly received.

Regards,

David Snook
User avatar
Thomas Linder Puls
VIP Member
Posts: 1398
Joined: 28 Feb 2000 0:01

Unread post by Thomas Linder Puls »

The first version is clearly much simpler, and straightforward.

Moreover, you should always avoid having facts that are only used as "auxiliary variables" in an algorithm. Facts should be used to represent the state of your program.

Your algorithm also leaves the result behind in labelDB, so you will both have a memory leakage and even worse the second time you run the algorithm will get both the result from the first run and the second (and so forth).

Performance: The complexity is the same in both cases. O(N*X) where N is the number of elements in SLCODES and X is the complexity coming from nameRefDB and get_label.

But the second version makes a lot of extra memory allocations and an extra run through that facts database. So it will be slower (you will most likely not notice this, because I believe the complexity X above will hide the difference).

As a little final note. If you are using version 7.5 you should also consider using "in" instead of list::getMember_nd:

Code: Select all

LNAMES = [Name || CODE in SLCODES, nameRefDB(CODE, LabelCODE), get_label(LabelCODE, Name)]
Regards Thomas Linder Puls
PDC
Peter Muraya
VIP Member
Posts: 147
Joined: 5 Dec 2012 7:29

Coding question

Unread post by Peter Muraya »

If I wanted to use the foreach construct while avoiding the issues Thomas raised, this is how I would code it:-

Code: Select all

          Result = varM::new([]),           foreach                CODE = list::getMember_nd(SLCODES),                nameRefDB(CODE,LabelCODE),                get_label(LabelCODE,Name)           do                Result:value:=[Name|Result:value]           end foreach,           LNAMES = Result:value,

Note how I indent the code as it helps me to step through the individual statements (when debugging) more easily than alternative arrangements.
Mutall Data Management Technical Support
User avatar
Thomas Linder Puls
VIP Member
Posts: 1398
Joined: 28 Feb 2000 0:01

Unread post by Thomas Linder Puls »

It is a good idea to use varM's, setM_xxx's, mapM_xxx's, etc instead of "algorithmic facts".

In this particular use you may however notice that the resulting list is reversed (which may sometimes be significant).

You can use a queueM_fact to maintain the order when collecting a list:

Code: Select all

          Result = queueM_fact::new(),           foreach                CODE = list::getMember_nd(SLCODES),                nameRefDB(CODE,LabelCODE),                get_label(LabelCODE,Name)           do                Result:insert(Name)           end foreach,           LNAMES = Result:asList,
A queueM_fact actually stores things in a fact database, so this version of the algorithm is very similar to the foreach loop in David's mail. However this time the fact database is in an object which is thrown away after the algorithm have finished, and thereby all traces of the fact database disappear.
Regards Thomas Linder Puls
PDC
David Snook
Active Member
Posts: 36
Joined: 6 Feb 2003 0:01

Unread post by David Snook »

Thank you Thomas and Peter.

The following is simple and clean:

Code: Select all

LNAMES = [Name || CODE in SLCODES, nameRefDB(CODE, LabelCODE), get_label(LabelCODE, Name)]
Having read a little further on this now I'm starting to get a feel for using varM and collections. I have a feeling I could use collections a great deal in my projects but don't fully understand how they work yet. If there are a couple of small real-world examples showing the coding without using collections along with coding using collections that would be very helpful :D.

I have used temp internal databases when collecting a list to avoid duplicates as an alternative to using list::removeDuplicates after the fact. I don't see how I could do that in the list comprehension clause above but would the following modification to the last example provided be the appropriate method to achieve that?

Code: Select all

          Result = queueM_fact::new(),           foreach                CODE = list::getMember_nd(SLCODES),                nameRefDB(CODE,LabelCODE),                get_label(LabelCODE,Name),                not(Result:contains(Name))           do                Result:insert(Name)           end foreach,           LNAMES = Result:asList,
Peter Muraya
VIP Member
Posts: 147
Joined: 5 Dec 2012 7:29

Coding question

Unread post by Peter Muraya »

Hi David,

I have not used the setM_redBlack or setM collections but I think they are what you need if you want don't want duplicates. Is that correct Thomas? What is the difference between the two?

I have found array2M to be very useful for working with 2-dimensional arrays.

More about collections.... what do the attributes at the top of collection interface definition mean?

Code: Select all

interface collection{@Type}     [in_test(contains), in_iterate(getAll_nd)]
Mutall Data Management Technical Support
User avatar
Thomas Linder Puls
VIP Member
Posts: 1398
Joined: 28 Feb 2000 0:01

Unread post by Thomas Linder Puls »

Hi, David, there is a tutorial about collections: Collection library.

A list comprehension cannot avoid duplicates. There are several possible ways to deal with it. A major factor when deciding which method to use is whether you have any requirements/wishes to the order of the result. If you need to preserve the order from SLCODES, I see two obvious possible choices:
  • For equal elements you want to retain the first (and skip all the other).
  • For equal elements you want to retain the last (and skip all the other).
list::removeDuplicates will satisfy the first choice. If you need the second choice or something even more complex it will have to be "hand-coded".

A setM_redBlack will result in a sorted list (or in the set itself). You will use it in exactly the same way as I used the queueM_fact:

Code: Select all

          Result = setM_redBlack::new(),           foreach                CODE = list::getMember_nd(SLCODES),                nameRefDB(CODE,LabelCODE),                get_label(LabelCODE,Name)           do                Result:insert(Name)           end foreach,           LNAMES = Result:asList,
Peter Muraya wrote:.... what do the attributes at the top of collection interface definition mean?

Code: Select all

interface collection{@Type}     [in_test(contains), in_iterate(getAll_nd)]
The in_test and in_iterate attributes defines the in operator for collections: see Language Reference: Terms: In
Regards Thomas Linder Puls
PDC
David Snook
Active Member
Posts: 36
Joined: 6 Feb 2003 0:01

Unread post by David Snook »

Thank you Thomas. I'm starting to get a feel for collections now and followed the "keyword recognition" and "text concordance" examples.

Just 1 more question regarding collections; if lists and internal databases are collections then it seems that the collections library is another method of handling these data structures. To help me understand when to use them are there general guidelines for using collections over using the standard list and internal database routines?

David
User avatar
Thomas Linder Puls
VIP Member
Posts: 1398
Joined: 28 Feb 2000 0:01

Unread post by Thomas Linder Puls »

I see two factors to consider when choosing:
  • What makes most sense, i.e. makes the code clearer, simpler, etc.
  • Performance
Clearer: If you think of the names you collect as the set of names used in ..., then it is also clearer if your code uses a set to represent the data. Not only will this ensure that there are no duplicates, it will set also clearly indicate that this is the intention.

I am not advocator for solving performance problems that does not exist. Especially not if the solution makes the code worse. I do however always take complexity issues into account. And complexity issues is what ruins performance. Whether it is most efficient to iterate a list or a fact database is not important, what is important is whether you iterate it N times or N*N times. And here it is worth noticing that the collections have rather significant some performance gains for some operations.

A (red-black) map with N keys (mapped to values) will require O(log(N)) operations to look-up a key. If you place the same data in a list or a fact chain look-up will have to traverse the list/fact from one end to the other giving a complexity of O(N).

If you double N then linear look-up will take twice as long time, the logarithmic will only take C longer (C being some fixed time). When N is small the doubling in time may be faster than the extra C, but as N becomes larger it is certain that doubling will become worse than adding C.

With collections we are rather lucky: if your code becomes clearer and simpler using collections it will most likely also perform significantly better. This is a free benefit.

I will suggest that you at first focus on the clarity aspect and forget the performance aspect until you have become familiar with the usage.
Regards Thomas Linder Puls
PDC
David Snook
Active Member
Posts: 36
Joined: 6 Feb 2003 0:01

Unread post by David Snook »

That makes good sense.

Cheers,

David
David Snook
Active Member
Posts: 36
Joined: 6 Feb 2003 0:01

Unread post by David Snook »

I'm probably getting carried away now but am looking at applications of the collection library in my code and another question has arisen.

The following is a simplified example for the purposes of my question. The predicate "collect_data" creates a column of report data subtotalling by DEPT and ending with the grand total. On completion, the fact "tList" contains the columnar data in a queueM . I see now the next step might be to have the internal database "dataDB" as a collection (perhaps ordered in some way) but I'll start with the following. Is this an appropriate way to use collections and varM?

David Snook

Code: Select all

class facts    tList : queueM{real} := queueM_fact::new().    subtotal : varM{real} := varM::new(0).    total : varM{real} := varM::new(0).   class predicates    collect_data(unsigned,gmttime*,gmttime*,unsigned_list). clauses    collect_data(Code,[Date|T2],DATES,[DEPT|T3]):-          if dataDB(Code,Date2,DEPT,Real) and Date2:sametime(Date)          then retract(dataDB(Code,Date2,DEPT,Real))          else Real = 0          end if,          subtotal:value := subtotal:value + Real,          tList:insert(Real),!,          collect_data(Code,T2,DATES,[DEPT|T3]). %collect next date    collect_data(Code,[],DATES,[_DEPT|T3]):- %end of data for DEPT (all dates collected)          tList:insert(subtotal:value),          total:value := total:value + subtotal:value,          subtotal:value := 0,!,          collect_data(Code,DATES,DATES,T3). %collect data for next dept    collect_data(Code,_,_,[]):-          tlist:insert(subtotal:value),          tlist:insert(total:value + subtotal:value).
User avatar
Thomas Linder Puls
VIP Member
Posts: 1398
Joined: 28 Feb 2000 0:01

Unread post by Thomas Linder Puls »

It never (i.e. never) makes sense to have a varM in a fact variable, because that is completely equivalent to having the value directly in the fact variable, except that it becomes more complex.

A varM's should be in "green" variables, to replace a "black" variable.

Collections can make sense in fact variables (i.e. if they represent the state of the program). But I don't think you gain anything by substituting an "auxiliary" nondeterm fact with an "auxiliary" fact variable holding a queue. The idea should again be to move the "auxiliary" data from "black" to "green".

(I find it strange that you mix values and sums in a single list.)
Regards Thomas Linder Puls
PDC
David Snook
Active Member
Posts: 36
Joined: 6 Feb 2003 0:01

Unread post by David Snook »

It never (i.e. never) makes sense to have a varM in a fact variable, because that is completely equivalent to having the value directly in the fact variable, except that it becomes more complex.
Yes, I was getting carried away trying to apply these new routines.
(I find it strange that you mix values and sums in a single list.)
The sample code would produce the raw data for a single column financial report subtotalled by department. I attached an example of a simple multi-column report that displays data in vpiTableed (this screen capture is not related to the sample code in any way). Subtotals may be present for multi-company reports or any number of other reasons.

Pre if-then-else functionality I had more clauses and an output parameter. Using the same example would be something like below. When preparing data for a detailed multi-company financial report comparing Actual against Budget against Prior Years with averages and percentage columns these predicates to collect the data become quite large and complex so I try to think of other ways to simplify the data collection. The reports themselves are not static in the sense that they're pre-defined. It's strictly parameter driven from the user point-of-view and everything else is handled by the system.

Thanks for the feedback Thomas and setting me straight on proper usage. I'll let it sink in and hopefully come up with more productive methods.

David

Note: the following code does not relate to the jpg attached.

Assume separate predicates addSubtotals, getSubTotal and getTotal.

Code: Select all

class predicates    collect_data(unsigned,gmttime*,gmttime*,unsigned_list,real_list) procedure (i,i,i,i,o). clauses    collect_data(Code,[Date|T2],DATES,[DEPT|T3],[Real|T4]):-          if dataDB(Code,Date2,DEPT,Real) and Date2:sametime(Date)          then retract(dataDB(Code,Date2,DEPT,Real))          else Real = 0          end if,          addSubtotals(Real),!,          collect_data(Code,T2,DATES,[DEPT|T3],T4). %collect next date    collect_data(Code,[],DATES,[_DEPT|T3],[SubTotal|T4]):- %end of data for DEPT (all dates collected)          getSubTotal(SubTotal),!,          collect_data(Code,DATES,DATES,T3,T4). %collect data for next department    collect_data(_,_,_,[SubTotal|Total]):-          getSubTotal(SubTotal),          getTotal(Total).
Attachments
screen.jpg
screen.jpg (90.69 KiB) Viewed 20124 times
User avatar
Thomas Linder Puls
VIP Member
Posts: 1398
Joined: 28 Feb 2000 0:01

Unread post by Thomas Linder Puls »

I guessed it was something like that.
Regards Thomas Linder Puls
PDC
Post Reply