Discussions related to Visual Prolog
Vitaly Markov
Active Member
Posts: 40
Joined: 30 Nov 2003 0:01

Additional function in the class redBlackTree

Unread post by Vitaly Markov »

Greetings Thomas
I offer additional function for the class redBlackTree.

The predicate redBlackTree::insert rewrites an element in the unique rbt if this element exist.
It would be desirable to rewrite elements only in case of the success of some condition.
Such feature is very useful to realisation breadth-first search to find minPath (maxPath) and other tasks.

Let's consider in detail:
The RedBlackTree stores tuple(PreviousNode, NewPathWeight) at key NextNode. That is there is mapping: NextNode -> tuple(PreviousNode, NewPathWeight).

Now three steps should be made for RedBlackTree updating after a finding of the edge: Node1->NextNode of the current path in the graph, NewWeight - the weight of the new path:
1) To call the predicate redBlackTree::tryLookUp to get the "old" data tuple(Node, Weight) at key NextNode. Estimation is equal log(N).
2) If new weight NewWeight is less than "old" weight Weight (NewWeight<Weight), then to remove the "old" data tuple(Node, Weight). To call the predicate redBlackTree::delete for this purpose. Estimation is equal log(N).
3) To insert the new data tuple(Node1, NewWeight) at key NextNode. To call the predicate redBlackTree::insert for this purpose. Estimation is equal log(N).
Total 3*log(N).

The new predicate redBlackTree::insert1 will demand one step and an estimation log(N).
Declaration of the new predicate:

Code: Select all

predicates     insert1 : (tree{Key, Data} Tree1, Key Key, Data Data, predicate_dt{Data,Data} OverWriteCondition) -> tree{Key, Data} Tree0.     % @short #Tree0 is the result of inserting (#Key, #Data) into unique tree #Tree1     % @detail If predicate OverWriteCondition is succeed, then data #Data at #Key will be overwritten.     % @end
Implementation of the new predicate:

Code: Select all

clauses     insert1(tree(P, T), Key, Data, Cond) = tree(P, insert_rep1(P, Key, Data, T, Cond)). class predicates     insert_rep1 : (treeProperty{Key} P, Key Key, Data Data, treeRep{Key, Data} Tree1, predicate_dt{Data, Data} OverWriteCondition) -> treeRep{Key, Data} Tree0. clauses     insert_rep1(P, K, V, T, Cond) = makeblack(ins1(P, K, V, T, Cond)). class predicates     ins1 : (treeProperty{Key} P, Key Key, Data Data, treeRep{Key, Data} Tree, predicate_dt{Data, Data} OverWriteCondition) -> treeRep{Key, Data} Tree. clauses     ins1(_P, K, V, emptyRep, _Cond) = nodeR(emptyRep, K, V, emptyRep).     ins1(P, K, V, nodeB(A, Xk, Xd, B), Cond) = ins_black1(P, comp(P, K, Xk), K, V, A, Xk, Xd, B, Cond).     ins1(P, K, V, nodeR(A, Xk, Xd, B), Cond) = ins_red1(P, comp(P, K, Xk), K, V, A, Xk, Xd, B, Cond).   class predicates     ins_black1 : (treeProperty{Key} P, compareResult Compare, Key Key, Data Data, treeRep{Key, Data} A, Key Xkey, Data Xdata, treeRep{Key, Data} B, predicate_dt{Data, Data} OverWrite) ->         treeRep{Key, Data} Tree. clauses     ins_black1(P, less(), K, V, A, Xk, Xd, B, Cond) = balance(ins1(P, K, V, A, Cond), Xk, Xd, B).     ins_black1(P, greater(), K, V, A, Xk, Xd, B, Cond) = balance(A, Xk, Xd, ins1(P, K, V, B, Cond)).     ins_black1(treeProperty(_, unique()), equal(), K, V, A, Xk, Xd, B, Cond) = NodeB :-         NodeB = if Cond(V,Xd) then nodeB(A, K, V, B) else nodeB(A, Xk, Xd, B) end if,         !.     ins_black1(P, equal(), K, V, A, Xk, Xd, B, Cond) = balance(ins1(P, K, V, A, Cond), Xk, Xd, B).   class predicates     ins_red1 : (treeProperty{Key} P, compareResult Compare, Key Key, Data Data, treeRep{Key, Data} A, Key Xkey, Data Xdata, treeRep{Key, Data} B, predicate_dt{Data, Data} OverWrite) ->         treeRep{Key, Data} Tree. clauses     ins_red1(P, less(), K, V, A, Xk, Xd, B, Cond) = nodeR(ins1(P, K, V, A, Cond), Xk, Xd, B).     ins_red1(P, greater(), K, V, A, Xk, Xd, B, Cond) = nodeR(A, Xk, Xd, ins1(P, K, V, B, Cond)).     ins_red1(treeProperty(_, unique()), equal(), K, V, A, Xk, Xd, B, Cond) = NodeR :-         NodeR = if Cond(V,Xd) then nodeR(A, K, V, B) else nodeR(A, Xk, Xd, B) end if,         !.     ins_red1(P, equal(), K, V, A, Xk, Xd, B, Cond) = nodeR(A, Xk, Xd, ins1(P, K, V, B, Cond)).
Five predicates have been slightly modified. They are marked by postfix 1.
I while realised really this predicate in a separate class.
Code optimisation is possible at addition of this predicate in the module redBlackTree.

Example of a call of a predicate in task about minPath:

Code: Select all

T1 = insert1(Tree, K, V, {(tuple(_,NewWeight),tuple(_,PrevWeight)) :- NewWeight<PrevWeight})
Example of a call of a predicate in task about maxPath:

Code: Select all

T1 = insert1(Tree, K, V, {(tuple(_,NewWeight),tuple(_,PrevWeight)) :- NewWeight>PrevWeight})
Example of a call of a predicate in some task about geometrical graph:

Code: Select all

X = 23,  % The fixed co-ordinate T1 = insert1(Tree, K, V, {(Y1,Y2) :- sqrt(X^2+Y1^2) > sqrt(X^2+Y2^2)})
I suggest a predicate insert1 to transfer to the class redBlackTree.
User avatar
Thomas Linder Puls
VIP Member
Posts: 1398
Joined: 28 Feb 2000 0:01

Unread post by Thomas Linder Puls »

Hi, Vitally.

You are of course aware that 3*O(<something>) is the same as O(<something>). That said it may of course still be a good idea to reduce the constant.

The first thing we can do to reduce the constant is to remove your step 2, i.e. deleting the entry because in a tree with unique keys inserting will replace an existing same key.

Secondly, if there is better data for the key in the tree then the step 3 is not performed.

Finally, step 1 is less expensive than your replacement code, because step 1 does not build a new tree.

So first of all the actual overhead constant is always less than 2 (rather than 3), furthermore it actually depends how often the data is actually updated or not.

I have run a little test with your implementation:

Code: Select all

implement main     open core, redBlackTree   constants     nodeCount : unsigned = 100000.     testCount : unsigned = 100000.     updatePropability : real = 0.5.   clauses     run() :-         Tree = mkTree(nodeCount, emptyUnique()),         Test = mkTest(testCount),         doTest("test1", Tree, Test, test1),         doTest("test2", Tree, Test, test2),         profileTime::printAndReset().   domains     key = unsigned.     data = unsigned.     tree = tree{key, data}.     testSet = tuple{key, data}*.     test = (tree Tree1, key Key, data Data) -> tree Tree0.   class predicates     mkTree : (key Count, tree Tree1) -> tree Tree0. clauses     mkTree(N, T) = if N > 0 then mkTree(N - 1, insert(T, N, 0)) else T end if.   class predicates     mkTest : (unsigned Count) -> testSet Test. clauses     mkTest(Count) =         [ tuple(Key, Data) ||             _ = std::cIterate(Count),             Key = math::random(nodeCount) + 1,             Data = if updatePropability < math::random() then 0 else 1 end if         ].   class predicates     doTest : (profileTime::costName Name, tree Tree, testSet TestSet, test Test). clauses     doTest(Name, Tree, TestSet, Test) :-         profileTime::start_pr(Name),         foreach tuple(K, D) in TestSet do             _ = Test(Tree, K, D)         end foreach,         profileTime::stop_pr(Name).   class predicates     test1 : test. clauses     test1(T, K, D) = if X = tryLookUp(T, K) and D > X then insert(T, K, D) else T end if.   class predicates     test2 : test. clauses     test2(T, K, D) = insert1(T, K, D, { (New, Old) :- New > Old }).   end implement main
If the update probability is 0 then test1 is ~35% more efficient than test2.
If the update probability is 1 then test1 is ~60% less efficient than test2.

There seems to be a break even when the update probability is around 40%.

For the kind of optimization you mention the update probability will start high and then become smaller and smaller as the search finds better and better results.

Given this I don't really think there is a reason to add your code to the redBalckTree class. At least not at present.

I do however have an other change in mind, namely to change the basic insert method into a determ predicate which fails if insert is not necessary. This will:
  • Make insert more efficient, because instead of rebuilding a an unchanged tree it can simply return the original tree (as in test1).
  • Make it simple to add a changed event to the mapM_redBlack class (which can be handy in a lot of scenarios)
If this change is also applied to your conditional insert method, it will also improve performance (not building unchanged trees) and then we can reconsider whether to add the code to the redBlackTree class.

I believe there are many contexts that could use a special variant of redBlackTree's (the redBlackSet is such a variant). Some variations can be implemented using additional configuration (like the controling the comparison and the key uniqueness in the current class). Such configuration gives a performance overhead and it also increase the conceptual complexity of the class.

Using dummy data the redBlackTree can easily be used as a set, but since sets have an important right on their own the dummy data overhead seems excessive. And therefore we have chosen to create a specialized (duplicate) version for this purpose.

Your code could either replace the original code (with a performance overhead for the normal non-conditional case) or live as a specialized duplicate (in the same or another class). For "standard" maintenance and size reasons we are not very happy about duplicate code, but there is clearly a non-trivial choice between specialized duplicates and configurable generality.

By now I think I have reached the conclusion that we will reject your suggestion. I think you should instead create a duplicate that you specialize completely to your needs:
  • Add the "better" test as a treeProperty.
  • Remove the uniqueKey from the treeProperty.
  • Optionally (i.e. if you don't need it) remove the key comparator from the treeProperty
  • Implement insert using determ predicate as I described above
Regards Thomas Linder Puls
PDC
Vitaly Markov
Active Member
Posts: 40
Joined: 30 Nov 2003 0:01

Unread post by Vitaly Markov »

Thomas, thanks for the detailed answer.

The sign on results of the test is obvious before the test:
insert is better without tree updating,
insert1 is better at tree updating.
Anyway, thanks for the test.

I did not suggest to replace standart insert.
I have offered idea for improvement of redBlackTree functions.

Regards Vitaly Markov.
Post Reply