Looping with Python in XSI
June 20th, 2005 by Bernard Lebel - Viewed 5393 times - Popularity: 11%ABSTRACT
There are many ways to loop in programming, each programming language offering their own flavors of looping techniques. In this articles I will explore few of the Python one, and see how they perform in XSI. The techniques I will test are: the built-in map() function, the classic for loop (both over a sequence and with an index), the list comprehension, as well as various other methods of retrieving information in the XSI scene.
WHAT YOU SHOULD READ BEFORE
Before reading this article, you should read these two articles that deal with the same subject.
The first is not specific to Python, and explores various looping techniques in XSI. Check it out here
The second one is not XSI-related, instead it explores various looping techniques of Python. Click here to read.
TIMER
To time performance, I have used exactly the same approach as the author of the Python looping article (unsigned unfortunately). Each looping technique is put into a function, and the function is called 10 times by the timer function, so the timer overhead is minimized (and the result more accurate).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 | import win32com import time from array import array xsi = Application c = win32com.client.constants XSIFactory = win32com.client.Dispatch('XSI.Factory') def timer(functionName): # Print function name xsi.logmessage('%s %s' % ('------------------------------------\n\t\t', functionName.__name__ )) # Record current time time1 = time.clock() # Start loop to perform single function calls for i in range( 0, 10 ): # Call function functionName() # Record new time time2 = time.clock() # Print execution time, rounded to 3 miliseconds xsi.logmessage(round(time2 - time1, 3)) # Put function names into tuple, eventually replace by real function names tFunctions = function1, function2 xsi.logmessage('A new series of tests') # Loop over list of functions for functionName in tFunctions: # Call timer for single function timer(functionName) |
XSI SETUP
To experiment, I have created a polygon sphere, onto which I have created 3 clusters of points, one of these clusters being a weight map. The sphere is frozen. I then have made 100 duplicate of the sphere. I have maximized the Explorer.
In this session, I have my scripting preferences this way:
Scripting language: Python ActiveX Scripting Engine (of course)
Log Size Max Lines: 3000
Log Commands: disabled
Log Messages: enabled
Real-Time Message Logging: disabled
Log To File: disabled
TEST 1: COLLECTING POINT CLUSTERS
Let say we want to collect the point clusters of all spheres, and put them into a XSICollection.
The sequence for loop
The first test we’re gonna do is with a series of sequence for loops:
1 2 3 4 5 6 7 8 | def collectPointClusters_ForLoop1(): oClusters = XSIFactory.CreateObject('XSI.Collection') for oObject in xsi.activesceneroot.children: if oObject.type == c.siPolyMeshType: if oObject.activeprimitive.geometry.clusters.count > 0: for oCluster in oObject.activeprimitive.geometry.clusters: if oCluster.type == 'pnt': oClusters.add(oCluster) |
Time: 3.205
The range for loop
The next loop we’re gonna do is a range type of loop. That is, we’ll loop over a sequence using a range index, and retrieve objects from collection using that index.
1 2 3 4 5 6 7 8 | def collectPointClusters_ForRange1(): oClusters = XSIFactory.CreateObject('XSI.Collection') for i in range(0, xsi.activesceneroot.children.count): if xsi.activesceneroot.children(i).type == c.siPolyMeshType: if xsi.activesceneroot.children(i).activeprimitive.geometry.clusters.count > 0: for j in range(0, xsi.activesceneroot.children(i).activeprimitive.geometry.clusters.count): if xsi.activesceneroot.children(i).activeprimitive.geometry.clusters(j).type == 'pnt': oClusters.add(xsi.activesceneroot.children(i).activeprimitive.geometry.clusters(j)) |
Time: 28.631
What? Did I read this result properly? The for range loop is nearly 9 times slower than the sequence for loop!
Another range for loop
I wonder though… would assigning a name to the collected object (i) and to the cluster (j) make this script faster? Let’s try it…
1 2 3 4 5 6 7 8 9 | def collectPointClusters_ForRange2(): oClusters = XSIFactory.CreateObject('XSI.Collection') for i in range(0, xsi.activesceneroot.children.count): oChild = xsi.activesceneroot.children(i) if oChild.type == c.siPolyMeshType: if oChild.activeprimitive.geometry.clusters.count > 0: for j in range(0, oChild.activeprimitive.geometry.clusters.count): oCluster = oChild.activeprimitive.geometry.clusters(j) if oCluster.type == 'pnt': oClusters.add(oCluster) |
Time: 8.179
Whoa, much better! Not quite close from the sequence for loop, but roughly 3.5 times faster than the first range for loop. I suspect this dramatic increase in speed is due to the fact that we have assigned a name to the sphere and the cluster, so the script does not have to look up the collections to retrieve the data. Once a name is assigned, it has reserved a space into memory for this data block, while in the first range loop it did not.
This result comes somewhat as a surprise, although it should not. I’m surprised because I had the assumption that reserving a space in memory for data would involve more overhead than a table lookup, especially in cases like this where we do a lot of looping. I should not be surprised because I have ignored one of the fundamentals of Python, that is, the lookup table.
In Python, every name that you use (a name is a generic term that can be a variable, a constant, a function, a module, and anything that using a name) is part of lookup table. When Python attempts to retrieve the data associated with that name, it will lookup this table, which will point him to the data in memory. The lookup table is actually made of 4 tables, and a table is better known as a scope. The scopes are nested in a hierarchical manner, and this structure is known as the LEGB rule. L is for Local, that is, names inside a function. E is for Encapsulating. In Python, you can nest functions inside functions. G is for Global, that is, the script file (also known as module file). B is for built-in.
When you use a name, Python will lookup the scope name tables one after another until it can find the data. Is looks the local scope, then the encapsulating one, then the global one, and finally the built-in one. If it can’t find the name, it will raise an error saying the name is undefined/unassigned. Each time Python has to reach to the next scope to lookup for a name, it has a performance cost.
Where I’m going with this is that I believe what happened here basically is the same as with the name lookup mechanism. Assigning a name to data basically allows Python to retrieve the data much faster than by looking up the members of a sequence several times in a row. I assume that each time you get the sphere/cluster using the i/j index, you tell Python to lookup members of a collection and retrieve the one that matched that index. So by assigning a local name to data, Python instantly retrieves the information, so there is a gain in performance in the long run. This is observation is in perfect concordance with the statements of the author of the second article I have presented above.
The list comprehension
Then we will try a more “pythonic” way: the list comprehension. This will use exactly the same approach, but with the list comprehension syntax.
1 2 3 | def collectPointClusters_ListComprehension(): oClusters = XSIFactory.CreateObject('XSI.Collection') oClusters.additems([oCluster for oObject in xsi.activesceneroot.children if oObject.type == c.siPolyMeshType if oObject.activeprimitive.geometry.clusters.count > 0 for oCluster in oObject.activeprimitive.geometry.clusters if oCluster.type == 'pnt']) |
Time: 3.592
Most surprising. I would have thought the list comprehension faster than the for loop, as it is supposedly a faster loop technique in Python.
Another list comprehension
I’m a bit disappointed so I decide to give a second go, but this time I will do things a little differently. Instead of using a single long list comprehension, I’ll break it up into smaller list comprehensions, and to populate the XSICollection only at the very end.
1 2 3 4 5 6 7 8 | def collectPointClusters_ListComprehension2(): oChildren = xsi.activesceneroot.children aSpheres = [oChild for oChild in oChildren if oChild.type == c.siPolyMeshType] aClusters = [oCluster for oSphere in aSpheres if oSphere.activeprimitive.geometry.clusters.count > 0 for oCluster in oSphere.activeprimitive.geometry.clusters] aPointClusters = [oCluster for oCluster in aClusters if oCluster.type == 'pnt'] oPointClusters = XSIFactory.CreateObject('XSI.Collection') oPointClusters.additems(aPointClusters) |
Time: 3.752
Even worse! Okay, so far it seems that list comprehension is not the best way to traverse the scene hierarchy and test types in XSI.
The Python map() function
So now we’ll experiment with an “implied” type of loop. Python has some built-in functions that involves loop, but there is no need to have a for statement to launch them. One of them is the map() function. Now we have to consider that map() requires an additional function or two, as the implied loop is mapped to that function. Therefore, we will use a series of functions that are implicitly called by the map() function.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | def mapClusters(oCluster): if oCluster.type == 'pnt': return oCluster def mapChildren(oChild): aObjectClusters = [] if oChild.type == c.siPolyMeshType: if oChild.activeprimitive.geometry.clusters.count > 0: aClusters = map(mapClusters, oChild.activeprimitive.geometry.clusters) aObjectClusters.extend(aClusters) return aObjectClusters def collectPointClusters_map(): oClusters = XSIFactory.CreateObject('XSI.Collection') aChildrenClusters = map(mapChildren, xsi.activesceneroot.children) oClusters.additems([oCluster for aCluster in aChildrenClusters for oCluster in aCluster]) |
Time: 3.59
Another big surprise. In fact, the execution time of the list comprehensions and the map() function is somewhat inconsistent, and sometimes the map() function is quicker. This time, I would have expected the map() loop to be much slower than the list comprehensions. The reason is that the map() function is specially designed to be used only with other built-in functions (written in C), not “custom” functions (written in Python) like we have here. map() is a C function, not a Python function, and we have used it with Python functions. What is even more confusing is that I once used a map() function to call a self-installed custom command that would create proxy parameters between objects, and the execution would slow down at each iteration, to the point where XSI stopped responding and eventually crashed.
The XSI FindObjects() command
Well, I think it takes too much time to retrieve the information, so now I’ll try out with a XSI command called FindObject(). This command allows you to retrieve objects of a certain classid. The classid is a unique identifier for object classes in XSI.
1 2 | def collectPointClusters_FindObjects(): oClusters = xsi.FindObjects(None, '{E4DD8E40-1E1C-11D0-AA2E-00A0243E34C4}') |
Time: 0.031
Oh-my-god. Nearly 100 times faster than the sequence for loop. Can you ask for faster than that?
A filtered FindObjects() function
There is only a slight problem though. If you print the number of clusters, it will return 400 instead of 300, which means it finds something we don’t want. So we have to filter the result.
1 2 3 4 5 | def collectPointClusters_FindObjects2(): oClusters = XSIFactory.CreateObject('XSI.Collection') oFoundClusters = xsi.FindObjects(None, '{E4DD8E40-1E1C-11D0-AA2E-00A0243E34C4}') for oCluster in oFoundClusters: if not 'default_Point' in str(oCluster.parameters(0)): oClusters.add(oCluster) |
Time: 3.36
To go from suprise to surprises, just to filter out the undesired cluster, our function has become even slower than the sequence for loop. In other words, FindObjects() will work well as long as you get exactly what you want, otherwise you may want to go the sequence loop.
Summary of runtime for test 1
I have to say there were discrepancies in the execution time of the functions from one execution to the next. Although the tendencies of one being faster than the other remained moderately consistent, the exact time never has been the same each time. Sometimes the map() function was slower than the list comprehensions, sometimes faster, and sometimes the second list comprehension was faster than the first one.
Here is a series of execution time for these tests.
#INFO : A new series of tests #INFO : ------------------------------------ # collectPointClusters_ForSequence #INFO : Time: 3.205 #INFO : ------------------------------------ # collectPointClusters_ForRange1 #INFO : Time: 28.631 #INFO : ------------------------------------ # collectPointClusters_ForRange2 #INFO : Time: 8.179 #INFO : ------------------------------------ # collectPointClusters_ListComprehension1 #INFO : Time: 3.592 #INFO : ------------------------------------ # collectPointClusters_ListComprehension2 #INFO : Time: 3.752 #INFO : ------------------------------------ # collectPointClusters_map #INFO : Time: 3.59 #INFO : ------------------------------------ # collectPointClusters_FindObjects1 #INFO : Time: 0.031 #INFO : ------------------------------------ # collectPointClusters_FindObjects2 #INFO : Time: 3.36 #INFO : A new series of tests #INFO : ------------------------------------ # collectPointClusters_ForSequence #INFO : Time: 3.192 #INFO : ------------------------------------ # collectPointClusters_ForRange1 #INFO : Time: 28.574 #INFO : ------------------------------------ # collectPointClusters_ForRange2 #INFO : Time: 8.201 #INFO : ------------------------------------ # collectPointClusters_ListComprehension1 #INFO : Time: 3.719 #INFO : ------------------------------------ # collectPointClusters_ListComprehension2 #INFO : Time: 3.591 #INFO : ------------------------------------ # collectPointClusters_map #INFO : Time: 3.58 #INFO : ------------------------------------ # collectPointClusters_FindObjects1 #INFO : Time: 0.031 #INFO : ------------------------------------ # collectPointClusters_FindObjects2 #INFO : Time: 3.363 #INFO : A new series of tests #INFO : ------------------------------------ # collectPointClusters_ForSequence #INFO : Time: 3.137 #INFO : ------------------------------------ # collectPointClusters_ForRange1 #INFO : Time: 28.763 #INFO : ------------------------------------ # collectPointClusters_ForRange2 #INFO : Time: 8.241 #INFO : ------------------------------------ # collectPointClusters_ListComprehension1 #INFO : Time: 3.736 #INFO : ------------------------------------ # collectPointClusters_ListComprehension2 #INFO : Time: 3.572 #INFO : ------------------------------------ # collectPointClusters_map #INFO : Time: 3.759 #INFO : ------------------------------------ # collectPointClusters_FindObjects1 #INFO : Time: 0.032 #INFO : ------------------------------------ # collectPointClusters_FindObjects2 #INFO : Time: 3.387
Conclusions of test 1
So what to think of all these tests? Let’s summarize the results.
- In all cases except with FindObjects(), the fastest loop was with the sequence for loop.
- A range loop is much slower than the sequence for loop.
- We can speed up dramatically, although not nearly as much as the sequence for loop, the range for loop by assigning names to objects part of sequences.
- List comprehension, when involving testing types and iterating over sequences of objects, is slower than the sequence for loop.
- A single long list comprehension is faster than multiple small ones.
- FindObjects() is the unchallenged champion of speed to retrieve objects of a certain class.
- FindObjects() sometimes finds too much information, its return value has to be filtered, which negates its efficiency. Again, filtering with a sequence for loop is generally faster than range for loop and list comprehension.
- map(), when involved in testing types and looping over sequences, is roughly the same speed as the list comprehension. However, if used to call XSI commands and perform more resource demanding operations, it may compromise stability.
TEST 2: LISTING VALUES
I’m still flabbergasted at the results I got with the list comprehension. These results defy what is usually recognized as one of the fastest looping technique Python has to offer. So I wanted to further investigate how could list comprehension be used to speed up the Python code execution. So far my tests have demonstrated that to compare types and traverse the object model, the list comprehension was not the best suited. But what if I want to populate a list of values? Let say I have all my objects, and I want to record their vertex indices. Here I do not traverse the object nor compare types, I just record information.
The sequence for loop
The first test will be with a sequence loop:
1 2 3 4 5 6 7 8 9 10 | def listVertexIndices_For(): oSpheres = XSIFactory.CreateObject('XSI.Collection') for oChild in xsi.activesceneroot.children: if oChild.type == c.siPolyMeshType: oSpheres.add(oChild) aIndices = [] for oSphere in oSpheres: for oPoint in oSphere.activeprimitive.geometry.points: aIndices.append(oPoint.index) |
Time: 15.495
The list comprehension
Now let’s try with a list comprehension:
1 2 3 4 5 6 | def listVertexIndices_ListComprehension(): oSpheres = XSIFactory.CreateObject('XSI.Collection') for oChild in xsi.activesceneroot.children: if oChild.type == c.siPolyMeshType: oSpheres.add(oChild) aIndices = [oPoint.index for oSphere in oSpheres for oPoint in oSphere.activeprimitive.geometry.points] |
Time: 15.466
Just slightly better. I guess in a larger scenes the speed difference will become more significant, for example you may try increasing the sphere subdivisions or duplicate 10x times the existing ones (to have 100 spheres).
Conslusions of test 2
My final conclusion regarding list comprehensions is that when used in XSI, they are better suited for generating lists that do not require a deep traversing of the object model and do no require to perform comparisons. When such operations are required, the sequence for loop is better indicated.
Summary of runtime for test 2
#INFO : A new series of tests #INFO : ------------------------------------ # listVertexIndices_For #INFO : Time: 15.495 #INFO : ------------------------------------ # listVertexIndices_ListComprehension #INFO : Time: 15.466 #INFO : A new series of tests #INFO : ------------------------------------ # listVertexIndices_For #INFO : Time: 15.515 #INFO : ------------------------------------ # listVertexIndices_ListComprehension #INFO : Time: 15.501 #INFO : A new series of tests #INFO : ------------------------------------ # listVertexIndices_For #INFO : Time: 15.357 #INFO : ------------------------------------ # listVertexIndices_ListComprehension #INFO : Time: 15.346
FINAL CONCLUSION
I guess this series of basic tests indicate that there is no magic looping solution, the performance of a looping technique is relative to the context it is being used in. I’d be curious to compare my results with other XSI-Python users’ own tests, after all my test is by no mean definite.
FULL SCRIPT
You can download the full script here if you wish to run these tests on your end.





June 24th, 2005 at 8:51 pm
Well I don”t exactly understand it all yet (as I”m just getting started) but thank you for the article… I am sure it will be very helpful someday soon.
June 25th, 2005 at 10:27 am
Hi Jason,
This article is all about optimization. Since you are just starting, although it’’s good to know from the beginning what is better, you should focus on learning the langage, the XSI object model, then program design (ie functionality), and then you can bother about optimization. I find it easier to write a script that does the job, without too much attention to performance, and then modify it (sometimes completely rewrite it) in order to optimize.
Cheers
Bernard
September 7th, 2005 at 12:38 am
List comprehension is a type of map function
map() is an integral part of python, it is also part of C, its simple another part of computer science, its effcient.
Im surprised you didnt go into filter and lambda functions as they have more effect on the speed of code
The speed of code is not as important as legibility and ease of editing.
On top of this the biggest speed ups coem from pushing a whole array of data rather than passing through each element indivdually.
I find this article confusing for someone with less programming experience
September 7th, 2005 at 12:49 am
Hey Bernard!
I think that while reading some of the docs you probably got confused in regards to what python does and how it does it, which could be why your comparison of looping techniques is largely flawed (together with the fact that the example is not the most relevant in a performance test, nor is using just one test bed).
several books state that map is a C function while for loops are handled by python, and I guess that is fairly confusing to people who lack in IT and abstract theory fundamentals.
to oversimplify and clarify a bit:
a for loop, whichever way you range it, relies on python restarting an iteration at the end of the body, and this accounts for its performance when dealing with large datasets.
map instead picks the list as a whole and hammers it through a compiled and optimized library (hence the use of C and possibly the confusion) and lets this more performant handler deal with the operations. and that’’s why map is usually faster then explicit looping in most cases.
last but not least, as the article is about loops, I believe you left out an important part of comparing looping techniques, which is the amount of functionalities that each technique provides.
IE: mapping is fast but it can be trickier to handle certain arguments with it (as you have to cross reference variables to make it accept anything other then equally sized lists for arguments).
list comprehension is an elegant way of dealing with streamlining, and it’’s just as performing as mapping (it’’s infact a form of mapping underneath) but it very easily makes your code atrociously obfuscated (which infact happens in one of your examples that take more time to read then it takes to execute ;) )
Hope you can revise the article.
good job anyway.
September 7th, 2005 at 8:20 am
Sam:
>> The speed of code is not as important as legibility and ease of editing.
I think one must balance between the two. It’’s important that indeed the code has to be easily maintainable. But if execution takes time and slows down the user (wich is ultimately the one benefiting from the script), then steps have to be taken to palliate to this problem, even if means more code. The context of this article is executing the code in XSI, so dealing with objects to test types and make comparisons are very common tasks. I focused this article on that aspect.
>> Im surprised you didnt go into filter and lambda functions as they have more effect on the speed of code
I considered testing filter(), but I assumed it would have the same kind of results as with map(), considering it’’s a C function used with Python functions. I”m too new to lambdas to use them efficiently. Could you provide some examples?
>> On top of this the biggest speed ups coem from pushing a whole array of data rather than passing through each element indivdually.
Definitely. But this article is about looping :-)
Rafaele:
>> I think that while reading some of the docs you probably got confused in regards to what python does and how it does it, which could be why your comparison of looping techniques is largely flawed (together with the fact that the example is not the most relevant in a performance test, nor is using just one test bed).
Please contribute to this article with your own ideas, I think that would be more useful than this comment…
>> map instead picks the list as a whole and hammers it through a compiled and optimized library (hence the use of C and possibly the confusion) and lets this more performant handler deal with the operations. and that’s why map is usually faster then explicit looping in most cases.
Well, yes and no. I assumed the same when I found out about map(), but when I originally put it to the test with many XSI self-installed commands, I noticed a significant decrease in execution speed. After some research in books and mailing list, everything I have read pointed at the same conclusion: map() works better with built-in Python functions (written in C), but not necessarily with Python functions (written in Python). When I replaced the map() calls in my self-installed commands by for loops, the gain in speed was humanly measurable, 100% of the cases. Could I have done something wrong with the use of map()? That’’s quite possible. Then again I have never seen an example of map used in XSI, so any contribution in that regard, with performance comparisons, would be welcomed.
>> last but not least, as the article is about loops, I believe you left out an important part of comparing looping techniques, which is the amount of functionalities that each technique provides.
I”m not sure what you mean here… Can you elaborate?
Bernard
September 7th, 2005 at 11:43 pm
lemme refine my statements a bit then.
what you deal with in this article is clearly oriented to performance tuning.
now, performance tuning is a VERY task specific issue to deal with, no 2 pieces of code, even when the apparent differences are minimal, benefit in the same way from the same exact tuning in the same way.
performance is dictated by several elements working in sinergy. part of it is the language you are using or, more specifically, the backend to that language and how it deals with different programming paradigms and the various and different tools that it offers to handle a generic dataset in substantially different ways. however, a HUGE part of performance issues could very well spawn from something entirely different, like the API and the SDK you are using, the way such tools handle calls marshalling and internal automation/optimization of processes and all that.
when you use a single test bed for your performance analisys, and one that involves such a highly SDK dependant thing like XSI’’s internal collections, you”re totally voiding your efforts.
one example is map, that you insist saying isn”t much of a performance booster when dealing with functions written inside your python code. that statement could be almost correct in this one particular case, but have you tryied mapping a simple transform over a position array list?
on something simple, like getting the position array, decomposing it and adding a fixed amount to one axis of the position, and then dumping it back into the geometry, mapping that function VS looping it in a range provides a tenfold difference in performance, and that is because what you do is creating the list, manipulating it all in one go, and then dump it back.
further then that, in such a simple case you could achieve the same result, without obfuscating the code, and inlining the operation with a lambda, and that would give you another small, but nearly irrelevant boost.
last but not least, the same thing, with list comprehension, will probably result in something even faster, if only by 10-20%, then map.
now, isn”t that a case that happens VERY often when writing for XSI? manipulating an array to alter geometry I mean, and isn”t that giving results radically different from what you are stating as a fairly cold hard truth here?
trying to generalize the subject of performance tuning, even in such a simple environment like an interpret language in one single application, is still damn hard.
there’’s hundreds of pages in huge very technical books that deal with optimization in a single language in a single API in a single environment (like optimizing terrain generation in C + OGL), and yet they barely scratch the surface.
drawing conclusions about all the possibilities in python from one single test is misleading to say the least.
if you want to decompose and chart performance optimization, you first have to do it for XSI’’s SDK, figuring out what objects and methods, under a constant condition (like using always the same iteration technique), offer what kind of results.
once you”ve done that you test the same set under all the looping techniques.
once you”ve done that you test this second level permutation (which will be at least 16 different snippets just using 4 possible looping techniques on 4 different kinds of storing and handling data) you”ll then have to figure out what direct optimization techniques you could use and when they are beneficial or detrimental.
like sometimes committing a collection to a different datatype, handling it, and committing it back into a collection (or parsing the dataset) will provide obvious boosts, because the translation overhead is only a small loss compare to the speed boost gained from a language’’s tricks you could use.
some other times that overhead will negate the benefits and then slow you down some.
considering 4 or 5 different tricks and techniques of committing or dereferencing data, your 3rd level permutation would be already 80 different situations to test and chart, and that is leaving out all the shades of gray inbetween.
hopefully now you understand what I meant when I said I find the article flawed in its approach and in the adamantine quality some of the statements have in them.
September 28th, 2005 at 9:16 pm
Raffaele,
It is with the greatest of interest that I have read your last post. Recently I had some time to tinker about it, so let’’s review your comments…
>> you first have to do it for XSI’s SDK, figuring out what objects and methods, under a constant condition (like using always the same iteration technique), offer what kind of results.
[Bernard] This article is about using in XSI some of the available looping techniques offered by Python, with the only exception being FindObjects(), wich I provided only for a matter of comparison. I don”t know where you got that but I have never pretended that some techniques were better or worse than some techniques offered by the XSI objects and methods (other than the FindObjects call). I voluntarily left out things like FindChildren, Filter, Find, Item, and so on. I focused entirely on Python specific looping techniques, hence the title of this article. The intention was not to cover all possible combinations of all available looping techniques, wich would probably make a book. I don”t see how this made my conclusions flawed.
>> once you’ve done that you test the same set under all the looping techniques.
[Bernard] This is what I have done nearly throughout the entire article, Raffaele…
>> once you’ve done that you test this second level permutation (which will be at least 16 different snippets just using 4 possible looping techniques on 4 different kinds of storing and handling data) you’ll then have to figure out what direct optimization techniques you could use and when they are beneficial or detrimental.
[Bernard] Okay another point here. I”m in complete agreement with you that each looping technique has to be carefully chosen depending on the context. Yes, in some cases, some types of loops will work better than others. I used only a handful of looping techniques in only a handful of contexts.
Still, one has to admit that even though pulling constants out of a few tests is not possible, tendencies, or should I say, outlines, can be drawn and hold on tight until there is proof of the contrary is found. As opposed to what you state I have not attempted to find stone truths, I explored few possibilities in few context. The possiblities I have explored may or may not extend beyond their context, that is up again for experimentation. Still, I believe the conclusion I have pulled from this experimentation is 100% valid and could easily apply to many different scenarios.
>> one example is map, that you insist saying isn’t much of a performance booster when dealing with functions written inside your python code. that statement could be almost correct in this one particular case, but have you tryied mapping a simple transform over a position array list?
[Bernard] What if I did?
===================================================
1- Moving a keyframe in X translation by 3 units
In this exercice, I have created a null, animated it over 100 frames, changed the fcurves to linear, then plotted its transformations for each frame. As a result, the null had 100 keyframes on all translation axes.
The script has to take the posx fcurve, and apply a simple addition of 3 units to each keyframe.
import win32com
import time
c = win32com.client.constants
xsi = Application
def timer():
xsi.logmessage( ”” )
# Record current time
time1 = time.clock()
# Start loop to perform single function calls
for i in range( 0, 10 ):
# Call function
main()
# Record new time
time2 = time.clock()
# Print execution time, rounded to 3 miliseconds
xsi.logmessage( round( time2 - time1, 3 ) )
def moveUp( oKey ):
# Append in-place the list of new keyframes
aList.append( oKey.time )
aList.append( oKey.value + nAmount )
def main():
global aList
global nAmount
# Get selected object
oSel = xsi.selection(0)
# Specify the amount of transformation we want
nAmount = 3.0
# Create empty list to store new array of time/value pairs
aList = []
# Map keys to function
map( moveUp, oSel.posx.source.keys )
# Put new keyframes onto fcurve
oSel.posx.source.setkeys( aList )
# Call timer
timer()
# RESULTS:
#INFO :
#INFO : 0.052
#INFO : 0.104
#INFO : 0.156
#INFO : 0.209
#INFO : 0.26
#INFO : 0.313
#INFO : 0.365
#INFO : 0.417
#INFO : 0.469
#INFO : 0.521
In the next script, I just made a slight modification. Instead of putting times and values into a list and then setting keys on the fcurve, I modify the keys in-place directly. That made the code a little bit shorter, as there was no need to declare globals.
import win32com
import time
c = win32com.client.constants
xsi = Application
def timer():
xsi.logmessage( ”” )
# Record current time
time1 = time.clock()
# Start loop to perform single function calls
for i in range( 0, 10 ):
# Call function
main()
# Record new time
time2 = time.clock()
# Print execution time, rounded to 3 miliseconds
xsi.logmessage( round( time2 - time1, 3 ) )
def moveUp( oKey ):
oKey.value += 3.0
def main():
# Get selected object
oSel = xsi.selection(0)
# Map keys to function
map( moveUp, oSel.posx.source.keys )
# Call timer
timer()
# RESULTS:
#INFO :
#INFO : 0.061
#INFO : 0.122
#INFO : 0.183
#INFO : 0.245
#INFO : 0.307
#INFO : 0.368
#INFO : 0.429
#INFO : 0.491
#INFO : 0.552
#INFO : 0.614
Now let’’s try with a classic for loop. There was no need for another function, now the loop can be done in the main function.
import win32com
import time
c = win32com.client.constants
xsi = Application
def timer():
xsi.logmessage( ”” )
# Record current time
time1 = time.clock()
# Start loop to perform single function calls
for i in range( 0, 10 ):
# Call function
main()
# Record new time
time2 = time.clock()
# Print execution time, rounded to 3 miliseconds
xsi.logmessage( round( time2 - time1, 3 ) )
def main():
# Get selected object
oSel = xsi.selection(0)
oFcurve = oSel.posx.source
# Specify the amount of transformation we want
nAmount = 3.0
# Create empty list to store new array of time/value pairs
aList = []
for oKey in oFcurve.keys:
aList.append( oKey.time )
aList.append( oKey.value + nAmount )
# Put new keyframes onto fcurve
oFcurve.setkeys( aList )
# Call timer
timer()
# RESULTS
#INFO :
#INFO : 0.051
#INFO : 0.103
#INFO : 0.155
#INFO : 0.206
#INFO : 0.258
#INFO : 0.309
#INFO : 0.36
#INFO : 0.412
#INFO : 0.464
#INFO : 0.515
Very, very slightly faster…. It’’s certainly not excluded that I have not written the best possible way the map() calls, but there are also other factors to consider. First, since the loop is done in the main function, the calls to another are suppressed. In Python, every call, as lightweight as it might be, involves a performance cost. Also, the fact that I have suppressed the need to lookup the global scope for the aList and nAmount objects might have speed up things a little, as I already explained that looking up a wider scope than the “current” one also involves a performance cost.
Now let’’s try a modification in-place using a classic for loop:
import win32com
import time
c = win32com.client.constants
xsi = Application
def timer():
xsi.logmessage( ”” )
# Record current time
time1 = time.clock()
# Start loop to perform single function calls
for i in range( 0, 10 ):
# Call function
main()
# Record new time
time2 = time.clock()
# Print execution time, rounded to 3 miliseconds
xsi.logmessage( round( time2 - time1, 3 ) )
def main():
# Get selected object
oSel = xsi.selection(0)
oFcurve = oSel.posx.source
# Specify the amount of transformation we want
nAmount = 3.0
# Create empty list to store new array of time/value pairs
aList = []
for oKey in oFcurve.keys:
oKey.value += nAmount
# Call timer
timer()
# RESULTS
#INFO :
#INFO : 0.06
#INFO : 0.122
#INFO : 0.183
#INFO : 0.245
#INFO : 0.307
#INFO : 0.368
#INFO : 0.43
#INFO : 0.492
#INFO : 0.554
#INFO : 0.617
Interesting…. Compared to the map version, the for loop seems faster in the beginning, but is gradually slowing down, to the point where it is slower than the map() version. What is also interesting is the fact that the population of the list in-place was, both with map and for, rather steady compared to each other.
Now I wonder what would be the result of the last two scripts, those that modify keyframe values in place, if we had 10,000 keyframes instead of 100. Let see.
The map version:
#INFO :
#INFO : 64.185
#INFO : 128.029
#INFO : 194.52
#INFO : 258.883
#INFO : 326.49
#INFO : 392.737
#INFO : 459.738
#INFO : 526.23
#INFO : 593.348
#INFO : 660.818
The for version:
#INFO :
#INFO : 61.918
#INFO : 123.811
#INFO : 185.709
#INFO : 247.661
#INFO : 309.602
#INFO : 371.551
#INFO : 433.595
#INFO : 495.631
#INFO : 557.791
#INFO : 619.819
Hummm in this case, the map version not only did worse than the for version, but the result is not the same at all as was we had before, the speed of the for loop remains more consistantly lower than the map version.
===================================================
2- Computing tranform multiplications
Very well, enough of that. I figured that I could do a different exercice. Perhaps I have not chosen the right use for the map call. Thus I have created a null that I keyframed at the origin. Then I moved it to (1000, 1000, 1000) and keyframed it. Then I changed the curves to linear interpolation. Then I duplicated from animation this null, I ended up with 1000 nulls. In this exercice I extract the transformations of the nulls, multiply them by another transform, and set these transforms back on the nulls.
import win32com
import time
c = win32com.client.constants
xsi = Application
XSIMath = win32com.client.Dispatch( ”XSI.Math” )
def timer():
xsi.logmessage( ”” )
# Record current time
time1 = time.clock()
# Start loop to perform single function calls
for i in range( 0, 10 ):
# Call function
main()
# Record new time
time2 = time.clock()
# Print execution time, rounded to 3 miliseconds
#xsi.logmessage( round( time2 - time1, 3 ) )
def moveDown( oObject ):
# Get current object transforms
oObjTransform = oObject.Kinematics.Local.Transform
# Create new transformation
oNewTransform = XSIMath.CreateTransform()
# Multiply object transform by global transform
oNewTransform.Mul( oObjTransform, oTransform )
oObject.Kinematics.Local.Transform = oNewTransform
def main():
global oTransform
oTransform = XSIMath.CreateTransform()
oTransform.SetTranslationFromValues( -1.578, 3.45987, 4.23490 )
map( moveDown, xsi.selection )
# Call timer
timer()
# RESULTS
#INFO :
#INFO : 2.916
#INFO : 5.857
#INFO : 8.809
#INFO : 11.765
#INFO : 14.708
#INFO : 17.669
#INFO : 20.625
#INFO : 23.542
#INFO : 26.487
#INFO : 29.444
Then with a for loop version:
import win32com
import time
c = win32com.client.constants
xsi = Application
XSIMath = win32com.client.Dispatch( ”XSI.Math” )
def timer():
xsi.logmessage( ”” )
# Record current time
time1 = time.clock()
# Start loop to perform single function calls
for i in range( 0, 10 ):
# Call function
main()
# Record new time
time2 = time.clock()
# Print execution time, rounded to 3 miliseconds
xsi.logmessage( round( time2 - time1, 3 ) )
def main():
oTransform = XSIMath.CreateTransform()
oTransform.SetTranslationFromValues( -1.578, 3.45987, 4.23490 )
for oObject in xsi.selection:
# Get current object transforms
oObjTransform = oObject.Kinematics.Local.Transform
# Create new transformation
oNewTransform = XSIMath.CreateTransform()
# Multiply object transform by global transform
oNewTransform.Mul( oObjTransform, oTransform )
oObject.Kinematics.Local.Transform = oNewTransform
# Call timer
timer()
#INFO :
#INFO : 2.912
#INFO : 5.876
#INFO : 8.869
#INFO : 11.851
#INFO : 14.87
#INFO : 17.857
#INFO : 20.848
#INFO : 23.821
#INFO : 26.813
#INFO : 29.786
Mmmm not much differences, all in all the for is slower by a fraction. Not very conclusive. Let’’s make some modifications to the scripts. This time, instead of creating local transform objects in every iteration, we”ll create a reusable transform object. I”m not sure it will speed up or slow things down, but it worth a try, no?
import win32com
import time
c = win32com.client.constants
xsi = Application
XSIMath = win32com.client.Dispatch( ”XSI.Math” )
def timer():
xsi.logmessage( ”” )
# Record current time
time1 = time.clock()
# Start loop to perform single function calls
for i in range( 0, 10 ):
# Call function
main()
# Record new time
time2 = time.clock()
# Print execution time, rounded to 3 miliseconds
xsi.logmessage( round( time2 - time1, 3 ) )
def moveDown( oObject ):
# Get current object transforms
oObjTransform = oObject.Kinematics.Local.Transform
# Multiply object transform by global transform
oNewTransform.Mul( oObjTransform, oTransform )
oObject.Kinematics.Local.Transform = oNewTransform
def main():
global oTransform
oTransform = XSIMath.CreateTransform()
oTransform.SetTranslationFromValues( -1.578, 3.45987, 4.23490 )
global oNewTransform
oNewTransform = XSIMath.CreateTransform()
map( moveDown, xsi.selection )
# Call timer
timer()
# RESULTS
#INFO :
#INFO : 2.186
#INFO : 4.45
#INFO : 6.734
#INFO : 8.993
#INFO : 11.249
#INFO : 13.496
#INFO : 15.743
#INFO : 18.012
#INFO : 20.298
#INFO : 22.537
Nice! Just using a global object instead of recreating it at each iteration made things faster. Now let see how things are going with the for loop.
import win32com
import time
c = win32com.client.constants
xsi = Application
XSIMath = win32com.client.Dispatch( ”XSI.Math” )
def timer():
xsi.logmessage( ”” )
# Record current time
time1 = time.clock()
# Start loop to perform single function calls
for i in range( 0, 10 ):
# Call function
main()
# Record new time
time2 = time.clock()
# Print execution time, rounded to 3 miliseconds
xsi.logmessage( round( time2 - time1, 3 ) )
def main():
oTransform = XSIMath.CreateTransform()
oTransform.SetTranslationFromValues( -1.578, 3.45987, 4.23490 )
# Create new transformation
oNewTransform = XSIMath.CreateTransform()
for oObject in xsi.selection:
# Get current object transforms
oObjTransform = oObject.Kinematics.Local.Transform
# Multiply object transform by global transform
oNewTransform.Mul( oObjTransform, oTransform )
oObject.Kinematics.Local.Transform = oNewTransform
# Call timer
timer()
# RESULTS
#INFO :
#INFO : 2.224
#INFO : 4.492
#INFO : 6.705
#INFO : 8.975
#INFO : 11.209
#INFO : 13.437
#INFO : 15.658
#INFO : 17.871
#INFO : 20.095
#INFO : 22.351
The for loop starts a little slower, but catches and even surpasses the map loop.
Okay, time to pull out some conclusions. First, there are multiple other uses of loops. These tests were just another attempt at measuring differences in speed in a given context. The difference were generally very minor, by a magnitude of 100ths of a second for the “quick” loops, and by an order of 3 to 40 seconds for the longer loops. In some cases the map performed better, in other case the for loop performed better. Only in the longer loops did the for really provided a big difference. So based on this new series of tests, I still can”t change my mind about the fact that a for loop is faster than a map when used with Python functions.
===================================================
3- Other forms of loop
But wait! I remembered that lambdas and list comprehensions were mentioned. Lambdas are still a little bit obscure to me, but I decided to give it a go, hoping I will not mess things. Let see….
import win32com
import time
c = win32com.client.constants
xsi = Application
def timer():
xsi.logmessage( ”” )
# Record current time
time1 = time.clock()
# Start loop to perform single function calls
for i in range( 0, 10 ):
# Call function
main()
# Record new time
time2 = time.clock()
# Print execution time, rounded to 3 miliseconds
xsi.logmessage( round( time2 - time1, 3 ) )
def main():
# Get selected object
oSel = xsi.selection(0)
oFcurve = oSel.posx.source
# Specify the amount of transformation we want
nAmount = 3.0
# Map to lambda function, returns tuples
aTuples = map( ( lambda oKey: ( oKey.time, oKey.value + nAmount ) ), oFcurve.keys )
# Flatten list
aFlatList = [ nValue for tTuple in aTuples for nValue in tTuple ]
# Put new keyframes onto fcurve
oFcurve.setkeys( aFlatList )
# Call timer
timer()
# RESULTS
#INFO :
#INFO : 0.052
#INFO : 0.104
#INFO : 0.156
#INFO : 0.207
#INFO : 0.259
#INFO : 0.31
#INFO : 0.362
#INFO : 0.414
#INFO : 0.465
#INFO : 0.517
I suspect that I might have not the most perfect use of the lambda (like, I don”t know if it’’s possible to return multiple values without a tuple), but still it is a perfectly legal use of lambda to use it in a map call. Possibly the list comprehension right after the map/lambda that flattens the values may involve a performance hit that nullifies the use of the map/lambda combination.
Now let’’s try with a list comprehension version…
import win32com
import time
c = win32com.client.constants
xsi = Application
def timer():
xsi.logmessage( ”” )
# Record current time
time1 = time.clock()
# Start loop to perform single function calls
for i in range( 0, 10 ):
# Call function
main()
# Record new time
time2 = time.clock()
# Print execution time, rounded to 3 miliseconds
xsi.logmessage( round( time2 - time1, 3 ) )
def main():
# Get selected object
oSel = xsi.selection(0)
oFcurve = oSel.posx.source
# Specify the amount of transformation we want
nAmount = 3.0
# Get pairs of new time/value
aTuples = [ ( oKey.time, oKey.value + nAmount ) for oKey in oFcurve.keys ]
# Flatten list
aFlatList = [ nValue for tTuple in aTuples for nValue in tTuple ]
# Put new keyframes onto fcurve
oFcurve.setkeys( aFlatList )
# Call timer
timer()
# RESULT
#INFO :
#INFO : 0.051
#INFO : 0.104
#INFO : 0.156
#INFO : 0.208
#INFO : 0.26
#INFO : 0.311
#INFO : 0.363
#INFO : 0.414
#INFO : 0.465
#INFO : 0.516
Again, extremely minimal difference….
===================================================
4- Summary of tests
Okay so let’’s put all the result tables next to each other, so we can have a better idea….
#INFO :
#INFO : 0.052
#INFO : 0.104
#INFO : 0.156
#INFO : 0.209
#INFO : 0.26
#INFO : 0.313
#INFO : 0.365
#INFO : 0.417
#INFO : 0.469
#INFO : 0.521
#INFO :
#INFO : 0.051
#INFO : 0.103
#INFO : 0.155
#INFO : 0.206
#INFO : 0.258
#INFO : 0.309
#INFO : 0.36
#INFO : 0.412
#INFO : 0.464
#INFO : 0.515
#INFO :
#INFO : 0.052
#INFO : 0.104
#INFO : 0.156
#INFO : 0.207
#INFO : 0.259
#INFO : 0.31
#INFO : 0.362
#INFO : 0.414
#INFO : 0.465
#INFO : 0.517
#INFO :
#INFO : 0.051
#INFO : 0.104
#INFO : 0.156
#INFO : 0.208
#INFO : 0.26
#INFO : 0.311
#INFO : 0.363
#INFO : 0.414
#INFO : 0.465
#INFO : 0.516
# 100 keyframes
#INFO :
#INFO : 0.061
#INFO : 0.122
#INFO : 0.183
#INFO : 0.245
#INFO : 0.307
#INFO : 0.368
#INFO : 0.429
#INFO : 0.491
#INFO : 0.552
#INFO : 0.614
#INFO :
#INFO : 0.06
#INFO : 0.122
#INFO : 0.183
#INFO : 0.245
#INFO : 0.307
#INFO : 0.368
#INFO : 0.43
#INFO : 0.492
#INFO : 0.554
#INFO : 0.617
# 10,000 keyframes
#INFO :
#INFO : 64.185
#INFO : 128.029
#INFO : 194.52
#INFO : 258.883
#INFO : 326.49
#INFO : 392.737
#INFO : 459.738
#INFO : 526.23
#INFO : 593.348
#INFO : 660.818
#INFO :
#INFO : 61.918
#INFO : 123.811
#INFO : 185.709
#INFO : 247.661
#INFO : 309.602
#INFO : 371.551
#INFO : 433.595
#INFO : 495.631
#INFO : 557.791
#INFO : 619.819
#INFO :
#INFO : 2.916
#INFO : 5.857
#INFO : 8.809
#INFO : 11.765
#INFO : 14.708
#INFO : 17.669
#INFO : 20.625
#INFO : 23.542
#INFO : 26.487
#INFO : 29.444
#INFO :
#INFO : 2.912
#INFO : 5.876
#INFO : 8.869
#INFO : 11.851
#INFO : 14.87
#INFO : 17.857
#INFO : 20.848
#INFO : 23.821
#INFO : 26.813
#INFO : 29.786
#INFO :
#INFO : 2.186
#INFO : 4.45
#INFO : 6.734
#INFO : 8.993
#INFO : 11.249
#INFO : 13.496
#INFO : 15.743
#INFO : 18.012
#INFO : 20.298
#INFO : 22.537
#INFO :
#INFO : 2.224
#INFO : 4.492
#INFO : 6.705
#INFO : 8.975
#INFO : 11.209
#INFO : 13.437
#INFO : 15.658
#INFO : 17.871
#INFO : 20.095
#INFO : 22.351
To conclude this series of tests, I will say that the use I made of map, for, lambda and list comprehension did NOT invalidated any of the conclusion from the top article, and I have yet to find a scenario where using map will be faster than for loops with Python functions. At this point, if you are in disagreement with me, I think it would be a better solution for you to provide some code. Unless you provide evidence of your statements, I”m affraid I”ll have to stay on my positions on that one. I think I have done my homeworks.
Btw, all scripts I have wrote for this post are available here:
http://www.bernardlebel.com/scripts/xsi/scripts/others/xsiblog/pythonloopingtechniques/
>> when you use a single test bed for your performance analisys, and one that involves such a highly SDK dependant thing like XSI’s internal collections, you’re totally voiding your efforts.
[Bernard] With all due respect to your experience and knowledge, I”m in largely in disagreement here. Again, this article is all about using Python looping techniques in XSI, *nothing more*. The code runs in XSI at all time, in the same build, same machine, same OS, same Python version, well same conditions at all times. In this article I have questioned Python *in XSI*, period. Of course it’’s dependant of the SDK. So? That means we should try to benchmark Python’’s performance? Of course the results may be different than running similar code in another environment. But I did not talked about that.
One last thing…
>> hopefully now you understand what I meant when I said I find the article flawed in its approach and in the adamantine quality some of the statements have in them.
I take what I do very seriously, and a considerable amount of time has been spent on these posts. Therefore, I consider that it would be the most elementary courtesy and respect from you, if you wish to further refute my statements, to show a similar commitment and provide tangible material to support your statements. I think I have gone far enough with this for you to go past the commenting approach and take an educating one.
I know you are a knowledgeable guy and I respect that, and would be grateful if you could teach me something; I”m happy to revisit my positions, as long as we”re on the same level of seriousness. Still I thank you for your comments, they made me learn a few things already.
Bernard
December 14th, 2005 at 4:51 am
ok, let’’s drop the confrontational thing, which was never my intention to start.
I see what you”re trying to get across in the article, and I never said your results were wrong.
I said the way your statements try to encompass the whole looping techniques thing is what I”m arguing.
remember when I said that you should try different testbeds? and that applying mathematical operations to an array will be the fastest by mapping or comprehension and that such a situation wasn”t so uncommon (it’’s infact more performance critical then walking selections or curves, as such topology walks are at the heart of deformers, which NEED to be fast).
in the following code, which is representative of the time required to create lists or not in each technique and how they are created, you can see the speed progression I described in my reply.
it is just one of MANY examples where performance ladders are considerably different from that in your conclusions.
the innner working of an interpreter and the various caching/loading/reloading and instancing operations needed to run them can tip the scales in considerably different ways in different situations.
IE: notice how a lambda function will make map slightly faster, but will slow down list comprehension considerably.
if the site kills the formatting, indent once lines:
5, 6, 19, 20, 21, 30, 32, 34, 43, 44, 45
I hope it helps furthering your understanding of programming and improving your scripts:
——
Editor’’s note:
Script removed.
Please refer to the script in this comment.
——
December 14th, 2005 at 5:32 am
and I”m a dummy for having forgot the syntax of list comprehension and having had two typos in the script….
here’’s the correct version, ignore the previous one and the lambda thing, was in muppet mode and didn”t test things.
import win32comimport time
def simpleAddition(foo):foo = foo +1
return foo
#Application.CreatePrim("Sphere", "MeshSurface", "", "")#Application.SetValue("sphere.polymsh.geom.subdivu", 300, "")
#Application.SetValue("sphere.polymsh.geom.subdivv", 300, "")
overallStartTime = time.clock()tupPointsPosition = Application.Selection(0).ActivePrimitive.Geometry.Points.PositionArraystartTime = time.clock()forRangeAppendPointsPosition = [[],[],[]]
for i in range(len(tupPointsPosition[0])):
forRangeAppendPointsPosition[0].append(simpleAddition(tupPointsPosition[0][i]))
forRangeAppendPointsPosition[1].append(simpleAddition(tupPointsPosition[1][i]))
forRangeAppendPointsPosition[2].append(simpleAddition(tupPointsPosition[2][i]))
endTime = time.clock()
print “for range appending the new value ” + str(endTime-startTime)
testList = forRangeAppendPointsPosition
if testList == forRangeAppendPointsPosition: print”check”
startTime = time.clock()forInPointPosition = [[],[],[]]
for i in tupPointsPosition[0]:
forInPointPosition[0].append(simpleAddition(i))
for i in tupPointsPosition[1]:
forInPointPosition[1].append(simpleAddition(i))
for i in tupPointsPosition[2]:
forInPointPosition[2].append(simpleAddition(i))
endTime = time.clock()
print “for each element and appending ” + str(endTime-startTime)
if testList == forInPointPosition: print”check”
startTime = time.clock()forRangeInPlacePointsPosition = [list(tupPointsPosition[0]),list(tupPointsPosition[1]),list(tupPointsPosition[2])]
for i in range(len(tupPointsPosition[0])):
forRangeInPlacePointsPosition[0][i] = simpleAddition(tupPointsPosition[0][i])
forRangeInPlacePointsPosition[1][i] = simpleAddition(tupPointsPosition[1][i])
forRangeInPlacePointsPosition[2][i] = simpleAddition(tupPointsPosition[2][i])
endTime = time.clock()
print “for range modifying in place ” + str(endTime-startTime)
if testList == forRangeInPlacePointsPosition: print”check”
#no for each modifying in place as retrieving the index of the value#makes modification in place unreliable and dangerous
#or overly complicated and slow to make it safe
startTime = time.clock()mapPosX = map(simpleAddition, tupPointsPosition[0])
mapPosY = map(simpleAddition, tupPointsPosition[1])
mapPosZ = map(simpleAddition, tupPointsPosition[2])
mapPointsPosition = [mapPosX, mapPosY, mapPosZ ]
endTime = time.clock()
print “mapping the function ” + str(endTime-startTime)
if testList == mapPointsPosition: print”check”
startTime = time.clock()mapLambdaPosX = map(lambda x: x+1, tupPointsPosition[0])
mapLambdaPosY = map(lambda x: x+1, tupPointsPosition[1])
mapLambdaPosZ = map(lambda x: x+1, tupPointsPosition[2])
mapLambdaPointsPosition = [mapPosX, mapPosY, mapPosZ ]
endTime = time.clock()
print “mapping a lambda function ” + str(endTime-startTime)
if testList == mapLambdaPointsPosition: print”check”
startTime = time.clock()listCompPointsPosition = [[simpleAddition(i) for i in tupPointsPosition[0]],[simpleAddition(i) for i in tupPointsPosition[1]],[simpleAddition(i) for i in tupPointsPosition[2]]]
endTime = time.clock()
print “list comprehension ” + str(endTime-startTime)
if testList == listCompPointsPosition: print”check”
startTime = time.clock()listCompDirectPointsPosition = [[i+1 for i in tupPointsPosition[0]],[i+1 for i in tupPointsPosition[1]],[i+1 for i in tupPointsPosition[2]]]
endTime = time.clock()
print “list comprehension with embedded function ” + str(endTime-startTime)
if testList == listCompDirectPointsPosition: print”check”
overallEndTime = time.clock()print "total time elapsed: " + str(overallEndTime-overallStartTime)
December 18th, 2005 at 9:32 pm
Most interesting, Raffaele. Perhaps I”m pushing my luck, but I would like to see a set of timing results, if you don”t mind. Simply to compare with mine, I swear.
#INFO : for range appending the new value 0.383707485695
#INFO : check
#INFO : for each element and appending 0.321049263254
#INFO : check
#INFO : for range modifying in place 0.337336047004
#INFO : check
#INFO : mapping the function 0.170906125977
#INFO : check
#INFO : mapping a lambda function 0.147533536157
#INFO : check
#INFO : list comprehension 0.222757300532
#INFO : check
#INFO : list comprehension with embedded function 0.11899168944
#INFO : check
#INFO : total time elapsed: 2.75101092793
>> I said the way your statements try to encompass the whole looping techniques thing is what I’m arguing.
I would like to point you to the last paragraph of my initial post. But I take your point nonetheless.
Thanks for the code, I appreciate.
Bernard
December 19th, 2005 at 12:44 am
they”d be pretty much in line with yours for this particular example, infact yours are slightly more flattering to mapping then what I got on my 4 years old laptop, which is what I wrote and tested the script on.
would you still want me to post them? (no need to swear on what the purpose is, as I said the point was never to try and belittle you, otherwise I would have brought forward the subject of poutine, not mapping functions).
in other cases, especially on much more complex operations then a simple addition, the difference map can make will go up tenfold, and we”ve had cases where just mapping a function, with no other optimization, boosted a snippet’’s speed by well over x10.
in some other cases it wouldn”t make much of a difference, and in some (although seldomly encountered) others a lambda could even slow down things a bit (the peculiarity of lambda functions is the way the process is mapped ahead of execution, and in cases where this process is voided, or not applied to a large enough table, it can be counterproductive).
Different techniques will require different steps.
IE: manipulating in place can be a lot faster then creating by appending while the loop runs, but you won”t always be able to pre-determine the size of the list in all dimensions (especially for very generically purposed pieces of code), or it would need counter-productive procedures or just too much time to test it in all situations.
also, if you were to take away from the equation the time needed to create or manipulate lists and other elements which differ from technique to technique, results would change again, and while knowing the performance of each part of the execution can be relevant, in the end it’’s the performance of the whole, possibly in situations that are real-world related, that matters.
this is, obviously, neglecting other important elements of software design, but for sheer speed based operations (like manipulating geometry) the self-contained example is quite pertinent.
I”ll be happy to acknowledge the last paragraph of the initial post in regards to my comment and consider it crossed out.
does this make my initial point any clearer or is there anything I left out or mis-interpreted/mis-presented.
cheers.
Raff
December 19th, 2005 at 12:07 pm
>> they’d be pretty much in line with yours for this particular example, infact yours are slightly more flattering to mapping then what I got on my 4 years old laptop, which is what I wrote and tested the script on.
would you still want me to post them?
[Bernard] I think you answered my qestion, I was interested in the time curve on your side of things. Thanks for that.
>> (no need to swear on what the purpose is, as I said the point was never to try and belittle you, otherwise I would have brought forward the subject of poutine, not mapping functions).
[Bernard] mmmm poutine…. What a fascinating subject, isn”t it? That said, I can hardly see how you could belittle me (or a Canadian) on such a subject. ;)
>> IE: manipulating in place can be a lot faster then creating by appending while the loop runs, but you won’t always be able to pre-determine the size of the list in all dimensions (especially for very generically purposed pieces of code), or it would need counter-productive procedures or just too much time to test it in all situations.
Okay here I have a question for you. Unless I have missed something, there is no way to specify a list length in Python like in VBScript or JScript. On the other hand, the tuple has a fixed length, and its elements can”t be modified in-place. So I”m not sure what you meant with that statement…… if you are dealing with a list, then its length doesn”t really matter, but on the other hand, if you are dealing with a tuple, then you have no choice but to make copies. Correct?
>> I’ll be happy to acknowledge the last paragraph of the initial post in regards to my comment and consider it crossed out.
[Bernard] Thank you.
>> does this make my initial point any clearer or is there anything I left out or mis-interpreted/mis-presented.
Fair enough, I think we have reached a consensus and finally seem to be talking the same language by now.
Cheers
Bernard
December 19th, 2005 at 7:08 pm
yes, you have no choice but to make copies for tuples.
however, creating a list by walking the tuple, processing a row, and appending a new entry can be slower then casting the tuple to a list and modifying in place with another technique.
casting a tuple to a list by list() or creating a pre-determined sized list through list comprehension will be nearly instantaneous; and then you can modify it in place.
the speed boost will still be considerable over creting the list by processing and appending the entry at every cycle.
Also, if you multiply a None in a list I think that triggers a python optimization to generate a presized list even faster, something like myList = [None]*Size, but don”t quote me on this one as I don”t remember where or when I read it, and using list comp is usually so fast that I never concerned myself with finding a better way to pre-size a list.
December 19th, 2005 at 7:16 pm
ops… I should also add that you shouldn”t limit the domain of your thinking to the specifics of this one example.
this one case uses the position array because it was the most convenient way for me to generate the need for a large list in XSI and provide an example rooted in day2day needs.
while xsi almost always returns tuples when dealing with returning arrays to the user, there are a lot of instances where you could be working with external data as well, and that could very well return a list right away.
this example aside though, often one ends up applying these techniques and tricks to custom data that won”t necessary be a tuple, nor easily pre-determined in its layout (IE: especially when working with databases and walking one you never know with how many branching offs you will end up with in some searches), but that is probably out of the scope of the article.
July 17th, 2008 at 11:54 pm
Hi,
The conversation that has taken place in the replies to this article are important in defining how the article’s information should be applied, however, you probably lost most beginners and a good portion of everyone else by the end of it ;). This article is great (with a few disclaimers warning about assumed usage)!
As someone already mentioned, readability should win the day when speed of various methods is either too close to worry about, or simply unnecessary. I would suggest creating good, clean, easy to read, and edit, code that works first, then optimize where necessary. Especially when working with a group of people that have to work with your code…and especially when working to a deadline. In production a developer can’t always write awesome code just to stroke their “ego”, you have to spend the time where the production will best benefit. This is a hard-won lesson and can apply to anything from writing code to animating.
In python, I always reach for the object iterating for-loop. I find range-based looping is rarely something I will use by choice (in JScript it was the primary looping method, but while learning Python I read a book which said I would find that I simply would’t need it most of the time. I have found this to be true). I would only move to other methods of looping to address performance issues. This is where this article gives some great clues to new XSI/Python users.
Nice one!
- Rafe