Monday, October 30, 2006

pthreads: when light becomes heavy

Pthreads or POSIX threads are said to be "light weight processes" and are usually recommended instead of fork(), exec() and calls to shared memory and semaphore routines. This is because a thread creation is supposed to take less resource for creation and cheaper for switching between them. Threads (pthreads) may be implemented at OS level or supported by appropriate hardware.

Though it seems that pthreads take up much less resources, but the way an OS is configured can drastically alter the resource requirements of creating pthreads. This was typically the situation I landed up when I was using pthread via Python on a large SGI Altix system (google for: ANUSF).
The stack size on this altix system was set to a default of 1GB, which resulted in stack allocation of 1GB per thread even if no work (or memory allocated) was done within these threads. Initially I though that this was a problem with Python interpreter on IA64, so i coded up a small skeleton code in C using pthreads; to my surprise this also resulted in allocation of 1GB per thread. Next, I tried using OpenMP "threads" and was pleased to see that the memory of this process didn't shoot up like its pthreads counterparts.

After some consultations with my instructor, I discovered that you can set the stack sizes of pthreads using: pthread_attr_setstacksize() function (check google codesearch for examples). But all this meant rewriting all my code in Python in C or writing a full thread wrapper for Python in C.

So determined not to do that I set out finding new ways to handle this in Python itself. I discovered that you could actually set the stack size in Python, but to my dismay this had been only introduced in the latest 2.5 release, and there was no way that the 2.3 version of Python on the Altix machines were to be updated.

After googling around a bit i discovered what is called as stack-less python. This essentially reduces usage of python stacks by maintaining a common stack. But again this had many problems, first and foremost was this was not standard python and had to install it separately. Secondly there is a lot of debate on the merit of using stackless python and the disagreement with the main Python development community.

Ruling this out, by sheer chance i googled for "microthreads" and came across an interesting article by David Mertz. This article suggested using generators in Python to achieve user level cooperative multithreading. This was really an interesting article for me as it was the first time that I was introduced to the wonderful generators in Python. I began toying around this idea, but finally discovered that I would still be requiring preemptive multithreading for my particular application.

I had known the use of "ulimit" in bash (or "limit" in csh), and had frequently used it to query the system limits. But had never intentionally used it to change those limits. As soon as I remembered this command, it was very obvious what I would be doing: wrap up python execution in a shell script, and issue an ulimit with 8MB (or so) as the maximum stack limit. So only the python process will be affected by this change, and the rest of the system process remained intact.

In the end the solution to the problem seemed to be simple, but learned a lot in the process.
Now I am writing a small MicroThreads interface using generators, which I will be soon posting here. (Note: there are many more implementation using this idea, but I just want to have some fun with generators)

day light saving: googly!

Today, i was taken unaware of a practice followed in countries not near equator.
Canberra, where I am presently staying had a daytime saving for +1 hour for the "summer time" .. and evidently not aware of this change, I reached one hour late to my work place. Surprised, my instructor asked me how come you are late today?
late? i looked at my mobile and said "probably u r early, its 8:45am".
"no, the time changed on saturday ... it is quarter to 10!" he said promptly with a smile.

... after some time I realized what day time saving was all about: only heard of now experienced!
Have a look at http://webexhibits.org/daylightsaving/c.html to know more about the rational behind it.

Later on I realized that my nokia 6600 has an option to adjust automatically to day light saving which obviously was set to off!

Tuesday, October 10, 2006

and the web office ...

google just combined its writely and spreadheet app:

http://docs.google.com/

updated to blogger beta

just now updated to new blogger beta...

google code search!

google has release a wonderful code search utility!!

http://www.google.com/codesearch

it has found most of the code that i have released... except one MeTA Studio.
these seem to be many a projects which seem to have similar name!