DPT Performance and Stack Overflows
I finally finished the private contract I was doing on saturday, and so I sat down sunday morning to start generating results for the presentation I am going to be doing for the CSIR regarding this on thursday. Things went well until about 2 minutes in when the program crashed, running it from within VS, it dies with an unhandled exception: stack overflow… I panicked and instead of driving to work to do some research on stack overflows, I assumed that my code was broken. After debugging for around 8 hours with no results apart from figuring out that the problem was in the grow operation, I had a mini breakdown. I left it all and dedicated the entire day today to fixing my “broken” code. It was the best thing I could have done!
I came in this morning feeling calm and relaxed, sat down and ran my program again and realized that it was an exception being thrown and not a fatal error! Did a google search on the exception and figured out it was due to my program stack being too small, so my “broken” code wasnt broken at all but rather it was suffering from a severe case of PEBCAK (google it
)…
The locate critical sets function makes use of recursion to locate the initial sets, and on outside images there are large areas (sky) that are the same color so it starting recursion thousands of levels deep and obviously the default 1mb stack cant store all that, increasing the stack size to 25mb fixed everything! If anyone is interested the option for setting the stack size is found under project -> properties -> linker -> System -> Stack Reserve Size (in bytes).
So I ran the simulation and the results are interesting, i ran it on 10 different images, with 9 different sizes ranging for 100% to 10% in 10% intervals. Apart from a few anomalies the graphs look pretty linear, this is alot better than the previous results where the results seem exponential. See for yourself.
UPDATE: These are the updated performance graphs from the averaged results from 10 runs. I’m still not too sure what to make of them yet so i wont even hazard a guess… One thing is clear tho that we need to do more research…
The Year So Far
Okay, halfway through January, I’ve burned through all my disposable income and have started using my saving for day to day expenses, the holidays were expensive but that’s expected.
I found a new flat yesterday and am signing the lease this afternoon, I’m still a little worried about living with someone but it’s something I need to experience. This whole moving fiasco is proving to be expensive, so I’ve had to take on a new private contract just to make ends meet. I’ll be spending a few sleepless nights on it. Once that’s complete, I will have a few days to try and develop a GUI for my DPT program and also prepare a 15min presentation on it. I have to present our results to a group of people at the CSIR. I hate public speaking and this event worries me.
In other news the e8400’s are being released in a week!!! I’m 2 weeks away from my new 4 GHz baby; I can’t wait to see what it can do. I’m going to have to go for a P35 motherboard since the x38/x48 chipsets don’t clock as high. If I have some extra money after the move I’ll get 2 new hard drives and some water cooling…
There don’t seem to be enough hours in the day right now for everything I want to do and I seem to be experiencing a major dip in my concentration and energy mid afternoon, so from around 1:30 – 5:00 I’m like a vegetable.
I have sold off the rights to my JATE template engine so releasing the source isn’t going to happen anytime soon, I should be glad as this is less work for me.
Another exciting development is that I have been given the opportunity to give my surveillance system as a new 3rd year project, so I can get the final year students to upgrade and polish off my already working system. I’m pretty exciting since if they get this right they can make a lot of money with it afterwards (that’s enough incentive for them to work that bit harder).
Oh I forgot to mention that for the GUI I’m going to have to re-learn how to use MFC to create a GUI, the last time I used MFC was 4 years ago, and I really don’t remember anything. It kind of worries me since I’ll have 2 days to learn it but I guess I should be used to being constantly under pressure.
The new year and my plans…
Hello 2008, its a new year thats barely started and already it promises to keep me super busy with lots and lots of work.
I’m gonna be doing my honors course (its a graduate program for those that dont know) in computer science. My subject choices are mainly geared towards game development and advanced programming: Swarm Intelligence, Advanced Computer Graphics, Generic C++ Programming, Data Compression, Software Architecture, Computer Networks and some informatics course to simply fill up the credits.
I’ve been doing some performance tests on the DPT algorithm we’ve developed (we’ve named it the fast pulse transform) and while being fast it’s extremely memory intensive, a 1632×1224 (2MP) image ends using over 2 gbs of physical ram to decompose. This is pretty crazy and seeing that i only have 2gbs of ram on my home machine and a paltry 1gb on my work machine i cant decompose certain images. We discovered some redundancy in our algorithm and are going to update the program to correct this. I’m also going to try and reduce memory usage wherever i can.
I was simply going to use short ints instead of ints but it seems that for an 32bit x86 architecture using shorts requires an extra instruction per operation and so its again a memory vs performance tradeoff.
The initial performance estimates for the algorithm werent very promising with both the prime critical sets search and the decomposition showing exponential complexity. I’m not so worried about this as this is the first unoptimized version of a brand new algorithm, and future refinement will definately improve of this.
In other news: i will be upgrading at the end of january to a shiny new penryn based e8400, which from initial reports on the internet should overclock to 4ghz easily. I still need to pick out a motherboard and if i get a contract in january perhaps even get some water cooling (going to cost me around R2000 for a full setup). I dont really mind air cooling and water shouldnt help my overclock too much but i’m kinda sick of the constant whoosh my pc currently makes, i want good old sweet silence
And the reason for my upgrade? its two fold, I need a new motherboard for my video card with a better layout and i will need to run some really intensive simulation this year and having a monstrous pc will reduce the amount of time drastically. Some of the simulations can take up to a day to complete on a 3ghz p4, so i’ll be looking at like half that time with a 4ghz C2D
I cant believe i’m actually buying high end PC stuff for academic purposes, its so weird being able to justify PC upgrades
High Resolution C++ Timing on Windows
Accurate Timing is something that seems simple until you actually try it, at first people would be like just get two timestamps and minus them, sure that works for per second timing but what about millisecond or even microsecond timing?
Just get 2 millisecond timestamps and minus them, right? No, the problem comes into that when the CPU is heavily loaded, often the timestamps will be significantly off, i noticed this in my surveillance project, where a specified time of 10seconds ended up being between 5 to 15 real seconds depending on the load on the processor.
So how do you do accurate high resolution timing in windows? Before people made use of the now antiquated Multimedia timers, these days you simple use the Windows API QueryPerformanceCounter and QueryPerformanceFrequency functions.
The QueryPerformanceCounter function call returns the value of the performance counter (in counts) at that point and the QueryPerformanceFrequency returns the amount of counts per second for that processor… You can see where i’m going with this, Simple huh?
The only other thing you need to take into account is that you need to make use of the LARGE_INTEGER union type for the count storage, since the count is a 64bit int you need to make use of the QuadPart (LongLong) of the union for storage.
here’s a very basic c++ timing class: HRTimer.h

