Discrete Pulse Transforms and Rants…
I don’t know if I mentioned this before but my father is a math’s professor at the University of Pretoria. I’ll be doing my honors year next year, and I can maybe get credit for my one subject by helping my dad develop an algorithm/program to do discrete pulse transform in a 2d space. The inventor behind these transforms (also referred to as LULU operators) is a math’s professor at the University of Stellenbosch.
My role isn’t an important one, my task is to get my dad’s semi working (or not) matlab programs and rework and optimize them in c++, and then focus on maybe porting them over to a GPGPU format. So far this has been a tiring task since matlab programming is basically scripting, really badly formatted scripting. Now I have to take managed code written by my dad who has almost no experience in programming and convert it into a fast and efficient c++ program.
Architecture wise this is proven to be a nightmare as I don’t know c++ intimately and by that I mean know ever little trick in the language, and I’ve been working in c++ for years. I now have to think about the smallest performance problems, for examples will a lookup table really be faster or will the resulting cache misses result in it being slower, if you remember I tried using a lookup table in a previous project and it turned out to be a lot slower, I think this might be the reason.
The most embarrassing thing is that I work in the computer science department; this department produces almost as much computer science research papers as every single other department countrywide put together. And there is’nt really any one that I can talk to about extreme c++ optimization and techniques. There is way too much focus on software engineering and documentation, and not enough on the serious in depth topics concern with programming and its techniques. I’ve completed my undergraduate degree and the term cache miss has never cropped up, I think this is absolutely unacceptable. And now the department has moved almost all the undergraduate courses to java including the data structures and algorithms courses, the degree might as well be a bsc “java monkey” or a bsc “I can’t program anything complicated”.
The entire degree program and evaluation system here has just been one frustration after another, everything I’ve learnt I’ve had to do myself, and the courses where I’ve had a thorough understanding of the material and that interested me (namely AI and graphics) my marks were bad in, I’m just bad at memorizing material verbatim, I’m sorry but I’m really angry at this, I’m probably in the top 10 programmers in my year group and my marks are sitting towards the bottom of the spectrum just because my brain doesn’t work well with remembering exact phrases and dry theory too well. And the worst part is I’m one of the top programmers not because I’m a great programmer or super smart, but because everyone else here are idiots, it’s easy to memorize class notes and vomit them back up at the exam time. I thought it was just here but I’m starting to think that it might be the same everywhere…
Anyways I’ve gotten off topic with that rant about the sorry state of computer science in academia and my disillusionment with it.
I had a meeting with my dad and that professor from Stellenbosch and I felt like such an idiot, at least it seems in mathematics there are still people that have talents and know what there are talking about, that discussion was crazy intense. It’s awesome to actually feel like an idiot for a change, I don’t run into many humbling experiences in my field unfortunately. The 2d application of the LULU operators / DPT transforms hasnt really been explored and we dont really know if we’d be able to even apply it to a 2d image but i guess thats the goal of research. At least i’m learning things bit by bit…
Now I’m still carrying on with my conversions but I’m stuck here staring at a profiling and I’m have lot of idea for different implementations but I’m so tired of having to write 6 different versions and test them to see how they perform and then sit in front of google and hope to find out what causes the performance differences. I’d love to just meet someone that I can ask a ton of technical questions on compiler optimizations, cache techniques, storage of instructions in registers etc. I’m so tired of self study…
Guess it’s just wishful thinking at this stage. At the end of the day the cold reality is that I can’t rely on anyone here for advice.
I’m tired now and this rant has gone completely off topic, i’ll write up my initial progress in the weeks to come… I’ve also started with my game dev project, its going to be a total conversion for the UE3 engine, should be tons of fun and a nice challenge. I’m pretty much living my life one challenge after another…
Another Dissapointment: Crysis
The hype machine in 2007 has been in overdrive, first Bioshock, then Hellgate: london and now crysis. Bioshock won many awards but in all honesty wasnt that great a game, yes, it look pretty but the gameplay was boring and repetitive, the only reason i forced myself to finish it was due to the storyline.
Hellgate london from what i hear is terrible, i’m not going to even waste my time on it when i havent heard a single good thing about it, from atrocious maps to bad gameplay to bugs popping up. Its starting to be compared to one of those bad diablo 2 clones that came out.
Now the latest and greatest game at the moment is crysis! The hype machine has overdone itself for this game, yes its the prettiest game available and … well thats about it, the game is let down by a pretty bland story and even blander gameplay. It feels just like another average shooter, or just like a far cry expansion.
Coming back to the atrocious story, in the first part of the game, you get stealthily inserted into the island so as not to cause an incident and 2 hours later with no proper motivation the US army invades as if its just been sitting around waiting to invade. Plot hole deluxe. The story is the same old alien artifacts are found, they were dormant, we woke them up, Oh NOES!!, the aliens are teh_evilz, We musts killz themz N0w! ugh…
Dont get me wrong the engine is utterly amazing from a technological standpoint and the game features some of the best examples of modelling and texturing available but in the end who cares if the game isnt fun to play? Its like putting a ferrari shell on a toyota prius, it’ll look gorgeous but it’s still a prius and will drive like one.
I’m around halfway through the game on hard difficulty and honestly i cant seem to find a reason to carry on playing. It’s boring the crap out of me. I was utterly hooked on Call of Duty 4, and finished the game in 2 sessions, and am still playing it enthusiatically multiplayer but yeh crysis… sigh…
There will no doubt be a large group of idiots claiming it to be the best things since sliced bread, but then again it’s probably the same idiots that buy and enjoy console games aimed at single iq players. I want exciting, challenging, emersive gameplay not shiny looking boredom… I’m sorry but crysis for me was a “far cry” from what it was made out to be. There are too many idiots that judge a game on how good it looks, they seem to forget that the game is just that a game not an art exhibition…
Crytek – you guys have amazing programmers, dont you think it’s about time to hire some amazing game designers to go with them…
A long weekend: 8800gtx
What was supposed to be an exciting and fun weekend turn out to be a complete nightmare filled with swearing regret, some blood, some embarrasment, some more swearing, a lot of bandwidth wasted on campus researching and finally some screaming and yet more swearing…
I received an 8800gtx for work, i’m going to be doing some GPGPU programming on the card using CUDA for another image processing project and I’m going to start learning directX 10 programming. Stay tuned for those blogs as they promise to be quite interesting. I’m going to learn a lot in the next couple of weeks.
I was super excited when i got this card as my 7800gt was starting to show its age, an absolute gem of a card but sadly it needed to be retired. I received the card and rushed home, ripped out the 7800gt and grabbed the “what then seemed a slightly larger card” 8800gtx to install it in its place. Then the swearing started, see if you can see why in the below image?

The damn card wouldnt fit inside the case! And i have (had) a full tower! The lip of the hard drive bays blocked the card. I have(had) a high end case that i thought would last me for years to come. except that the card wont fit, nor will this card fit in 80% of the cases on the market!!! it seems there are some select few case with an 8800gtx compatible rating.
anyways the second problem is due to immense size of the cooler the card wont fit over my chipset cooler, so i’m reduced to using the second PCI-e slot which unfortuneatly on my motherboard is an 8x slot and not a 16x slot, since i never planned on running two cards in SLI i bought a 570 chipset motherboard, I figure one 16x slot was more that enough. The performance drop should be minor, somewhere around 5-10% (i hope)…
At the end of the day i did get the card installed after 10 minutes of twisting and angling the card to get past the lip of the drive bays, once installed i was left with a wonderful 1-2mm of clearance between the bays and the card.
Thank goodness, its all over… So i got into windows installed the drivers, configured the card and fired up on the the games i was currently playing (since i’m on holiday now i actually have the time to sit and play for a bit, plus so many amazing titles have just been released), 15minutes in my machine blue screens with a nv4dsip error (nvidia driver error), an error with which i’m extremely familiar thanks to hours spent overclocking, the new card was overheating. I tried to manually set the fan via the drivers to 100% and i figured this had worked so i tried playing again at which point it crashed again, i was really getting pissed off now as i hadnt expected a stock card to overheat, not to mention the fact i was hoping to overclock the card a bit. So i ripped off the side panel and slapped my house fan blowing directly on the card and fired up the game, maxed out the graphics settings and tried to kill the card! No luck, Rock stable! Finally…
On the upside, this card is a monster, nothing i could throw at it slowed it down, its terrifying. The absolute raw processing power of this card is exceptional but this performance comes at a price (apart from the monetary cost of course
), the card runs extremely hot, idling at around 70c and at load hitting the mid to high 80s… absolutely scary. I even comtemplated importing a water cooling setup for it.
I was gonna planning on buying an aftermarket cooler for the card even before i got it and i usually need to justify those kind of purchases to myself and it takes a while but this time its starting to look like i wont have to, since its not just a nice add-on, its a necessity…
Also I sold my 7800GT for R800 and i this morning, I was also forced to sell my case for R650 (i loved my case so this isnt something i’m all that happy about) – so now i have r1450 for a new case… (money which i was going to use for a new hard drive or 2)
Since now the card is stable (albeit using a ghetto solution), i came in to the office to ensure the CUDA framework downloads had finished (which they hadnt due to varsity’s atrocious connection of late) and to do some research on a new case.
As for cases there are two candidates: the gorgeous, cool and expensive cooler master CM 832 – r2000 or the elegant but hotter(temperature wise) and much, much heavier antec p182 - r1300. there is also the cheaper antec 900 case -r1000 but its ugly as hell and pretty small, and i’d literally pay money to not have to look at it! All the other cases are either not large enough for the card or just look cheap with millions of flashing lights and fragile plastic bits. One nice thing about anec and cooler master, both those cases are built like tanks so i know they’ll last, both have won numerous awards and both are considered the top two cases on the market right now, you’re starting to see why this choice is hard… I’m really undecided between the two cases as each one has its pro’s and cons… money isnt really an issue here, i just want to get the best case possible, one that will last me years…

Coolermaster CM Stacker 832

Antec p182

Antec 900
While doing this i also did some research on a bug (cant disable aspect ratio scaling) i found in the nvidia drivers and stumbled upon the reason for my overheating, there is a bug in the drivers for the 8800gtx that never turns the fan on the card’s cooler to 100% since it tries to keep the noise levels down, but in africa with our high ambient temps this is required. This was the first thing i had done and now i find out that it had been disregarded due to a driver bug and that i needed an external program to do this manually…
So what was supposed to be my first tast of CUDA ended up a very frustrating and now potentially expensive weekend. I’m going home now to the very unwelcome sight of my PC’s innards lying bare on my desk, i’m going to eat something and just go to bed…
Radon Transform
Wow, it’s been a rough couple of weeks: I had to hand in my graphics project, study for a statistics test, fighting off my allergies (I hate spring) and then I had to study for my finals. At least I have my degree now.
Finally!
Anyways, I promised I’d write about the radon transformation I used to convert from the extracted images to a numerical format suitable for input into our neural network. This technique is extremely effective and is already used in industry for just such purposes. We tested it on demo day with very minimal data and it worked remarkably well.
Before I get knee deep in the technical aspects of the system, I need to mention this: due to the preprocessing done on the motion detected, there is no need for a complicated AI system; the radon transformation and the recursive feature extractor together remove a lot of noise and problems that may have been present otherwise. The radon transform especially helps as we have built in scaling so this does not have to be taken into account later on. Also from the results of the transformation, objects similar in shape have extremely similar radon transformations so the training time of the neural network was reduced as was the amount of hidden neurons necessary.
In the final demo we used a neural network with 408 inputs and only 4 hidden neurons, scary isn’t it.
Introduction:
Now back to the nitty gritty: the radon transform. If you Google the “radon transform” you’ll probably get the Wikipedia page with a scary looking equation. I also got a fright the first time I saw this but after some research it’s really simple.
The basic idea of the radon transform (or my modified version thereof) is simple: if you look at your 2D image in the XY plane, you simply flatten the image onto the X axis (figure1), then divide the X axis into several beams and you work out the amount of pixels within each beam. Your output will be the pixel contributions of the object to each beam. Then you’d rotate the object and flatten it once again. Doing this for multiple angles will give you a very good representation of the objects shape.
The most basic (and unfortunately most commonly used technique) for image classification is to simple get the centroid of the object and then trace the outline of the object giving you a silhouette. Now this doesn’t sound so bad does it? Well, it is firstly it doesn’t handle broken up images well (not without major preprocessing or modification) and it also loses a lot of detail and can provide false matches. In figure1 below we have the radon transform of a solid circle and a hollow circle, a standard outline trace would provide the exact shape result for these obviously different shapes while as you can see the radon transform (in one projection) provides completely different results. Again this pre-processing will take the strain of the neural network (or other AI technique we’ll use for classification).
Okay now for the technical details: as you remember we flatten the image according to some projection. Figure2 shows some of these projections. Now if you look at figure 2 you might notice that the now flattened image’s top border can be seen as a graph of some function, so the amount of pixels in a beam is the approximate area under the graph between the left and right end points of the beam.
Now that picture is misleading as you might think that that it is a square object that we’ve rotated and flattened, but it is in fact a single pixel. The algorithm works on a per pixel basis. Instead of actually flattening the object, we simply work out the equation of the graph for a single rotated pixel and then use that to run through all the pixels in the object, work out the left most and right most and then add them to their respective beams.
Now some of you are screaming that if we just rotate the pixels it will be wrong as we aren’t rotating the entire object but that is taken into account later on.
Now how do we calculate the area under the graph for each pixel and how do we figure out what beam to add it to since a beam will have lots and lots of pixels in it? Also the beam widths will differ per object.
What we do is simply divide the beams into lots of sub-beams, so that multiple sub-beams pass through each pixel. Then for each pixel we work out the left most sub-beam and the right most sub-beam that passes through the pixel. This then becomes the domain for the equation of the graph we have earlier and we loop through each sub-beam, calculate the pixels contribution to it(the area under the graph) and then add it to the sub-beam total. This is shown in figure 3. What you also notice from figure 3 is that there is a small degree of approximation to reduce the calculations required for the area, but remember that we’re talking about fractions of a pixel here so the total error in approximation can easily be ignored.
Now for each projection we run through each pixel and add it to the appropriate sub-beam. Once this is complete we sum the sub-beams up into the initial amount of beams and then we divide each beam by the scaling factor. The scaling factor is simply the total pixels over the beam width; this reduces the total area for the beams to 1. So every object gets reduces to an n-beam representation where the sum of all beams is equal to 1.
Okay, my explanation is very basic and I’m sure mathematicians would point out various mistakes and so on , but I’m trying to make this easy to understand and to follow, it is not meant as a 100% mathematically accurate explanation, obviously if you wish to implement something like this, you wouldn’t only use my guide here as a reference. I’ve also left out some details but they should become apparent from the below explanations.
I’m struggling to find a good way to structure this guide so I’m just going to run through the algorithm simply just to finish off.
Preparation:
The first steps we need to take before we can process the object is to get the total number of pixels, work out the centroid and the approximate radius of the object. Using the radius we work out the amount of beams and sub-beams we need for the transformation. Remember that we want several sub-beams to pass through each pixel.
Projection:
Now we run each of our projection functions to calculate the sub-beams totals. I’ll run through the basic procedure for a projection:Work out the center of the pixel on the new axis (this where the rotation of the object comes into play)Work out the left most and right most sub-beams that pass through the pixel.For each sub-beam add the pixel’s contribution to it.
Note: For some equations there is an incline to the graph and so this needs to be calculated too, and processed separately. I.e. Work out the left most and right most sub-beams for the increasing incline and the decreasing incline and then using that work out out all the section separately.
Combine Sub-beams:
Now once all the sub-beams have been calculated, we work out the scaling factor which is: beamWidth /numPixels. We then sum all the sub-beams into beams (per projection) and multiply each one by the scaling factor. And that’s it. We have our complete numerical representation of our image.
Note: I used only 8 projections as I had very limited CPU time left at this stage of the project and had to limit the amount of processing that needs to be done, obviously more projections will be better but then again too many would be worse. A fine balance needs to be found, I personally think that 8 projections are more than sufficient for my needs. Again GPGPU programming would be so useful here!
C++ Source Code: radonTransformer.h


