Tag: computer science


FFT in an SQL Query

I just came across this, and thought I’d share it with you whilst I’m working on a couple of things and don’t have time to post properly.

The Fast Fourier Transform, implemented in an SQL query. This guy must be absolutely insane. Kudos to him!



On Autonomous Robotics

Long time no blog! I’ve now finished uni and have graduated, and thought I would put up a reflection on my final year project, entitled Autonomous Robot Operation for Simulated Disaster Environments.

It was a group project which I undertook with 4 other Computer Science students in my year. The five of us worked with the Warwick Mobile Robotics group, who have for some years been developing tele-operated robots as a final year group project for entrance into the RoboCup Rescue competition. The competition tests robots in simulated earthquake damaged buildings, with the aim being to identify “victims” – simulated signs of life that might suggest a human is trapped. The competition has sections tailored for both autonomous and tele-operated robots, and this year WMR wished to compete in both regions. They would design and build a robot using which we would implement algorithms performing Simultaneous Localisation and Mapping (SLAM) and Victim Identification.

I would be working with Jonathan Abbey on the SLAM algorithms – an exciting and difficult set of problems. SLAM is the problem of an autonomous robot navigating an unknown environment whilst simultaneously generating an accurate map of that environment. It’s an unsolved problem – there are many potential solutions, but none are complete. My tasks involved Data Association, the problem of determining which of the features the robot can currently see have been seen before; Occupancy Grid Mapping, which generates a map of the environment by recording what (if anything) occupies each grid location, and can be visualised as a standard map; as well as dealing with low-level device code and porting the code onto the robot. Two other members of the team, Jan Vosecky and Jason Cook, worked on Victim Identification code, which attempted to identify potential victims through vision using both standard and Infra-Red cameras.

As part of the project, I was fortunate enough to travel to Magdeburg, Germany for the RoboCup German Open. There, I was able to observe the solutions put forward by other teams. Of most interest were the entrants by Darmstadt and Koblenz. Both these teams have worked hard for many years on these problems, and have incredible robots which are very impressive to watch (check out Koblenz’s videos on YouTube – particularly the loop closing, which is just phenomenal). Our robot, unfortunately, was significantly less impressive. Infact, if it made a single turn we considered it to be a good run. The project will continue, and has a good team of CS students taking it over next year, so with a bit more development WMR should be very competitive in the autonomous region at future competitions.

More than anything, though, this project has really piqued my interest in autonomous robotics. It’s really interesting to read about the DARPA Grand Challenge, which pits unmanned vehicles against each other. These vehicles are incredibly impressive, with an ability to not only map their environment and complete navigational objectives, but to also observe obstacles in the form of other vehicles and to obey traffic laws.

I was watching the Grand Prix this weekend, and I had a thought: how far can we push autonomous vehicles? Could we, for instance, go so far as to create an autonomous Formula 1 series?

Imagine it: F1 cars, minus the heavy drivers and safety shells, autonomously driving around race tracks at speed. This poses more challenges than it is possible to list here, but amongst the major of those is the sheer speed of motion compared to the speed of computation. Can we process sensor data fast enough to react to the environment? Can we get sensors that feed data fast enough for us to even be able to react in a reasonable time at speed? Real-time SLAM is something no-one seems to worry about at the moment (and perhaps rightly so – if it’s not solved, why make it faster?), but it’s a challenge that needs to be faced sooner or later if autonomous robots are going to fulfill everything researchers are dreaming of.

Another challenge is simply keeping control of the car – F1 drivers can detect and react quickly to understeer, oversteer, tyre graining and a host of other problems that haunt these high performance cars. Can we make a computer that can also deal with all these?

And how about overtaking? Could we design intelligent AI algorithms which would plan and execute maneuvers – and could these algorithms perhaps teach us something, with no fear and the ability to rapidly judge and weigh multiple options?

I, for one, would love to see this. It looks like the University of Essex might be working along these lines, but it’s quite far away. Maybe if I get bored of Web Development I’ll try and start something…

In the meantime, enjoy a lovely video of our robot from our project, along with some fairly horrible music I composed using Garage Band.


On the importance of Data Integrity

Not long ago, I attended a seminar by Chris Date at Warwick Uni DCS on the topic on the advantages of the Closed World Assumption as opposed to the Open World Assumption. I wont be talking about these here, but I’ll just give a quick introduction. In the closed world assumption, if a set of data exists in a database, then it is assumed to be true (for example, Employee Steve has ID 1 and earns 20K). If a set of data could exist in the database, but doesn’t, then it is assumed to be false. In the open world assumption, however, this data can be either true or false – the assumption is that it doesn’t exist in the database because we don’t know if it’s true or not. It’s a very interesting topic, perhaps I’ll talk about it some other time.

The talk was incredibly interesting, but there was one thing he said which really struck a chord, and that was:

The most important property a database can have is that its data is correct.

This struck me in particular as a result of a number of issues I’ve had with recent projects whereby the entire website was crippled because of a couple of bit of incorrect data in the database. Could this have been avoided? Definitely. I could have added lots of constraints to the database, and I could have performed more unit tests on the code itself (I’d only just gotten into doing unit tests at this time, so they weren’t quite as robust as they could have  been). It got me wondering, however, whether it would be useful to unit test that database design itself. It’s all good having “correct” code, but if the underlying database will allow inconsistencies, then there’s still a risk that the system can break. I’m not sure if unit testing the database design at a level below the code would be worthwhile, but it’s something to think about!


Research Directions

I’ve recently finished the first term of my final year of university, and for probably the first time, I’m actually sad to have completed one of my modules. The module was entitled “Research Directions”, and is (apparently) fairly unique to Warwick University. It’s organised as a series of graduate seminars, meaning the module as a whole covers a wide range of subjects, mainly being taught by guest lecturers who are experts in their particular fields. This meant that we were given decent groundings in areas including Quantum Computing, Bioinformatics and Trading Algorithms. Each member of the group presented a topic taken from the Communications of the ACM, which gave us brief introductions to more topics such as Smoothed Complexity, Probabilistic Databases and Model Checking.

What made this module great? It was relaxed, free-flowing and interesting. What more can you ask for? If only all modules were like this.


Theory and Applications

The degree I do at Warwick is very theory-heavy, in that we learn a lot of algorithms, techniques and equations, but put very little of it into practice. I quite enjoy the theory part of it, but it’s always fun to do the implementation (after all, isn’t the general perception of Computer Science that it’s a degree in programming?). Today, I got one of those wonderful moments where the theory is put into practice outside of the modules, when I had to implement Bresenham’s line drawing algorithm as part of our group project. For some reason, I found this really exciting! Of course, I had to look up how to do it, even though the module was only last year…


On Work and Music

Wow, I’ve kind of neglected this haven’t I? Sorry!

I’ve been really busy recently with my project. I’ve been having loads of issues with Seg Faults and Memory Leaks and the like taking up pretty much all my time. I’ll post up a lovely debrief once it’s all over and done with. There’s about 5 weeks til the presentations, by which point everything needs to be working, so it’s time to get into serious panic mode. There’s plenty to do yet, but I’m sure it’ll get there.

Walking home from uni today I had the strange desire to build a web app. I’ve done very little web work in a good while, besides a small flash game which I will be unleashing soon hopefully. I’d quite like to get a good start done on one of the various ideas I have for Imaginary Roots or Slicecake, but just don’t have the time. I’ll have to hope this enthusiasm doesn’t dwindle over the next few weeks!

I’ve been listening to a lot of Jazz Fusion recently, mostly Metro and Greg Howe. Does anyone happen to know any really good Jazz/Jazz Fusion guitar teachers around? I’d love to take some with Greg Howe, but can’t really afford the $75 an hour fee. Still, it would probably be much better than lessons I’d get anywhere else.


Scrabble is Big!

I ran some simulations over the weekend, and I’ve found some interesting statistics I thought I’d share with you.

After 10,000 simulated games, the average number of moves per game was 23, and the average number of choices per move (i.e. the number of moves you could actually make at each move) is around 970. This means that a graph which stored all possible configurations of the board would contain around 970^23 nodes (ignoring repetitions). That’s huge! As a contrast, chess (which is considered pretty damned hard to solve) has on average 57 moves per game (from a quick Google search, this may be inaccurate) and has a branching factor of around 35, making a graph of size 35^57.
What have I gotten myself into?!



One deadline out of the way! We had to hand in the specification last Thursday, and so here’s a bit of an update on where I am with the project.

Firstly, the goals of the project. Mainly, this is going to be a reasearch project, and so I will be looking a lot into the existing methodologies in computer based Scrabble agents, and trying to extend them where I can. The goals of the project are going to be to:

  • First, define what optimality means for a Scrabble strategy
  • Produce an optimal algorithm
  • Attempt to find a greedy algorithm which produces comparable results to the current world class agents

After doing this, I will hopefully find the time to attempt to generalise the findings to games of the same classification to Scrabble (imperfect information games), but this is more of a pie-in-the-sky objective.
I used LaTeX to produce the specification report, and I have to say it’s proved invaluable. It makes document production so much easier, I will definitely be using it in the future. I’ve been a bit soft and have started using TeX Shop, which takes a lot of the headache out of it (syntax highlighting, a few useful macros, and pdf production in a handy little button). Give it a go!

Update: I’ve just looked back at this, and it’s amazing how the project evolved from these somewhat optimistic initial objectives. But I suppose the majority of projects have the same form of evolution.


SQL and Normalisation

I was just having a discussion with a co-worker about some SQL they were doing. They had 3 tables, which looked something like this:

user( user_id, ... );
pages( page_id, ... );
comments( comment_id, user_id, page_id, ... );

So that people can comment on both users and pages. The user_id and page_id in the comments table both defaulted to NULL, and only one would be non-null at any one time. He argued that this was a fine solution to the problem since it minimised the number of tables you had, and that he would gain no actual data storage by separating it out and having two join tables between user-comments and page-comments. I disagree.

I believe that since a comment is an entity which can be applied to numerous things, the table for it should be self-contained, i.e. not have any foreign keys for the things it is applied to. This changes, of course, when the comments apply to only one thing, when it stops being “comments” and starts being “user_comments”, for example. Although the actual relation between user-comments and page-comments is one-to-many, because there are multiple entities I believe it is different case, and thus deserves join tables as per a many-to-many relation. In terms of code, having the join tables makes things a lot simpler (although he argued that you lose a bit of efficieny since you need two queries to save an entity instead of just one for his design). Also, flexibility is greatley increased with join tables – what if the client wants to add a “media” table, which can also have comments? Are you going to have to go back and modify all your queries so that it caters for having 3 possible null fields instead of 2?

My collegue was asking me because he wanted to do a query which did some joins, and then did some other joins based on the value of the non-null field in comments. I wasn’t sure of the actual query, so don’t know if the join-tables solution would have helped (though it proably would…), but he was going down the route of adding SQL conditional statements into his query. Again, here I believe that as little logic processing as possible should be done in everything except the language you are building in (be it PHP or whatever).

What do you think? Am I over-normalising my tables?